Effect of Hyperparameter Tuning Using Random Search on Tree-Based Classification Algorithm for Software Defect Prediction

https://doi.org/10.22146/ijccs.90437

Muhammad Hevny Rizky(1), Mohammad Reza Faisal(2*), Irwan Budiman(3), Dwi Kartini(4), Friska Abadi(5)

(1) Lambung Mangkurat University
(2) Lambung Mangkurat University
(3) Lambung Mangkurat University
(4) Lambung Mangkurat University
(5) Lambung Mangkurat University
(*) Corresponding Author

Abstract


The field of information technology requires software, which has significant issues. Quality and reliability improvement needs damage prediction. Tree-based algorithms like Random Forest, Deep Forest, and Decision Tree offer potential in this domain. However, proper hyperparameter configuration is crucial for optimal outcomes. This study demonstrates the use of Random Search Hyperparameter Setting Technique to predict software defects, improving damage estimation accuracy. Using ReLink datasets, we found effective algorithm parameters for predicting software damage. Decision Tree, Random Forest, and Deep Forest achieved an average AUC of 0.73 with Random Search. Random Search outperformed other tree-based algorithms. The main contribution is the innovative Random Search hyperparameter tuning, particularly for Random Forest. Random Search has distinct advantages over other tree-based algorithms

Keywords


Software Defect Prediction; Hyperparameter Tuning; Decision Tree; Random Forest; Deep Forest;

Full Text:

PDF


References

A. Elmishali and M. Kalech, “Issues-Driven features for software fault prediction,” Information and Software Technology, vol. 155, 2023, doi: 10.1016/j.infsof.2022.107102. [2] M. K. Thota, F. H. Shajin, and P. Rajesh, “Survey on software defect prediction techniques,” International Journal of Applied Science and Engineering, vol. 17, no. 4, pp. 331–344, 2020, doi: 10.6703/IJASE.202012_17(4).331. [3] W. Zheng, S. Mo, X. Jin, Y. Qu, Z. Xie, and J. Shuai, “Software defect prediction model based on improved deep forest and AutoEncoder by forest,” Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE, vol. 2019-July, no. 3, pp. 419–424, 2019, doi: 10.18293/SEKE2019-008. [4] M. A. Mabayoje, A. O. Balogun, H. A. Jibril, J. O. Atoyebi, H. A. Mojeed, and V. E. Adeyemo, “Parameter tuning in KNN for software defect prediction: an empirical analysis,” Jurnal Teknologi dan Sistem Komputer, vol. 7, no. 4, pp. 121–126, 2019, doi: 10.14710/jtsiskom.7.4.2019.121-126. [5] E. Andini, M. Reza Faisal, R. Herteno, R. Adi Nugroho, and F. Abadi, “PENINGKATAN KINERJA PREDIKSI CACAT SOFTWARE DENGAN HYPERPARAMETER TUNING PADA ALGORITMA KLASIFIKASI DEEP FOREST,” Jurnal MNEMONIC, vol. 5, no. 2, 2022, [Online]. Available: https://github.com/bharlow058/AEEEM-and-other- [6] M. Ryan Afrizal, R. Adi Nugroho, D. Kartini, R. Herteno, J. Ahmad Yani Km, and K. Selatan, “XGBOOST DENGAN RANDOM SEARCH HYPER-PARAMETER TUNING UNTUK KLASIFIKASI SITUS PHISING,” 2022. [7] T. Zhou, X. Sun, X. Xia, B. Li, and X. Chen, “Improving defect prediction with deep forest,” Information and Software Technology, vol. 114, no. July 2018, pp. 204–216, 2019, doi: 10.1016/j.infsof.2019.07.003. [8] A. Javeed, S. Zhou, L. Yongjian, I. Qasim, A. Noor, and R. Nour, “An Intelligent Learning System Based on Random Search Algorithm and Optimized Random Forest Model for Improved Heart Disease Detection,” IEEE Access, vol. 7, pp. 180235–180243, 2019, doi: 10.1109/ACCESS.2019.2952107. [9] H. Aji Prihanditya and N. Hestu Aji Prihanditya, “The Implementation of Z-Score Normalization and Boosting Techniques to Increase Accuracy of C4.5 Algorithm in Diagnosing Chronic Kidney Disease,” 2020. [10] B. Kovalerchuk, “Enhancement of Cross Validation Using Hybrid Visual and Analytical Means with Shannon Function,” Studies in Computational Intelligence, vol. 835, pp. 517–543, 2020, doi: 10.1007/978-3-030-31041-7_29. [11] H. Aljamaan and A. Alazba, “Software defect prediction using tree-based ensembles,” PROMISE 2020 - Proceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software Engineering, Co-located with ESEC/FSE 2020, pp. 1–10, 2020, doi: 10.1145/3416508.3417114. [12] M. Joye and F. Salehi, Private yet efficient decision tree evaluation, vol. 10980 LNCS. Springer International Publishing, 2018. doi: 10.1007/978-3-319-95729-6_16. [13] B. Charbuty and A. Abdulazeez, “Classification Based on Decision Tree Algorithm for Machine Learning,” Journal of Applied Science and Technology Trends, vol. 2, no. 01, pp. 20–28, 2021, doi: 10.38094/jastt20165. [14] H. B. Kibria and A. Matin, “THE S EVERITY P REDICTION OF T HE B INARY A ND M ULTI -C LASS C ARDIOVASCULAR D ISEASE - A M ACHINE L EARNING -B ASED F USION A PPROACH,” 2022. [15] L. V. Utkin, “An imprecise deep forest for classification,” Expert Systems with Applications, vol. 141, p. 112978, 2020, doi: 10.1016/j.eswa.2019.112978. [16] S. Cui, Y. Yin, D. Wang, Z. Li, and Y. Wang, “A stacking-based ensemble learning method for earthquake casualty prediction,” Applied Soft Computing, vol. 101, p. 107038, 2021, doi: 10.1016/j.asoc.2020.107038. [17] H. Alibrahim and L. Simone A., “2021 IEEE Congress on Evolutionary Computation,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 4, no. 5, pp. 740–740, 2020, doi: 10.1109/tetci.2020.3020707. [18] R. G. Mantovani, T. Horváth, R. Cerri, S. B. Junior, J. Vanschoren, and A. C. P. de L. F. de Carvalho, “An empirical study on hyperparameter tuning of decision trees,” no. December, 2018, [Online]. Available: http://arxiv.org/abs/1812.02207 [19] M. Daviran, A. Maghsoudi, R. Ghezelbash, and B. Pradhan, “Computers and Geosciences A new strategy for spatial predictive mapping of mineral prospectivity : Automated hyperparameter tuning of random forest approach,” Computers and Geosciences, vol. 148, no. January, p. 104688, 2021, doi: 10.1016/j.cageo.2021.104688.



DOI: https://doi.org/10.22146/ijccs.90437

Article Metrics

Abstract views : 1755 | views : 1044

Refbacks

  • There are currently no refbacks.




Copyright (c) 2024 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



Copyright of :
IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
ISSN 1978-1520 (print); ISSN 2460-7258 (online)
is a scientific journal the results of Computing
and Cybernetics Systems
A publication of IndoCEISS.
Gedung S1 Ruang 416 FMIPA UGM, Sekip Utara, Yogyakarta 55281
Fax: +62274 555133
email:ijccs.mipa@ugm.ac.id | http://jurnal.ugm.ac.id/ijccs



View My Stats1
View My Stats2