Perbandingan Algoritma Klasifikasi untuk Prediksi Cacat Software dengan Pendekatan CRISP-DM

Nurtriana Hidayati; Joko Suntoro; Galet Guntoro Setiaji

doi:10.34128/jsi.v7i2.313

PDF

Published: Nov 30, 2021

DOI: https://doi.org/10.34128/jsi.v7i2.313

Keywords:

Prediksi Cacat Software, k-Nearest Neighbor, NaÃ¯ve Bayes, CART, CRISP-DM, Rekayasa Perangkat Lunak, Data Mining, Machine Learning

Nurtriana Hidayati

Universitas Semarang

Joko Suntoro

Universitas Semarang

Galet Guntoro Setiaji

Universitas Semarang

Abstract

Proses prediksi cacat software merupakan bagian terpenting dalam sebuah pengujian kuliatas software sering juga disebut dengan software quality yang bertujuan untuk mengetahui mutu software dalam pemenuhan kebutuhan fungsional dan kinerjanya. Metode machine learning mempunyai kinerja lebih baik untuk menemukan cacat software daripada metode manual. Algoritma klasifikasi dalam machine learning yang pernah digunakan untuk prediksi cacat software antara lain k-Nearest Neighbor (k-NN), NaÃ¯ve Bayes (NB) dan Decision Tree (CART). Dalam penelitian ini akan dibandingkan kinerja antara algoritma - algoritma klasifikiasi yaitu k-NN, NB, dan CART untuk prediksi cacat software dengan pendekatan CRISP-DM. CRISP-DM merupakan model proses data mining dengan 6 tahapan yaitu: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, dan Deployment dalam menentukan perbandingan algoritma klasifikasi dalam memprediksi cacat software. Software Matrix yang digunakan pada penelitian ini adalah tujuh dataset dari NASA MDP. Hasil penelitian menunjukkan bahwa nilai rata-rata akurasi algoritma CART lebih baik daripada algoritma k-NN dan NB dengan nilai 0,867. Sedangkan nilai rata-rata akurasi algoritma k-NN dan NB masing-masing 0,859 dan 0,778.

Issue

Vol. 7 No. 2 (2021): Jurnal Sains dan Informatika

Section

Articles

References

T. M. Khoshgoftaar and E. B. Allen, â€œA practical classification-rule for software-quality models,â€ IEEE Trans. Reliab., vol. 49, no. 2, pp. 209â€“216, 2000, doi: 10.1109/24.877340.

I. H. Laradji, M. Alshayeb, and L. Ghouti, â€œSoftware defect prediction using ensemble learning on selected features,â€ Inf. Softw. Technol., vol. 58, pp. 388â€“402, 2015, doi: 10.1016/j.infsof.2014.07.005.

K. Gao, T. M. Khoshgoftaar, and R. Wald, â€œCombining Feature Selection and Ensemble Learning for Software Quality Estimation,â€ in The Twenty-Seventh International Flairs Conference, 2014, pp. 47â€“52.

Q. Song, Z. Jia, M. Shepperd, S. Ying, and J. Liu, â€œA General Software Defect-Proneness Prediction Framework,â€ IEEE Trans. Softw. Eng., vol. 37, no. 3, pp. 356â€“370, 2010.

J. Suntoro, F. W. Christanto, and H. Indriyawati, â€œSoftware Defect Prediction Using AWEIG + ADACOST Bayesian Algorithm for Handling High Dimensional Data and Class Imbalanced Problem,â€ Int. J. Inf. Technol. Bus., vol. 1, no. 1, pp. 36â€“41, 2018.

J. D. Strate and P. A. Laplante, â€œA Literature Review of Research in Software Defect Reporting,â€ IEEE Trans. Reliab., vol. 62, no. 2, pp. 444â€“454, 2013.

C. Jones and O. Bonsignour, The Economics of Software Quality. Boston: Pearson Education, Inc., 2012.

S. Lessmann, B. Baesens, C. Mues, and S. Pietsch, â€œBenchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings,â€ IEEE Trans. Softw. Eng., vol. 34, no. 4, pp. 485â€“496, 2008.

T. Menzies, Z. Milton, B. Turhan, B. Cukic, Y. Jiang, and A. Bener, â€œDefect prediction from static code features: current results, limitations, new approaches,â€ Autom. Softw. Eng., vol. 17, no. 4, pp. 375â€“407, 2010.

Ã–. F. Arar and K. Ayan, â€œSoftware defect prediction using cost-sensitive neural network,â€ Appl. Soft Comput., vol. 33, pp. 263â€“277, 2015, doi: 10.1016/j.asoc.2015.04.045.

B. Turhan, T. Menzies, A. B. Bener, and J. Di Stefano, â€œOn the relative value of cross-company and within-company data for defect prediction,â€ Empir. Softw. Eng., vol. 14, no. 5, pp. 540â€“578, 2009, doi: 10.1007/s10664-008-9103-7.

C. Catal and B. Diri, â€œInvestigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem,â€ Inf. Sci. (Ny)., vol. 179, no. 8, pp. 1040â€“1058, 2009, doi: 10.1016/j.ins.2008.12.001.

E. Arisholm, L. C. Briand, and E. B. Johannessen, â€œA systematic and comprehensive investigation of methods to build and evaluate fault prediction models,â€ J. Syst. Softw., vol. 83, no. 1, pp. 2â€“17, 2010, doi: 10.1016/j.jss.2009.06.055.

N. Gayatri, S. Nickolas, and A. V. Reddy, â€œFeature Selection Using Decision Tree Induction in Class level Metrics Dataset for Software Defect Predictions,â€ 2010.

Z. Sun, Q. Song, and X. Zhu, â€œUsing coding-based ensemble learning to improve software defect prediction,â€ IEEE Trans. Syst. Man, Cybern. Part C (Applications Rev., vol. 42, no. 6, pp. 1806â€“1817, 2012, doi: 10.1109/TSMCC.2012.2226152.

V. Kotu and B. Deshpande, Predictive Analytics and Data Mining: Concepts and Practice with Rapidminer. Waltham: Morgan Kaufmann Publishers is an imprint of Elsevier, 2015.

H. Liu and S. Zhang, â€œNoisy data elimination using mutual k-nearest neighbor for classification mining,â€ J. Syst. Softw., vol. 85, no. 5, pp. 1067â€“1074, 2012, doi: 10.1016/j.jss.2011.12.019.

J. Suntoro, Data Mining Algoritma dan Implementasi Menggunakan Bahasa Pemrograman PHP. Jakarta: Elex Media Komputindo, 2019.

J. Suntoro and C. N. Indah, â€œAverage Weight Information Gain Untuk Menangani Data Berdimensi,â€ J. Buana Inform., vol. 8, no. 3, pp. 131â€“140, 2017.

R. S. Wahono, N. Suryana, and S. Ahmad, â€œMetaheuristic Optimization based Feature Selection for Software Defect Prediction,â€ J. Softw., vol. 9, no. 5, pp. 1324â€“1333, 2014, doi: 10.4304/jsw.9.5.1324-1333.

R. S. Wahono, N. S. Herman, and S. Ahmad, â€œA Comparison Framework of Classification Models for Software Defect Prediction,â€ Adv. Sci. Lett., vol. 20, no. 10â€“11, pp. 1945â€“1950, 2014, doi: 10.1166/asl.2014.5640.

C. Dawson, Projects in Computing and Information Systems. Canada: Pearson Education, 2009.

B. Lantz, Machine Learning with R. Birmingham: Packt Publishing Ltd, 2013.

A. A. Aburomman and M. B. I. Reaz, â€œA novel SVM-kNN-PSO ensemble method for intrusion detection system,â€ Appl. Soft Comput., vol. 38, pp. 360â€“372, 2016, doi: 10.1016/j.asoc.2015.10.011.

J. Han, M. Kamber, and J. Pei, Data Mining Concepts and Techniques Third Edition. Waltham: Morgan Kaufmann Publishers is an imprint of Elsevier, 2012.

M. Kantardzic, Data Mining: Concepts, Models, Methods, and Algorithms Second Edition. Canada: Wiley-IEEE Press, 2011.

D. Ryu and J. Baik, â€œEffective multi-objective naÃ¯ve Bayes learning for cross-project defect prediction,â€ Appl. Soft Comput., vol. 49, pp. 1062â€“1077, 2016, doi: 10.1016/j.asoc.2016.04.009.

J. Suntoro, A. Ilham, and H. A. D. Rani, â€œNew Method Based Pre-Processing to Tackle Missing and High Dimensional Data of CRISP-DM Approach,â€ J. Phys. Conf. Ser., vol. 1471, no. 1, 2020.

M. North, Data Mining for the Masses. Global Text Project, 2012.

J. DemÅ¡ar, â€œStatistical Comparisons of Classifiers over Multiple Data Sets,â€ J. Mach. Learn. Res., vol. 7, pp. 1â€“30, 2006, doi: 10.1016/j.jecp.2010.03.005.

H.-L. Dai, â€œClass imbalance learning via a fuzzy total margin based support vector machine,â€ Appl. Soft Comput., vol. 31, pp. 172â€“184, 2015, doi: 10.1016/j.asoc.2015.02.025.

Perbandingan Algoritma Klasifikasi untuk Prediksi Cacat Software dengan Pendekatan CRISP-DM

Abstract

References

Author Guidelines

Article Template

Online Submission

Publication Flow

Instruction for Reviewer

Publication Ethics

Peer-Review Process

View My Stats

Author Fee

Copyright Transfer Form

Article Sidebar

Main Article Content

Abstract

Article Details

References