Comparison of Naive Bayes Method, K-NN (K-Nearest Neighbor) and Decision Tree for Predicting the Graduation of ‘Aisyiyah University Students of Yogyakarta

Authors

  • Tikaridha Hardiani Department of Information Technology, Science and Technology Faculty, University ‘Aisyiyah Yogyakarta

DOI:

https://doi.org/10.31101/ijhst.v2i1.1829
Abstract views 873 times

Keywords:

data mining, prediction, student, graduation, decicion tree, naive bayes, K-NN

Abstract

The students of Universitas ‘Aisyiyah Yogyakarta have been increasing including the number of students in the Faculty of Health Sciences. In 2016 the total number of UNISA students was 1851. The increasing number of students every year leads to great numbers of data stored in the university database. The data provide useful information for the university to predict student graduation or student study period whether they graduate on time with a study period of 4 years or late with a study period of more than 4 years. This can be processed by using a data mining technique that is the classification technique. Data needed in the classification technique are data of students who have graduated as training data and data of students who are still studying in the university as testing data. The training data were 501 records with 10 goals and the testing data were 428 records. Data mining process method used was the Cross-Industry Standard Prosses for Data Mining (CRISPDM). The algorithms used in this study were Naive Bayes, K-Nearest Neighbor (KNN) and Decision Tree. The three algorithms were compared to see the accuracy by using Rapidminer software. Based on the accuracy, it was found that the K-NN algorithm was the best in predicting student graduation with an accuracy of 91.82%. The K-NN algorithm showed that 100% of the students of Nursing study program of Universitas Aisyiyah Yogyakarta are predicted to graduate on time.

References

A. Nadali and H. E. Nosratabadi, “Evaluating the Success Level of Data Mining Projects Based on CRISP-DM Methodology by a Fuzzy Expert System,†IEEE, pp. 161–165, 2011.

A. Saleh, “Implementasi Metode Klasifikasi Naïve Bayes dalam Memprediksi Besarnya Penggunaan Listrik Rumah Tangga,†Citec J., vol. 2, no. 3, pp. 207–217, 2015.

A. Rakhman, “Menggunakan Metode Decision Tree Berbasis Particle Swarm Optimation ( PSO ),†Smart Camp, vol. 6, no. 1, pp. 193–197, 2017.

C. Catley, K. Smith, C. Mcgregor, and M. Tracy, “Extending CRISP-DM to Incorporate Temporal Data Mining of Multi- dimensional Medical Data Streams : A Neonatal Intensive Care Unit Case Study,†pp. 0–4, 2009.

D. Iskandar and Y. K. Suprapto, “Perbandingan akurasi klasifikasi tingkat kemiskinan antara algoritma C4 . 5 dan Naïve Bayes Clasifier,†vol. 11, no. 1, pp. 14–17, 2013.

D. Sartika, D. I. Sensuse, U. Indo, G. Mandiri, and F. I. Komputer, “Perbandingan Algoritma Klasifikasi Naive Bayes , Nearest Neighbour , dan Decision Tree pada Studi Kasus Pengambilan Keputusan Pemilihan Pola Pakaian,†J. Tek. Inform. dan Sist. Inf., vol. 1, no. 2, pp. 151–161, 2017.

D. T. Larose and C. D. Larose, Discovering Knowledge in Data. 2014.

E. Parilla-ferrer, P. L. F. Jr, and J. T. B. Iv, “Automatic Classification of Disaster- Related Tweets,†in International conference on Innovative Engineering Technologies (ICIET’2014), 2015, no. September.

E. R. Paramita Mayadewi, “Prediksi Nilai Proyek Akhir Mahasiswa Menggunakan Algoritma Klasifikasi Data Mining,†Sist. Inf., vol. 11, no. November, pp. 1–7, 2015.

J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques. 2012.

L. R. Fielitz and D. K. Scott, “Prediction of Physical Performance Using Association Rule Mining,†2002.

M. Azmi and F. Sarmadi, “Improving the accuracy of K-nearest neighbour method in long-lead hydrological forecasting,†Sci. Iran., vol. 23, no. 3, pp. 856–863, 2016.

S. Pulakkazhy, “Data Mining In Banking And Its Applications-A Review,†J. Comput. Sci., vol. 9, no. 10,pp. 1252–1259, 2013.

T. Hardiani, “Segmentasi Nasabah Simpanan Menggunakan Fuzzy C Means Dan Fuzzy Rfm ( Recency , Frequency , Monetary ) Pada Bmt Xyz,†Nero, vol. 3, no. 3, pp. 185–192, 2018.

Y. Kumar, G. Sahoo, and G. Yadav, “Predication of Parkinson′s disease using data mining methods: A comparative analysis of tree, statistical, and support vector machine classifiers,†Indian J. Med. Sci., vol. 65, no. 6, p. 231, 2011.

Downloads

Published

2021-01-21

How to Cite

Hardiani, T. (2021). Comparison of Naive Bayes Method, K-NN (K-Nearest Neighbor) and Decision Tree for Predicting the Graduation of ‘Aisyiyah University Students of Yogyakarta. International Journal of Health Science and Technology, 2(1), 75–85. https://doi.org/10.31101/ijhst.v2i1.1829

Issue

Section

Articles

SHARE THIS