Comparison of Decision Tree Classification Methods and Gradient Boosted Trees
Comparison of Decision Tree Classification Methods and Gradient Boosted Trees
Author(s): Arif Rinaldi Dikananda, Sri Jumini, Nafan Tarihoran, Santy Christinawati, Wahyu Trimastuti, Robbi RahimSubject(s): Environmental interactions
Published by: UIKTEN - Association for Information Communication Technology Education and Science
Keywords: Comparison; Data mining; Classification; C4.5; Random Forest; Accuracy;
Summary/Abstract: The purpose of this research is to analyze the C4.5 and Random Forest algorithms for classification. The two methods were compared to see which one in the classification process was more accurate. The case is the success of university students at one of the private universities. Data is obtained from the https://osf.io/jk2ac data set. The attributes used were gender, student, average evaluation (NEM), reading session, school origin, and presence as input and success as a result (label). The process of analysis uses Rapid Miner software with the same test parameters (k-folds = 2, 3, 4, 5) with the same type of sample (stratified sample, linear sample, shuffled sampling). The first result shows that the sample type test k-fold (stratified sampling) achieved an average accuracy of 55.76 percent (C4,5) and 5618 percent (Random Forest). The second result showed that the kfold (linear sampling) sample test achieved an average precision of 58.06 percent (C4.5) and 6506 percent. (Random Forest).The third result shows that the k-fold test with the sampling type has averaged 58.68 per cent (C4,5) and 60,76 per cent (shuffled sampling) precision (Random Forest). From the three test results, in the case of student success at a private university, the Random Forest method is better than C4.5.
Journal: TEM Journal
- Issue Year: 11/2022
- Issue No: 1
- Page Range: 316-322
- Page Count: 7
- Language: English