Proposed Approach for Overcoming the Impact of Unbalanced Distribution in Predicting Students' Performance Cover Image

Proposed Approach for Overcoming the Impact of Unbalanced Distribution in Predicting Students' Performance
Proposed Approach for Overcoming the Impact of Unbalanced Distribution in Predicting Students' Performance

Author(s): Gabrijela Dimić, Ljiljana Pecić
Subject(s): Classification, School education
Published by: UIKTEN - Association for Information Communication Technology Education and Science
Keywords: Classification; SMOTE; unbalananced distribution; machine learning; educational data mining;

Summary/Abstract: The paper presents a method for mitigating the impact of an unbalanced distribution of multidimensional class features on grade prediction accuracy. For the purposes of the case study, an educational data set named APOD was created by integrating data from heterogeneous sources. The input features and the multidimensional class feature were defined. The effectiveness of adopting the Synthetic Minority Over-Sampling Technique (SMOTE) to handle data imbalance issues was explored using various classification methods. To determine which algorithm performed best in terms of minority class distribution, three experiments were carried out. The SMOTE approach with automatic minority class detection and a 100% sampling factor demonstrated a considerable improvement in model performance for four out of five classifiers that were tested. The primary objective of the study described in this paper is to address the problem of predicting students' final grades in situations where a small dataset causes data imbalance. Small datasets provide insufficient representation of instances within specific classes, resulting in unreliable models with poor performance in predicting student success. The proposed approach for implementing SMOTE is based on an algorithm for identifying minority classes, with a predetermined minimum number of samples per class. This approach enables the development of precise models for predicting students' final test results, even with small educational datasets. The contribution of the proposed research lies in achieving greater accuracy in predicting students' final grades, regardless of dataset size and the presence of minority classes.

  • Issue Year: 13/2024
  • Issue No: 4
  • Page Range: 2839-2849
  • Page Count: 11
  • Language: English
Toggle Accessibility Mode