EVALUATION OF THE PERFORMANCE OF AUTOMATED MACHINE LEARNING TOOLS Cover Image

EVALUATION OF THE PERFORMANCE OF AUTOMATED MACHINE LEARNING TOOLS
EVALUATION OF THE PERFORMANCE OF AUTOMATED MACHINE LEARNING TOOLS

Author(s): Desislava Koleva, Yanka Aleksandrova
Subject(s): Economy, Business Economy / Management, ICT Information and Communications Technologies
Published by: Икономически университет - Варна
Keywords: AutoML; automated machine learning; Azure; Amazon SageMaker; H2O; Altair
Summary/Abstract: The ubiquitous application of predictive models has created the demand for optimized ways of building, deploying and enhancing machine learning models. Traditionally building and deploying machine learning models require the involvement of highly classified data scientists with good knowledge about machine learning algorithms, specialized programming languages, mathematics, statistics and data engineering. The rela-tively new and developing area of automated machine learning (AutoML) has made the whole process of building and deploying an AI model more accessible and automated by providing solutions for automated machine learning pipeline encompassing data collection, preprocessing, feature selection, model building and hyperparameter tuning. Automated machine learning tools allow for business users that are not necessarily machine learning experts to develop and implement high quality predictive models. The purpose of this re-search paper is to assess and compare AutoML tools for solving classification problems. One of the leading AutoML tools are chosen like Azure Automated Machine Learning, Amazon Sage Maker Auto Pilot, H2O Au-toML, H2O Flow and Altair AI Studio Auto Model. Classification models using AutoML tools are trained on a dataset for customer churn predictions and models are compared based on previously chosen measures. AutoML tools are also assessed on different criteria like ease of use, functionality, user orientation, limitations and generated output format. Results show remarkable predictive performance of the generated classification models in general. The best trained classification models are ensemble models trained in Amazon Sage Maker Auto Pilot and H2O AutoML. Some conclusions and recommendations have been drawn to help data science practitioners with choosing and implementing automated machine learning tools.

Toggle Accessibility Mode