UPM Institutional Repository

Hyperparameter tuning and pipeline optimization via grid search method and tree-based AutoML in breast cancer prediction


Citation

Mat Radzi, Siti Fairuz and Abdul Karim, Muhammad Khalis and Saripan, M. Iqbal and Abd Rahman, Mohd Amiruddin and Che Isa, Iza Nurzawani and Ibahim, Mohammad Johari (2021) Hyperparameter tuning and pipeline optimization via grid search method and tree-based AutoML in breast cancer prediction. Journal of Personalized Medicine, 11 (10). pp. 1-12. ISSN 2075-4426

Abstract

Automated machine learning (AutoML) has been recognized as a powerful tool to build a system that automates the design and optimizes the model selection machine learning (ML) pipelines. In this study, we present a tree-based pipeline optimization tool (TPOT) as a method for determining ML models with significant performance and less complex breast cancer diagnostic pipelines. Some features of pre-processors and ML models are defined as expression trees and optimal gene programming (GP) pipelines, a stochastic search system. Features of radiomics have been presented as a guide for the ML pipeline selection from the breast cancer data set based on TPOT. Breast cancer data were used in a comparative analysis of the TPOT-generated ML pipelines with the selected ML classifiers, optimized by a grid search approach. The principal component analysis (PCA) random forest (RF) classification was proven to be the most reliable pipeline with the lowest complexity. The TPOT model selection technique exceeded the performance of grid search (GS) optimization. The RF classifier showed an outstanding outcome amongst the models in combination with only two pre-processors, with a precision of 0.83. The grid search optimized for support vector machine (SVM) classifiers generated a difference of 12% in comparison, while the other two classifiers, naïve Bayes (NB) and artificial neural network—multilayer perceptron (ANN-MLP), generated a difference of almost 39%. The method’s performance was based on sensitivity, specificity, accuracy, precision, and receiver operating curve (ROC) analysis.


Download File

[img] Text (Abstract)
ABSTRACT.pdf

Download (106kB)
Official URL or Download Paper: https://www.mdpi.com/2075-4426/11/10/978

Additional Metadata

Item Type: Article
Divisions: Faculty of Engineering
Faculty of Science
DOI Number: https://doi.org/10.3390/jpm11100978
Publisher: Multidisciplinary Digital Publishing Institute
Keywords: Machine learning; breast cancer; Genetic programming; Tree-based pipeline optimization tool
Depositing User: Ms. Nuraida Ibrahim
Date Deposited: 25 Jul 2022 02:28
Last Modified: 25 Jul 2022 02:28
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.3390/jpm11100978
URI: http://psasir.upm.edu.my/id/eprint/97584
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item