Abstract Objective: To predict hypothyroid disorder at an earlier stage using Machine Learning Algorithms. Method: An early diagnosis of hypothyroidism was addressed through a machine-learning algorithm-based framework incorporating data pre-processing, handling imbalance dataset, selecting relevant features, data splitting, and model optimization. An unbalanced dataset was converted to a balanced dataset using SMOTEEN. To address the effect of class imbalance and boost the learning of minority class data elements. Feature selection is performed using a filter-based Chi-square test to discover the most appropriate features, thus enhancing the model performance and minimizing overfitting. Random Search Cross-Validation (Randomized CV) was used to train and optimize a subset of important machine learning classifiers, including Decision tree (DT), Random Forest (RF), and K closest neighbor (KNN), to determine the optimal hyperparameters. Findings: An ensemble learning strategy using voting techniques was implemented to improve predictive performance. The effectiveness of each model was measured using performance metrics such as precision, recall, F1-score, and accuracy of 98.81%. The results of the experiments show that the proposed method greatly enhances categorization and can be a reliable instrument for the early detection of hypothyroidism in clinical decision support systems. Novelty: Random Search is a well-defined structure in hyperparameter tuning for the prediction of hypothyroidism earlier and faster, ensuring computational efficacy and robust performance, and assisting as a stepping stone for advanced optimization strategies. Keywords: Hypothyroidism, SMOTE-ENN, Pre-processing, Feature selection, Performance Metrics
Chitra et al. (Sun,) studied this question.