Early identification of lung cancer using questionnaire-based data offers a low-cost, non-invasive pathway to assist clinical decision-making. However, such datasets often contain redundant, noisy, and imbalanced attributes that limit the performance of traditional classifiers. This study introduces a hybrid LSTM–GRU framework optimized using a Grey Wolf–Whale Optimization (GWO–WOA) algorithm for hyperparameter tuning and Binary Particle Swarm Optimization (BPSO) for feature selection. Two public lung cancer datasets sourced from the Kaggle repository were employed: the first comprising 309 samples and the second containing 3000 samples. For both datasets, the preprocessing pipeline included missing-value imputation, categorical encoding, outlier removal, and z-score normalization to ensure feature consistency. Datasets were then split into 70%, 20%, and 10% subsets for training, validation, and testing, respectively. BPSO effectively selected the most informative features that contribute to accurate diagnosis. At the same time, GWO–WOA refined key hyperparameters, such as the learning rate, hidden units, and layer depth, of the hybrid architecture. Experimental results demonstrate the superior performance of the proposed GWO–WOA–LSTM–GRU model, achieving 100.00% accuracy, precision, recall, and F1-score on the 309-sample dataset, and 99.33% accuracy/F1 (precision: 99.34%, recall: 99.33%) on the 3000-sample dataset. In comparison, tuned single models—LSTM, GRU, CNN, and SVM—achieved accuracies ranging from 77.42 to 98.33%. These findings confirm that integrating metaheuristic optimization and hybrid recurrent networks enhances the robustness and generalization capabilities of lung cancer classification systems across diverse datasets, offering a reliable tool for early detection and clinical risk stratification.
Amrir et al. (Fri,) studied this question.