This study explores a comprehensive assessment of deep learning models for classification of four Brassica species (Brassica juncia, Brassica napus, Brassica oleracea, and Brassica rapa) based on codon usage frequency patterns mined from their whole CDS genomes. We compared the performance of a novel Gradient Guided Adaptive Regularized (GGAR) Multilayer Perceptron (MLP) model against five panelized approaches of MLP, Adaptive, Elastic Net, Fixed L1, Fixed L2, base line MLP and one traditional 1D-CNN model, across multiple hyper parameter configurations (learning rates: 0.01, 0.001, 0.0001; batch sizes: 32, 64, 128, 256). The models were evaluated using 10-fold cross-validation, with performance metrics including accuracy, precision, recall, F1-score, and Matthews Correlation Coefficient (MCC). The results shows that GGAR consistently performed better than its existing models models in low learning rate of 0.0001 and batch sizes of 32, 64 and 128 settings, attaining near perfect classification accuracy, recall, mcc and F1 approximately equals to 1. Statistical validation via Kruskal–Wallis and ANOVA tests confirmed GGAR’s superiority (p < 0.001) over comparative models as well as over traditional CNN model in all evaluation scenarios. Notably, Fixed L1 and CNN excelled at higher learning rates of 0.01 and 0.001, while GGAR dominated in fine-tuned, low rate regimes, signifying its effectiveness in handling indirect genomic patterns. The analysis of training durations showed that Fixed L1 was computationally efficient, completing analysis in 5.90–91.52 min. In contrast, GGAR demanded more time from 6.38 to 124.78 min but achieved higher accuracies. While the MLP baseline performed competitively, its results were less consistent, and Elastic Net and Fixed L2 demonstrated clear speed versus precision tradeoffs. The CNN also gives exceptional performance with very low execution speed from 99.49 to 179.25 min. These results highlights the significance of adaptive regularization in genomic classification, with GGAR showing particularly effective for precise species classification. This study introduces a practical guidance for filtering deep learning models in bioinformatics, stressing how regularization approaches and hyper parameter tuning influence deep learning model performance.
Shahzad et al. (Sat,) studied this question.