Abstract The increasing complexity and scale of internet infrastructure demand computational frameworks capable of performing accurate, scalable, and interpretable inference over high-dimensional network data. Conventional detection strategies often struggle to maintain robustness and efficiency when confronted with heterogeneous feature spaces and rapidly evolving threat patterns. This study presents a scalable hybrid computational intelligence framework that integrates discriminative statistical modeling, gradient-based inference, and bio-inspired meta-heuristic optimization to address large-scale malicious URL detection. The proposed framework couples Linear and Quadratic Discriminant Analysis with a categorical gradient-boosted inference engine, while automated parameter exploration is conducted using the Mother Optimization Algorithm and the Osprey Optimization Algorithm. A large-scale dataset consisting of 63,191 URLs, described by both application-layer and network-layer attributes, is employed to rigorously evaluate the framework’s performance. Statistical robustness is evaluated through exploratory distribution assessment (Shapiro–Wilk), nonparametric hypothesis testing (Kruskal–Wallis), pairwise model comparison, and cross-validation-based performance consistency. These procedures provide quantitative support for model comparison and feature relevance under non-Gaussian conditions. Model transparency and reproducibility are further strengthened using SHAP-based feature attribution to quantify the influence of individual variables. Results demonstrate that the bio-inspired optimized models achieves superior performance, attaining an accuracy of 96.35%, precision of 96.54%, recall of 96.35%, F1-score of 96.40%, and specificity of 96.36%. These findings indicate that the synergistic integration of hybrid discriminative intelligence and bio-inspired optimization significantly enhances inference capability in real-world URL classification dataset with moderate-dimensional features. Beyond cybersecurity, the proposed framework offers a transferable and computationally efficient paradigm for high-dimensional classification and decision-making tasks across engineering systems and data-intensive scientific applications.
Hua Liu (Tue,) studied this question.