Abstract Accurate multiclass classification of dermoscopic skin lesions is clinically important yet remains challenging due to strong inter-class visual similarity, intra-class variability, and severe class imbalance in real-world benchmarks. In this work, we address 8-class diagnosis on the ISIC 2019 dataset by introducing DermaFusionNet (DFN), a dual-branch hybrid fusion framework designed to jointly capture fine-grained texture cues and broader contextual patterns. DFN integrates a multiscale convolutional stream augmented with local-window attention and dynamic channel gating to refine cross-scale representations, and a lightweight depthwise-separable convolutional stream equipped with Squeeze-and-Excitation blocks to enhance discriminative channel responses; the two streams are fused via feature concatenation followed by a compact classification head. We evaluate DFN exclusively on ISIC 2019 using the standard 8 categories (AK, BCC, BKL, DF, MEL, NV, SCC, VASC) with the imbalanced training set of 25,331 images (NV 50.8%, MEL 17.9%) and report results under fivefold cross-validation (CV = 5). DFN achieves 97.85% accuracy with 98.06% precision, 97.85% recall, and 97.91% F1-score (weighted), while one-vs-rest ROC–AUC values range from 0.98–1.00 across classes; moreover, DFN outperforms strong baselines including ConvNeXtV2-Tiny (97.12% accuracy) and DeiT-Base (96.94% accuracy) under the same evaluation protocol. These findings indicate that the proposed attention–gated multiscale fusion with an SE-calibrated separable-CNN branch provides an effective and generalizable representation-learning strategy for high-performance dermoscopic lesion classification on ISIC 2019.
Melon et al. (Sun,) studied this question.