ABSTRACT Cervical cancer remains a major contributor to cancer‐related mortality among women worldwide, with a disproportionately high burden in low‐ and middle‐income countries. Pap smear imaging is a standard screening modality for detecting precancerous and malignant cervical abnormalities; however, manual interpretation is labor‐intensive, subjective, and susceptible to interobserver variability. To mitigate these limitations, this study proposes a hybrid deep learning framework for automated cervical cell classification that integrates Vision Transformers (ViT) with Convolutional Neural Networks (CNN). The proposed framework incorporates a structured preprocessing pipeline, including image resampling and data augmentation strategies such as random horizontal flipping and controlled rotations, to enhance model generalization and mitigate overfitting. Input images are divided into fixed‐size patches and processed through a ViT backbone to capture long‐range contextual dependencies. Complementary CNN layers are employed to extract localized morphological features critical for cytological analysis. The extracted representations are combined through a feature fusion mechanism and passed to fully connected layers for classification. The ViT component is initialized with pretrained weights and subsequently fine‐tuned on cervical cytology datasets. Experimental evaluation on the Herlev and SIPaKMeD datasets achieved classification accuracies of 97.31% and 96.62%, respectively. Ablation analysis showed that the CNN branch improves local morphological feature discrimination, while class‐wise evaluation indicated stable performance across multiple cytological categories. These results support the effectiveness of the proposed CNN–ViT fusion framework for automated cervical cell classification and motivate further validation on larger and patient‐indexed clinical datasets.
Building similarity graph...
Analyzing shared references across papers
Loading...
Fida Hussain Dahri
Ghulam Mustafa
Ashfaque Khowaja
International Journal of Imaging Systems and Technology
Sun Yat-sen University
Southeast University
Southern University of Science and Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Dahri et al. (Mon,) studied this question.
www.synapsesocial.com/papers/69d893eb6c1944d70ce04f17 — DOI: https://doi.org/10.1002/ima.70352