Binary classification of lung cancer using vision transformer models on CT images

Key Points

The model achieved 92.3% accuracy, a significant improvement for lung cancer detection.
Precision was recorded at 90.5%, indicating reliable identification of malignant cases in CT images.
Observational analysis used a Kaggle-hosted subset of the LIDC–IDRI dataset with 315 CT nodule patches.
This highlights the promise of transformer models in enhancing early lung cancer identification in medical imaging.

Abstract

Lung cancer remains a leading cause of cancer-related deaths, primarily due to late-stage detection. Although medical imaging and biopsy-based evaluations have improved, early identification of lung cancer continues to be challenging. To address this, we propose a Vision Transformer (ViT)-based model for binary lung nodule classification using computed tomography (CT) images. This study uses a Kaggle-hosted subset of the LIDC–IDRI dataset containing 315 CT nodule patches, where the original malignancy scores were converted into benign and malignant binary classes. Given the small dataset size, an extensive augmentation pipeline was designed to enhance model generalization. The lightweight ViT-Small/16 architecture demonstrated strong performance, achieving 92.3% accuracy, 90.5% precision, 93.8% recall, and a 92.1% F1-score. These results highlight the potential of compact transformer models for early lung cancer identification. This work is among the first to evaluate ViT-Small/16 on a small-scale CT nodule dataset using a tailored augmentation strategy for limited-data medical imaging.

Bookmark

View Full Paper

Bookmark

View Full Paper

Binary classification of lung cancer using vision transformer models on CT images

Key Points

Abstract

Cite This Study