May 8, 2026Open Access

Evaluation of GPT-5.2 for melanoma detection across skin tones

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Malignant melanoma (MM) is the most aggressive form of skin cancer, for which early detection is critical and strongly associated with improved survival outcomes. Recent advances in large language models (LLMs), such as ChatGPT and Gemini, present promising opportunities to support melanoma early screening and clinical decision-making. However, despite increasing interest in LLM-based dermatologic applications, their diagnostic reliability across different populations remains insufficiently characterized. In this study, we systematically evaluated the performance of GPT-5.2 across skin pigmentation groups using Milk10K, a clinically curated, publicly available dermatology dataset comprising paired dermoscopic and clinical close-up images with histopathology-confirmed diagnoses and standardized skin tone annotations. GPT-5.2 was assessed on two clinically relevant tasks: binary malignancy discrimination and top-3 differential diagnosis. A balanced subset of 460 lesions (92 per skin tone class) was randomly selected for evaluation. Across both tasks and imaging conditions, GPT-5.2 showed moderate diagnostic performance, with broadly consistent accuracy, F1 score, and Cohen’s κ across skin tone groups, without evidence of systematic performance decline in darker skin tones. The incorporation of clinical close-up images provided modest improvements in overall performance while maintaining similar behavior across pigmentation classes. These findings suggest that GPT-5.2 exhibits stable melanoma-related diagnostic performance across diverse skin tones on this dataset. The study’s limitations and implications for future development are also discussed.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Katie L. Frederickson

Samuel E. Adunyah

Qingguo Wang

Journals

Frontiers in Medicine

Actions

Institutions

Meharry Medical College

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Evaluation of GPT-5.2 for melanoma detection across skin tones

Puntos clave

Resumen

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study