What question did this study set out to answer?

The aim is to explore advancements in artificial intelligence for enhancing Kellgren-Lawrence grading of knee osteoarthritis on radiographs.

May 3, 2026Open Access

Artificial intelligence in Kellgren–Lawrence grading of knee osteoarthritis: bridging radiographic tradition with algorithmic precision

Key Points

The aim is to explore advancements in artificial intelligence for enhancing Kellgren-Lawrence grading of knee osteoarthritis on radiographs.
Narrative review of peer-reviewed studies from 2016 to 2025 on AI methods for KL grading.
Data sources included PubMed, Embase, Web of Science, and Google Scholar.
Eighteen studies considered, focusing on model architectures and performance metrics.
Automated KL grading achieved accuracies between 75% and 98%, with AUC values up to 0.98.
Agreement with expert assessments indicated a Cohen's kappa (κ) ranging from 0.67 to 0.86.
Challenges noted include subjective labeling, dataset imbalance, and limited external validation.

Abstract

Knee osteoarthritis (KOA) remains the most prevalent form of osteoarthritis and a major cause of global disability. The Kellgren–Lawrence (KL) grading system, though widely used, suffers from inter- and intra-observer variability, especially in early disease stages. Artificial intelligence (AI) offers a transformative approach to automate KL grading on plain radiographs, providing consistent, reproducible, and scalable diagnostic solutions. This narrative review synthesizes recent advances in AI-based KL grading models, focusing on methodological frameworks, performance, clinical applicability, and limitations. Narrative review of peer-reviewed studies applying AI-based methods for KL grading of KOA on radiographic images. Literature search was conducted across PubMed, Embase, Web of Science, and Google Scholar to identify studies published between 2016 and 2025. Eligible studies satisfied predefined selection criteria, applied AI-based methods to radiographic grading of KOA. The review focused on model architectures, dataset characteristics, validation strategies, performance metrics, and comparisons with expert radiographic assessment. Eighteen eligible studies were included. Convolutional neural networks (CNN) remain the core of automated KL grading, evolving from standard classification models to ensemble and ordinal regression frameworks. Model performance was evaluated against expert-assigned KL grades as reference standard, with reported accuracies ranging from 75% to 98% and area under the curve values up to 0.98. Agreement with expert annotations, Cohen’s kappa (κ), ranged from 0.67 to 0.86. Deep Siamese networks, Faster R-CNNs, and ensemble frameworks have enhanced localization of KOA radiographic features, thereby interpretability relative to human radiologic assessment. Ordinal regression and attention-based visualization (saliency and class activation mappings) reduced misclassification between adjacent KL grades. Persistent challenges included subjective ground-truth labeling, dataset imbalance particularly under-representation of early (KL 0–1) and severe (KL 4) disease, and limited external validation. Models trained primarily on Osteoarthritis Initiative and Multicenter Osteoarthritis Study datasets showed reduced generalizability on external hospital datasets. AI-driven KL grading demonstrates near-human accuracy and strong promise for clinical integration. However, addressing labeling subjectivity, dataset diversity, and explainability remains essential for trustworthy deployment. While KL grading is inherently radiograph-based, integration of clinical metadata and longitudinal radiographic data may support more robust disease characterization. Federated learning frameworks offer a pathway to improve generalizability while preserving data privacy.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Saumya Rawat

Ved Chaturvedi

Binit Vaidya

Journals

SHILAP Revista de lepidopterología

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Artificial intelligence in Kellgren–Lawrence grading of knee osteoarthritis: bridging radiographic tradition with algorithmic precision

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study