What question did this study set out to answer?

February 27, 2026Open Access

Multifaceted structural classification of Indian legal judgments

Key Points

The aim is to automatically identify structural components of Indian court judgments to improve legal document analysis.
Proposes a multiclass structural classification framework using a dataset of 6500 judgment segments.
Utilizes a domain-specific lexicon of 5000 legal n-grams for feature construction.
Applies TF-IDF representation, PCA for dimensionality reduction, and various machine learning classifiers.
Combines models using interpolation-based fusion for enhanced performance.
Achieves 84% accuracy and 84% weighted F1-score with the fused Legal-BERT and Indian Legal-BERT model.
Macro recall is reported at 80%, demonstrating effective classification.
Validated performance increases using paired Wilcoxon signed-rank test with p< 0.05.

Abstract

Automatically identifying structural components of Indian court judgments is critical for effective legal document analysis but remains challenging due to complex legal language, class imbalance, and limited annotated data. This paper proposes a multiclass structural classification framework for Indian legal judgments using a dataset of nearly 6500 judgment segments from Indian Kanoon, manually annotated into 15 structural categories. A domain-specific lexicon of 5,000 legal n-grams is used to support feature construction for TF-IDF representation. We evaluate statistical representations (TF-IDF), dimensionality reduction (PCA), data augmentation, and contextual embeddings from transformer-based models across multiple machine learning and deep learning classifiers. The best-performing models are further combined using interpolation-based fusion. Experimental results show that a fused Legal-BERT and Indian Legal-BERT model achieves the best performance, with 84% accuracy, 84% weighted F1-score, and 80% macro recall, without data augmentation or manual feature engineering. Performance gains are validated using a paired Wilcoxon signed-rank test (p< 0.05), demonstrating robust and consistent improvements across structural classes. Further, explainability tools are used for interpreting and understanding the primary tokens influencing the model’s decisions.

Mark Helpful

Bookmark

Relay

View Full Paper