Help desks at various organizations receive hundreds and thousands of requests from users daily. Manually sorting these requests takes considerable time and often leads to routing errors, reducing the speed and quality of customer service. Automating the request categorization process is a pressing issue for companies of all types, including IT support, medical institutions, banks, government agencies, and online stores. This paper proposes a universal method for automatically sorting text requests into categories using a pre-trained Sentence-BERT (SBERT) neural network model. The low efficiency of pre-trained language models when working with texts from highly specialized subject areas is investigated. To address this issue, contrastive retraining of the model on domain-specific data was applied, significantly improving the quality of vector text representations. A systematic comparison of four approaches was conducted: a baseline model without retraining, unsupervised contrastive learning on unlabeled data, supervised retraining using the CosineSimilarityLoss criterion, and retraining using the Multiple Negatives Ranking Loss (MNRL) criterion. Experiments were conducted on a dataset of 6,500 Russian-language queries, of which 1,119 were labeled into 16 categories. Both internal metrics (Silhouette Score, Davies-Bouldin Index) and external ones (Purity, NMI, ARI) were used to assess clustering quality. The MNRL method demonstrated the best results: clustering quality increased by 123% for Purity, 233% for NMI, and 658% for ARI compared to the baseline model. A mechanism for assessing classification confidence based on an individual Silhouette Score for each query is proposed, allowing uncertain cases to be redirected for manual processing. The developed approach is universal and can be adapted to automate the processing of requests in any subject area with 10-20% of labeled data.
Building similarity graph...
Analyzing shared references across papers
Loading...
A. N. Isenbaev
I. M. Yannikov
Intellekt Sist Proizv
Izhevsk State Technical University
Building similarity graph...
Analyzing shared references across papers
Loading...
Isenbaev et al. (Sat,) studied this question.
www.synapsesocial.com/papers/69db37404fe01fead37c5468 — DOI: https://doi.org/10.22213/2410-9304-2026-1-13-25