What question did this study set out to answer?

The research aims to develop a framework for analyzing telecom fraud crimes by utilizing big language modeling techniques.

March 13, 2026Open Access

Predictive analysis of telecommunication network fraud crimes based on big language modeling

Key Points

The research aims to develop a framework for analyzing telecom fraud crimes by utilizing big language modeling techniques.
Collected a dataset of 500 real telecom fraud cases from a police platform.
Applied two-stage de-identification using regular expressions and BERT-NER for data compliance.
Constructed a knowledge base with over 200 police-specific terms for better model understanding.
Utilized the MEMIT method to integrate police terminology into the LLaMA model.
Performed fine-tuning with LoRA to enhance model performance.
Entity extraction precision increased from 77.3% to 85.0%.
Recall improved by 6.4 percentage points after knowledge editing.
Fine-tuning with LoRA improved precision by 3.3 points and recall by 3.9 points.
Achieved a classification Macro-F1 score of 0.862, outperforming TextCNN and BERT models.

Abstract

This paper proposes a telecom fraud crime analysis framework that integrates privacy protection, domain knowledge injection, parameter-efficient fine-tuning, Prompt engineering, and time-series prediction. A dataset of 500 real cases is collected from a police platform, covering typical fraud types such as rebate fraud, impersonating law enforcement, and false investments. First, a two-stage de-identification process using regular expressions and BERT-NER is applied to thoroughly de-sensitize sensitive information, ensuring data compliance while retaining essential case elements. Next, a knowledge base of over 200 police-specific terms and common knowledge is constructed, and the MEMIT method is used to locally inject police terminology into the LLaMA model, significantly enhancing the model’s understanding of terms such as “running points” and “card farmers.” The experimental results show that, after knowledge editing, entity extraction precision improved from 77.3% to 85.0%, with recall increasing by 6.4% points. Further fine-tuning with LoRA improved precision and recall by 3.3 and 3.9% points, respectively. Finally, the case classification Macro-F1 score reached 0.862, outperforming TextCNN (0.846) and BERT fine-tuning models (0.850). This framework demonstrates strong performance in telecom fraud case analysis and provides valuable support for intelligent policing and crime prediction.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Li et al. (Tue,) studied this question.

www.synapsesocial.com/papers/69b3aaa802a1e69014ccb7fb — DOI: https://doi.org/10.1186/s44147-026-00965-0

Authors

Danyang Li

Lin Zhan

Journals

Journal of Engineering and Applied Science

Actions

Institutions

China People's Public Security University

Shanghai Public Security Bureau

Jiangsu Police Officer College

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Predictive analysis of telecommunication network fraud crimes based on big language modeling

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion