What does this research mean for the field?

A fine-tuned KoBART model that processes entire conversation sequences as a single input can detect online grooming risk in Korean conversations with 99.18% accuracy and zero false negatives. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.ESTABLISHES_NEW_DIRECTION.

What question did this study set out to answer?

The aim is to develop a Korean-language model for detecting online grooming by analyzing conversational context.

March 22, 2026

A KoBART-Based Model for Detecting Online Grooming Risk in Korean Conversations

Key Points

The aim is to develop a Korean-language model for detecting online grooming by analyzing conversational context.
Utilized international dataset PAN12, translated and refined using DeepL and a large language model.
Constructed a high-quality dataset reflecting spoken Korean characteristics.
Fine-tuned the KoBART model for binary classification of grooming risk in conversations.
Analyzed confusion matrix to assess false negatives and detection reliability.
Achieved an accuracy of 99.18% and an F1-Score of 0.9918 with the model.
Confirmed zero false negatives in the confusion matrix analysis, indicating high detection reliability.

Abstract

디지털 플랫폼의 발달로 아동 및 청소년의 온라인 활동이 급증함에 따라, 이들을 대상으로 한 온라인 그루밍 범죄가 심각한 사회 문제로 대두되고 있다. 기존의 그루밍 탐지 연구는 대부분 영어 데이터셋에 의존하고 개별 메시지의 단편적 특징을 분석하는 데 집중하여, 한국어 환경에서 대화의 전체 맥락을 통해 점진적으로 신뢰를 쌓아 가는 지능적 범죄의 특성을 포착하는 데 명백한 한계를 보인다. 본 연구는 이러한 한계를 극복하고자, 한국어 환경에 최적화된 문맥 기반 온라인 그루밍 탐지 모델을 제안한다. 이를 위해 국제 공개 데이터셋인 PAN12를 DeepL과 거대언어모델을 활용해 번역 및 정제하고, 최종적으로 연구진 검수를 거쳐 한국어의 구어적 특성과 대화 맥락을 반영한 고품질의 탐지용 말뭉치를 구축하였다. 이 데이터셋을 기반으로, 대화 전체 시퀀스를 단일 입력으로 처리하여 문맥 흐름을 효과적으로 이해하는 KoBART 모델을 미세조정(Fine-tuning)하여 이진 분류 모델을 구현하였다. 실험 결과, 제안 모델은 정확도 99.18%, F1-Score 0.9918이라는 높은 성능을 달성하며 안정적으로 수렴했다. 특히 혼동 행렬 분석에서 실제 그루밍 위험 대화를 놓치는 위음성(False Negative) 오류가 0으로 수렴하여, 탐지 시스템으로서 높은 신뢰도와 실효성을 확보했음을 확인했다. 본 연구는 한국어 그루밍 탐지 연구가 부족한 상황에서, 대화의 전체 맥락을 고려한 새로운 접근법의 효과성을 실험적으로 입증했으며, 향후 관련 기술 개발에 중요한 기반을 제공한다는 점에서 의의가 있다.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Jun-kyu Kang

Min-su Jung

Seongmin Kim

Journals

The Journal of Korean Institute of Communications and Information Sciences

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

A KoBART-Based Model for Detecting Online Grooming Risk in Korean Conversations

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider