Large Language Models (LLMs) have advanced rapidly, driving major progress in Natural Language Processing (NLP) tasks. However, this advancement has also raised significant concerns about its potential misuse, particularly in academic and research contexts, such as the production of unoriginal or fabricated content. To address this challenge, we propose a robust approach for detecting artificial intelligence (AI)-generated text that emphasizes a deep understanding of linguistic context. Our methodology leverages cutting-edge language models, specifically Distilled Bidirectional Encoder Representations from Transformers (DistilBERT) and Cross-lingual Language Model-Robustly Optimized BERT Pretraining Approach (XLM-RoBERTa), to extract deep semantic features from text. These features capture intricate linguistic nuances, including contextual cues, stylistic patterns, and semantic relationships, which are critical for accurately differentiating human-written text from machine-generated text (MGT). We employ the Extreme Gradient Boosting (XGBoost) algorithm to enhance classification accuracy, a powerful machine learning technique renowned for its efficiency and predictive capability. We evaluated the proposed approach on two extensive English datasets, Daigt-V4 and LLM-Detect AI-Generated Text, extracting features primarily from the uppermost transformer layers that capture high-level semantic information. Similarly, for multilingual evaluation using XLM-RoBERTa on the Urdu Human and AI Text (UHAT) dataset, we applied a layer-weighting mechanism that combines representations from all transformer layers. This mechanism assigns trainable weights to each layer’s output, enabling the model to balance low-level syntactic and high-level semantic patterns, thereby enhancing cross-lingual robustness. Our experiments showed that DistilBERT performed well in comparison with XLM-RoBERTa by an average of 2% on the Daigt-V4 dataset. Specifically, XLM-RoBERTa achieved 94% accuracy, while DistilBERT reached 96% accuracy on the same dataset. On the LLM-Detect AI-Generated Text dataset, both models achieved 99% accuracy. In contrast, on the UHAT dataset, the model achieved a promising accuracy of 85%, demonstrating the effectiveness of the layer-weighting mechanism in handling cross-lingual challenges.
Building similarity graph...
Analyzing shared references across papers
Loading...
Akabra Javed
Zakia Jalil
Muhammad Nasir
PeerJ Computer Science
Umm al-Qura University
International Islamic University, Islamabad
Building similarity graph...
Analyzing shared references across papers
Loading...
Javed et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69c7725e8bbfbc51511e2ccd — DOI: https://doi.org/10.7717/peerj-cs.3672