March 3, 2026

Machine Learning-Based Detection of AI-Generated Text via Stylistic and Statistical Feature Modeling

Key Points

AI-generated text is detectable using machine learning models with an F1-score over 90%.
The analysis involved correlating 220 stylistic and statistical features of texts from multiple LLMs.
Feature modeling includes examining high kuperman age for AI texts versus lexical richness in human writing.
Results highlight the potential for developing robust tools against disinformation generated by AI systems.

Abstract

Through the advances of large-language models (LLMs) AI- generated text can be created with ease. But, these tools can also pose a threat, e.g. through the creation of disinformation. In this work, we analysed texts generated by three LLMs: GPT-3.5, LLaMA3, and Qwen from the CUDRT dataset. We extracted 220 stylistic and statistical features of human and AI-generated text using the LFTK library. First, we analysed the features using the pearson correlation. Second, we trained five machine learning models and tested the classifiers on detecting completely AI-generated, polished, rewritten texts, and summaries created by AI. We calculated an F1-score of 90%+ for the text generated entirely by AI, depending on the LLM used. We found that AI-generated texts, independent of LLM, can be identified through a high kuperman age, i.e. high word complexity, whereby human-written texts are written with higher lexical variation and richness. We provide an explanation for the classification results and a comparison with RoBERTa (fine-tuned).

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Karla Schäfer

M. Steinebach

Actions

Institutions

Fraunhofer Institute for Secure Information Technology

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Machine Learning-Based Detection of AI-Generated Text via Stylistic and Statistical Feature Modeling

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study