Clinical research published in internal medicine journals relies heavily on statistical analysis and quantitative inference, making the quality of statistical reporting and statistical peer review central to the credibility of this literature. Despite long-standing methodological recommendations, the quality of statistical analyses and reporting in medical journals remains suboptimal, and the proportion of manuscripts undergoing formal statistical review has not improved over recent decades. At the same time, generative artificial intelligence (AI) tools have been increasingly adopted in biomedical research, raising expectations that they may support statistical analysis and elements of the peer-review process. This narrative review synthesizes evidence published between 2023 and 2025 on the use of AI-assisted tools in statistical analysis and statistical review within medical research. The reviewed studies show that large language models can support selected tasks, including generation of analytical code, reproduction of simple statistical procedures, preliminary selection of statistical tests, and detection of certain formal statistical errors. However, AI performance is highly variable and frequently limited by incomplete consideration of statistical assumptions and reduced reliability in complex analytical scenarios. Current generative AI tools should not be regarded as fully autonomous instruments for statistical analysis or statistical peer review. Their effective use depends on statistical expertise, independent validation, and contextual judgment by human users. The review discusses implications for statistical practice and statistical review in internal medicine, a research setting characterized by heterogeneous observational data, multimorbidity, and frequent use of non-randomized study designs, including pragmatic clinical trials.
Michal Ordak (Fri,) studied this question.