Précis: DeepSeek, a biomedically enriched AI model, achieved the highest accuracy in generating PubMed citations for glaucoma research, outperforming general-purpose models and highlighting the necessity of human oversight to mitigate AI-related citation errors. Purpose: This study evaluated the accuracy and reliability of four artificial intelligence (AI) models—ChatGPT (OpenAI GPT-3.5), Copilot (GitHub/Microsoft), DeepSeek (DeepSeek AI), and Gemini (Google AI)—in generating PubMed citations for glaucoma research. This study aimed to assess the potential of AI tools for academic reference generation and identify their limitations, particularly in specialized ophthalmology fields. Methods: Thirty-five standardized clinical paragraphs from The Review of Ophthalmology (4th edition) were used to test citation accuracy. Each model was instructed to generate AMA 11-style PubMed citations. Citations were evaluated for accuracy, DOI matching, and clinical relevance. An expert review validated the outputs and classified them as “Fully Cited,” “Partially Cited,” or “Not Cited.” Results: DeepSeek, a biomedically enriched model, outperformed the others, with an accuracy of 92.0%. Copilot and Gemini achieved moderate accuracies of 66.7% and 25.8%, respectively, while ChatGPT achieved the lowest citation accuracy at 19.4%. Frequent errors included DOI mismatches, incorrect journal names, and irrelevant references. Expert review confirmed that even the best model produced citation errors, emphasizing the need for human oversight. We interpret this apparent advantage cautiously, as model details, updates, and changes in underlying data may influence performance. Conclusion: AI models—particularly biomedically enriched tools such as DeepSeek—can accelerate citation drafting, but citation hallucinations and metadata errors remain common. AI should serve as a decision support tool for reference retrieval and formatting, not a substitute for rigorous manual verification before submission.
Civelekler et al. (Thu,) studied this question.