August 5, 2025Open Access

Evaluating large language models in biomedical data science challenges through a classroom experiment

Key Points

LLMs demonstrated potential in solving biomedical data science challenges but didn't top Kaggle leaderboards.
Participants' LLM-generated submissions closely matched leading human scores, indicating strong performance.
Self-refinement was the most effective prompting strategy, enhancing initial solutions from LLMs.
These findings suggest LLMs can produce competitive machine learning solutions for non-experts.

Abstract

ABSTRACT Large language models have shown remarkable capabilities in algorithm design, but their effectiveness in solving data science challenges remains poorly understood. We conducted a classroom experiment in which graduate students used large language models (LLMs) to solve biomedical data science challenges on Kaggle. While their submissions did not top the leaderboards, their prediction scores were often close to those of leading human participants. LLMs frequently recommended gradient boosting methods, which were associated with better performance. Among prompting strategies, self-refinement, where the LLM improves its own initial solution, was the most effective, a result validated using additional LLMs. These findings demonstrate that LLMs can design competitive machine learning solutions, even when used by non-experts.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Yan et al. (Thu,) studied this question.

www.synapsesocial.com/papers/689a0f93e6551bb0af8d130b — DOI: https://doi.org/10.1101/2025.07.12.664517

Authors

Cairui Yan

Zhicheng Ji

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Evaluating large language models in biomedical data science challenges through a classroom experiment

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion