Debugging consumes a substantial portion of the software development lifecycle, yet the effectiveness of Large Language Models(LLMs) in this task is not well understood. Competitive programming offers a rich benchmark for such evaluation, given its diverse problem domains and strict efficiency requirements. We present an empirical study of LLM-based debugging on competitive programming problems and introduce DePro, a test-case driven approach that assists programmers by correcting existing code rather than generating new solutions. DePro combines brute-force reference generation, stress testing, and iterative LLM-guided refinement to identify and resolve errors efficiently.Experiments on 13 faulty user submissions from Codeforces demonstrate that DePro consistently produces correct solutions, reducing debugging attempts by up to 64% and debugging time by an average of 7.6 minutes per problem compared to human programmers and zero-shot LLM debugging.
Building similarity graph...
Analyzing shared references across papers
Loading...
Nabiha Parvez
Tanvin Sarkar Pallab
Mia Mohammad Imran
Building similarity graph...
Analyzing shared references across papers
Loading...
Parvez et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69f5943c71405d493affefd2 — DOI: https://doi.org/10.13016/m2db25-prl0