Formal verification of Python programs remains challenging due to the language’s dynamic nature and rich semantic constructs. We present an approach that combines Large Language Models (LLMs) with Bounded Model Checking (BMC) to verify Python code. Our system uses an LLM to translate Python programs into C code suitable for formal verification using ESBMC, enabling the detection of critical bugs, including arithmetic overflows, array bounds violations, and concurrency errors. We evaluate our approach on a benchmark of 23 carefully designed Python programs with planted bugs representing common verification challenges. The LLM orchestrator successfully detected all planted bugs by iteratively coordinating static, dynamic, and formal verification tools. Our results demonstrate that LLM-assisted translation can make mature C verification tools accessible for Python code analysis, though the approach is limited to programs amenable to BMC (15–50 lines, bounded loops, statically-sized data structures).
Building similarity graph...
Analyzing shared references across papers
Loading...
Shivkumar Shivaji
Natalia Lobakhina
Klaus Havelund
Building similarity graph...
Analyzing shared references across papers
Loading...
Shivaji et al. (Sun,) studied this question.