Despite the rapid adoption of Large Language Models (LLMs) for automatic code generation, their output often exhibits syntax errors, security vulnerabilities, and functional inconsistencies. To address these issues, we present CodeEnhancer, a two-stage framework that tightly integrates LLMs with static application security testing (SAST) tools and targeted fine-tuning. The goal is to produce more secure and functionally correct Python code. In the first stage, our iterative validation pipeline couples LLM-generated code with tools such as Pylint and Bandit. These tools automatically identify and remediate issues through structured feedback loops. When applied to the GPT-4o model, this process eliminated 82.8% of the initial vulnerabilities and resolved all the detected functional correctness issues when tested on the LLMSecEval dataset. In the second stage, we fine-tune the LLMs using two types of secure code examples: expert-written samples and code refined by our framework. Comparative experiments demonstrate that the framework-tuned model outperforms the baseline and expert-tuned models. The framework-tuned model generates only 18.4% vulnerable code snippets on the LLMSecEval dataset, whereas the baseline and expert-tuned models produce 43.6% and 54.7% vulnerable code snippets, respectively. The framework-tuned model reduces final vulnerability rates to 6.7% on LLMSecEval and 3.5% on the SecurityEval dataset. Our results highlight the synergistic effect of integrating static analysis with feedback-informed fine-tuning. They also reveal limitations in current evaluation metrics and dataset representativeness. These findings suggest a scalable, robust approach to achieving more secure, trustworthy, and practical AI-assisted code generation. • Combines language models with SAST Tools to enhance Syntax, security and functional correctness Python code. • First approach to address syntax, security, and functional correctness in LLM-generated code. • Automated feedback and learning process helps LLMs generate more secure, correct code. • Fine-tuning on framework-refined code leads to better security than training on expert-written code. • Scalable approach enables robust and trustworthy AI-assisted code generation and refinement with minimal manual effort.
Building similarity graph...
Analyzing shared references across papers
Loading...
Lee et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69d0aefd659487ece0fa4e64 — DOI: https://doi.org/10.1016/j.knosys.2026.115925
Jongmin Lee
Khang Mai
Nakul D. Ghate
Knowledge-Based Systems
Building similarity graph...
Analyzing shared references across papers
Loading...