August 28, 2024Open Access

SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

There is a growing trend of teaching large language models (LLMs) to solve mathematical problems through coding. Existing studies primarily focus on prompting powerful, closed-source models to generate seed training data followed by in-domain data augmentation, equipping LLMs with considerable capabilities for code-aided mathematical reasoning. However, continually training these models on augmented data derived from a few datasets such as GSM8K may impair their generalization abilities and restrict their effectiveness to a narrow range of question types. Conversely, the potential of improving such LLMs by leveraging large-scale, expert-written, diverse math question-answer pairs remains unexplored. To utilize these resources and tackle unique challenges such as code response assessment, we propose a novel paradigm that uses a code-based critic model to guide steps including question-code data construction, quality control, and complementary evaluation. We also explore different alignment algorithms with self-generated instruction/preference data to foster continuous improvement. Experiments across both in-domain (up to +5.7%) and out-of-domain (+4.4%) benchmarks in English and Chinese demonstrate the effectiveness of the proposed paradigm.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Dian Yu

Baolin Peng

Ye Tian

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models

Puntos clave

Resumen

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study