Recent advances in large language models (LLMs) have leveraged explicit Chain-of-Thought (CoT) prompting to improve reasoning accuracy. However, most existing methods primarily compress verbose reasoning outputs. These Long-to-Short transformations aim to improve efficiency, but still rely on explicit reasoning during inference. In this work, we introduce 3TF (Thought-Training and Thought-Free inference), a framework for efficient reasoning that takes a Short-to-Long perspective. We first train a hybrid model that can operate in both reasoning and non-reasoning modes, and then further train it on CoT-annotated data to internalize structured reasoning, while enforcing concise, thought-free outputs at inference time using the no-reasoning mode. Unlike compression-based approaches, 3TF improves the reasoning quality of non-reasoning outputs, enabling models to perform rich internal reasoning implicitly while keeping external outputs short. Empirically, 3TF-trained models obtain large improvements on reasoning benchmarks under thought-free inference, demonstrating that high quality reasoning can be learned and executed implicitly without explicit step-by-step generation.
Building similarity graph...
Analyzing shared references across papers
Loading...
Canhui Wu
Qingjiu Cao
Chao Xue
Building similarity graph...
Analyzing shared references across papers
Loading...
Wu et al. (Wed,) studied this question.
www.synapsesocial.com/papers/690fdcdaf60c54d04ea381ba — DOI: https://doi.org/10.48550/arxiv.2511.03408
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: