We introduce KuraFormer, a parameter-efficient adapter that injects Kuramoto oscillatory dynamics into pretrained Transformers, enabling iterative refinement of hidden representations at inference time. Unlike LoRA, which adapts weights, KuraFormer adapts computation depth—the same trained adapter can be run for varying numbers of integration steps without retraining. We evaluate on GSM8K with Mistral-7B and LLaMA-3-8B, reporting two findings: (1) a convergence window phenomenon where accuracy improves then degrades with more steps, and (2) that warm-start initialization with integration schedules eliminates this window entirely, producing flat accuracy curves across 4–64 steps. KuraFormer reaches within 2.9–3.5pp of LoRA using 18% fewer parameters while offering variable-depth computation that weight-based adapters cannot provide.
Building similarity graph...
Analyzing shared references across papers
Loading...
Jesus Tabares Montilla
Building similarity graph...
Analyzing shared references across papers
Loading...
Jesus Tabares Montilla (Fri,) studied this question.
www.synapsesocial.com/papers/69b6069b83145bc643d1cbb3 — DOI: https://doi.org/10.5281/zenodo.19007694