We present Compile-Gated DPO (CG-DPO), which automatically generates DPO training pairs by writing model-generated C# completions to a scratch namespace, triggering Unity's incremental compiler, capturing structured error diagnostics, and emitting (prompt, chosen, rejected) pairs where the rejection context includes the full diagnostic message. Error-conditioned rejection examples teach the model not just what failed but why. CG-DPO implements a three-tier fallback (Unity dotnet, UnityEngineShim, static analysis) for CI portability. Compile results persist to DuckDB for run-over-run statistical comparison. The first system to use game-engine compiler diagnostics as error-conditioned DPO rejection signals.
Weslyn Cory Whitehead (Mon,) studied this question.