This preprint introduces cognitive-effort-gated FFN concentration, a lightweight adaptive activation mechanism for compact ternary language models. A scalar active fraction is derived from per-token cross-entropy during supervised training, or from output entropy in target-free settings, normalized by an initial entropy reference, and used to apply an ordinal prefix mask over each SwiGLU feed-forward network intermediate dimension. The mechanism preserves lower-index FFN neurons and masks the tail, coupling measured difficulty to active model capacity without sparse experts, token-level routing, or a learned scheduler. The implementation is studied in a compact 44.8M-parameter ternary transformer using BitNet-style ternary linear semantics, grouped-query attention, and SwiGLU FFNs. Archived repeated-token identity runs show task learning and active-width telemetry. Included FFN gate-projection heatmaps show stronger lower-index row norms in archived checkpoints, supporting the motivation for a broader ablation study. The work is presented as a technical preprint: the current evidence establishes the mechanism and qualitative concentration pattern, while controlled prefix-versus-tail metrics, fixed-width baselines, multiple seeds, and matched-compute comparisons remain part of the validation plan.
soufiane hiyadi (Tue,) studied this question.