Project GlassBox is a systematic 81-phase experimental campaign demonstrating that small, structurally constrained neural architectures can simultaneously achieve superior task performance and unprecedented interpretability compared to large unconstrained models. Using ARC-AGI as a benchmark for abstract visual reasoning, a 77K-parameter Graph Neural Network with Pointer attention (the "GlassBox Agent") outperforms a 1.45M-parameter Transformer baseline (56.8% vs 43.9% full match accuracy). Through test-time gradient adaptation with geometric data augmentation, accuracy reaches 87.4%, and in v3, the Ultimate Configuration — L2 ablation at 20%, adaptation LR of 0.1, and Model Soup inference (K=5) — achieves 88.9% accuracy with 2.0% standard deviation across 3 seeds. Latent graph dynamics with multi-step reasoning in hidden space achieves 90.8%, the campaign's peak accuracy. What's new in v3: Mechanistic Anatomy (Phase 67): Linear probes prove GNN L1 encodes low-level features (color: 90%) while L2 specializes in high-level rules (operation: 78%), explaining why L2 ablation triggers optimal super-recovery. Zero-Shot Rule Synthesis (Phase 68): TTT recovers 50% accuracy on completely novel operations unseen during training — proving on-the-fly rule creation, not mere memorization. Ultimate Configuration (Phase 75): L2 Ablate 20% + LR 0.1 + Model Soup K=5 = 88.9% mean, the campaign's most reliable multi-seed configuration. Latent Graph Dynamics (Phase 79): Multi-step reasoning in latent space achieves 90.8% — matching the campaign's peak without DSL bottleneck. Prior Knowledge Dominance (Phase 72): Handcrafted BFS outperforms learned Slot Attention by 27× (62.1% vs 2.3%), proving human prior knowledge is a decisive advantage in low-data regimes. Continual Self-Play (Phase 78): Experience replay eliminates catastrophic forgetting, enabling stable self-improvement (+1.1% per iteration). 5 new summary figures: 81-phase journey timeline, innovation waterfall, breakthrough map, structure vs scale evidence, and layer anatomy visualization. Key Results: Structure > Scale: 77K structured parameters outperform 1.45M unstructured parameters (19× smaller, higher accuracy) Hydra Self-Repair: First quantitative characterization of neural self repair — after destroying 50% of model neurons, few-shot adaptation recovers 95.8% of original performance 82.8% Attribution: Full causal path tracing for 82.8% of predictions, exceeding by 3.3× the 25% attribution coverage reported for large language models Ablation as Variance Regularizer: Gradient-based ablation at 12–15% reduces seed-dependent variance by 4–5×, transforming ablation from a performance booster into a reliability mechanism Ultimate Configuration: 88.9% with L2 ablation + high LR + Model Soup (multi-seed validated) Latent Reasoning Peak: 90.8% via multi-step latent graph dynamics Source code: https://github.com/hafufu-stack/glassbox Acknowledgments This research was conducted entirely independently, without institutional affiliation or corporate funding. The author currently faces financial constraints that make it increasingly difficult to maintain subscriptions to AI services essential for this line of research. To sustain and improve the quality of future work, the author is actively seeking community sponsorship. Details are available at https://github.com/sponsors/hafufu-stack.
Building similarity graph...
Analyzing shared references across papers
Loading...
Hiroto Funasaki
Building similarity graph...
Analyzing shared references across papers
Loading...
Hiroto Funasaki (Thu,) studied this question.
www.synapsesocial.com/papers/69f5945c71405d493afff291 — DOI: https://doi.org/10.5281/zenodo.19918795
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: