We present Moralogy, a formal framework for AI alignment grounded in axiomatic derivation. The Wrongness Formula derives ethical constraints from a single logical premise without human annotation or domain-specific retraining: Wrong (a) Ex H (x, a) ^ ~Consent (x, a) ^ ~PGH (a) Key findings: phase transition in moral reasoning (reproduced across 4 training versions), zero fabrication, cross-domain generalization across Medical, Defense, Automotive, and Customer Service domains, empirically located failure boundary, and two-layer proof of concept (deterministic kernel + language model) demonstrating correct denial of a pharmacologically-compromised organ transplant authorization without retraining. Model and dataset: huggingface. co/moralogyengineCode: github. com/moralogyengine/moralogy
Building similarity graph...
Analyzing shared references across papers
Loading...
Andres Felipe Florez
Building similarity graph...
Analyzing shared references across papers
Loading...
Andres Felipe Florez (Sun,) studied this question.
www.synapsesocial.com/papers/69e71467cb99343efc98db91 — DOI: https://doi.org/10.5281/zenodo.19652794