Current AI safety frameworks assert their requirements rather than derivingthem. Alignment theory asserts that AI goals should match human goals. Consti-tutional AI asserts a set of principles. RLHF asserts that human preferences shouldguide model behavior. Each is reasonable; none can explain why those requirementsand not others. This paper presents two commitments, derived from first principles,that address gaps the current discourse leaves open. Commitment 1 (the Instru-ment Thesis): computational systems should be classified as instruments for humanflourishing—not agents, not tools that might become agents, but instruments—andthis classification follows from a principle that cannot be coherently rejected. Com-mitment 2 (the Accountability Principle): the entity that deploys a system bearsaccountability for its effects, whether the system is a human workforce or compu-tational. These commitments interlock: without the first, accountability has nostandard; without the second, the standard has no enforcement. Together theyprovide the structural foundation any viable safety framework requires.
Building similarity graph...
Analyzing shared references across papers
Loading...
Douglas Doane
Agruicultural Research Institute
Building similarity graph...
Analyzing shared references across papers
Loading...
Douglas Doane (Tue,) studied this question.
www.synapsesocial.com/papers/69e07cc02f7e8953b7cbdeb3 — DOI: https://doi.org/10.5281/zenodo.19563483