The dominant concern in AI safety research is whether AI systems will produce harmful outputs. This paper argues that a more foundational concern has been systematically underaddressed: whether AI systems know the boundaries within which their outputs remain valid. We propose the Fisherman Principle — derived from the classical Chinese text "The Fisherman" (渔父) in the Chuci (Songs of the South), attributed to Qu Yuan (c. 340–278 BCE) — as an operational criterion for AI judgment systems: AI安全研究的主要关注点是AI系统是否会产生有害输出。本文认为,一个更基础的关切被系统性地忽视了:AI系统是否知道其输出在什么边界内保持有效。我们提出渔父原则——源自屈原(约公元前340-278年)的楚辞经典篇目《渔父》——作为AI判断系统的操作性标准: The first duty of a judgment system is not to produce answers, but to recognize the boundaries within which answers remain valid. 判断系统的首要职责不是产生答案,而是识别答案在什么边界内保持有效。 The paper distinguishes between two failure modes that are often conflated: error (an incorrect output within a system's competence boundary) and fabrication (structurally coherent output generated beyond the system's epistemic boundary). Error is correctable; fabrication, coated in fluency, is not. Epistemic humility, in this framework, is not a moral virtue but an architectural requirement: a judgment system without boundary-awareness is structurally untrustworthy, regardless of its average accuracy. 本文区分了两种经常被混淆的失败模式:错误(系统能力边界内的不正确输出)和捏造(在系统认识边界之外生成的结构完整输出)。错误是可以纠正的;被流畅性包裹的捏造则不然。在这个框架中,认识论谦逊不是道德美德,而是架构要求:一个没有边界感知能力的判断系统,在结构上是不可信的,无论其平均准确率如何。
Chen et al. (Sun,) studied this question.