We derive a mechanistic bound on effective information per serial decoding event using the M-ary rate-distortion function. Applied to three independently characterized systems — the ribosome (M=21), human phonology (M=31), and the chromatic scale (M=12) — the framework predicts throughput values within the observed range. We test vocabulary-independence by evaluating 1,749 models across two architectures on a shared reference corpus. Vocabulary size does not predict bits-per-byte in causal language models (p=0.643) or language-matched translation models (Spearman p=0.13). An unexpected finding reveals a quantifiable +1.33-bit information cost when models process input in an unfamiliar language.
Building similarity graph...
Analyzing shared references across papers
Loading...
Grant Lavell Whitmer III (Sun,) studied this question.
www.synapsesocial.com/papers/69cb6541e6a8c024954b95b6 — DOI: https://doi.org/10.5281/zenodo.19322973
Grant Lavell Whitmer III
Wind Power Engineering (Japan)
Building similarity graph...
Analyzing shared references across papers
Loading...