January 1, 2023Open Access

未来透镜：从单一隐藏状态预测后续标记

Key Points

Key points are not available for this paper at this time.

Abstract

我们猜测，与单个输入标记对应的隐藏状态向量包含足够的信息，可以准确预测多个后续标记。更具体地说，本文提出问题：给定输入中位置 t 处单个标记的隐藏（内部）表示，能否可靠地预测位置 ≥ t + 2 处将出现的标记？为此，我们在 GPT-J-6B 中测试线性逼近和因果干预方法，以评估网络中单个隐藏状态包含的信号是否足够丰富，以预测未来隐藏状态，并最终预测标记输出。我们发现，在某些层中，通过单一隐藏状态，我们可以以超过 48% 的准确率逼近模型对后续标记的预测输出。最后，我们展示了一个“未来透镜”可视化工具，利用这些方法创建了一种观察变换器状态的新视角。

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Koyena Pal

Jiuding Sun

Andrew C. Yuan

Actions

Institutions

University of Massachusetts Amherst

Universidad del Noreste

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

未来透镜：从单一隐藏状态预测后续标记

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider