April 14, 2024Open Access

TransformerFAM: 피드백 어텐션은 작업 기억이다

Key Points

Key points are not available for this paper at this time.

Abstract

트랜스포머가 딥러닝에 혁신을 가져왔지만, 그들의 제곱 복잡도의 어텐션은 무한히 긴 입력을 처리하는 능력을 제한한다. 우리는 네트워크가 자체 잠재 표현에 주의를 기울일 수 있도록 피드백 루프를 활용하는 새로운 트랜스포머 구조인 피드백 어텐션 메모리(FAM)를 제안한다. 이 설계는 트랜스포머 내에서 작업 기억의 출현을 촉진하여 무한히 긴 시퀀스를 처리할 수 있게 한다. TransformerFAM은 추가 가중치가 필요 없어 사전 학습된 모델과 원활하게 통합될 수 있다. 우리의 실험 결과는 다양한 모델 크기(1B, 8B, 24B)에서 장기 문맥 작업에 대해 TransformerFAM이 트랜스포머 성능을 크게 향상시킨다는 것을 보여준다. 이러한 결과는 대형 언어 모델(LLM)이 무한한 길이의 시퀀스를 처리할 수 있는 잠재력을 보여준다.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Dongseong Hwang

Weiran Wang

Zhuoyuan Huo

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

TransformerFAM: 피드백 어텐션은 작업 기억이다

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider