Deep neural networks can be understood as discretizing a continuous dynamical system. This literature review analyzes how the multi-particle dynamical system formulation models the self-attention mechanism in transformers. We will discover how this formulation enables the systematic study of the system's convergence towards clusters and its relation with the Kuramoto oscillator.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yuxuan Zhang (Thu,) studied this question.
www.synapsesocial.com/papers/68e02f40f0e39f13e7fa2af1 — DOI: https://doi.org/10.54254/2753-8818/2025.dl27324
Yuxuan Zhang
Theoretical and Natural Science
Building similarity graph...
Analyzing shared references across papers
Loading...