Although Transformer-based trackers have achieved impressive tracking accuracy owing to their strong capability for global context modeling, they still suffer from substantial model complexity and high computational latency. To address these limitations, this paper proposes a lightweight Transformer-based single object tracking method, termed TPTTrack. Specifically, a target-state-guided prompt token is introduced and concatenated with the template and search region features. Constructed from compact target-state information, this token guides cross-region feature interaction toward target-relevant information, thereby enhancing tracking robustness. Furthermore, a hierarchical attention decoupling mechanism is developed to improve shallow feature extraction efficiency and reduce redundant self-attention in deeper layers. In addition, a lightweight autoregressive prediction module is employed for dynamic target-state modeling and efficient state estimation. The results of experiments such as 65.0% AO on GOT-10k and 68.5% precision on LaSOT demonstrate an effective balance between accuracy and efficiency that provides a good trade-off between performance and computational cost.
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhu et al. (Tue,) studied this question.
www.synapsesocial.com/papers/69fd7f86bfa21ec5bbf08059 — DOI: https://doi.org/10.3390/app16094550
Haoran Zhu
Hailong Zhang
Weining Chen
Applied Sciences
Chinese Academy of Sciences
University of Chinese Academy of Sciences
Xi'an Institute of Optics and Precision Mechanics
Building similarity graph...
Analyzing shared references across papers
Loading...