Short-term facial landmark forecasting is important for anticipatory facial behavior in human–robot interaction, yet models trained with pointwise reconstruction losses often suffer from mean reversion, producing low-error predictions with weakened motion dynamics. To address this issue, we propose a peak-aware gated recurrent unit (GRU) framework that separates forecasting into peak planning and peak-conditioned trajectory generation. The planning stage estimates the timing and intensity of a salient motion peak within the forecast horizon together with a global motion direction, and the generation stage produces short-horizon landmark displacements through temporal gating and structured motion composition. The model is trained with reconstruction loss, peak supervision, peak-integrity regularization, and correlation-based temporal-shape regularization. Experiments on the MEAD dataset using 3D facial landmarks under a subject-independent protocol show a clear distortion–dynamics trade-off. Compared with static and sequence-to-sequence baselines, the proposed method better preserves peak-related facial dynamics while maintaining competitive 24-step prediction accuracy.
Yan et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: