Leveraging hierarchical attention and dynamic fusion mechanisms for multi-modal speech emotion recognition | Synapse