Deep multi-modal fusion transformer for emotion recognition | Synapse