The rapid evolution of social networks has positioned multimodal content, including text, images, and audio, as a pivotal medium for self-expression and public sentiment analysis. However, existing multimodal fusion methods are often limited by privacy risks, parameter redundancy, and insufficient exploitation of intermodal correlations. To overcome these challenges, this study introduces a novel federated learning framework that integrates high-order tensor-based multimodal data fusion with privacy-aware decentralized training by keeping raw data local. It leverages tensor Tucker decomposition to capture complex spatial and semantic relationships between modalities, enhancing fusion accuracy while supporting user privacy through local data retention. Experimental results on the separate TREC2017 Precision Medicine Track Scientific Abstracts dataset and on the CMU-MOSI multimodal sentiment benchmark demonstrate that the proposed algorithm outperforms existing methods. The TREC2017 experiments validate the framework’s performance in text-dominant conditions (higher Mean Average Precision, MAP)), while the CMU-MOSI experiments confirm the effectiveness of the high-order tensor fusion in modeling intermodal correlations for multimodal tasks. Furthermore, our framework demonstrates adaptive learning capabilities, efficiently processing diverse multimodal data types without expanding redundant model parameters. This research opens new avenues for privacy-aware multimodal data fusion in social media, offering a robust solution for monitoring and managing online public opinion while supporting user privacy through local data retention.
Building similarity graph...
Analyzing shared references across papers
Loading...
Wan Li
Bin Zhang
PLoS ONE
Building similarity graph...
Analyzing shared references across papers
Loading...
Li et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69fd7fa1bfa21ec5bbf082b2 — DOI: https://doi.org/10.1371/journal.pone.0344980