What question did this study set out to answer?

The aim is to develop a lightweight framework for real-time detection of deepfake content across various media types.

April 10, 2026Open Access

Real-Time Generalized Deepfake Detection via Multi-Modal Fusion and Explainable Artificial Intelligence for Cross-Platform Validation

Key Points

The aim is to develop a lightweight framework for real-time detection of deepfake content across various media types.
Developed a multimodal deepfake detection framework.
Integrated ResNet18 for spatial feature extraction.
Employed EfficientNet for analyzing video frames.
Utilized Wav2Vec2 for audio representation learning.
Tested on datasets including FaceForensics++ and Celeb-DF.
Achieved 88.25% accuracy for image-based detection.
Referred 70.56% accuracy in video frame analysis.
Recorded 81.50% accuracy for audio classification.
Demonstrated real-time performance with low latency and computational overhead.

Abstract

The rapid spread of deepfake content on social networks, video conferencing platforms, and voice communication systems means we need to find ways to detect it that are fast and work well. This paper presents a lightweight multimodal deepfake detection framework designed for real-time deployment under resource-constrained environments. The system integrates a hybrid architecture combining ResNet18-based convolutional neural networks for spatial feature extraction, EfficientNet for frame-level video analysis, and Wav2Vec2 for audio representation learning. We use these tools to get information from each type of media and then combine them to make a decision. We tested our system using some datasets like FaceForensics++ Celeb-DF and ASVspoof 2019. Experimental results demonstrate an accuracy of 88.25% for image-based detection, 70.56% for video frame analysis, and 81.50% for audio classification under CPU-only deployment. The system achieves real-time performance with low latency and reduced computational overhead, making it suitable for practical applications. Our approach provides an effective trade-off between detection accuracy and computational efficiency, enabling deployment in real-world scenarios such as social media content moderation, secure video conferencing, and voice phishing prevention.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Sanjana Shetty

Ketaki Sakhadeo

Tejashree Deore

Journals

Cureus Journal of Computer Science.

Actions

Institutions

MIT Art, Design and Technology University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Real-Time Generalized Deepfake Detection via Multi-Modal Fusion and Explainable Artificial Intelligence for Cross-Platform Validation

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study