With the rapid advancement of Generative AI, voice cloning has emerged as a significant threat, often used as a primary tool for financial scams and misinformation. My work, titled "Real Talk," details the development of an AI-driven system designed to distinguish between authentic human speech and fabricated vocal patterns. The system utilizes Convolutional Neural Networks (CNNs) to classify audio data by analyzing digital discrepancies. Leveraging signal processing techniques such as MFCC extraction and spectrogram visualization via Librosa, the model was trained on industry-standard datasets, including ASVspoof and LibriSpeech. Technical implementation was performed using Python with the TensorFlow, Keras, and PyTorch frameworks. Key features of the application include multi-format support (.wav, .mp3, .m4a), real-time processing, and a percentage-based confidence meter for authenticity reporting. Experimental results demonstrate a detection rate of over 85% with an analysis latency of less than 10 seconds, providing an accessible and efficient solution for non-technical users to verify audio integrity. This work was conducted at Arab International University (AIU), Syria. The official website of the university is: https://www.aiu.edu.sy
Building similarity graph...
Analyzing shared references across papers
Loading...
Mohammad Kanj
Tarek Barhoum
Arab International University
Building similarity graph...
Analyzing shared references across papers
Loading...
Kanj et al. (Sun,) studied this question.
www.synapsesocial.com/papers/69ddda0de195c95cdefd786e — DOI: https://doi.org/10.5281/zenodo.19540979
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: