Key points are not available for this paper at this time.
Data augmentation is a common strategy adopted to increase the quantity of training data, avoid overfitting and improve robustness of the models. In this paper, we investigate audio-level speech augmentation methods which directly process the raw signal. The method we particularly recommend is to change the speed of the audio signal, producing 3 versions of the original signal with speed factors of 0.9, 1.0 and 1.1. The proposed technique has a low implementation cost, making it easy to adopt. We present results on 4 different LVCSR tasks with training data ranging from 100 hours to 1000 hours, to examine the effectiveness of audio augmentation in a variety of data scenarios. An average relative improvement of 4.3% was observed across the 4 tasks.
Building similarity graph...
Analyzing shared references across papers
Loading...
Ko et al. (Sun,) studied this question.
www.synapsesocial.com/papers/69fb895c6d730ca589dd5ba1 — DOI: https://doi.org/10.21437/interspeech.2015-711
Tom Ko
Vijayaditya Peddinti
Daniel Povey
Johns Hopkins University
Building similarity graph...
Analyzing shared references across papers
Loading...
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: