June 9, 2015Open Access

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

Key Points

Key points are not available for this paper at this time.

Abstract

Recurrent Neural Networks can be trained to produce sequences of tokens given some input, as exemplified by recent results in machine translation and image captioning. The current approach to training them consists of maximizing the likelihood of each token in the sequence given the current (recurrent) state and the previous token. At inference, the unknown previous token is then replaced by a token generated by the model itself. This discrepancy between training and inference can yield errors that can accumulate quickly along the generated sequence. We propose a curriculum learning strategy to gently change the training process from a fully guided scheme using the true previous token, towards a less guided scheme which mostly uses the generated token instead. Experiments on several sequence prediction tasks show that this approach yields significant improvements. Moreover, it was used successfully in our winning entry to the MSCOCO image captioning challenge, 2015.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Samy Bengio

Oriol Vinyals

Navdeep Jaitly

Actions

Institutions

Google (United States)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Bengio et al. (Tue,) studied this question.

www.synapsesocial.com/papers/69d99f2c2a25b240b7a3d225 — DOI: https://doi.org/10.48550/arxiv.1506.03099

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Learning long-term dependencies with gradient descent is difficult· 1994 · 8,395 citations
OntoNotes· 2006 · 817 citations
Curriculum learning· 2009 · 4,925 citations
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift· 2015 · 24,339 citations
Grammar as a Foreign Language· 2014 · 402 citations

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider