Key points are not available for this paper at this time.
Recurrent Neural Networks can be trained to produce sequences of tokens given some input, as exemplified by recent results in machine translation and image captioning. The current approach to training them consists of maximizing the likelihood of each token in the sequence given the current (recurrent) state and the previous token. At inference, the unknown previous token is then replaced by a token generated by the model itself. This discrepancy between training and inference can yield errors that can accumulate quickly along the generated sequence. We propose a curriculum learning strategy to gently change the training process from a fully guided scheme using the true previous token, towards a less guided scheme which mostly uses the generated token instead. Experiments on several sequence prediction tasks show that this approach yields significant improvements. Moreover, it was used successfully in our winning entry to the MSCOCO image captioning challenge, 2015.
Building similarity graph...
Analyzing shared references across papers
Loading...
Samy Bengio
Oriol Vinyals
Navdeep Jaitly
Google (United States)
Building similarity graph...
Analyzing shared references across papers
Loading...
Bengio et al. (Tue,) studied this question.
www.synapsesocial.com/papers/69d99f2c2a25b240b7a3d225 — DOI: https://doi.org/10.48550/arxiv.1506.03099
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: