论文信息 - Skip-Thought Vectors - 字舞流文

Skip-Thought Vectors

We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct the surrounding sentences of an encoded passage. Sentences that share semantic and syntactic properties are thus mapped to similar vector representations. We next introduce a simple vocabulary expansion method to encode words that were not seen as part of training, allowing us to expand our vocabulary to a million words. After training our model, we extract and evaluate our vectors with linear models on 8 tasks: semantic relatedness, paraphrase detection, image-sentence ranking, question-type classification and 4 benchmark sentiment and subjectivity datasets. The end result is an off-the-shelf encoder that can produce highly generic sentence representations that are robust and perform well in practice.

Sanja Fidler | Raquel Urtasun | Ruslan Salakhutdinov | Richard S. Zemel | Yukun Zhu | Antonio Torralba | Ryan Kiros | Ryan Kiros | R. Salakhutdinov | R. Zemel | A. Torralba | R. Urtasun | S. Fidler | Yukun Zhu

[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[2] Dan Roth,et al. Learning Question Classifiers , 2002, COLING.

[3] Bo Pang,et al. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[4] Chris Quirk,et al. Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources , 2004, COLING.

[5] Bing Liu,et al. Mining and summarizing customer reviews , 2004, KDD.

[6] Claire Cardie,et al. Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[7] Eiichiro Sumita,et al. Using Machine Translation Evaluation Techniques to Determine Sentence-level Semantic Equivalence , 2005, IJCNLP.

[8] Bo Pang,et al. Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[9] Stephen Wan,et al. Using Dependency-Based Features to Take the ’Para-farce’ out of Paraphrase , 2006, ALTA.

[10] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[11] Noah A. Smith,et al. Paraphrase Identification as Probabilistic Quasi-Synchronous Recognition , 2009, ACL.

[12] Jeffrey Pennington,et al. Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection , 2011, NIPS.

[13] Nitin Madnani,et al. Re-examining Machine Translation Metrics for Paraphrase Identification , 2012, NAACL.

[14] Christopher D. Manning,et al. Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[15] Quoc V. Le,et al. Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.

[16] Jacob Eisenstein,et al. Discriminative Improvements to Distributional Sentence Similarity , 2013, EMNLP.

[17] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[18] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[19] Phil Blunsom,et al. Recurrent Continuous Translation Models , 2013, EMNLP.

[20] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[21] Quoc V. Le,et al. Grounded Compositional Semantics for Finding and Describing Images with Sentences , 2014, TACL.

[22] M. Marelli,et al. SemEval-2014 Task 1: Evaluation of Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment , 2014, *SEMEVAL.

[23] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[24] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[25] Phil Blunsom,et al. A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[26] Alexander F. Gelbukh,et al. UNAL-NLP: Combining Soft Cardinality Features for Semantic Textual Similarity, Relatedness and Entailment , 2014, *SEMEVAL.

[27] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[28] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.

[29] Malvina Nissim,et al. The Meaning Factory: Formal Semantics for Recognizing Textual Entailment and Determining Semantic Similarity , 2014, *SEMEVAL.

[30] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[31] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[32] Quoc V. Le,et al. Distributed Representations of Sentences and Documents , 2014, ICML.

[33] Alice Lai,et al. Illinois-LH: A Denotational and Distributional Approach to Semantics , 2014, *SEMEVAL.

[34] Man Lan,et al. ECNU: One Stone Two Birds: Ensemble of Heterogenous Measures for Semantic Relatedness and Textual Entailment , 2014, *SEMEVAL.

[35] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[36] Christopher D. Manning,et al. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[37] Lior Wolf,et al. Associating neural word embeddings with deep image representations using Fisher Vectors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Wei Xu,et al. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.

[39] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[40] Han Zhao,et al. Self-Adaptive Hierarchical Sentence Model , 2015, IJCAI.

[41] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[42] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[43] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.