词汇语义表示研究综述

袁书寒,向 阳

PDF(1812 KB)
PDF(1812 KB)
中文信息学报 ›› 2016, Vol. 30 ›› Issue (5) : 1-8.
综述

词汇语义表示研究综述

  • 袁书寒,向 阳
作者信息 +

A Review on Lexical Semantic Representation

  • YUAN Shuhan, XIANG Yang
Author information +
History +

摘要

构建能够表达语义特征的词语表示形式是自然语言处理的关键问题。该文首先介绍了基于分布假设和基于预测模型的词汇语义表示方法,并给出目前词表示方法的评价指标;进而介绍了基于词汇表示所蕴含的语义信息而产生的新应用;最后,对词汇语义表示研究的方法和目前面临的问题进行了分析和展望。

Abstract

Constructing the words representation which could express the semantic features is the key problem of Natural Language Processing. In this paper, we first introduce the lexical semantic representation based on the distributional hypothesis and prediction model, and describe the evaluations methods of words representation. Then we review the new applications based on the semantic information of words representation. Finally, we discuss the development directions and exiting problems of lexical semantic representation.

关键词

词汇表示 / 语义 / 分布假设 / 深度学习

Key words

words representation / semantic / distributional hypothesis / deep learning

引用本文

导出引用
袁书寒,向 阳. 词汇语义表示研究综述. 中文信息学报. 2016, 30(5): 1-8
YUAN Shuhan, XIANG Yang. A Review on Lexical Semantic Representation. Journal of Chinese Information Processing. 2016, 30(5): 1-8

参考文献

[1] 孙茂松, 刘挺, 姬东鸿, 等. 语言计算的重要国际前沿 [J]. 中文信息学报, 2014, 28(1): 1-8.
[2] Turney P, Pantel P. From Frequency to Meaning?: Vector Space Models of Semantics [J]. Journal of Artificial Intelligence Research, 2010, 37: 141-188.
[3] Sahlgren M. The Word-Space Model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces [D]. Stockholm University, 2006.
[4] Bishop C M. Pattern Recognition and Machine Learning [M]. 2006.
[5] Vayrynen J J, Honkela T. Word Category Maps based on Emergent Features Created by ICA [J]. Proceedings of the STeP, 2004, 19: 173-185.
[6] Lebret R, Collobert R. Word Embeddings through Hellinger PCA [C]//Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. 2014: 482-490.
[7] Hintor G E, Mcclelland J L, Rumelhart D E. Distributed representations [J]. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, 1986, 1: 77-109.
[8] Hinton G E, Salakhutdinov R R. Reducing the Dimensionality of Data with Neural Networks [J]. Science, 2006, 313: 504-507.
[9] Bengio Y. Learning deep architectures for AI [J]. Foundations and Trends in Machine Learning, 2009, 2(1): 1-127.
[10] Bengio Y. Deep Learning of Representations: Looking Forward [C]//Proceedings of the International Conference on Statistical Language and Speech Processing. 2013: 1-37.
[11] Bengio Y, Courville A, Vincent P. Representation learning: A review and new perspectives [J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2013, 35(8): 1798-1828.
[12] Bengio Y, Ducharme R, Vincent P,et al. A Neural Probabilistic Language Model [J]. Journal of Machine Learning Research, 2003, 3: 1137-1155.
[13] Rojas R. The Backpropagation Algorithm [G]. Neural Networks - A Systematic Introduction, 1996.
[14] Collobert R, Weston J, Bottou L,et al. Natural Language Processing (almost) from Scratch [J]. Journal of Machine Learning Research, 2011(12): 2493-2537.
[15] Tomas M, Karafiat M, Burget L et al. Recurrent neural network based language model [C]//Proceedings of INTERSPEECH, 2010: 1045-1048.
[16] Sutskever I, Martens J, Hinton G. Generating Text with Recurrent Neural Networks [C]//Proceedings of the 28th International Conference on Machine Learning, 2011:1017-1024.
[17] Tomas M. Statistical Language Models based on Neural Networks [D]. Brno University of Technology, 2012.
[18] Yao K, Zweig G. Recurrent Neural Networks for Language Understanding [C]//Proceedings of INTERSPEECH, 2013: 2524-2528.
[19] Mikolov T, Kombrink S, Burget L,et al. Extensions of recurrent neural network language model [C]//Proceedings of ICASSP, 2011: 5528-5531.
[20] Boden M. A guide to recurrent neural networks and backpropagation [R]. 2002: 1-10.
[21] Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is difficult [J]. Neural Networks, IEEE Transactions on, 1997, 5(2): 157-166.
[22] Hochreiter S, Schmidhuber J. Long short-term memory [J]. Neural computation, 1997, 9(8): 1735-1780.
[23] Graves Alex. Supervised Sequence Labelling with Recurrent Neural Networks [M]. 2012.
[24] Palangi H, Deng L, Shen Y等. Deep Sentence Embedding Using the Long Short Term Memory Network: Analysis and Application to Information Retrieval [J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2016, 24(4): 694-707.
[25] Turian J, Ratinov L, Bengio Y. Word representations?: A simple and general method for semi-supervised learning [C]//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 2010: 384-394.
[26] Mikolov T, Corrado G, Chen K,et al. Efficient Estimation of Word Representations in Vector Space[C]//Proceedings of Workshop at ICLR, 2013.
[27] Mikolov T, Yin W, Zweig G. Linguistic regularities in continuous space word representations [C]//Proceedings of NAACL-HLT, 2013: 746-751.
[28] Omer L, Yoav G. Neural Word Embeddings as Implicit Matrix Factorization [C]//Proceedings of NIPS, 2014:2177-2185.
[29] Mnih A. A fast and simple algorithm for training neural probabilistic language models[C]//Proceedings of the 29th International Conference on Machine Learning. 2012.
[30] Mnih A, Kavukcuoglu K. Learning word embeddings efficiently with noise-contrastive estimation [C]//Proceedings of NIPS, 2013: 2265-2273.
[31] GUTMANN M U, HYV?RINEN A. Noise-contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics[J]. J. Mach. Learn. Res., 2012, 13(1): 307-361.
[32] Spearman's rank correlation coefficient[J]. Wikipedia, the free encyclopedia, .
[33] Finkelstein L, Gabrilovich E, Matias Y,et al. Placing Search in Context: The Concept Revisited [J]. ACM Trans. Inf. Syst., 2002, 20(1): 116-131.
[34] Bruni E, Boleda G, Baroni M,et al. Distributional Semantics in Technicolor [C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, 2012: 136-145.
[35] Hill F, Reichart R, Korhonen A. SimLex-999: Evaluating Semantic Models with (Genuine) Similarity Estimation [R], 2014.
[36] Luong M, Manning C D. Better Word Representations with Recursive Neural Networks for Morphology [C]//Proceedings of CoNLL, 2013: 104-113.
[37] Gao B, Bian J, Liu T-Y. WordRep: A Benchmark for Research on Learning Word Representations[J]. arXiv:1407.1640 [cs], 2014.
[38] Collobert R, Weston J. A Unified Architecture for Natural Language Processing?: Deep Neural Networks with Multitask Learning[C]//Proceedings of the 25th International Conference on Machine Learning, 2008: 160-167.
[39] Erk K, Mccarthy D, Gaylord N. Measuring Word Meaning in Context [J]. Computational Linguistics, 2013, 39(3): 511-554.
[40] Jacob A, Dan K. How much do word embeddings encode about syntax [C]//Proceedings of ACL, 2014:822-827.
[41] Zweig G, Burges C. The Microsoft Research Sentence Completion Challenge [R]. MSR-TR-2011-129, 2011.
[42] Mitchell J, Lapata M. Composition in Distributional Models of Semantics [J]. Cognitive Science, 2010, 34(8): 1388-1429.
[43] Socher R, Huval B, Manning D,et al. Semantic Compositionality through Recursive Matrix-Vector Spaces[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2012: 1201-1211.
[44] Scoher R, Perelygin A, Wu Y,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank [C]//Proceedings of EMNLP, 2013: 1631-1642.
[45] Kalchbrenner N, Grefenstette E, Blunsom P. A Convolutional Neural Network for Modelling Sentences [C]//Proceedings of ACL, 2014: 655-665.
[46] Wenpeng Y, Hinrich S. Convolutional Neural Network for Paraphrase Identification [C]//Proceedings of NAACL, 2015: 901-911.
[47] Bordes A, Weston J, Collobert R,et al. Learning Structured Embeddings of Knowledge Bases[C]//Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011:301-306.
[48] Bordes A, Usunier N, Garcia A,et al. Translating Embeddings for Modeling Multi-relational Data [C]//Proceedings of NIPS, 2013: 2787-2795.
[49] Weston J, Bordes A, Yakhnenko O,et al. Connecting Language and Knowledge Bases with Embedding Models for Relation Extraction[C]//Proceedings of EMNLP, 2013: 1366-1371.
[50] Jason W. Embeddings for KB and text epresentation, extraction and question answering [R]. 2014.
[51] Ruiji F, Jiang G, Bing Q. Learning Semantic Hierarchies via Word Embeddings [C]//Proceedings of ACL, 2014: 1199-1209.
[52] Bordes A, Globot X, Weston J et al. Joint learning of words and meaning representations for open-text semantic parsing [C]//Proceedings of the International Conference on Artificial Intelligence and Statistics. 2012.
[53] Wang Z, Zhang J, Feng J et al. Knowledge Graph Embedding by Translating on Hyperplanes[C]//Proceedings of the AAAI. 2014.
[54] Garcia A, Bordes A, Usunier N et al. Combining Two and Three-Way Embeddings Models for Link Prediction in Knowledge Bases [J]. Journal of Artificial Intelligence Research. 2016, 55: 715-742
[55] Cho K, Van M, Bahdanau D et al. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches[J]. arXiv:1409.1259 [cs, stat], 2014.
[56] Sutskever I, Vinyals O, Le V. Sequence to Sequence Learning with Neural Networks [C]//Proceedings of NIPS, 2014:310-3112.
[57] Cho K, Van M, Gulcehre C et al. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation [C]//Proceedings of EMNLP, 2014: 1724--1734.
[58] Mo Y, Mark D. Improving Lexical Embeddings with Semantic Knowledge [C]//Proceedings of ACL, 2014: 545-550.
[59] Bain J, Gao B, LIU T-Y. Knowledge-Powered Deep Learning for Word Embedding[C]//Proceedings of ECML, 2014: 132-148.
[60] Omer L, Yoav G. Dependency-Based Word Embeddings [C]//Proceedings of ACL, 2014: 302-308.
[61] Bruni E, Baroni M. Multimodal Distributional Semantics [J]. Journal of Arti?cial Intelligence Research, 2014, 49: 1-47.
[62] Kiros R, Salakhutdinov R, Zemel R. Multimodal Neural Language Models[C]//Proceedings of ICML, 2014: 595-603.
[63] Kiros R, Salakhutdinov R, Zemel S. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models [J]. TACL, 2015.
[64] Srivastava N, Salakhutdinov R. Multimodal Learning with Deep Boltzmann Machines [C]//Proceedings of NIPS, 2013.
[65] Vinyals O, Toshev A, Bengio S等. Show and Tell: A Neural Image Caption Generator[C]//Proceedings of CVPR, 2014: 3156-3164.
[66] Dean J, Ghemawat S. MapReduce: Simplified Data Processing on Large Clusters [J]. Commun. ACM, 2008, 51(1): 107-113.
[67] Dean J, Corrado G, Monga R,et al. Large Scale Distributed Deep Networks[C]//Proceedings of NIPS. 2012: 1223-1231.
[68] Collobert R, Kavukcuoglu K, Farabet C. Torch7: A Matlab-like Environment for Machine Learning [C]//Proceedings of NIPS Workshop, 2011.
[69] Jia Y, Shelhamer E, Donahue J,et al. Caffe: Convolutional Architecture for Fast Feature Embedding[C]//Proceedings of ACM international conference on Multimedia, 2014.
[70] Bastien F, Lamblin P, Pascanu R,et al. Theano: new features and speed improvements[M]. 2012.
[71] Baroni M, Dinu G, Kruszewski G. Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors [C]//Proceedings of the 52nd ACL. 2014: 238-247.

基金

国家重点基础研究发展计划(973计划)(2014CB340404);上海市科委科研计划项目(14511108002);国家自然科学基金(71171148,71571136);上海市科委基础研究项目(16JC1403000)
PDF(1812 KB)

1026

Accesses

0

Citation

Detail

段落导航
相关文章

/