基于词语关系的词向量模型

PDF(1528 KB)

中文信息学报 ›› 2017, Vol. 31 ›› Issue (3) : 25-31.

语言分析与计算

基于词语关系的词向量模型

蒋振超；李丽双；黄德根

作者信息 +

Word Representation Based on Word Relations

JIANG Zhenchao; LI Lishuang; HUANG Degen

Author information +

History +

摘要

词向量能够以向量的形式表示词的意义,近来许多自然语言处理应用中已经融入词向量,将其作为额外特征或者直接输入以提升系统性能。然而,目前的词向量训练模型大多基于浅层的文本信息,没有充分挖掘深层的依存关系。词的词义体现在该词与其他词产生的关系中,而词语关系包含关联单位、关系类型和关系方向三个属性,因此,该文提出了一种新的基于神经网络的词向量训练模型,它具有三个顶层,分别对应关系的三个属性,更合理地利用词语关系对词向量进行训练,借助大规模未标记文本,利用依存关系和上下文关系来训练词向量。将训练得到的词向量在类比任务和蛋白质关系抽取任务上进行评价,以验证关系模型的有效性。实验表明,与skip-gram模型和CBOW模型相比,由关系模型训练得到的词向量能够更准确地表达词语的语义信息。

Abstract

In natural language processing tasks, distributed word representation has succeeded in capturingsemantic regularities and have been used as extra features. However, most word representation model arebased shallow context-window, which are not enough to express the meaning of words. The essence of wordmeaning lies in the word relations, which consist of three elements: relation type, relation direction and relateditems. In this paper, we leverage a large set of unlabeled texts, to make explicit the semantic regularity toemerge in word relations, including dependency relations and context relations, and put forward a novelarchitecture for computing continuous vector representation. We define three different top layers in the neuralnetwork architecture as corresponding to relation type, relation direction and related words, respectively.Different from other models, the relation model can use the deep syntactic information to train wordrepresentations. Tested in word analogy task and Protein-Protein Interaction Extraction task, the results showthat relation model performs overall better than others to capture semantic regularities.

导出引用

蒋振超；李丽双；黄德根. 基于词语关系的词向量模型. 中文信息学报. 2017, 31(3): 25-31

JIANG Zhenchao; LI Lishuang; HUANG Degen. Word Representation Based on Word Relations. Journal of Chinese Information Processing. 2017, 31(3): 25-31

参考文献

[1] Salton G, Wong A, Yang C S. A vector space model for automatic indexing[J]. Communications of the ACM, 1975, 18(11): 613-620.
[2] Bengio Y, Ducharme R, Vincent P, et al.A Neural Probabilistic Language Model[J].The Journal of Machine Learning Research.2003,3: 1137-1155.
[3] Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv: 1301.3781, 2013.
[4] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the Advances in Neural Information Processing Systems. 2013: 3111-3119.
[5] Pennington J, Socher R, Manning CD. GloVe: Global Vectors for Word Representation[C]//Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014), 2014.
[6] Tian F, Dai H, Bian J, Gao B. A Probabilistic Model for Learning Multi-Prototype Word Embeddings[C]//Proceedings of Coling 2014, 2014: 151-160.
[7] Qiu S, Gao B. Co-learning of Word Representations and Morpheme Representations[C]//Proceedings of Coling 2014, 2014: 141-150.
[8] Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from scratch[J]. The Journal of Machine Learning Research, 2011, (12): 2493-2537.
[9] Socher R, Pennington J, Huang E H, et al. Semi-supervised recursive autoencoders for predicting sentiment distributions[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2011: 151-161.
[10] Levy O, Goldberg Y. Dependency-based word embeddings[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014.
[11] Mnih A, Hinton G E. A scalable hierarchical distributed language model[C]//Proceedings of the Advances in neural information processing systems. 2009: 1081-1088.
[12] Bunescu R, Ge R, Kate R J, et al. Comparative experiments on learning information extractors for proteins and their interactions[J]. Artificial intelligence in medicine, 2005, 33(2): 139-155.
[13] Pyysalo S, Ginter F, Heimonen J, et al. BioInfer: a corpus for information extraction in the biomedical domain[J]. BMC bioinformatics, 2007, 8(1): 50.
[14] Fundel K, Küffner R, Zimmer R. RelEx—relation extraction using dependency parse trees[J]. Bioinformatics, 2007, 23(3): 365-371.
[15] Ding J, Berleant D, Nettleton D, et al. Mining MEDLINE: abstracts, sentences, or phrases[C]//Proceedings of the pacific symposium on biocomputing. 2002, (7): 326-337.
[16] Nédellec C. Learning language in logic-genic interaction extraction challenge[C]//Proceedings of the 4th Learning Language in Logic Workshop (LLL05). 2005: 7.
[17] Li L, Jiang Z, Huang D. A general instance representation architecture for protein-protein interaction extraction[C]//Proceedings of International Conference on Bioinformatics and Biomedicine, 2014: 497-500.

基金

国家自然科学基金(61672126、61173101)

PDF(1528 KB)

737

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

Received	Published
2015-09-23	2017-06-15
Issue Date
2017-06-15

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金