语言知识驱动的词嵌入向量的可解释性研究

林星星,邱晓枫,刘扬,虞梦夏,祁晶,康司辰

PDF(1432 KB)
PDF(1432 KB)
中文信息学报 ›› 2020, Vol. 34 ›› Issue (8) : 1-9.
语言分析与计算

语言知识驱动的词嵌入向量的可解释性研究

  • 林星星1,邱晓枫2,3,刘扬2,虞梦夏2,3,祁晶2,3,康司辰2
作者信息 +

A Study of Knowledge Motivated Explainalbe Word Embedding Vector

  • LIN Xingxing1, QIU Xiaofeng2,3, LIU Yang2, YU Mengxia2,3, QI Jing2,3, KANG Sichen2
Author information +
History +

摘要

神经网络语言模型应用广泛但可解释性较弱,其可解释性的一个重要而直接的方面表现为词嵌入向量的维度取值和语法语义等语言特征的关联状况。先前的可解释性工作集中于对语料库训得的词向量进行知识注入,以及基于训练和任务的算法性能分析,对词嵌入向量和语言特征之间的关联缺乏直接的验证和探讨。该文应用基于语言知识库的伪语料法,通过控制注入语义特征,并对得到的词嵌入向量进行分析后取得了一些存在性的基础性结论: 语义特征可以通过控制注入到词嵌入向量中;注入语义特征的词嵌入向量表现出很强的语义合成性,即上层概念可以由下层概念表示;语义特征的注入在词嵌入向量的所有维度上都有体现。

Abstract

Neural network language models have many applications without much interpretations. An important and direct aspect of its interpretability is the association between word embedding vectors and linguistic features. The previous work of interpretability focuses on the knowledge injection to corpus-based word embedding and the theoretical analysis of training models, without direct verification and discussion on the correlation between word embedding vectors and linguistic features. In this paper, the pseudo-corpus derived from knowledge bases is applied. Some preliminary findings include: 1) it is feasible to inject semantic features into the word embedding vectors under control; 2) the compositionality of the word embedding vectors, i.e. the upper concept can be represented by the lower concepts, is observed with injected linguistic features; 3) the injection of semantic features is reflected in all dimensions of word embedding vectors.

关键词

可解释性 / 词嵌入向量 / 伪语料法

Key words

interpretability / word embedding vector / pseudo-corpus method

引用本文

导出引用
林星星,邱晓枫,刘扬,虞梦夏,祁晶,康司辰. 语言知识驱动的词嵌入向量的可解释性研究. 中文信息学报. 2020, 34(8): 1-9
LIN Xingxing, QIU Xiaofeng, LIU Yang, YU Mengxia, QI Jing, KANG Sichen. A Study of Knowledge Motivated Explainalbe Word Embedding Vector. Journal of Chinese Information Processing. 2020, 34(8): 1-9

参考文献

[1] Miller T. Explanation in artificial intelligence: Insights from the social sciences[J]. Artificial Intelligence, 2019, 267:1-38.
[2] Kim B, Khanna R, Koyejo O. Examples are not enough, learn to criticize! criticism for interpretability[C]//Proceedings of NIPS, 2016: 2288-2296.
[3] Doshi-Velez F, Kim B. Towards a rigorous science of interpretable machine learning[J]. arXiv preprint arXiv: 1702.08608, 2017.
[4] Christoph M. Interpretable machine learning: A guide for making black box models explainable[EB/OL]. [2020-06-29].https://christophm.github.io/interpretable-ml-book/.
[5] Lars Hulstaert. Interpreting machine learning models[EB/OL]. [2018-02-20]. https://towardsdatascience.com/ interpretability-in-machine-learning-70c30694a05f.
[6] Lesun Y, Kavukcuoglu K, Farabet C. Convolutional networks and applications in vision[C]//Proceedings of the 2010 IEEE International Symposium on Circuits and Systems, 2010: 253-256.
[7] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Proceedings of Advances in Neural Information Processing Systems, 2012: 1097-1105.
[8] Levy O, Goldberg Y. Dependency-based word embeddings[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2014: 302-308.
[9] Zhou B. Interpretable representation learning for visual intelligence[D]. Cambridge: Massachusetts Institute of Technology, 2018.
[10] Ribeiro M T, Singh S, Guestrin C. Why should I trust you?: Explaining the predictions of any classifier[C]//Proceedings of SIGKDD, 2016: 1135-1144.
[11] Panchenko A. Best of both worlds: Making word sense embeddings interpretable[C]//Proceedings of ACL, 2016:2649-2655.
[12] Li W, Wu Y, Lv X. Improving word vector with prior knowledge in semantic dictionary[C]//Proceedings of NLPCC, 2016: 461-469.
[13] Gao B, Bian J, Bai Y, et al. RC-NET: A General framework for incorporating knowledge into word representations [C]//Proceedings of CIKM, 2014: 1219-1228.
[14] Yu M, Dredze M. Improving lexical embeddings with semantic knowledge[C]//Proceedings of ACL, 2014: 545-550.
[15] Sun Y, Lin L, Yang N, et al. Radical-enhanced Chinese character embedding[C]//Proceedings of ICONIP, 2014: 279-286.
[16] Cao S, Lu W, Zhou J, et al. cw2vec: Learning Chinese word embeddings with stroke n-gram information[C]//Proceedings of AAAI, 2018: 5053-5061.
[17] Bojanowski P, Grave E, Joulin A, et al. Enriching word vectors with subword information[C]//Proceedings of TACL, 2017: 135-146.
[18] Faruqui M, Dodge J, Jauhar S K, et al. Retrofitting word vectors to semantic lexicons[J]. arXiv preprint arXiv:1411.4166, 2014.
[19] MrkD?ic N, Séaghdha D ó, Thomson B, et al. Counter-fitting word vectors to linguistic constraints[C]//Proceedings of NAACL, 2016: 142-148.
[20] Levy O, Goldberg Y. Neural word embedding as implicit matrix factorization[C]//Proceedings of NIPS, 2014: 2177-2185.
[21] Li Y, Xu L, Tian F, et al. Word embedding revisited: A new representation learning and explicit matrix factorization perspective[C]//Proceedings of IJCAI, 2015: 3650-3656.
[22] Yin Z, Shen Y. On the dimensionality of word embedding[C]//Proceedings of NIPS, 2018: 895-906.
[23] Gittens A, Achlioptas D, Mahoney M W. Skip-Gram-Zipf + Uniform = Vector Additivity[C]//Proceedings of ACL, 2017: 69-76.
[24] Peng K. Evaluation method of word embedding by roots and affixes[J]. arXiv preprint arXiv:1606.07601, 2016.
[25] 段宇光,刘扬,俞士汶. 《同义词词林》的嵌入表示与应用评估[J]. 厦门大学学报(自然科学版), 2018, 57(6): 867-875.
[26] Lin Z, Liu Y. Implanting rational knowledge into distributed representation at morpheme level[C]//Proceedings of AAAI, 2019:2954-2961.
[27] Resnik P. Disambiguating noun groupings with respect to WordNet senses[M]. Dordrecht: Springer, 1999.
[28] Kamps J, Marx M, Mokken R J, et al. Using WordNet to measure semantic orientations of adjectives[C]//Proceedings of LREC, 2004:1115-1118.
[29] Sussna M, Michael. Word sense disambiguation for free-text indexing using a massive semantic network[C]//Proceedings of CIKM, 1993: 67-74.
[30] Fragos K, Maistros Y, Skourlas C. Word sense disambiguation using WordNet relations[C]//Proceedings of Balkan Conference in Informatics, 2003:633-643.
[31] Miller G A, Beckwith R, Fellbaum C, et al. Introduction to WordNet: An on-line lexical database[J]. International Journal of Lexicography, 1990, 3(4): 235-244.
[32] Miller G A. WordNet : A lexical database for English[J].Communications of the ACM, 1995, 38(11): 39-41.
[33] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of NIPS, 2013: 3111-3119.
[34] Pennington J, Socher R, Manning C. Glove: Global vectors for word representation[C]//Proceedings of EMNLP, 2014: 1532-1543.
[35] Frege G, Geach P T, Black M. Translations from the philosophical writings of Gottlob Frege[M].Blackwell, 1960.
[36] Bowman S R, Gauthier J, Rastogi A, et al. A fast unified model for parsing and sentence understanding[J]. arXiv preprint arXiv:1603.06021, 2016.

基金

国家社会科学基金(16BYY137、18ZDA295)
PDF(1432 KB)

Accesses

Citation

Detail

段落导航
相关文章

/