基于HowNet的语义表示学习

朱靖雯,杨玉基,许斌,李涓子

PDF(2039 KB)
PDF(2039 KB)
中文信息学报 ›› 2019, Vol. 33 ›› Issue (3) : 33-41.
知识表示与知识获取

基于HowNet的语义表示学习

  • 朱靖雯1,杨玉基2,许斌2,李涓子2
作者信息 +

Semantic Representation Learning Based on HowNet

  • ZHU Jingwen1, YANG Yuji2, XU Bin2, LI Juanzi2
Author information +
History +

摘要

HowNet是一个大规模高质量的跨语言(中英)常识知识库,蕴含着丰富的语义信息。该文利用知识图谱领域的方法将HowNet复杂的结构层层拆解,得到了知识图谱形式的HownetGraph,进而利用网络表示学习以及知识表示学习方法得到了跨语言(中、英)、跨语义单位(字词、义项、DEF_CONCEPT和义原)的向量表示,在词语相似度(word similarity)和词语类比(word analogy)任务上对中英文数据集进行了实验,实验结果显示该文提出的方法在词语语义相似度的任务上取得了最好效果。

Abstract

HowNet is a large-scale and high-quality cross-lingual commonsense knowledge base, containing a wealth of semantic information. This paper disassembles HowNets complex structure and obtains HownetGraph in the form of knowledge graph. Then Network Representation Learning and Knowledge Representation Learning methods are applied to obtain cross-lingual vector representation of different semantic units, i.e., word, sense, DEF_CONCEPT and sememe. Two series of experiments (word similarity and word analogy) are conducted on Chinese and English datasets, and the results show the proposed method achieves the best results.

关键词

HowNet / 知识图谱 / 语义表示 / 表示学习

Key words

HowNet / knowledge graph / semantic representation / representation learning

引用本文

导出引用
朱靖雯,杨玉基,许斌,李涓子. 基于HowNet的语义表示学习. 中文信息学报. 2019, 33(3): 33-41
ZHU Jingwen, YANG Yuji, XU Bin, LI Juanzi. Semantic Representation Learning Based on HowNet. Journal of Chinese Information Processing. 2019, 33(3): 33-41

参考文献

[1] 董振东, 董强.知网和汉语研究[J]. 当代语言学, 2001, 3(1): 33-44.
[2] Niu Y, Xie R, Liu Z, et al. Improved word representation learning with sememes [C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017(1): 2049-2058.
[3] 刘群, 李素建. 基于《 知网》 的词汇语义相似度计算[J]. 中文计算语言学, 2002, 7(2): 59-76.
[4] 梅立军, 周强, 臧路, 等. 知网与同义词词林的信息融合研究[J]. 中文信息学报, 2005, 19(1): 64-71.
[5] 孙景广,蔡东风,吕德新,等.基于知网的中文问题自动分类[J].中文信息学报,2007,21(1): 90-95.
[6] Yan J,Bracewell D B, Ren F, et al. The creation of a Chinese emotion ontology based on HowNet[J]. Engineering Letters, 2008, 16(1): 166-171.
[7] 唐怡, 周昌乐, 练睿婷. 基于 HowNet 的中文语义依存分析[J]. 心智与计算, 2010 (2): 109-116.
[8] Liu J, Xu J, Zhang Y. An approach of hybrid hierarchical structure for word similarity computing by HowNet[C]//Proceedings of the 6th International Joint Conference on Natural Language Processing, 2013: 927-931.
[9] 向春丞, 穗志方, 詹卫东. HowNet 与 CCD 映射方法研究[J]. 中文信息学报, 2015, 29(3): 44-51.
[10] Zeng X, Yang C,Tu C, et al. Chinese LIWC lexicon Expansion via Hierarchical classification of word embeddings with sememe Attention[C]//Proceedings of AAAI 2018, 2018.
[11] Perozzi B, Al-Rfou R, Skiena S. DeepWalk: Online learning of social representations [C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2014: 701-710.
[12] Tang J, Qu M, Wang M, et al. LINE: Large-scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2015: 1067-1077.
[13] Grover A,Leskovec J. node2vec: Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016: 855-864.
[14] Cao S, Lu W,Xu Q. Grarep: Learning graph representations with global structural information[C]//Proceedings of the 24th ACM International Conference on Information and Knowledge Management. ACM, 2015: 891-900.
[15] Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks[J]. arXiv preprint arXiv: 1609.02907, 2016.
[16] Yang C, Liu Z, Zhao D, et al. Network representation learning with rich text information[C]//Proceedings of the 24th IJCAI., 2015: 2111-2117.
[17] Tu C, Liu H, Liu Z, et al. Cane: Context-aware network embedding for relation modeling[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017, 1: 1722-1731.
[18] Bordes A, Usunier N, Garcia-Duran A, et al. Translating embeddings for modeling multi-relational data[C]//Proceedings of the 27th ALL on Neural Information Processing Systems, 2013: 2787-2795.
[19] Wang Z, Zhang J, Feng J, et al. Knowledge gGraph embedding by translating on hyperplanes[C]//Proceedings of the 14th AAAI conference on Artifical Intelligence, 2014(14): 1112-1119.
[20] Lin Y, Liu Z, Sun M, et al. Learning entity and relation embeddings for knowledge graph completion[C]//Proceedings of the 29th AAAI Conference on Artifical Intelligence 2015(15): 2181-2187.
[21] Ji G, He S, Xu L, et al. Knowledge graph embedding via dynamic mapping matrix[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015, 1: 687-696.
[22] Ji G, Liu K, He S, et al. Knowledge graph completion with adaptive sparse transfer matrix[C]//Proceedings of the 30th AAAI Conference on Artifical Intelligence, 2016: 985-991.
[23] Xiao H, Huang M,Hao Y, et al. TransG: A generative mixture model for knowledge graph embedding[J]. arXiv preprint arXiv: 1509.05488, 2015.
[24] He S, Liu K,Ji G, et al. Learning to represent knowledge graphs with gaussian embedding[C]//Proceedings of the 24th ACM International Conference on Information and Knowledge Management. ACM, 2015: 623-632.
[25] Chen X, Xu L, Liu Z, et al. Joint learning of character and word embeddings[C]//Proceedings of IJCAI, 2015: 1236-1242.
[26] Neelakantan A, Shankar J, Passos A, et al. Efficient non-parametric estimation of multiple embeddings per word in vector space[J]. arXiv preprint arXiv: 1504.06654, 2015.
[27] Xie R, Yuan X, Liu Z, et al. Lexical sememe prediction via word embeddings and matrix factorization[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence. AAAI Press, 2017: 4200-4206.

基金

国家高技术研究发展计划(863)(2015AA015401);国家科技部重点研发计划(2018YFB100283)
PDF(2039 KB)

1195

Accesses

0

Citation

Detail

段落导航
相关文章

/