孙媛,王丽客,郭莉莉. 基于改进词向量GRU神经网络模型的藏语实体关系抽取[J]. 中文信息学报, 2019, 33(6): 35-41.
SUN Yuan, WANG Like, GUO Lili. Tibetan Entity Relation Extraction Based on Optimized Word Embedding with GRU Neural Network. , 2019, 33(6): 35-41.
Tibetan Entity Relation Extraction Based on Optimized Word Embedding with GRU Neural Network
SUN Yuan1,2, WANG Like1,2, GUO Lili1,2
1.School of Information Engineering, Minzu University of China, Beijing 100081, China; 2.Minority Languages Branch, National Language Resource and Monitoring Research Center, Minzu University of China, Beijing 100081, China
Abstract:To facilitate the structural analysis Tibetan and development of deeper of Tibetan knowledge, this paper proposes a method for Tibetan entity relation extraction based on optimized word vectors with GRU neural network model. In the training of the model, we apply the optimized word vector. In order to get a relatively good word vector model, we introduce the Tibetan syllable vector, the syllable position vector, part of speech vector and so on to further optimize the word vector. And we also select Tibetan lexical features and Tibetan sentence features. Experiments show that the proposed method achieves F1 value of 78.43%.
[1] Xu M,Wang Z,Bie R,et al.Discovering missing semantic relations between entities in Wikipedia[C]// Proceedings of the 12th International Semantic Web Conference,2013:673-686. [2] Bizer C,Heath T,Berners-Lee T.Linked data-the story so far[J].International Journal on Semantic Web and Information Systems,2009,5(3):1-22. [3] Bao J,Duan N,Zhou M,et al.Knowledge-based question answering as machine translation[C]//Proceedings of ACL,2014:967 - 976. [4] Bikel D,Castelli V,Florian R.Entity linking and slot filling through statistical processing and inference rules[C]//Proceedings of the 2nd Text Analysis Conference,2009. [5] Qian L,Zhou G,Kong F,et al.Exploiting constituent dependencies for tree kernel-based semantic relation extraction[C]//Proceedings of COLING,2008:697-704. [6] Hendrickx I,Kim S N,Kozareva Z,et al.SemEval-2010 task 8:Multi-way classification of semantic relations between pairs of nominals [C]// Proceedings of the Workshop on Semantic Evaluations:Recent Achievements & Future Directions,2010:33-38. [7] Girju R,Nakov P,Nastase V,et al.Classification of semantic relations between nominals[J].Language Resources and Evaluation,2009,43(2):105-121. [8] 孙建东,顾秀森,李彦,等.基于COAE2016数据集的中文实体关系抽取算法研究[J].山东大学学报(理学版),2017,(09):10-15+21. [9] 邓擘,樊孝忠,杨立公.用语义模式提取实体关系的方法[J].计算机工程,2007,33(10):212-214. [10] 车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005,19(2):1-6. [11] Zhou G,Su J,Zhang J,et al.Exploring various knowledge in relation extraction[C]// Proceedings of ACL,2005:427-434. [12] 董静,孙乐,冯元勇,等.中文实体关系抽取中的特征选择研究[J].中文信息学报,2007,21(4):80-91. [13] Zelenko D,Aone C,Richardella A.Kernel methods for relation extraction[J].Journal of Machine Learning Research,2003,3(3):1083-1106. [14] Culotta A,Sorensen J.Dependency tree kernels for relation extraction[C]// Proceedings of ACL,Barcelona,Spain,2004:423-429. [15] Zhang M,Zhang J,Su J,et al.A composite kernel to extract relations between entities with both flat and structured features[C]// Proceedings of ACL,Sydney,Australia,2006:825-832. [16] Huang R,Sun L,Feng Y.Study of kernel-based methods for Chinese relation extraction[M].Information Retrieval Technology,Springer Berlin Heidelberg,2008:598-604. [17] 陈鹏,郭剑毅,余正涛,等.融合领域知识短语树核函数的中文领域实体关系抽取[J].南京大学学报:自然科学版,2015(1):181-186. [18] Hinton G.E.,Salakhutdinov R.R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507. [19] 陈宇,郑德权,赵铁军.基于Deep Belief Nets的中文名实体关系抽取[J].软件学报,2012,23(10):2572-2585. [20] Zeng D,Liu K,Lai S,et al.Relation classification via convolutional deep neural network[C]// Proceedings of COLING,2014:2335-2344. [21] Cai R,Zhang X,Wang H.Bidirectional recurrent convolutional neural network for relation classification[C]// Proceedings of ACL,2016:756-765. [22] 梁金宝.藏语历史文献词汇统计[D].北京:中国社科院民族学与人类学研究所博士学位论文,2013. [23] 祁坤钰.信息处理用藏语自动分词研究[J].西北民族大学学报:哲学社会科学版,2006,26(4):92-97. [24] 才智杰,才让卓玛.藏语自动分词系统的设计[J].计算机工程与科学,2011,33(5):151-154. [25] Sun Y,Yan X,Zhao X,et al.Research on automatic recognition of Tibetan personal names based on multi-features[C]//Proceedings of the International Conference on Natural Language Processing & Knowledge Engineering,IEEE,2010. [26] 加羊吉,李亚超,宗成庆,等.最大熵和条件随机场模型相融合的藏语人名识别[J].中文信息学报,2014,28(1):107-112. [27] 朱臻,孙媛.基于SVM和泛化模版协作的藏语人物属性抽取[J].中文信息学报,2015,29(6):220-227. [28] 兰义湧.藏语人名属性抽取及消歧研究[D].北京:中央民族大学博士学位论文,2016. [29] Mikolov,T,Wentau Y,Geoffrey Z.Linguistic regularities in continuous space word representations[C]// Proceedings of NAACL-HLT,2013:746-751.