旅游场景下的实体别名抽取联合模型

杨一帆,陈文亮

PDF(2275 KB)
PDF(2275 KB)
中文信息学报 ›› 2020, Vol. 34 ›› Issue (6) : 55-63.
信息抽取与文本挖掘

旅游场景下的实体别名抽取联合模型

  • 杨一帆,陈文亮
作者信息 +

Joint Model for Entity Alias Extraction in Tourism Domain

  • YANG Yifan, CHEN Wenliang
Author information +
History +

摘要

目前互联网中包含了大量的实体介绍文本,为实体知识构建提供了资源基础。别名作为实体的一种属性,是实体正式名称的不同表达,在知识图谱构建中具有重要意义。该文以景点介绍文本作为语料,结合不同别名描述方式提出别名标注策略,人工构建别名标注数据集。别名抽取可分为实体识别与关系分类两个子任务。该文提出基于深度学习的景点实体别名抽取联合模型,同时完成两个子任务。在该文构建的数据集上的实验结果表明,联合模型与流水线式处理模型相比性能有显著提高。

Abstract

At present, the Internet contains a large amount of entity introduction texts, which provides a resource basis for the construction of entity knowledge. As an attribute of entity, an alias is a different expression of the official name of an entity with great significance in knowledge graphs. In this paper, the introduction text of the attraction is used as a corpus, and the alias annotation strategy is proposed with the combination of different alias description methods. Alias extraction can be divided into two subtasks: entity recognition and relation classification. This paper proposes a joint model of scenic entity alias extraction based on deep learning, and completes two subtasks simultaneously. The experimental results on the data set constructed in this paper show that the performance of the joint model is significantly improved compared with the pipelined model.

关键词

旅游景点 / 实体别名 / 联合模型 / 实体识别

Key words

tourist attraction / entity alias / joint model / entity recognition

引用本文

导出引用
杨一帆,陈文亮. 旅游场景下的实体别名抽取联合模型. 中文信息学报. 2020, 34(6): 55-63
YANG Yifan, CHEN Wenliang. Joint Model for Entity Alias Extraction in Tourism Domain. Journal of Chinese Information Processing. 2020, 34(6): 55-63

参考文献

[1] Chinchor N, Robinson P. MUC-7 named entity task definition[C]//Proceedings of the 7th Conference on Message Understanding, 1997, 29: 1-21.
[2] Hendrickx I, Kim S N, Kozareva Z, et al. Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals[C]//Proceedings of the 5th International Workshop on Semantic Evaluation. Association for Computational Linguistics, 2010: 33-38.
[3] Bach N, Badaskar S. A review of relation extraction[J]. Literature Review for Language and Statistics II, 2007, 2: 1-15.
[4] Suchanek F M, Kasneci G, Weikum G. Yago: A core of semantic knowledge[C]// Proceedings of the 16th International Conference on World Wide Web. ACM, 2007: 697-706.
[5] Peters M E, Neumann M, Iyyer M, et al. Deep contextualized word representations[J]. arXiv preprint arXiv: 1802.05365, 2018.
[6] Lafferty J D, McCallum A, Pereira F C N. Conditional random fields: probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the 18th International Conference on Machine Learning, 2001: 282-289.
[7] Passos A, Kumar V, McCallum A. Lexicon infused phrase embeddings for named entity resolution[J]. arXiv preprint arXiv: 1404.5367, 2014: 78-86
[8] Collobert R, Weston J. A unified architecture for natural language processing: Deep neural networks with multitask learning[C]//Proceedings of the 25th International Conference on Machine Learning. ACM, 2008: 160-167.
[9] Lample G, Ballesteros M, Subramanian S, et al. Neural architectures for named entity recognition[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016: 260-270.
[10] Ma X, Hovy E. End-to-end sequence labeling via Bi-directional LSTM-CNNs-CRF[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 1064-1074.
[11] Kambhatla N. Combining lexical, syntactic, and semantic features with maximum entropy models for information extraction[C]//Proceedings of the ACL Interactive Poster and Demonstration Sessions, 2004: 178-181.
[12] Zeng D, Liu K, Lai S, et al. Relation classification via convolutional deep neural network[C]//Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, 2014: 2335-2344.
[13] Xu Y, Mou L, Li G, et al. Classifying relations via long short term memory networks along shortest dependency paths[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015: 1785-1794.
[14] Cai R, Zhang X, Wang H. Bidirectional recurrent convolutional neural network for relation classification[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 756-765.
[15] Zheng S, Hao Y, Lu D, et al. Joint entity and relation extraction based on a hybrid neural network[J]. Neurocomputing, 2017, 257: 59-66.
[16] Miwa M, Bansal M. End-to-end relation extraction using LSTMs on sequences and tree structures[J].arXiv preprint arXiv: 1601.00770, 2016.
[17] Zheng S, Wang F, Bao H, et al. Joint extraction of entities and relations based on a novel tagging scheme[J]. arXiv preprint arXiv: 1706.05075, 2017.
[18] Mintz M, Bills S, Snow R, et al. Distant supervision for relation extraction without labeled data[C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, 2009: 1003-1011.
[19] 奉国和,郑伟.国内中文自动分词技术研究综述[J].图书情报工作, 2011, 55(02): 41-45.
[20] Sang E F, Veenstra J. Representing text chunks[C] // Proceedings of the 9th Conference on European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 1999: 173-179.
[21] Harrington P. Machine Learning in Action[M]. Connecticut: Manning Publications Co., 2012.
[22] Li Q, Ji H. Incremental joint extraction of entity mentions and relations[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014: 402-412.

基金

国家自然科学基金(61936010,61525205)
PDF(2275 KB)

1170

Accesses

0

Citation

Detail

段落导航
相关文章

/