基于指针标注的跨境民族文化实体关系抽取方法

杨振平,毛存礼,雷雄丽,黄于欣,张勇丙

PDF(3614 KB)
PDF(3614 KB)
中文信息学报 ›› 2024, Vol. 38 ›› Issue (3) : 75-83.
民族、跨境及周边语言信息处理

基于指针标注的跨境民族文化实体关系抽取方法

  • 杨振平1,2,毛存礼1,2,雷雄丽2,3,黄于欣1,2,张勇丙1,2
作者信息 +

Cross-border Ethnic Cultural Entity Relation Extraction Based on Pointer Annotation

  • YANG Zhenping1,2, MAO Cunli1,2, LEI Xiongli2,3, HUANG Yuxin1,2, ZHANG Yongbing1,2
Author information +
History +

摘要

跨境民族文化领域文本中存在较多的领域词汇,使得模型提取领域信息困难,造成上下文领域信息缺失,在该领域中实体密度分布高,面临实体关系重叠的问题。考虑到领域信息对跨境民族文化文本语义表征有着重要的作用,该文提出一种基于指针标注的跨境民族文化实体关系抽取方法,在字符向量表示中融入领域词典信息来增强领域信息用于解决领域实体标注不准确问题,通过多层指针标注解决跨境民族文化领域实体关系重叠问题。实验结果表明,在跨境民族文化实体关系抽取数据集上所提出方法相比于基线方法的F1值提升了2.34%。

Abstract

The information extraction in the field of cross-border ethnic culture is challenged by rich domain words and the high density distribution of entities caused the overlapping entity relationships. To better capture the domain information, this paper proposes a cross-border ethnic cultural entity relationship extraction method based on pointer annotation. The domain lexicon is integrated into the character vector representation to enhance domain entity labeling. The problem of overlapping entity relations is solved through multi-layer pointer labeling in the field of cross-border ethnic culture. The experimental results show that the F1 value of the proposed method has improved by 2.34% compared with the baseline method on the cross-border ethnic cultural entity relation extraction dataset.

关键词

跨境民族文化 / 实体关系抽取 / 指针标注 / 领域词典信息

Key words

cross-border national culture / entity relation extraction / pointer annotation / domain lexicon information

引用本文

导出引用
杨振平,毛存礼,雷雄丽,黄于欣,张勇丙. 基于指针标注的跨境民族文化实体关系抽取方法. 中文信息学报. 2024, 38(3): 75-83
YANG Zhenping, MAO Cunli, LEI Xiongli, HUANG Yuxin, ZHANG Yongbing. Cross-border Ethnic Cultural Entity Relation Extraction Based on Pointer Annotation. Journal of Chinese Information Processing. 2024, 38(3): 75-83

参考文献

[1] WEI Z, SU J, WANG Y, et al. A novel cascade binary tagging framework for relational triple extraction[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics,2020: 1476-1488.
[2] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[3] ZHONG Z, CHEN D. A frustratingly easy approach for joint entity and relation extraction[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,2021: 50-61.
[4] MIWA M, BANSAL M. End-to-end relation extraction using LSTMs on sequences and tree structures[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 1105-1116.
[5] ZHENG S, HAO Y, LU D, et al. Joint entity and relation extraction based on a hybrid neural network[J]. Neurocomputing, 2017, 257: 1-8.
[6] ZHENG S, WANG F, BAO H, et al. Joint extraction of entities and relations based on a novel tagging scheme[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistic. Stroudsburg,2017: 1227-1236.
[7] ZENG X, ZENG D, HE S, et al. Extracting relational facts by an end-to-end neural model with copy mechanism[C]//Proceedings of the 56th Annual Meeting of the Assciation for Computational Linguistics, 2018, 1: 506-514.
[8] MIWA M, SASAKI Y. Modeling joint entity and relation extraction with table representation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014: 1858-1869.
[9] GUPTA P, SCHUTZE H, ANDRASSY B. Table filling multi-task recurrent neural network for joint entity and relation extraction[C]//Proceedings of COLING, the 26th International Conference on Computational Linguistics: Technical Papers, 2016: 2537-2547.
[10] WANG J, Lu W. Two are better than one: Joint entity and relation extraction with table-sequence encoders[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2020: 1706-1721.
[11] FU T, LI P, MA W. Graphrel: Modeling text as relational graphs for joint entity and relation extraction[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 1409-1418.
[12] KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[C]//Proceedigns of the 5th International Conference on Learning Representations, 2017:1-14.
[13] WANG Y, YU B, ZHANG Y, et al. TPLinker: Single-stage joint extraction of entities and relations through token pair linking[C]//Proceedings of the 28th International Conference on Computational Linguistics,2020: 1572-1582.
[14] ZHENG H, WEN R, CHEN X, et al. PRGC: Potential relation and global correspondence based joint relational triple extraction[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021: 6225-6235.
[15] 毛存礼,王斌,雷雄丽,等.融合领域知识图谱的跨境民族文化分类[J].小型微型计算机系统,2022,43(05): 943-949.
[16] 曹明宇,杨志豪,罗凌,等.基于神经网络的药物实体与关系联合抽取[J].计算机研究与发展,2019,56(07): 1432-1440.
[17] 陆亮,孔芳.面向对话的融入交互信息的实体关系抽取[J].中文信息学报,2021,35(08): 82-88+97.
[18] DEVLIN J, CHANG M, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minnesota, 2019: 4171-4186.
[19] ZHANG Y, WALLACE B. A sensitivity analysis of (and practitioners' guide to) convolutional neural networks for sentence classification[C]//Proceedings of the Eighth International Joint Conference on Natural Language Processing, Taipei, 2017: 253-263.
[20] VINYALS O, FORTIMATO M, JAITLY N. Pointer networks[C]//Proceedings of the 28th Interational Conference on Aeural Information Processing System Volume 2, 2015:2692-2700.

基金

国家自然科学基金(62166023,61866019);云南省自然科学基金(2019FA023);云南省重大科技专项计划项目(202103AA080015,202002AD080001)
PDF(3614 KB)

Accesses

Citation

Detail

段落导航
相关文章

/