基于BERT的端到端中文篇章事件抽取

张洪宽,宋晖,徐波,王舒怡

PDF(3632 KB)
PDF(3632 KB)
中文信息学报 ›› 2022, Vol. 36 ›› Issue (10) : 97-106.
信息抽取与文本挖掘

基于BERT的端到端中文篇章事件抽取

  • 张洪宽1,2,宋晖1,徐波1,王舒怡1
作者信息 +

A BERT-based End-to-End Model for Chinese Document-level Event Extraction

  • ZHANG Hongkuan1,2, SONG Hui1, XU Bo1, WANG Shuyi1
Author information +
History +

摘要

篇章级事件抽取研究从整篇文档中检测事件,识别出事件包含的元素并赋予每个元素特定的角色。该文针对限定领域的中文文档提出了基于BERT的端到端模型,在模型的元素和角色识别中依次引入前序层输出的事件类型以及实体嵌入表示,增强文本的事件、元素和角色关联表示,提高篇章中各事件所属元素的识别精度。在此基础上利用标题信息和事件五元组的嵌入式表示,实现主从事件的划分及元素融合。实验证明,该文提出的方法与现有工作相比具有明显的性能提升。

Abstract

Document-level event extraction aims at discovering the event with its arguments and their roles from texts. This paper proposes an end-to-end model for domain-specific document-level event extraction based on BERT. We introduce the embedding of event type and entity nodes to the subsequent layer for event argument and role identification, which represents the relation between events, arguments and roles to improve the accuracy of classifying multi-event arguments. With the title and the embedding of the quintuple of event, we realize the identification of principal and subordinate events, and element fusion between multiple events. Experimental results show that our model outperforms the baselines.

关键词

篇章级事件抽取 / 端到端 / 主从事件

Key words

document-level event extraction / end-to-end / principal and subordinate events

引用本文

导出引用
张洪宽,宋晖,徐波,王舒怡. 基于BERT的端到端中文篇章事件抽取. 中文信息学报. 2022, 36(10): 97-106
ZHANG Hongkuan, SONG Hui, XU Bo, WANG Shuyi. A BERT-based End-to-End Model for Chinese Document-level Event Extraction. Journal of Chinese Information Processing. 2022, 36(10): 97-106

参考文献

[1] David A. The stages of event extraction[C]//Proceedings of the Workshop on annotating and Reasoning about Time and Events. Association for Computational Linguistics,2006: 1-8.
[2] Wei X, Wang B. A survey of event extraction from text[C]//Proceedings of IEEE Access,2019:173111-173137.
[3] Piskorski J, Tanev H, Atkinson M,et al. Online news event extraction for global crisis surveillance[C]//Proceedings of Transactions on Computational Collective Intelligence,2011,6910(1): 182-212.
[4] Ribeiro S, Ferret O, Tannier X. Unsupervised event clustering and aggregation from news wire and web articles[C]//Proceedings of the EMNLP Workshop: Natural Language Processing meets Journalism,2017: 62-67.
[5] Yu S, Wu B. Exploiting structured news information to improve event detection via dual-level clustering[C]//Proceedings of IEEE 3rd International Conference on Data Science in Cyberspace,2018: 873-880.
[6] Chen,Y,Xu L, Liu K, et al. Event extraction via dynamic multipooling convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015(1): 167-176.
[7] Zeng Y, Yang H, Feng Y, et al. A convolution BiLSTM neural network model for Chinese event extraction[C]//Proceedings of the Natural Language Understanding and Intelligent Applications. Springer, 2016: 275-287.
[8] Li Q , Ji H, Huang L. Joint event extraction via structured prediction with global features[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics,2013:73-82.
[9] Nguyen T H, Cho K, Grishman R. Joint event extraction via recurrent neural networks[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016:300-309.
[10] Yang H, Chen Y, Liu K, et al. DCFEE: A document-level chinese financial event extraction system based on automatically labeled training data[C]//Proceedings of Meeting of the Association for Computational Linguistics, System Demonstrations, 2018: 50-55.
[11] 仲伟峰, 杨航, 陈玉博,等. 基于联合标注和全局推理的篇章级事件抽取[J]. 中文信息学报, 2019,33(09): 88-95.
[12] Yang B, Mitchell T M. Joint extraction of events and entities within a document context[C]//Proceedings of NAACL- HLT. San Diego,California,USA: Association for Computational Linguistics,2016: 289-299.
[13] Zheng S, Cao W, Xu W, et al. Doc2EDAG: An end-to-end document-level framework for chinese financial event extraction[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing,2019:337-346.
[14] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,2018: 4171-4186.
[15] Liu S, Li Y, Zhang F,et al.Event detection without triggers[C]//Proceedings of the Conference of the North, 2019: 735-744.
[16] Ji X, Lin Y K, Liu Z L, et al. Improving neural fine-grained entity typing with knowledge attention[C]//Proceedings of the 32th AAAI Conference on Artificial Intelligence, 2018: 5997-6004.
[17] Kingma D P, Ba J. Adam: A method for stochastic optimization[C]//Proceedings of 3rd International Conference for Learning Representations, San Diego,2015.
[18] Mikolov T,Sutskever I,Chen K,et al. Distributed representations of words and phrases and their compositionality [C]//Proceedings of Advances in Neural Information Processing System, 2013: 3111-3119.
[19] Yangarber R, Grishman R. Customization of information extraction systems[C]//Proceedings of International Workshop on Lexically Driven Information Extraction,1997: 1-11.
[20] Surdeanu M,Harabagiu S M.Infrastructure for open-domain information extraction[C]//Proceedings of the 2nd International Conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc,2002: 325-330.
[21] Felix J, Katharina M. Enhanced services for targeted information retrieval by event extraction and data mining[M].Natural Language and Information Systems. Springer Berlin Heidelberg,2008:335-336.
[22] Liao S,Grishman R.Using document level cross-event inference to improve event extraction[C]//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics,Uppsala, Sweden. DBLP,2010:789-797.
[23] 吴文涛,李培峰,朱巧明.基于混合神经网络的实体和事件联合抽取方法[J].中文信息学报,2019,33(08):77-83.

基金

国家自然科学基金(61906035);上海市青年科技英才扬帆计划项目(19YF140230)
PDF(3632 KB)

1771

Accesses

0

Citation

Detail

段落导航
相关文章

/