融入文档图和事件图的新闻核心事件检测

赵庆珏,余正涛,王剑,黄于欣,朱恩昌

PDF(2107 KB)
PDF(2107 KB)
中文信息学报 ›› 2024, Vol. 38 ›› Issue (5) : 99-106.
信息抽取与文本挖掘

融入文档图和事件图的新闻核心事件检测

  • 赵庆珏1,2,余正涛1,2,王剑1,2,黄于欣1,2,朱恩昌1,2
作者信息 +

News Salient Event Detection Incorporating Document Graph and Event Graph

  • ZHAO Qingjue1,2, YU Zhengtao1,2, WANG Jian1,2, HUANG Yuxin1,2, ZHU Enchang1,2
Author information +
History +

摘要

新闻核心事件检测旨在从非结构化的新闻文本中检测出最能代表新闻核心内容的事件。新闻报道的多个事件之间存在着复杂的关联关系,且同一个事件的事件要素分布在不同的句子甚至不同的段落中,传统的方法对事件之间的关联关系以及事件的全局语义信息建模不充分。因此,该文提出了融入文档图和事件图的新闻核心事件检测方法。该方法首先通过构建文档图和事件图来建模新闻文本的全局语义特征和事件之间的关联特征。然后,通过图卷积神经网络捕获高阶邻域信息,获得文档表征和事件表征。最后,将得到的文档表征和事件表征使用交叉注意力进一步捕获事件全局语义信息。在纽约时报数据集上的实验结果验证了该文方法的有效性,NR@1较基线方法提升2.18%。

Abstract

News salient event detection aims to detect the events that best represent the core content of unstructured news texts. There are complex correlations among multiple events in news reports, and the event elements of the same event are distributed in different sentences or even different paragraphs. To deal with this issue, this paper proposes a news salient event detection method that incorporates document graph and event graph. The method first models the news texts global semantic features and the association features between events by constructing document graphs and event graphs. Then, the document representation and event representation are obtained by capturing higher-order neighborhood information through graph convolutional neural networks. Finally, the obtained document representations and event representations are used to further capture the global semantic information of events using cross-attention. Experimental results on the New York Times Annotated Corpus validate the effectiveness of the paper's approach by 2.18% increase in NR@1 metric compared with the baseline.

关键词

核心事件检测 / 文档图 / 事件图 / 交叉注意力机制

Key words

salient event detection / document graph / event graph / cross-attention mechanism

引用本文

导出引用
赵庆珏,余正涛,王剑,黄于欣,朱恩昌. 融入文档图和事件图的新闻核心事件检测. 中文信息学报. 2024, 38(5): 99-106
ZHAO Qingjue, YU Zhengtao, WANG Jian, HUANG Yuxin, ZHU Enchang. News Salient Event Detection Incorporating Document Graph and Event Graph. Journal of Chinese Information Processing. 2024, 38(5): 99-106

参考文献

[1] NALLAPATI R, ZHAI F, ZHOU B. Summarunner: A recurrent neural network based sequence model for extractive summarization of documents[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017: 3075-3081.
[2] KOC^ISK T, SCHWARZ J, BLUNSOM P, et al. The narrativeqa reading comprehension challenge[J]. Transactions of the Association for Computational Linguistics, 2018, 6: 317-328.
[3] REDDY S, CHEN D, MANNING C D. Coqa: A conversational question answering challenge[J]. Transactions of the Association for Computational Linguistics, 2019, 7: 249-266.
[4] CHENG P, ERK K. Implicit argument prediction with event knowledge[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 831-840.
[5] OSTERMANN S, ROTH M, PINKAL M. MCScript2.0: A machine comprehension corpus focused on script events and participants[C]//Proceedings of the 8th Joint Conference on Lexical and Computational Semantics, 2019: 103-117.
[6] ZHANG X, CHEN M, MAY J. Salience-aware event chain modeling for narrative understanding[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2021: 1418-1428.
[7] VOSSEN P, CASELLI T, KONTZOPOULOU Y. Storylines for structuring massive streams of news[C]//Proceedings of the 1st Workshop on Computing News Storylines. 2015: 40-49.
[8] ZHANG C, SODERLAND S, WELD D S. Exploiting parallel news streams for unsupervised event extraction[J]. Transactions of the Association for Computational Linguistics, 2015, 3: 117-129.
[9] UPADHYAY S, CHRISTODOULOPOULOS C, ROTH D. “Making the news”: Identifying noteworthy events in news articles[C]//Proceedings of the 4th Workshop on Events. 2016: 1-7.
[10] LIU Z, XIONG C, MITAMURA T, et al. Automatic event salience identification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018: 1226-1236.
[11] CHOUBEY P K, RAJU K, HUANG R. Identifying the most dominant event in a news article by mining event coreference relations[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 340-345.
[12] CHOUBEY P K, LEE A, HUANG R, et al. Discourse as a function of event: Profiling discourse structure in news articles around the main event[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 5374-5386.
[13] ALDAWSARI M, PEREZ A, BANISAKHER D, et al. Distinguishing between foreground and background events in news[C]//Proceedings of the 28th International Conference on Computational Linguistics, 2020: 5171-5180.
[14] JINDAL D, DEUTSCH D, ROTH D. Is killed more significant than fled?: A contextual model for salient event detection[C]//Proceedings of the 28th International Conference on Computational Linguistics, 2020: 114-124.
[15] LU J, CHOI J D. Evaluation of unsupervised entity and event salience estimation[C]//Proceedings of the International FLAIRS Conference Proceedings, 2021: 34-39.
[16] WANG J, HAN B, WANG F, et al. Document-level core events extraction based on QA[C]//Proceedings of the Journal of Physics. IOP Publishing, 2022, 2171(1): 012062.
[17] KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[J]. arXiv preprint arXiv: 1609.02907, 2016.
[18] EVAN S. The new York Times annotated corpus[DB/OL]. LDC 2008T19. http://doi.org/10.35111/77ba-9x97, Philadelphia: Lingustic Data Consortium, 2008.
[19] 章舜仲. 文本分类中词共现关系的研究及其应用[D]. 南京: 南京理工大学硕士学位论文, 2010.
[20] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv: 1810.04805, 2018.
[21] CAO Y, FANG M, TAO D. BAG: Bi-directional attention entity graph convolutional network for multi-hop reasoning question answering[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 357-362.
[22] KINGMA D P, BA J. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv: 1412.6980, 2014.

基金

国家自然科学基金(U21B2027,61972186,61732005,61866019);云南省重大科技专项(202002AD080001,202202AD080003,202103AA080015);云南省高新技术产业专项(201606)
PDF(2107 KB)

484

Accesses

0

Citation

Detail

段落导航
相关文章

/