金融领域事件因果关系发现及事理图谱构建与应用

杨纪星,杨波,朱剑林,康怡琳

PDF(7267 KB)
PDF(7267 KB)
中文信息学报 ›› 2023, Vol. 37 ›› Issue (7) : 131-142.
自然语言处理应用

金融领域事件因果关系发现及事理图谱构建与应用

  • 杨纪星1,杨波1,2,朱剑林1,2,康怡琳1,2
作者信息 +

Event Causality Extraction, Eventic Graph Construction and Application in Financial Domain

  • YANG Jixing1, YANG Bo1,2, ZHU Jianlin1,2, KANG Yilin1,2
Author information +
History +

摘要

事理图谱是研究事物动态发展的有效手段。针对金融因果事理图谱构建过程中数据集匮乏及构建方案缺少实践对比的现状,该文面向金融领域中发生频率较高的热点事件,研究构建事理图谱的方法。该文提出了一种新的金融领域事件论元的定义,制定了基于ATT+SBV结构的句法分析方案,针对信息抽取任务提出了面向金融因果事件的序列标注定义。该文同时提出了一种基于BERT+Bi-LSTM+CRF模型的信息抽取方案,并与不同神经网络模型进行对比研究。实验结果表明,该模型在信息抽取任务中,F1值达到95.78%,准确性有较大提升。该文通过Neo4j图数据库存储并构建金融因果事理图谱,以事件关系可视化的方式揭示现实金融事件的演变逻辑规律,分析金融网络的风险传导扩散机制。

Abstract

The eventic graph is an effective solution to study the dynamic development of things. Aiming at the construction of financial causal eventic graph, this paper explores the methods of constructing eventic graph for the headlines with high frequency in the financial field. In this paper, we propose a new definition of event argument in the financial domain, a syntactic analysis scheme based on ATT+SBV structure, and a definition sequence labeling for financial causal events based on information extraction task. We also propose an information extraction scheme based on BERT+Bi-LSTM+CRF model, and compares it with different neural network models. The experimental results show that the proposed method achieves a significant improvement of F1 value of 95.78%. With Neo4j graph database, we demonstrate the event relationship visualization to reveal the evolution of real financial events, as well as the risk transmission and diffusion mechanism of financial network.

关键词

事理图谱 / 事件抽取 / 信息抽取模型

Key words

eventic graph / event extraction / information extraction model

引用本文

导出引用
杨纪星,杨波,朱剑林,康怡琳. 金融领域事件因果关系发现及事理图谱构建与应用. 中文信息学报. 2023, 37(7): 131-142
YANG Jixing, YANG Bo, ZHU Jianlin, KANG Yilin. Event Causality Extraction, Eventic Graph Construction and Application in Financial Domain. Journal of Chinese Information Processing. 2023, 37(7): 131-142

参考文献

[1] LI Z, ZHAO S, DING X, et al. EEG: Knowledge base for event evolutionary principles and patterns[C]//Proceedings of the Chinese National Conference on Social Media Processing. Springer, Singapore, 2017: 40-52.
[2] LUO Z, SHA Y, ZHU K Q, et al. Commonsense causal reasoning between short texts[C]//Proceedings of the 15th International Conference on the Principles of Knowledge Representation and Reasoning, 2016: 421-430.
[3] ZHAO S, WANG Q, MASSUNG S, et al. Constructing and embedding abstract event causality networks from text snippets[C]//Proceedings of the 10th ACM International Conference on Web Search and Data Mining, 2017: 335-344.
[4] CHE W, LI Z, LIU T. Ltp: A chinese language technology platform[C]//Proceedings of the Coling: Demonstrations, 2010: 13-16.
[5] DASGUPTA T, SAHA R, DEY L, et al.Automatic extraction of causal relations from text using linguistically informed deep neural networks[C]//Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, 2018: 306-316.
[6] 胡欢.面向热点话题的因果事理图谱构建及应用研究[D].青岛: 青岛大学硕士学位论文,2020.
[7] 陈佳惠.城市轨道交通运营突发事件的事理图谱构建方法研究[D].北京: 北京交通大学硕士学位论文,2021.
[8] 张超.面向电信诈骗领域的事理图谱构建关键技术研究[D].北京: 中国人民公安大学硕士学位论文,2022.
[9] 白璐,周子雅,李斌阳,等.面向政治领域的事理图谱构建[J].中文信息学报,2021,35(04): 66-74.
[10] LU Y, LIN H, XU J, et al. Text2 Event: Controllable sequence-to-structure generation for end-to-end event extraction[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021:2795-2806.
[11] SORGENTE A,VETTIGLI G, MELE F. Automatic extraction of cause-effect relations in natural language text[C]//Proceedings of the 13th Conference of the Italian Association for Artificial Intellgence,Rome,Italian, 2013: 37-48.
[12] 汪兰兰,姚春龙,李旭,等.结合依存句法分析与交互注意力机制的隐式方面提取[J].计算机应用研究,2022,39(01): 37-42.
[13] CHE W, SHAO Y, LIU T, et al.SemEval task 9: Chinese semantic dependency parsing[C]//Proceedings of the 10th International Workshop on Semantic Evaluation, 2016: 1074-1080.
[14] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[C]//Proceedings of International Conference on Learning Representations. Scottsdale,USA: 2013: 1-12.
[15] DEVLIN J, CHANG M W, LEE K, et al. BERT: Bidirectional encoder representations from transformers[C]//Proceedings of the NAACL-HLT. 2019: 4171-4186.
[16] 张汝佳,代璐,王邦,等.基于深度学习的中文命名实体识别最新研究进展综述[J].中文信息学报,2022,36(06): 20-35.
[17] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[18] SCHUSTER M, PALIWAL K K. Bidirectional recurrent nearal networks[J]. IEEE Transactions on Sigral Processing, 1997, 45(11): 2673-2681.
[19] 余本功,范招娣.面向自然语言处理的条件随机场模型研究综述[J].信息资源管理学报,2020,10(05): 96-111.
[20] SU J, MURTADHA A, PAN S, et al. Global pointer: Novel efficient span-based approach for named entity recognition[J]. arXiv preprint arXiv: 2208.03054, 2022.
[21] 徐家梁.基于事理图谱的铁路突发事件画像[D].兰州: 兰州交通大学硕士学位论文,2021.
[22] 屈倩倩,阚红星.基于Bert-BiLSTM-CRF的中医文本命名实体识别[J].电子设计工程,2021,29(19): 40-43.
[23] 吴超.面向突发事件领域的事理图谱平台的设计与实现[D].成都: 电子科技大学硕士学位论文,2020.
[24] 于鹏.逻辑公式间的Jaccard距离及其应用[J].计算机科学与探索,2020,14(11): 1975-1980.
[25] 单建芳,刘宗田,周文.事件相似度计算[J].小型微型计算机系统,2010,31(04): 731-734.
[26] 郭林斐,刘广钟.基于Neo4j不确定性数据处理技术的研究[J].计算机技术与发展,2020,30(1): 25-31.
[27] 刘政昊,曾曦,张志剑.面向应急管理的金融突发事件事理知识图谱构建与分析研究[J].信息资源管理学报,2022,12(03): 137-151.

基金

国家自然科学基金(72104254);国家重点研究与发展计划(2020YFC1522600);湖北省自然科学基金(2022CFB469)
PDF(7267 KB)

1771

Accesses

0

Citation

Detail

段落导航
相关文章

/