情绪-原因对抽取(ECPE)任务旨在从给定文档中同步抽取情绪子句及其对应的原因子句,该任务在新闻领域得到了广泛研究。然而,社交媒体领域ECPE任务的研究相对较少,主要原因在于缺少适用的数据集。与新闻领域相比,该领域更具挑战性和实用性: (1)在社交媒体领域,情绪表达更加多样化、非规范化; (2)以往的研究忽略了情绪造成的主观意图,其对于决策分析有很重要的价值。针对以上问题,该文首先构建了一个面向中文微博的情绪原因抽取数据集,并对其中5 009条数据进行了人工标注。该数据集具备以下特点: (1)收录了隐喻、反讽等形式的情绪表达,标注了细粒度的情绪类别; (2)定义了三种类型的意图,并标注了意图子句; (3)当前规模最大的中文情绪-原因对抽取数据集。结合数据集特点,该文提出一种融合情绪类别和意图信息的情绪-原因对抽取方法,并将该方法与多个ECPE主流方法进行了比较分析。实验结果表明,该文方法可以更有效提升社交媒体领域情绪-原因对抽取的效果。
Abstract
Emotion-cause pair extraction (ECPE) is to extract emotion clauses and corresponding cause clauses simultaneously, which has been widely studied in the news domain. In the social media domain, there are few studies on ECPE task due to the lack of datasets. Compared to the news domain, the social media is more challenging in that: (a) the emotion expression in the social media texts is more diverging or even ill-formed; (b) the human’s subjective intentions have been widely ignored in the prior studies, which are significant for decision analysis. To alleviate these issues, this paper constructs a Chinese Microblog dataset WeiboEmotion for ECPE with 5009 samples manually annotated. This dataset includes emotional expressions in the form of metaphor and irony, and defines fine-grained emotional categories and three types of intentions. Considering the features of this dataset, this paper tentatively proposes an ECPE method integrating emotion category and intention information. Experimental results show the effectiveness of this method compared with the mainstream ones.
关键词
情绪-原因对抽取 /
中文社交媒体 /
微博数据集
{{custom_keyword}} /
Key words
emotion-cause pair extraction /
Chinese social media /
weibo dataset
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] EKMAN P. Expression and the nature of emotion[J]. Approaches to Emotion, 1984, 3(19): 344-351.
[2] 徐琳宏, 林鸿飞, 潘宇, 等. 情感词汇本体的构造[J]. 情报学报, 2008, 27(2): 180-185.
[3] CHEN Y, LEE S Y M, LI S, et al. Emotion cause detection with linguistic constructions[C]//Proceedings of the Coling International Conference on Computational Linguistics, 2010, 2(08): 179-187.
[4] HRIPCSAK G, ROTHSCHILD A S. Agreement, the F-measure, and reliability in information retrieval[J]. Journal of the American Medical Informatics Association, 2005, 12(3): 296-298.
[5] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 4171-4186.
[6] HUANG J, SHEN H, HOU L, et al. Graph attention networks[C]//Proceedings of the ICLR, 2018.
[7] DING Z, XIA R, YU J. End-to-end emotion-cause pair extraction based on sliding window multi-label learning[C]//Proceedings of the EMNLP Conference on Empirical Methods in Natural Language Processing, 2020, 3: 3574-3583.
[8] WEI P, ZHAO J, MAO W. Effective inter-clause modeling for end-to-end emotion-cause pair extraction[C]//Proceedings of the ACL Annual Meeting of the Association for Computational Linguistics,2020: 3171-3181.
[9] GOOLE. BERT-base, Chinese[EB/OL], https://github.com/google-research/bert, 2018.
[10] GLOROT X, BENGIO Y. Understanding the difficulty of training deep feed forward neural networks[C]//Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, 2010, 9: 1074-1078.
[11] KINGMA D P, BA J L. Adam: A method for stochastic optimization[C]//Proceedings of the 3rd International Conference on Learning Representations, Conference Track Proceedings, 2015: 1-15.
[12] FAN C, YUAN C, DU J, et al. Transition-based directed graph construction for emotion-cause pair extraction[C]//Proceedings of the ACL Annual Meeting of the Association for Computational Linguistics, 2020: 3707-3717.
[13] 宗成庆, 夏睿, 张家俊. 文本数据挖掘[M]. 北京: 清华大学出版社, 2019.
[14] ZONG C, XIA R, ZHANG J. Text data mining[M]. Springer, 2021.
[15] LEE S Y M, CHEN Y, HUANG C R. A text-driven rule-based system for emotion cause detection[C]//Proceedings of the NAACL HLT Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, 2010: 45-53.
[16] RUBINO F, CNR I L C A Z, MORUZZI V G, et al. EMOCause: An easy-adaptable approach to emotion cause contexts[C]//Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis, 2011: 153-160.
[17] GUI L, YUAN L, XU R, et al. Emotion cause detection with linguistic construction in Chinese weibo text[C]//Proceedings of the Natural Language Processing and Chinese Computing, 2014: 457-464.
[18] GUI L, WU D, XU R, et al. Event-driven emotion cause extraction with corpus construction[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016: 1639-1649.
[19] 刁宇峰, 亮杨, 林鸿飞, 等. 基于ECPA神经网络的情绪原因识别方法[J]. 中文信息学报, 2021, 35(6): 85-92.
[20] GUI L, HU J, HE Y, et al. A question answering approach to emotion cause extraction[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2017: 1593-1602.
[21] LI X, SONG K, FENG S, et al. A coattention neural network model for emotion cause analysis with emotional context awareness[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018: 4752-4757.
[22] FAN C, YAN H, DU J, et al. A knowledge regularized hierarchical approach for emotion cause analysis[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, 2019: 5614-5624.
[23] DING Z, HE H, ZHANG M, et al. From independent prediction to reordered prediction: Integrating relative position and global label information to emotion cause identification[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019: 6343-6350.
[24] XIA R, MENGRAN Z, ZIXIANG D. RTHN: ARNN-transformer hierarchical network for emotion cause extraction[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence, 2019: 5285-5291.
[25] XIA R, DING Z. Emotion-cause pair extraction: A new task to emotion analysis in texts[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 1003-1012.
[26] DING Z, XIA R, YU J. ECPE-2D: Emotion-cause pair extraction based on joint two-dimensional representation, interaction and prediction[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, 1: 3161-3170.
[27] YUAN C, FAN C, BAO J, et al. Emotion-cause pair extraction as sequence labeling based on a novel tagging scheme[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2020: 3568-3573.
[28] CHEN X, LI Q, WANG J. A unified sequence labeling model for emotion cause pair extraction[C]//Proceedings of the 28th International Conference on Computational Linguistics, 2020: 208-218.
[29] CHEN Y, HOU W, LI S, et al. End-to-end emotion-cause pair extraction with graph convolutional network[C]//Proceedings of the 28th International Conference on Computational Linguistics, 2020: 198-207.
[30] SANTERNE A, MOUTOU C, TSANTAKI M, et al. Semi-supervised classification with graph convolutional networks[C]//Proceedings of the ICLR, 2017.
[31] GAO K, XU H, WANG J. A rule-based approach to emotion cause detection for Chinese micro-blogs[J]. Expert Systems with Applications, 2015, 42(9): 4517-4528.
[32] CHENG X, CHEN Y, CHENG B, et al. An emotion cause corpus for Chinese microblogs with multiple-user structures[J]. ACM Transactions on Asian and Low-Resource Language Information Processing, 2017, 17(1): 1-19.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
科技创新2030-“新一代人工智能”重大项目(2020AAA0108600)
{{custom_fund}}