涉案微博评论讽刺句检测的难点在于评论句字面语义与实际情感存在着较大差异,仅利用评论本身的特征难以判断,而涉案微博正文是案件的事实性描述,可以将其作为评论讽刺句检测的依据。为此,该文提出一种基于动态记忆案件描述的讽刺检测方法。首先利用动态记忆机制对微博正文进行案件特征抽取,其次利用注意力机制获得评论句特征,并与案件特征进行一致性比较,最后基于比较的特征进行讽刺句分类。实验结果表明,该文所提出方法的准确率和F1值分别达到85.65%和85.91%,较基线模型有较大提升,验证了案件描述对涉案微博评论讽刺句检测有很好的支撑作用。
Abstract
The difficulty in detecting the sarcastic sentence in case-related microblog comment lies in the great difference between the literal meaning of the comment and the actual emotion. To exploit the body of the case-related microblog as a factual description, this paper proposes a sarcasm detection method based on dynamic memory case description. Firstly, the dynamic memory mechanism is applied to extract the case features of the microblog text, and then the attention mechanism is used to compare the case feature with the comment sentence for the consistency. Finally, the category of sarcastic sentence is determined based on the comparison features. The experimental results show that the accuracy and F1 value of the proposed method are 85.65% and 85.91%, respectively, much better than the existing baseline model.
关键词
涉案微博 /
讽刺句检测 /
案件描述 /
动态记忆机制
{{custom_keyword}} /
Key words
case related microblog /
sarcasm detection /
case description /
dynamic memory mechanism
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] KREUZ R J, CAUCCI G M. Lexical influences on the perception of sarcasm[C] // Procceings of the Workshop on Computational Approaches to Figurative Language. Stroudsburg,PA: ACL,2007: 1-4.
[2] CARVALHO P,SARMENTO L,SILVA M J,et al. Clues for detecting irony in user-generated contents: oh...!! it's "so easy" ;-)[C]//Procceings of Conference on Information and Knowledge Management,New York,NY: ACM,2009: 53-56.
[3] TANG Y J,CHEN H.Chinese irony corpus construction and ironic corpus construction and ironic structure analysis [C]//Proceedings of the 25th International Conference on Computational Linguistics,New York,NY: ACM,2014: 1269-1278.
[4] GHOSH A,VEALE T. Fracking sarcasm using neural network[C]//Proceedings of North American Chapter of the Association for Computational Linguistics. San Diego,California : NAACL,2016: 161-169.
[5] YI T,LUU A T et al. Reasoning with sarcasm by reading in-between[C]//Proceedings of Meeting of the Association for Computational Linguistics,Stroudsburg,PA: ACL,2018: 1010-1020.
[6] AKSHAY K,PRANAV P,et al. Sarcasm detection in tweets with BERT and GloVe embeddings[C]//Proceedings of Meeting of the Association for Computational Linguistics,Stroudsburg,PA: ACL,2020: 56-60.
[7] ZHANG M,ZHANG Y,FU G,et al. Tweet sarcasm detection using deep neural network[C]//Proceedings of International Conference on Computational Linguistics.New York,NY: ACM, 2016: 2449-2460.
[8] HAZARIKA D,PORIA S,GORANTLA S,et al. CASCADE: Contextual sarcasm detection in online discussion forums[C]//Proceedings of International Conference on Computational Linguistics,New York,NY: ACM, 2018: 1837-1848.
[9] KOLCHINSKI Y A,POTTS C. Representing social media users for sarcasm detection[C]//Proceedings of Empirical Methods in Natural Language Processing. Stroudsburg,PA: ACL,2018: 1115-1121.
[10] KALAIVANI A,THENMOZHI D. Sarcasm identification and detection in conversion context using BERT[C]//Proceedings of Meeting of the Association for Computational Linguistics,Stroudsburg,PA: ACL,2020: 72-76.
[11] 孙晓,何家劲,任福继.基于多特征融合的混合神经网络模型讽刺语用判别[J].中文信息学报,2016,30(06): 215-223.
[12] 卢欣,李旸,王素格.融合语言特征的卷积神经网络的反讽识别方法[J].中文信息学报,2019,33(05): 31-38.
[13] SUKHBAATAR S,SZLAM A,WESTON J,et al. End-to-end memory networks[C]//Proceedings of Neural Information Processing Systems.USA: MIT Press, 2015: 2440-2448.
[14] XIONG C,MERITY S,SOCHER R,et al. Dynamic memory networks for visual and textual question answering[C]//Proceedings of International Conference on Machine Learning.New York,NY: ACM,2016: 2397-2406.
[15] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of Empirical Methods in Natural Language Processing.Stroudsburg,PA: ACL, 2014: 1746-1751.
[16] LAI S,XU L,LIU K,et al. Recurrent convolutional neural networks for text classification[C]//Proceedings of National Conference on Artificial intelligence. New York: Springer,2015: 2267-2273.
[17] YANG Z,YANG D,DYER C,et al. Hierarchical attention networks for document classification[C]//Proceedings of North American Chapter of the Association for Computational Linguistics.San Diego,California: NAACL,2016: 1480-1489.
[18] SABOUR S,FROSST N,HINTON G E,et al. Dynamic routing between capsules[C]//Proceedings of Neural Information Processing Systems.USA: MIT Press, 2017: 3856-3866.[19] VASWANI A,SHAZEER N,PARMAR N,et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.USA: MIT Press, 2017: 5998-6008.
[20] DEVLIN J,CHANG M W,LEE K,et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapeer of the Association for Computational Linguistics: Human Language Technologies, 2019: 4171-4186.
[21] 张庆林,杜嘉晨,徐睿峰.基于对抗学习的讽刺识别研究[J].北京大学学报(自然科学版),2019,55(01): 29-36.
[22] KUMAR A,IRSOY O,ONDRUSKA P,et al. Ask me anything: dynamic memory networks for natural language processing[C]//Proceedings of International Conference on Machine Learning. New York,NY: ACM,2016: 1378-1387.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家重点研发计划(2018YFC0830105,2018YFC0830101,2018YFC0830100);云南省高新技术产业专项(201606)
{{custom_fund}}