基于多特征融合的混合神经网络模型讽刺语用判别

PDF(4577 KB)

中文信息学报 ›› 2016, Vol. 30 ›› Issue (6) : 215-223.

综述

基于多特征融合的混合神经网络模型讽刺语用判别

孙晓¹;何家劲¹;任福继^1,2

作者信息 +

Pragmatic Analysis of Irony Based on Hybrid Neural Network Model with Multi-feature

SUN Xiao¹; HE Jiajin¹; REN Fuji^1,2

Author information +

History +

摘要

在社交媒体中,存在大量的反讽和讽刺等语言现象,这些语言现象往往表征了一定的情感倾向性。然而这些特殊的语言现象所表达的语义倾向性,通常与其浅层字面含义相去甚远,因此加大了社交媒体中文本情感分析的难度。鉴于此,该文主要研究中文社交媒体中的讽刺语用识别任务,构建了一个覆盖反讽、讽刺两种语言现象的语料库。基于此挖掘反讽和讽刺的语言特点,该文通过对比一些有效领域特征,验证了在反讽和讽刺文本的识别中,其结构和语义等深层语义特征的重要性。同时,该文提出了一种有效的多特征融合的混合神经网络判别模型,融合了卷积神经网络与LSTM序列神经网络模型,通过深层模型学习深层语义特征和深层结构特征,该模型获得了较好的识别精度,优于传统的单一的神经网络模型和BOW(Bag-of-Words)模型。

Abstract

In social media, there are a lot of ironies or satires, which imply certain emotional tendencies. However, the pragmatic tendency of these special language phenomena is most often a far cry from its literal meaning, which challerges the text sentiment analysis in social media. This paper studies irony recognition in Chinese social media, and constructs a corpus contains irony and satire. It demonstrates the importance of structural and semantic features of ironies in text recognition. This paper also presents an efficient multi-feature hybrid neural network model, which fuses the Convolutional Neural Network and LSTM sequential models. The experimental resitst prove that the proposed model is superior to the traditional neural network models and BOW (bag-of-words) model.

导出引用

孙晓;何家劲;任福继. 基于多特征融合的混合神经网络模型讽刺语用判别. 中文信息学报. 2016, 30(6): 215-223

SUN Xiao; HE Jiajin; REN Fuji. Pragmatic Analysis of Irony Based on Hybrid Neural Network Model with Multi-feature. Journal of Chinese Information Processing. 2016, 30(6): 215-223

参考文献

[1] Konstantin Buschmeier, Philipp Cimiano, Roman Klinger. An impact analysis of features in a classification approach to irony detection in product reviews[C]//Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. 2014: 42-49.
[2] Edwin Lunando, Ayu Purwarianti. Indonesian social media sentiment analysis with sarcasm detection[C]//Proceedings of the Advanced Computer Science and Information Systems (ICACSIS), 2013 International Conference on. IEEE, 2013: 195-198.
[3] David Bamman, Noah A Smith. Contextualized sarcasm detection on twitter[C]//Proceedings of the Ninth International AAAI Conference on Web and Social Media. 2015.
[4] Aditya Joshi, Pushpak Bhattacharyya, Mark James Carman. Automatic Sarcasm Detection: A Survey[J]. arXiv preprint arXiv: 1602.03426, 2016.
[5] Peng Liu, Wei Chen, Gaoyan Ou, et al. Sarcasm detection in social media based on imbalanced classification[C]//Proceedings of the International Conference on Web-Age Information Management. Springer International Publishing, 2014: 459-471.
[6] Santosh Kumar Bharti, Korra Sathya Babu, Sanjay Kumar Jena. Parsing-based sarcasm sentiment recognition in twitter data[C]//Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 2015: 1373-1380.
[7] Francesco Barbieri, Horacio Saggion, Francesco Ronzano. Modelling sarcasm in twitter, a novel approach[C]//Proceedings of the ACL 2014, 2014: 50.
[8] Yi-jie Tang, Hsin-Hsi Chen. Chinese Irony Corpus Construction and Ironic Structure Analysis[C]//Proceedings of the The 25th International Conference on Computational Linguistics: Technical Papers. Dublin, Ireland, 2014: 1269-1278.
[9] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[J]. arXiv preprint, arXiv: 1301,3781V3,2013.
[10] Kim Y. Convolutional Neural Networks for Sentence Classification[J]. arXiv preprint, arXiv: 1408.5882, 2014.
[11] Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural computation, 1997, 9(8): 1735-1780.
[12] 孙晓,叶嘉麒,唐陈意,等. 基于多策略的新浪微博大数据抓取及应用[J].合肥工业大学学报,2014, 39(10): 1210-1215.
[13] Sun X, Li C, Ren F. Sentiment Analysis for Chinese microblog based on deep neural networks with convolutional extension features[J]. Neurocomputing, 2016,210:227-236.
[14] Sun X, Jiaqi Y E, Ren F. Detecting influenza states based on hybrid model with personal emotional factors from social networks[J]. Neurocomputing, 2016,210: 257-268.
[15] Graves A, Schmidhuber J. Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures[J]. Neural Networks, 2005, 18(5): 602-610.
[16] M Liwicki, A Graves, S Fernández, et al. A Novel Approach to On-Line Handwriting Recognition Based on Bidirectional Long Short-Term Memory Networks[C]//Proceedings of the 9th International Conference on Document Analysis and Recognition. 2007, 1: 367-371.
[17] Trask A, Michalak P, Liu J. sense2vec—A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings[J]. arXiv preprint arXiv: 1511.06388VI, 2015.
[18] Aniruddha Ghosh, Guofu Li. Semeval-2015 task 11: Sentiment analysis of figurative language in twitter[C]//Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015). 2015: 470-478.

基金

安徽省自然基金(1508085QF119);国家自然基金(61432004);模式识别国家重点实验室开放课题(NLPR)(201407345);中国博士后科学基金(2015M580532)

PDF(4577 KB)

960

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

Received	Published
2016-09-27	2016-12-15
Issue Date
2016-12-15

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金