细粒度意见挖掘的主要目标是从观点文本中获取情感要素并判断情感倾向。现有方法大多基于序列标注模型,但很少利用情感词典资源。该文提出一种基于领域情感词典特征表示的细粒度意见挖掘方法,使用领域情感词典在观点文本上构建特征表示并将其加入序列标注模型的输入部分。首先构建一份新的电商领域情感词典,然后在电商评论文本真实数据上,分别为条件随机场(CRF)和双向长短期记忆-条件随机场(BiLSTM-CRF)这两种常用序列标注模型设计基于领域情感词典的特征表示。实验结果表明,基于电商领域情感词典的特征表示方法在两种模型上都取得了良好的效果,并且超过其他情感词典。
Abstract
Fine-grained opinion mining aims at detecting sentiment units and determining sentiment polarity from opinion text. Recent methods are mostly based on sequence labeling models, rarely using the information of sentiment lexicon resources. This paper proposes a fine-grained opinion mining method based on feature representation of domain sentiment lexicon. It generates feature representation by using domain sentiment lexicon, applying it as the input of sequence labeling model. We build a new sentiment lexicon in E-commerce domain, and then we design feature representation of domain sentiment lexicon for CRF and BiLSTM-CRF. Experiments on E-commerce reviews show that our proposed method performs well on both models and outperforms the method based on other lexica.
关键词
细粒度意见挖掘 /
情感词典 /
特征表示 /
序列标注模型
{{custom_keyword}} /
Key words
fine-grained opinion mining /
sentiment lexicon /
feature representation /
sequence labeling model
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Wiebe J,Wilson T,Cardie C. Annotating expressions of opinions and emotions in language[J]. Language resources and evaluation,2005,39(2-3): 165-210.
[2] Liu B. Sentiment analysis and opinion mining[J]. Synthesis lectures on human language technologies,2012,5(1): 1-167.
[3] Sang E F,Veenstra J. Representing text chunks[C]//Proceedings of the ninth Conference on European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics,1999: 173-179.
[4] Lafferty J D,McCallum A,Pereira F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the ICML 2001,3(2): 282-289.
[5] Lample G,Ballesteros M,Subramanian S,et al.Neural architectures for named entity recognition[J]. arXiv preprint arXiv: 1603.01360,2016.
[6] Nakayama Y,Fujii A. Extracting condition-opinion relations toward fine-grained opinion mining[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 622-631.
[7] Liu P,Joty S,Meng H. Fine-grained opinion mining with recurrent neural networks and word embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1433-1443.
[8] Pang B,Lee L,Vaithyanathan S. Thumbs up?: sentiment classification using machine learning techniques[C]//Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing-Volume 10. Association for Computational Linguistics,2002: 79-86.
[9] Wilson T,Wiebe J,Hoffmann P. Recognizing contextual polarity in phrase-level sentiment analysis[C]//Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics,2005: 347-354.
[10] Choi Y,et al.Identifying sources of opinions with conditional random fields and extraction patterns[C]//Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics,2005: 355-362.
[11] Shariaty S,Moghaddam S. Fine-grained opinion mining using conditional random fields[C]//Proceedings of Data Mining Workshops (ICDMW),2011 IEEE 11th International Conference on. IEEE,2011: 109-114.
[12] Breck E,Choi Y,Cardie C. Identifying expressions of opinion in context[C]//Proceedings of IJCAI. 2007,7: 2683-2688.
[13] Choi Y,Cardie C. Hierarchical sequential learning for extracting opinions and their attributes[C]//Proceedings of the ACL 2010 Conference Short Papers. Association for Computational Linguistics,2010: 269-274.
[14] Yang B,Cardie C. Extracting opinion expressions with semi-markov conditional random fields[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational linguistics,2012: 1335-1345.
[15] Yang B,Cardie C. Joint inference for fine-grained opinion extraction[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2013,1: 1640-1649.
[16] Irsoy O,Cardie C. Opinion mining with deep recurrent neural networks[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014: 720-728.
[17] Katiyar A,Cardie C. Investigating lstms for joint extraction of opinion entities and relations[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2016,1: 919-929.
[18] Ma X,Hovy E. End-to-end sequence labeling via bi-directional lstm-cnns-crf[J]. arXiv preprint arXiv: 1603.01354,2016.
[19] Kaji N,Kitsuregawa M. Building lexicon for sentiment analysis from massive collection of HTML documents[C]//Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL). 2007.
[20] Jijkoun V,de Rijke M,Weerkamp W. Generating focused topic-specific sentiment lexicons[C]//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics,2010: 585-594.
[21] 徐琳宏,等. 情感词汇本体的构造[J]. 情报学报,2008,27(2): 180-185.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61572338,61876115);江苏省高校自然科学研究重大项目(16KJA520001)
{{custom_fund}}