戴 敏,朱 珠,李寿山,周国栋. 面向中文文本的情感信息抽取语料库构建[J]. 中文信息学报, 2015, 29(4): 67-73.
DAI Min, ZHU Zhu, LI Shoushan, ZHOU Guodong. Corpus Construction on Opinion Information Extraction in Chinese. , 2015, 29(4): 67-73.
面向中文文本的情感信息抽取语料库构建
戴 敏,朱 珠,李寿山,周国栋
苏州大学 计算机科学与技术学院自然语言处理实验室, 江苏 苏州 215006
Corpus Construction on Opinion Information Extraction in Chinese
DAI Min, ZHU Zhu, LI Shoushan, ZHOU Guodong
NLP Lab, School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215006, China
Abstract:Opinion information extraction (OIE) is an important sub-task in the research on sentiment analysis. Currently, one pressing issue in Chinese OIE is that the Chinese corpus is not readily avalable. This paper focuses on the annotation framework for Chinese OIE, and constrcuts a Chinese corpus containing rich information. Specifically, in additions to the popular elements including sentiment orientation, opinion target and opinion keyword, our corpus contains the information of opinion target ellipsis, the expressing opinion without sentimental words and the sentimental polarity shifting. The statistics show the popularity and necessity of these special points (e.g., opinion target ellipsis) in Chinese texts.
[1]Pang B, Lee L. Opinion Mining and Sentiment Analysis[J]. Foundations and Trends in Information Retrieval, 2008, 2(1-2) :1-135.
[2] Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment Classification using Machine Learning Techniques[C]//Proceedings of EMNLP-02. 2002: 79-86.
[3] 宗成庆. 统计自然语言处理[M]. 北京: 清华大学出版社,2008:1-475.
[4] Kim S, Hovy E. Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text[C]//Proceedings of the ACL Workshop on Sentiment and Subjectivity in Text. 2006: 1-8.
[5] Ku L, Liu I, Lee C, et al. H. Sentence-Level Opinion Analysis by CopeOpi in NTCIR-7[C]//Proceedings of NTCIR-7 Workshop. 2008.
[6] Hu M, Liu B. Mining Opinion Features in Customer Reviews[C]//Proceedings of AAAI-2004. 2004: 755-760.
[7] Zhuang L, Jing F, Zhu X. Movie review mining and summarization[C]//Proceedings of CIKM-2006. 2006: 43-50.
[8] Li B, Zhou L, Feng S, et al. A Unified Graph Model for Sentence-based Opinion Retrieval[C]//Proceedings of ACL. 2010:1367-1375.
[9] Jakob N, Gurevych I. Extracting Opinion Targets in a Single and Cross-Domain Setting with Conditional Random Fields[C]//Proceedings of EMNLP-2010. 2010: 1035-1045.
[10] 王荣洋,鞠久朋,李寿山,等. 基于CRFs的评价对象抽取特征研究. 中文信息学报[J],2012,26(2): 56-61.
[11] Li S, Wang R, Zhou G. Opinion Target Extraction using a Shallow Semantic Parsing Framework[C]//Proceedings of AAAI 2012. 2012:1671-1677.
[12] 赵军,许洪波,黄萱菁,等. 中文倾向性分析评测技术报告[C]//Proceeding of COAE-2008.
[13] 刘康,王素格,廖祥文,等. 第四届中文倾向性分析评测总体报告[C]//Proceeding of COAE-2012.
[14] 谭松波,王素格,廖祥文,等. 第五届中文倾向性分析评测总体报告[C]//Proceeding of COAE-2013.
[15] Toprak C., Jakob N., and Gurevych I. Sentence and Expression Level Annotation of Opinions in User-Generated Discourse[C]//Proceedings of ACL-2010. 2010: 575-584.
[16] Cohen. A coefficient of agreement for nominal scales[J]. Educational and Psychological Measurement, 1960:37-46.