词汇的情感倾向直接影响短语、句子、段落、篇章等更高层次语言粒度的情感倾向。对于基准词选取问题,该文提出了基于类别区分能力与情感词词表相结合的方法。考虑到词汇与其同义词很大程度上具有相同的情感倾向,我们提出了基于同义词的词汇情感倾向判别方法,这种方法一定程度上避免了数据稀疏问题。实验结果表明,基于同义词的词汇情感倾向判别方法优于仅采用目标词与基准词的词汇情感倾向判别方法。
Abstract
The word sentiment orientation directly influences the sentiment orientation of higher level linguistic unit, such as the phrase, the sentence, the paragraph and the text. This paper proposes a paradigm word selection method based on the category distinguishing ability of a word and the sentiment word table. In consideration of that a word usually has the same sentiment orientation with its synonyms, we propose a method for word sentiment orientation discriminating based on synonyms. The method can avoid the data sparseness issue in a certain extent. The experiment results indicate that the proposed method is superior to the method based on the object word and paradigm words.
Key words computer application; Chinese information processing; word sentiment orientation; paradigm word; relation intensity; synonym
关键词
计算机应用 /
中文信息处理 /
词汇情感倾向 /
基准词 /
关联强度 /
同义词
{{custom_keyword}} /
Key words
computer application /
Chinese information processing /
word sentiment orientation /
paradigm word /
relation intensity /
synonym
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] PETER D. Turney. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews [C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL)//Philadelphia, PA, USA. 2002: 417-424.
[2] PETER D. Turney and MICHAEL L. Littman. Measuring praise and criticism: inference of semantic orientation from association[J]. ACM Transactions on Information Systems, 2003, 21(4): 315-346.
[3] PETER D. Turney and MICHAEL L. Littman. Unsupervised learning of semantic orientation from a hundred-billion-word corpus [R]. Tech. Rep. EGB-1094, National Research Council Canada: 2002.
[4] DAVE K., LAWRENCE S., and PENNOCK D.. Mining the peanut gallery: opinion extraction and semantic classification of product reviews [C]//Proceedings of the 22nd International World Wide Web Conference. Budapest, Hungary: 2003.
[5] YUEN Raymond W.M., CHAN Terence Y.W., LAI Tom B.Y. et al. Morpheme-based derivation of bipolar semantic orientation of Chinese words [C]//Proc. Of the 20th International Conference on Computational Linguistics (COLING-2004), Geneva, Switzerland. 2004: 1008-1014.
[6] 朱嫣岚, 闵锦, 周雅倩,等. 基于HowNet的词汇语义倾向计算[J]. 中文信息学报, 2006,21(1): 14-20.
[7] 徐琳宏, 林鸿飞, 杨志豪. 基于语义理解的文本倾向性识别机制[J]. 中文信息学报, 2007,21[1]:96-100.
[8] 王根, 赵军. 中文褒贬义词语倾向性的分析[C]//第三届学生计算语言学研讨会论文集. 沈阳. 2006: 81-85.
[9] 张伟,刘缙,郭先珍.学生褒贬义词典[M].中国大百科全书出版社. 2004.
[10] 史继林,朱英贵.褒义词词典[M].四川:四川辞书出版社. 2005.
[11] 杨玲,朱英贵.贬义词词典[M].四川:四川辞书出版社. 2005.
[12] 王素格. 基于Web的评论文本的情感分类问题研究[D]. 博士论文.上海:上海大学.2008.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金资助项目(60875040);教育部科学技术研究重点基金(2007018);教育部高等学校博士点基金(200801080006);山西省自然科学基金资助项目(2007011042);山西省重点实验室开放基金资助项目;山西高校科技研究开发项目(200611002)
{{custom_fund}}