基于主观性文本的意见挖掘技术是一种在多种领域都有广泛应用的语言技术。该文把评价性语素作为研究对象,在哈尔滨工业大学的语言技术平台(LTP)对语料处理结果的基础上,利用SBV极性传递法为核心,引入指代消解、ATT链算法和互信息法对语料中的评价对象进行抽取,并在对极性词进行倾向性判别时,充分考虑了不同类型的句子,以及副词、连词对极性的影响,尤其是对一般副词、贬义副词和副词“太”作了详细地探讨,最后提出了一个综合的解决方案。该方案结构层次清晰,易于理解,并且其算法复杂度较低。但由于利用的是较为浅层的句法分析结果和基于经验的语言模式方法,该文提出的方案对句法分析结果的依赖度较大。
Abstract
Opinion mining based on the subjective text is a language technology widely used in various fields. This paper studies on the evaluation morpheme, employing SBV polarity transfer algorithm, anaphora resolution, ATT chain algorithm and mutual information algorithm to extract evaluated objects from corpus results of LTP. Different types of sentences are taken into consideration to identify the orientation of sentiment words. The effects of adverb and conjunction, especially the normal adverb, negative adverb and adverb “Tai” are discussed in detail. Finally, an overall solution is presented with low algorithm complexity, clear structure and easy to understand. However, due to the adoption of basic syntactic analysis and experience-based language pattern, the proposed solution is dependent on syntactic analysis results.
Key wordsevaluated object; orientation; SBV polarity transfer algorithm; anaphora resolution
关键词
评价对象 /
倾向性 /
SBV极性传递法 /
指代消解
{{custom_keyword}} /
Key words
evaluated object /
orientation /
SBV polarity transfer algorithm /
anaphora resolution
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 姚天昉,娄德成. 汉语语句主题语义倾向分析方法的研究[J],中文信息学报,2007,21(5): 73-79.
[2] 徐赳赳. 现代汉语篇章回指研究[M]. 北京,中国社会科学出版社,2003.
[3] Li,C.N., S. A. Thompson. Third-person pronouns and zero-anaphora in Chinese discourse[C]//T. Givon(ed.). Syntax and Semantics:Discourse and Syntax,1979(12): 311-335
[4] Walker,M.A.,A.K. Joshi, E. F. Prince. Centering in naturally-occurring discourse: An overview[C]//M.A. Walker,A.K. Joshi & E.F. Prince(eds.),Centering Theory in Discourse,New York,Oxford University Press,1998:1-28.
[5] 王德亮. 汉语长距离回指的消解策略[C]//第七届中文信息处理国际会议,湖北武汉大学,2007,10.
[6] 陆俭明. 对“NP+的+VP”结构的重新认识[J],中国语文,2003,(5).
[7] 李治国,蔡东风,周俏丽,等. 在篇章中利用互信息识别命名实体的研究[J],沈阳航空工业学院学报,2007,24(1): 31,35-37
[8] 刘鸿宇,赵妍妍,秦兵,等. 评价对象抽取及其倾向性分析[J],中文信息学报,2010,24(1).
[9] Hatzivassiloglou V.,McKeown R.. Predicting the semantic orientation of adjectives[C]//Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (ACL-97),Madrid,Spain,July 7-12,1997: 174-181.
[10] 王治敏,朱学锋,俞士汶.基于现代汉语语法信息词典的词语情感评价研究[J].中文计算语言学期刊,2005,10(4): 581-592.
[11] 陆俭明. 汉语和汉语研究十五讲[M],北京大学出版社,2004.
[12] Walker M.A.,A.K. Joshi, E. F. Prince. Centering in naturally-occurring discourse: An overview[C]//M.A. Walker,A.K. Joshi & E.F. Prince(eds.),Centering Theory in Discourse,New York,Oxford University Press,1998:1-28.
[13] Turney P D. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL-02),Philadelphia,PA,USA,July 6-12,2002: 417-424.
[14] 姚天昉,娄德成. 汉语情感词语义倾向判别的研究[C]//第七届中文信息处理国际会议,湖北武汉大学,2007,10.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金资助项目(60773087)
{{custom_fund}}