该文从研究背景、设计思路、标注体系和方法、加工步骤等方面介绍了汉语语义倾向语料库的建设过程。该语料库是一个以研究语言主观性表达为目的的共时、非平衡、单语标注语料库,依据语言主观性多维度描述体系而设计,规模为100万字,配备有集检索与统计、结果检查与可视化于一体的专用语料库工具箱系统,具有可用性大、标注质量高、语言学理据强等特点。
Abstract
This paper introduces the construction of a Chinese Semantic Orientation Corpus (CSOC) by presenting its research background, design plan, annotating system and processing steps. The CSOC is an unbalanced synchronic monolingual corpus for the purpose of researching linguistic subjective expressions. Shipped with a concordancer, retrievial and visualization toolkit, the one million Chinese character corpus is specially designed according to a multi-dimensional descriptive system of linguistic subjectivity. It is characterized by its high-quality, linguistic motivation and double usability for both linguistics and natural language processing.
关键词
语义倾向 /
语料库 /
主观性 /
建设
{{custom_keyword}} /
Key words
semantic orientation /
corpus /
subjectivity /
construction
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 沈家煊. 语言的“主观性”和“主观化”[J].外语教学与研究, 2001,33(4):268-275.
[2] 沈家煊.汉语的主观性和汉语语法教学[J].汉语学习, 2009,(4):3-12.
[3] Lun-Wei Ku,Tung-Ho Wu,Li Ying Lee et al. Construction of an Evaluation Corpus for Opinion Extraction[C]//Proceedings of NTCIR-5 Workshop Meeting, Tokyo, Japan, 2005.
[4] 徐琳宏,林鸿飞,赵晶.情感语料库的构建和分析[J].中文信息学报,2008,22(1):116-122.
[5] 徐琳宏,林鸿飞,潘宇等.情感词汇本体的构造[J].情报学报,2008,27(2):180-185.
[6] 宋鸿彦,刘军,姚天昉等.汉语意见型主观性文本标注语料库的构建[J].中文信息处理2009,23(2):123-128.
[7] 彭宣维,杨晓军,何中清.汉英对应评价意义语料库[J].外语电化教学,2012,247(9):3-10.
[8] 崔晓玲.基于汉语网络新闻评论的情感语料库标注研究[J].北京邮电大学学报(社会科学版),2013,15(6):21-29.
[9] 谭松波.中文情感挖掘语料[DB/OL].(2010-06-29)[2013-07-20].http://www.searchforum.org.cn/tansongbo/corpus-senti.htm
[10] Martin J R. Beyond Exchange: APPRAISAL Systems in English[C]//Evaluation in Text, Hunston, S. & Thompson, G. (eds), Oxford: Oxford University Press, 2000:142-175.
[11] Martin J R, White P R R. The Language of Evaluation: Appraisal in English[M]. New York: Palgrave Macmillan, 2005.
[12] Taboada M, Grieve J. Analyzing Appraisal Automatically[C]//Proceedings of American Association for Artificial Intelligence Spring Symposium on Exploring Attitude and Affect in Text, Stanford, USA, 2004:158-161.
[13] Read J, Hope D, Carroll J. Annotating expressions of appraisal in English[C]//Proceedings of Linguistic Annotation Workshop, ACL 2007, Prague, Czech, 2007: 93-100.
[14] Wiebe J, Wilson T, Cardie C. Annotating expressions of opinions and emotions in language[J]. Language Resources and Evaluation, 2005, 39(2-3):165-210.
[15] Kim S M, Hovy E.Determining the Sentiment of Opinions[C]//Proceedings of the COLING Conference 2004, Geneva, 2004:1367-1373.
[16] Leech G.Corpus annotation schemes[J]. Literary and Linguistic Computing, 1993, 8(4):275-81.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
教育部人文社会科学研究项目(11YJC740127);湖南省教育厅科学研究优秀青年项目(14B068)
{{custom_fund}}