语料对中文名词短语指代消解影响研究

高俊伟,孔 芳,朱巧明,李培峰

PDF(1106 KB)
PDF(1106 KB)
中文信息学报 ›› 2013, Vol. 27 ›› Issue (3) : 61-69.
综述

语料对中文名词短语指代消解影响研究

  • 高俊伟,孔 芳,朱巧明,李培峰
作者信息 +

Research on the Corpus Effect to the Chinese Noun Phrase Anaphora Resolution

  • GAO Junwei, KONG Fang, ZHU Qiaoming, LI Peifeng
Author information +
History +

摘要

指代是自然语言中一种常见的语言现象,对简化语言,减少冗余有很大的作用。指代消解是用计算机找出这些指代现象的一个过程。近几年英文指代消解研究取得了很大的成就,然而,中文指代消解研究目前还较少,一方面是由于中文自然语言处理的研究起步较晚,相关的知识较少,另外一方面就是中文相关的语料库较少,目前已知的仅有ACE2005, OntoNotes等。为了探讨语料库对中文名词短语指代消解的影响,该文实现了一个基于有监督学习方法的中文名词短语指代消解平台和一个基于无监督聚类方法的中文名词短语指代消解平台,在此平台的基础上从语料库的数量和质量两个方面来探讨语料对中文名词短语指代消解的影响。

Abstract

Coreference is a common phenomenon in natural language, with a great effect in making the natural language clear and explicit illusions. Coreference resolution is the process to detect these phenomena by the computer. A great deal of research has been conducted on this task in English with substantial achievements in recent years. However, much less work has been done in this area in Chinese. One problem is the lack of public Chinese corpus for this research in except for ACE2005, OntoNotes and so on. To discuss the effect of the corpus to the Chinese Noun Phrase Anaphora Resolution, we present a Chinese noun phrase coreference resolution system that based on supervised learning approach and another system that based on unsupervised clustering approach. We discussed the effect of the corpus to the Chinese noun phrase coreference resolution based on the two platforms from the quantity and the quality of the corpus.
Key wordscoreference resolution; noun phrase; unsupervised; clustering; corpus

关键词

指代消解 / 名词短语 / 无监督 / 聚类 / 语料

Key words

coreference resolution / noun phrase / unsupervised / clustering / corpus

引用本文

导出引用
高俊伟,孔 芳,朱巧明,李培峰. 语料对中文名词短语指代消解影响研究. 中文信息学报. 2013, 27(3): 61-69
GAO Junwei, KONG Fang, ZHU Qiaoming, LI Peifeng. Research on the Corpus Effect to the Chinese Noun Phrase Anaphora Resolution. Journal of Chinese Information Processing. 2013, 27(3): 61-69

参考文献

[1] Jerry Hobbs. Resolving pronoun reference[J]. Lingua,1978, 44:339-352.
[2] Lappin S, Herbert J L. An algorithm for pronominal anaphora resolution[J].Computational Linguistics,1994,20(4):535-561.
[3] Soon W M, Ng H T, Lim D. A machine learning approach to coreference resolution of noun phrases[J]. Computational Linguistics,2001,27(4):521-544.
[4] V Ng, C Cardie. Improving machine learning approaches to coreference resolution [C]//ACL2002:104-111.
[5] Yang X F, Su J, Tan C L. Kernel-based pronoun resolution with structured syntactic knowledge[C]// ACL2006:41-48.
[6] Zhou G D, Kong F, Zhu Q M. Context-sensitive convolution tree kernel for pronoun resolution[C]//IJCNLP2008:25-31.
[7] Vincent Ng. Unsupervised Models for Coreference Resolution[C]//EMNLP2008:640-649.
[8] Hoifung Poon, Pedro Domingos. Joint Unsupervised Coreference Resolution with Markov Logic[C]//EMNLP2008:650-659.
[9] 王厚峰,何婷婷. 汉语中人称代词的消解研究[J].计算机学报,2001,24(2):6-13.
[10] 王厚峰,梅铮. 鲁棒性的汉语人称代词消解[J].软件学报,2005,16(5):700-707.
[11] 王厚峰.指代消解的方法和实现技术[J].中文信息学报,2002,16(6):9-17.
[12] Grace Ngai, Chi Shing Wang. A Knowledge-based Approach for Unsupervised Chinese Coreference Resolution [J]. Computational Linguistics and Chinese Language Processing. 2007, 12(4): 459-484.
[13] 周俊生,黄书剑,陈家骏,等. 一种基于图划分的无监督汉语指代消解算法[J]. 中文信息学报, 2007,21(2):77-82.
[14] 李国臣,罗云飞. 采用优先选择策略的中文人称代词的指代消解[J]. 中文信息学报,2005,119(14):24-30.
[15] 史树敏,黄河燕,刘东升. 自然语言文本共指消解性能评测算法研究[J].计算机科学,2008,35(9):168-171.

基金

国家自然科学基金资助项目(90920004,60970056,61070123,61003153);江苏省高校自然科学重大基础研究资助项目(08KJA520002)
PDF(1106 KB)

627

Accesses

0

Citation

Detail

段落导航
相关文章

/