一种基于概率上下文无关文法的汉语句法分析

林颖,史晓东,郭锋

PDF(496 KB)
PDF(496 KB)
中文信息学报 ›› 2006, Vol. 20 ›› Issue (2) : 3-9,34.

一种基于概率上下文无关文法的汉语句法分析

  • 林颖,史晓东,郭锋
作者信息 +

A Chinese Parser Based on Probabilistic Context Free Grammar

  • LIN Ying,SHI Xiao-dong,GUO Feng
Author information +
History +

摘要

本文研究了PCFG独立性假设的局限性,并针对这一局限性提出了句法结构共现的概念以引入上下文信息,给出了计算方法;为了打破中文树库规模过小的局限性,对于句法规则参数的获取,本文利用Inside-Outside算法进行迭代,最后提出了一个基于统计模型的自顶向下的汉语句法分析器。在封闭测试下,其标记精确率和标记召回率分别为88.1%和86.8%。实验结果表明,这种方法确实能够提高标记的精确率和召回率,值得深入研究。

Abstract

This paper studies the limitations of probabilistic context free grammar , and proposes a concept of co-occurrence in syntax structure so as to use the context information. To address the limitation of the Chinese Treebank’s small scale , an Inside-Outside algorithm to obtain the parameters of syntactic rules is given. At last , we present a probabilistic top-down Chinese parser. In the closed test , we get the result that label precision and label recall are 88.1% and 86.8% , showing that this method has potential to get a better performance in parsing and deserves further research.

关键词

人工智能 / 自然语言处理 / 统计句法分析 / 概率上下文无关文法 / 汉语自动分析

Key words

artificial intelligence / natural language processing / statistical paring / probabilistic context-free grammar / Chinese NLP

引用本文

导出引用
林颖,史晓东,郭锋. 一种基于概率上下文无关文法的汉语句法分析. 中文信息学报. 2006, 20(2): 3-9,34
LIN Ying,SHI Xiao-dong,GUO Feng. A Chinese Parser Based on Probabilistic Context Free Grammar. Journal of Chinese Information Processing. 2006, 20(2): 3-9,34

参考文献

[1] Christopher D. Manning Hinrich Schutze. Foundations of Statistical Natural Language Processing[M] . The MIT Press Cambridge ,Massachusetts London ,England ,1999.
[2] Eugene Charniak. Parsing With Context-free Grammar and Word Statistics[A] . Technical Report CS-95-28 ,Dept. of Computer Science ,Brown University ,1995.
[3] Michael Collins. Head-Driven Statistical Model for Natural Language Parsing[D] . Ph. D. Thesis ,The University of Pennsylvania. 1999.
[4] Brian Roark. Probabilistic Top-Down Parsing and Language Modeling[J] . Computational Linguistics 2001 Volume 27 , Number 2.
[5] Michael Collins ,Three Generative Lexicalised Models for Statistical Parsing[C] ,CoRR cmp-lg/9706022.
[6] 孟遥. 四种基本统计句法分析模型在汉语句法分析中的性能比较[J] . 中文信息学报,2003 ,17 (3) :1 - 8.
[7] 吴竞存. 现代汉语句法分析[M] ,北京:北京大学出版社,1982.
[8] Charniak ,Eugene. 2000. A maximum-entropy-inspired parser[A] . In : Proceedings of the lst Conference of the North American Chapter of the Association for Computational Linguistics[C] ,132 - 139.
[9] 张浩. 结构上下文相关的概率句法分析[EB] http://www.nlp.org.cn/categories/default.php?cat—id=13.
[10] 杨开城. 一种基于句法语义特征的汉语句法分析器[J] . 中文信息学报,2000 ,14 (3) :46 - 53.

基金

国家高科技研究发展计划(863)资助项目(2002AA117010)
PDF(496 KB)

922

Accesses

0

Citation

Detail

段落导航
相关文章

/