基于动作建模的中文依存句法分析

段湘煜,赵军,徐波

PDF(350 KB)
PDF(350 KB)
中文信息学报 ›› 2007, Vol. 21 ›› Issue (5) : 25-30.
综述

基于动作建模的中文依存句法分析

  • 段湘煜,赵军,徐波
作者信息 +

Chinese Dependency Parsing Based on Action Modeling

  • DUAN Xiang-yu, ZHAO Jun, XU Bo
Author information +
History +

摘要

决策式依存句法分析,也就是基于分析动作的句法分析方法,常常被认为是一种高效的分析算法,但是它的性能稍低于一些更复杂的句法分析模型。本文将决策式句法分析同产生式、判别式句法分析这些复杂模型做了比较,试验数据采用宾州中文树库。结果显示,对于中文依存句法分析,决策式句法分析在性能上好于产生式和判别式句法分析。更进一步,我们观察到决策式句法分析是一种贪婪的算法,它在每个分析步骤只挑选最有可能的分析动作而丢失了对整句话依存分析的全局视角。基于此,我们提出了两种模型用来对句法分析动作进行建模以避免原决策式依存分析方法的贪婪性。试验结果显示,基于动作建模的依存分析模型在性能上好于原决策式依存分析方法,同时保持了较低的时间复杂度。

Abstract

Action-based dependency parsing, also known as deterministic dependency parsing, has often been regarded as an efficient parsing algorithm while its parsing accuracy is a little lower than the best results reported by more complex parsing models. In this paper, we compare action-based dependency parsers with complex parsing methods such as generative and discriminative parsers on the standard data set of Penn Chinese Treebank. The results show that, for Chinese dependency parsing, action-based parsers outperform generative and discriminative parsers. Furthermore, we propose two kinds of models for the modeling of parsing actions in action-based Chinese dependency parsing. We take the original action-based dependency parsers as baseline systems. The results show that our two models perform better than the baseline systems while maintaining the same time complexity, and our best result improves much over the baseline.

关键词

计算机应用 / 中文信息处理 / 中文依存句法分析 / 决策式依存分析 / 动作建模

Key words

computer application / Chinese information processing / Chinese dependency parsing / deterministic dependency parsing / parsing action modeling

引用本文

导出引用
段湘煜,赵军,徐波. 基于动作建模的中文依存句法分析. 中文信息学报. 2007, 21(5): 25-30
DUAN Xiang-yu, ZHAO Jun, XU Bo. Chinese Dependency Parsing Based on Action Modeling. Journal of Chinese Information Processing. 2007, 21(5): 25-30

参考文献

[1] Taku Kudo and Yuji Matsumoto. Japanese dependency analysis using cascaded chunking [A]. In: Proceedings of the Sixth Workshop on ComputationalLanguage Learning (CoNLL) [C]. 2002.
[2] Hiroyasu Yamada and Yuji Matsumoto. Statistical dependency analysis with support vector machines [A]. In: Proceedings of the 8th InternationalWorkshop on Parsing Technologies (IWPT) [C]. 2003.
[3] Joakim Nivre and Mario Scholz. Deterministic dependency parsing of English text [A]. In: Proceedings of the 20th International Conference on Computational Linguistics (COLING) [C]. 2004.
[4] Kenji Sagae and Alon Lavie. A classifier-based parser with linear run-time complexity [A]. In: Proceedings of the 9th International Workshop on Parsing Technologies (IWPT) [C]. 2005.
[5] Mengqiu Wang, Kenhi Sagae, and Teruko Mitamura. A fast, accurate deterministic parser for Chinese [A]. In: Proceedings of the 44th AnnualMeeting of the Association for ComputationalLinguistics (ACL) [C]. 2006.
[6] Nianwen Xue, Fei Xia, Fu-Dong Chiou, and Martha Palmer. The Penn Chinese Treebank: Phrase structure annotation of a large corpus [J]. Natural LanguageEngineering. 2005.
[7] Chih-Chung Chang and Chih-Jen Lin. LIBSVM: A library for support vector machines [EB/OL]. 2005.
[8] Meixun Jin, Mi-Young Kim, and Jong-Hyeok Lee. Two-phase shift-reduce deterministic dependency parser of Chinese [A]. In: Proceedings of the Second International Joint Conference on Natural Language Processing (IJCNLP) [C]. 2005.
[9] Honglin Sun and Daniel Jurafsky. Shallow semantic parsing of Chinese [A]. In: Proceedings of the HLT/NAACL [C]. 2004.
[9] Daniel M. Bikel. On the Parameter Space of Generative Lexicalized Statistical Parsing Models [D]. Ph.D. thesis of University of Pennsylvania. 2004.
[11] Ryan McDonald, Koby Crammer, and Fernando Pereira. Online large-margin training of dependency parsers [A]. In: Proc. of the 43rd AnnualMeeting of the Association for ComputationalLinguistics (ACL) [C]. 2005.
[12] Michael Collins. Head-Driven Statistical Models for Natural Language Parsing [D]. Ph.D. thesis of University of Pennsylvania. 1999.

基金

国家自然科学基金资助项目(60673042);国家高技术研究发展计划资助项目(2006AA01Z144);北京市自然科学基金资助项目(4052027,4073043)
PDF(350 KB)

Accesses

Citation

Detail

段落导航
相关文章

/