基于序列标注模型的分层式依存句法分析方法

鉴 萍,宗成庆

PDF(999 KB)
PDF(999 KB)
中文信息学报 ›› 2010, Vol. 24 ›› Issue (6) : 14-23.
综述

基于序列标注模型的分层式依存句法分析方法

  • 鉴 萍,宗成庆
作者信息 +

Layer Based Dependency Parsing by Sequence Labeling Models

  • JIAN Ping, ZONG Chengqing
Author information +
History +

摘要

该文提出了一种全新的分层式依存句法分析方法。该方法以依存深度不大于1的依存层作为分析单位,自底向上构建句子的依存结构。在层内,通过穷尽搜索得到层最优子结构;在层与层之间,分析状态确定性地转移。依存层的引入,使该模型具有比典型的基于图的方法更低的算法复杂度,与基于转换的方法相比,又一定程度上缓解了确定性过程的贪婪性。此外,该方法使用典型序列标注模型进行层依存子结构搜索,证明了序列标注技术完全可以胜任句法分析等层次结构分析任务。实验结果显示,该文提出的分层式依存分析方法具有与主流方法可比的分析精度和非常高的分析效率,在宾州树库上可以达到每秒2 500个英语单词。

Abstract

A layer-based projective dependency parsing approach is presented. This novel approach works layer by layer in a bottom-up manner, in which the depth of token dependency is allowed no more than one. Inside the layer the dependency graphs are searched exhaustively while between the layers the parser state transfers deterministically. Taking the dependency layer as the parsing unit, the proposed parser has a lower computational complexity than graph-based models which search for a whole dependency graph, alleviating the error propagation in transition-based models to some extent. Furthermore, our parser adopts the sequence labeling models to find the optimal sub-graph of the layer, which demonstrates the sequence labeling techniques qualified for hierarchical structure analysis tasks. Experimental results indicate that the proposed approach offers desirable accuracies and especially a very fast parsing speed, with 2500 words per second for Penn Treebank.
Key wordsdependency parsing; dependency layer; sequence labeling

关键词

依存句法分析 / 依存层 / 序列标注

Key words

dependency parsing / dependency layer / sequence labeling
 
/   /   /
 
/   /   /
 
/   /  

引用本文

导出引用
鉴 萍,宗成庆. 基于序列标注模型的分层式依存句法分析方法. 中文信息学报. 2010, 24(6): 14-23
JIAN Ping, ZONG Chengqing. Layer Based Dependency Parsing by Sequence Labeling Models. Journal of Chinese Information Processing. 2010, 24(6): 14-23

参考文献

[1] R. McDonald, K. Crammer and F. Pereira. Online large-margin training of dependency parsers[C]//Proc. of ACL. Ann Arbor, USA: 2005, 91-98.
[2] R. McDonald and F. Pereira. Online learning of approximate dependency parsing algorithms[C]//Proc. of EACL. Trento, Italy: 2006, 81-88.
[3] H. Yamada and Y. Matsumoto. Statistical dependency analysis with support vector machines[C]//Proc. of IWPT. Nancy, France: 2003, 195-206.
[4] J. Nivre and M. Scholz. Deterministic dependency parsing of English text[C]//Proc. of COLING. Switzerland: 2004, 64-70.
[5] R. McDonald and J. Nivre. Characterizing the errors of data-driven dependency parsing models[C]//Proc. of EMNLP-CoNLL. Prague, Czech: 2007, 122-131.
[6] F. Sha and F. Pereira. Shallow parsing with conditional random fields[C]//Proc. of NAACL. Edmonton, Canada: 2003, 213-220.
[7] 李珩, 朱靖波, 姚天顺. 基于SVM的中文组块分析[J]. 中文信息学报, 2004, 18(2): 1-7.
[8] Y. Wu, J. Yang and Y. Lee. Multilingual deterministic dependency parsing framework using modified finite Newton method support vector machines[C]//Proc. of EMNLP-CoNLL. Prague, Czech: 2007, 1175-1181.
[9] Y. Cheng, M. Asahara and Y Matsumoto. Multi-lingual dependency parsing at NAIST[C]//Proc. of CoNLL. New York City, USA: 2006, 191-195.
[10] J. Hall, J. Nilsson, J. Nivre, et. al. Single Malt or blended? A study in multilingual parser optimization[C]//Proc. of EMNLP-CoNLL. Prague, Czech: 2007, 933-939.
[11] M. P. Marcus, B. Santorini and M. A. Marcinkiewicz. Building a large annotated corpus of English: the Penn Treebank[J]. Computational Linguistics, 1993, 19(2): 313-330.
[12] J. Hall, J. Nivre and J. Nilsson. Discriminative classifiers for deterministic dependency parsing[C]//
Proc. of COLING-ACL. Sydney, Australia: 2006, 316-323.
[13] http://w3.msi.vxu.se/~nivre/research/Penn2Malt.html[CP/OL].
[14] A. Ratnaparkhi. MXPOST[CP/OL]. http://www.inf.ed.ac.uk/resources/nlp/local_doc/MXPOST.html.
[15] N. Xue, F. Xia, F. Chiou et, al. The Penn Chinese Treebank: phrase structure annotation of a large corpus[J]. Natural Language Engineering, 2005,11(2): 207-238.
[16] X. Duan, J. Zhao and B. Xu. Probabilistic models for action-based Chinese dependency parsing[C]//Proc.of ECML-PKDD. Warsaw, Poland: 2007, 559-566.
[17] J. Nivre, J. Hall and J. Nilsson. MaltParser: a data-driven parser-generator for dependency parsing[C]//Proc. of LREC. Genoa, Italy: 2006, 2216-2219.
[18] J. Nivre. Incrementality in deterministic dependency parsing[C]//Proc. of ACL. Barcelona, Spain: 2004, 50-57.
[19] R. McDonald. MSTParser[CP/OL]. http://www.seas.upenn.edu/~strctlrn/MSTParser/MSTParser.html.
[20] Y. Cheng, M. Asahara and Y. Matsumoto. Machine learning-based dependency analyzer for Chinese[C]//Proc. of ICCC. Mauritius: 2005, 66-73.
[21] M. Wang, K. Sagae and T. Mitamura. A fast, accurate deterministic parser for Chinese[C]//Proc. of COLING-ACL. Sydney, Australia: 2006, 425-432.
[22] Y. Goldberg and M. Elhadad. SplitSVM: fast, space-efficient, non-heuristic, polynomial kernel computation for NLP applications[C]//Proc. of ACL. Columbus USA: 2008, 237-240.
[23] J. Hall, J. Nilsson and J. Nivre. MaltParser in the CoNLL shared task 2007-Multilingual track[EB/OL]. http://maltparser.org/conll/conll07/.
[24] K. Sagae and A. Lavie. Parser combination by reparsing[C]//Proc. of NAACL. New York City, USA: 2006, 129-132.
[25] J. Nivre and R. McDonald. Integrating graph-based and transition-based dependency parsers[C]//Proc. of ACL. Columbus, USA: 2008, 950-958.
PDF(999 KB)

Accesses

Citation

Detail

段落导航
相关文章

/