赵国荣,王文剑. 一种处理结构化输入输出的中文句法分析方法[J]. 中文信息学报, 2015, 29(1): 139-145.
ZHAO Guorong,WANG Wenjian. A Chinese Parsing Method Based on Interdependent and Structured Input and Output Spaces. , 2015, 29(1): 139-145.
一种处理结构化输入输出的中文句法分析方法
赵国荣,王文剑
山西大学 计算机与信息技术学院,山西 太原 030006
A Chinese Parsing Method Based on Interdependent and Structured Input and Output Spaces
ZHAO Guorong,WANG Wenjian
School of Computer and Information Technology, Shanxi University, Taiyuan, Shanxi 030006, China
Abstract:Chinese syntax has complex structure and high dimension features, and the best known Chinese parsing performance is still inferior to that of other western languages. In order to improve the efficiency and accuracy of Chinese parsing,we propose a L2-norm soft margin optimization structural support vector machines (structural SVMs) approach. By constructing the structural function ψ(x,y), the input information of syntactic tree can be mapped well. Since Chinese syntax has a strong correlation, we use father node of phrase structure trees to enrich the structure information of ψ(x,y). The experiment results on the benchmark dataset of PCTB demonstrate that the proposed approach is effective and efficient compared with classical Structural SVMs and Berkeley Parser system.
[1] Manning C D, Schutze H. Foundations of statistical natural language processing [M]. London: the MIT Press, 1999. [2] 冯志伟.基于短语结构语法的自动句法分析方法[J].当代语言学,2000, 2(2): 84-98. [3] 马金山. 基于统计方法的汉语依存句法分析研究[D].哈尔滨:哈尔滨工业大学,2007. [4] 吴伟成,周俊生, 曲维光. 基于统计学习模型的句法分析方法综述[J].中文信息学报.2013,27(3):9-19. [5] Vapnik V. Statictical Learning Theory [M].New York: Wiley, 1998. [6] Tsochantaridis I, Hofmann T, Joachims T, et al. Support Vector Machine Learning for Interdependent and Structured Output Spaces[C]//Proceedings of the twenty-first International Conference on Machine Learning, 2004:104-112. [7] Dietterich G H, Domingos P, Getoor L. Structured Machine Learning: the next ten years [J], Machine Learning, 2008, 73(1):3-23. [8] 王文剑,王亚贝. 基于结构化支持向量机的中文句法分析[J].山西大学学报(自然科学版). 2011, 1: 66-72. [9] http://code.google.com/p/berkeleyparser/ [10] T Joachims, T Hofmann, Yisong Yue, et al. Predicting Structured Objects with Support Vector Machines[J], Communications of the ACM, Research Highlight, November, 2009,52(11):97-104. [11] Tsochantaridis I, Joachims T, Hofmann T, et al. Large Margin Methods for Structured and Interdependent Output Variables [J]. Journal of Machine Learning Research, 2005, 9: 1453-1484. [12] Joachims T, Finley T, Chun-Nam Yu. Cutting-Plane Training of Structural SVMs [J]. Machine Learning, 2009, 77(1):27-59. [13] Eugene C, Mark J. Coarse-to-fine n-best parsing and MaxEnt Discriminative reranking[C]//Proceedings of the 43rd Annual Meeting of the ACL, 2005:173-180. [14] Nello C, John S T. An Introduction to SVM and Other Kernel-based Learning Methods [M].北京:电子工业出版社,2004. [15] 黄昌宁,李玉梅,周强.树库的隐含信息[J].中国语言学报.2012,15: 149-160. [16] Collins M J.A new statistical parser based on bigram lexical Dependencies [C]//Proceedings of ACL, 1996: 184-191.