Abstract:Chinese functional chunks are defined as a series of non-overlapping, non-nested skeleton segments of a sentence, representing the implicit grammatical relations between the sentence-level predicates and their arguments. In this paper, we proposed two statistical models for parsing four main functional chunks in a sentence. In the chunk boundary detection model, we focus on building the sub models based on SVM algorithm for detecting SP (subject-predicate) and PO (predicate-object) boundaries. In the sequence labeling model, we formulate the chunking task as a sequence labeling problem and base our model on CRF algorithm. By introducing some revision rules, we build a combined parsing model which integrates the advantages of both statistical models and have achieved the best F-Score of 82.93%, 86.58%, 78.46% and 86.64%for subject, predicate, object and adverb functional chunks respectively. Experimental results show that the complex clauses and serial verb structures are the main recognition difficulties.
[1] Lance A.Ramshaw and Mitchell P.Marcus. Text Chunking Using Transformation-Based Learning [A]. In: Proceedings of the Third ACL Workshop on Very Large Corpora8 [C]. Cambridge MA, USA: 1995. [2] Erik F. Tjong Kim Sang and Sabine Buchholz. Introduction to CoNLL-200 Shared Task: Chunking [A]. In: Proceedings of CoNLL-2000 and LLL-2000[C]. Lisbon, Portugal: 2000. 127-132. [3] Erik F. Tjong Kim Sang and Herv D jean. Introduction to the CoNLL-2001 Shared Task: Clause Identification [A]. In: Proceedings of CoNLL-2001 [C]. Toulouse, France: 2001. 53-57. [4] Xavier Carreras and Llus Marquez. Introduction to the CoNLL-2004 shared task: Semantic role labeling [A]. In: Proceedings of the Conference on Computational Natural Language Learning (CoNLL)[C]. Boston, MA: May, 2004. [5] Xavier Carreras and Llu s M arquez. Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling [A]. In: Proceedings of the CoNLL-2005 [C]. 2005. [6] 周强,任海波,詹卫东.构建大规模汉语语块库 [A]. 黄昌宁,张普主编自然语言理解与机器翻译[C].北京: 清华大学出版社, 2001. 102-107. [7] Steven Abney. Parsing By Chunks [A]. In: Robert Berwick, Steven Abney and Carol Tenny (eds.), Principle-Based Parsing [C]. Kluwer Academic Publishers, Dordrecht. 1991. [8] Yingze Zhao, Qiang Zhou A SVM-based Model for Chinese Functional Chunk Parsing [A]. In: Proc. of the Fifth SIGHAN Workshop on Chinese Language Processing[C]. Sydney: 2006. 94-101. [9] Vladimir N. Vapnik. The Nature of Statistical Learning Theory [M]. Springer, 1995. [10] John Lafferty, Fernando Pereira, and Andrew McCallum. Conditional random fields: Probabilistic models for segmenting and labeling sequence data [A]. In: International Conference on Machine Learning (ICML’01)[C]. 2001. 282-289. [11] 赵颖泽. 汉语功能块的自动分析 [D]. 北京: 清华大学,2006. [12] Xavier Carreras1, Lluis Marquez, et. al. Learning and Inference for Clause Identification [A]. In: Proc. of ECML’02 [C]. 2002. [13] Sandra Kübler and Erhard W. Hinrichs. From chunks to function-argument structure: A similarity-based approach [A] . In: Proceedings of ACL/EACL 2001 [C]. Toulouse, France: 2001. 338 - 345. [14] Elliott Franco Dr bek, Qiang Zhou. Experiments in Learning Models for Functional Chunking of Chinese Text [A]. In: Proc. of IEEE International Workshop on Natural Language processing and Knowledge Engineering[C]. Tucson, Arizona, 2001. 859-864.