Abstract:This paper proposes a Uyghur Chunk parsing scheme,and extracts chunks from 3000 annotated sentences. According to the characteristics of Uyghur language,additional features on the stem,affixes,synonyms etc are augmented. 3000 marked sentences are constructed,and the cross-validation experiments at the training/testing ration of 9∶1,8∶2,2∶1 result in the recall rates of 80.34%,76.87% and 66.76%,respectively.
[1] Abney S P. Parsing by Chunks[J]. Computation and psycholinguistics,1991: 257-278. [2] T K Sang,S Buchholz.Introduction to the Conll-2000 Shared Task: Chunking[C]//Proceeding of CoNLL-2000,Lisbon,Portugal,2000: 127-132. [3] A Kinyon. A Language-Independent Shallow-Parser Compiler[C]//Proceedings of 39th ACL Conference,Tourouse,France,2001: 322-329. [4] J Hammerton,M Osborne,S Armstrong. Introduction to Special Issue on Machine Learning Approaches to Shallow Parsing[J]. Journal of Machine Learning Research.2002,2: 551-558. [5] 周强.汉语语料库的短语自动划分和标注研究[D].北京大学博士学位论文.1996. [6] 赵军,黄昌宁.汉语基本名词短语结构分析模型[J].计算机学报,1999,22(2): 141-146. [7] 李素建,刘群,杨志峰.基于最大熵模型的组块分析[J].计算机学报. 2003,25(12): 1722-1727. [8] 张昱琪,周强.汉语基本短语自动识别[J].中文信息学报.2002,16(6): 1-8. [9] W Chen,Y Zhang,H Isahara. An Empirical Study of Chinese Chunking[C]//Proceedings of the 44th Annual Meeting of ACL,Sydney,Australia,2006: 97-104. [10] 孙广路.基于统计学习的中文组块分析技术研究[D]哈尔滨工业大学博士学位论文.2008. [11] 周俏丽,刘新,郎文静,等.基于分治策略的组块分析[J].中文信息学报.2012,26(5): 120-128.