汉语中介语是伴随着汉语国际教育产生的,随着汉语学习在全球的不断开展,汉语中介语的规模不断增长,由于这些语料在语言使用上有其独特性,使得中介语成为语言信息处理和智能语言辅助学习的独特资源。依存语法分析是语言信息处理的重要步骤,英语中介语的依存语法标注语料已经有很好的应用,目前汉语中介语语料库对句法的关注度较低,缺乏一个充分考虑汉语中介语特点的依存句法标注规范。该文着眼于汉语中介语的依存句法标注语料库的建构,探讨依存标注规范,在充分借鉴国际通用依存标注体系(Universal Dependencies)的基础上,制定了汉语中介语的依存标注规范,并进行了标注实践,形成了一个包括汉语教学语法点的中介语依存语料库。
Abstract
Chinese inter-language is accompanied by Chinese international education. With growing development of Chinese language learning in the world, the scale of inter-language in Chinese has been expanding. Considering the uniqueness of using inter-language, it has become a unique resource for language information processing and intelligent language assisting learning. Compared with inter-language in English with dependency grammar annotation corpus, the current Chinese inter-language corpora even have no annotation guideline for dependency syntax.Aiming to construct the corpus of inter-language dependency annotation in Chinese, this paper, develops a new dependency annotation guideline for Chinese inter-language based on the Universal Dependencies. And a corpus of Chinese inter-language annotated with dependency sturucture is finally achieved with consideration of its characteristics.
关键词
汉语中介语 /
依存句法 /
标注规范
{{custom_keyword}} /
Key words
inter-language /
dependency grammar /
annotation guideline
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 鲁健骥. 中介语理论与外国人学习汉语的语音偏误分析[J]. 语言教学与研究,1984(3): 44-56.
[2] 李娟,谭晓平,杨丽姣. 汉语中介语语料库应用及发展对策研究[J]. 曲靖师范学院学报, 2016(2): 86-91.
[3] 李正华. 汉语依存句法分析关键技术研究[D]. 哈尔滨: 哈尔滨工业大学博士学位论文, 2013.
[4] Joakim Nivre, Marie Catherine de Marneffe,Filip Ginter, et al. Universal dependencies v1: A multilingual treebank collection[C]//Proceedings of the 10th International Conference on Language Resources and Evaluation. LREC,2016: 1659-1666.
[5] John Lee,Herman Leung, Keying Li. Towards universal dependencies for learner Chinese[C]//Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies, 2017: 67-71.
[6] Marwa Ragheb, Markus Dickinson. Developing a corpus of syntactically-annotated learner language for English[C]//Proceedings of the 13th International Workshop on Treebanks and Linguistic Theories (TLT), 2014.
[7] Geoffrey Sampson. English for the computer: The SUSANNE corpus and analytic scheme[M]. UK: Clarendon Press,1995.
[8] Brian MacWhinney. The CHILDES system[J]. American Journal of Speech-Language Pathology, 1996, 5(1): 5-14.
[9] Yevgeni Berzak,Jessica Kenney, Carolyn Spadine,et al. Universal dependencies for learner English[C]//Proceedings of Annual Meeting of the Association for Computational Linguistics, 2016.
[10] Diane Nicholls. The Cambridge Learner Corpus: Error coding and analysis for lexicography and ELT[C]//Proceedings of the Corpus Linguistics 2003 Conference, 2003(16): 572-581.
[11] Herman Leung,Rafal Poiret,Tak-sum Wong,et al. Developing universal dependencies for mandarin Chinese[C]//Proceedings of the 12th Workshop on Asian Language Resources (ALR12), 2016: 20-29.
[12] 张斌. 现代汉语描写语法[M]. 北京: 商务印书馆,2005.
[13] 刘丹青. 汉语中的框式介词[J]. 当代语言学, 2002(4): 241-253.
[14] 黄伯荣,廖旭东. 现代汉语(增订四版)[M]. 北京: 高等教育出版社,2007.[15] Gerdes Kim, Sylvain Kahane. Dependency annotation choices: Assessing theoretical and practical issues of universal dependencies[C]//Proceedings of the 10th Linguistic Annotation Workshop Held in Conjunction with ACL 2016. 2016: 131.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
北京语言大学校级项目(中央高校基本科研业务费专项资金)(18YBB20);语言资源高精尖创新中心项目(TYZ19005);国家语言资源监测与研究平面媒体中心研究经费
{{custom_fund}}