汉语词汇测试自动命题研究

胡韧奋

PDF(1964 KB)
PDF(1964 KB)
中文信息学报 ›› 2017, Vol. 31 ›› Issue (1) : 41-49.
自然语言处理应用

汉语词汇测试自动命题研究

  • 胡韧奋
作者信息 +

Automatic Generation of Chinese Vocabulary Test Questions

  • HU Renfen
Author information +
History +

摘要

为了提升汉语词汇测试的命题效率,该文从汉语语言特性和二语教学需求出发,对词语听力、多空词语选择、词语排序和单空词语选择四种词汇测试题型进行自动命题尝试,以满足不同语言信息、不同难度的词汇知识考查。在词语特征的提取上,构建了一个覆盖词音、词形、词义、语法、搭配、偏误各层次信息的词汇知识库,在句子特征的提取上,实现了语法项目自动识别、句子难度分析等算法,为自动命题中的题干句、目标词和干扰项选择提供依据。通过词句选择和语块合成等步骤,生成四种题型共计7 263道词汇测试题。人工测试数据显示,词汇测试自动命题的初步尝试取得了较好的效果,约58%的试题被评价为完全合理,经人工简单调整,试题接受率达到75.7%。

Abstract

This paper discusses the automatic generation strategy of four types of vocabulary test questions: word listening, multi-word selection, word order and single word selection.. A knowledge base is built to extract word-level features including pronunciation, senses, grammars, collocations, learners errors, etc. Sentence analysis modules are also developed for automatic identification of grammatical constructions and the estimation of sentence difficulty degrees. By selecting proper sentences, target words and distractors, 7263 vocabulary test questions are automatically generated in the experiment. The manual evaluation shows that the automatic generation strategy performs well with 58% of the questions evaluated as completely reasonable. After slight manual modification, the question acceptance rate is increased to 75.7%.

关键词

二语教学 / 词汇测试 / 自动命题

Key words

second language acquisition / vocabulary test / automatic question generation

引用本文

导出引用
胡韧奋. 汉语词汇测试自动命题研究. 中文信息学报. 2017, 31(1): 41-49
HU Renfen. Automatic Generation of Chinese Vocabulary Test Questions. Journal of Chinese Information Processing. 2017, 31(1): 41-49

参考文献

[1] Nation I S P. Learning vocabulary in another language[M]. Stuttgart: Ernst Klett Sprachen, 2001: 33.
[2] Mitkov R, Ha L A. Computer-aided generation of multiple-choice tests[C]//Proceedings of the HLT-NAACL workshop on building educational applications using natural language processing-Volume 2. Association for Computational Linguistics, Edmonton, Canada, 2003: 17-22.
[3] Brown J C,Frishkoff G A, Eskenazi M. Automatic question generation for vocabulary assessment[C]//Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Vancouver, Canada, 2005: 819-826.
[4] Correia R, Baptista J, Mamede N, et al. Automatic generation of cloze question distractors[C]//Proceedings of the Interspeech Satellite Workshop on Second Language Studies: Acquisition, Learning, Education and Technology, Waseda University, Tokyo, Japan. 2010.
[5] Goto T, Kojiri T, Watanabe T, et al. Automatic generation system of multiple-choice cloze questions and its evaluation[J]. Knowledge Management & E-Learning: An International Journal, 2010, 2(3): 210-224.
[6] 杨丽姣, 肖航. 汉语深层语义理解与知识表示——面向语义搜索的语料库语境信息标注研究[J]. 语言文字应用, 2015,(1): 107-116.
[7] 胡韧奋, 曹冰, 杜健一. 现代汉字形声字声符在普通话中的表音度测查[J]. 中文信息学报, 2013, 27(3): 41-48.
[8] Lin D. Extracting collocations from text corpora[C]//Proceedings of the First workshop on computational terminology.University of Montreal, Montreal, Canada, 1998: 57-63.
[9] Che W, Li Z, Liu T. LTP: A Chinese language technology platform[C]//Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations. Association for Computational Linguistics, Beijing, China, 2010: 13-16.
[10] Hindle D. Noun classification from predicate-argument structures[C]//Proceedings of the 28th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, University of Pittsburgh, Pittsburgh, USA, 1990: 268-275.
[11] Lin D. Automatic retrieval and clustering of similar words[C]//Proceedings of the 17th International Conference on Computational linguistics-Volume 2. Association for Computational Linguistics,University of Montreal, Montreal, Canada, 1998: 768-774.
[12] 国家汉办/孔子学院总部. 国际汉语教学通用课程大纲[Z]. 北京: 外语教学与研究出版社, 2009: 80-96.
[13] 李桂梅,张晋军,解妮妮,符华均. 新HSK词汇控制对试卷难度影响的研究[J]. 中国考试,2015,03: 38-40.
[14] Liu, F, Yang M, Lin D. Chinese Web 5-gram Version 1LDC2010T06[Z]. Philadelphia: Linguistic Data Consortium, 2010.

基金

国家语委“十二五”科研规划项目(YB125-124)
PDF(1964 KB)

744

Accesses

0

Citation

Detail

段落导航
相关文章

/