利用深层语言分析改进中文作文自动评分方法

魏思,巩捷甫,王士进,宋巍,宋子尧

PDF(3920 KB)
PDF(3920 KB)
中文信息学报 ›› 2022, Vol. 36 ›› Issue (4) : 111-123.
专题: 面向类人智能的教育认知关键技术

利用深层语言分析改进中文作文自动评分方法

  • 魏思1,3,巩捷甫1,3,王士进1,3,宋巍2,宋子尧1,3
作者信息 +

Improving Chinese Automated Essay Scoring via Deep Language Analysis

  • WEI Si1,3, GONG Jiefu1,3, WANG Shijin1,3, SONG Wei2, SONG Ziyao1,3
Author information +
History +

摘要

利用自然语言处理技术对作文进行自动评阅是有重要意义和挑战的研究课题,引起了人工智能领域与教育领域学者的共同关注。该文聚焦于语文作文自动评分任务,提出通过深层语言分析,包括应用高性能别字、语法纠错器分析语言运用能力,采用自动修辞分析、优秀表达识别等手段反映语言表达能力,以及通过细粒度篇章质量分析评估篇章整体质量,来构建有效特征。该文同时提出了结合语言分析特征与深度神经网络编码的自适应混合评分模型。在真实语文作文数据上的实验表明,融入深层语言分析特征可有效提高作文评分效果;年级与主题自适应的模型训练策略,可提高模型的迁移能力和预测效果。消融实验进一步分析和解释了不同类型特征对评分效果的贡献。

Abstract

Automated essay scoring is a significant and challenging research topic, which has attracted the attention of scholars in the fields of artificial intelligence and education. Focuses on Chinese automated essay scoring, this paper proposes to exploit deep language analysis, including the application of spelling error corrector and grammar error corrector to analyze grammar level writing ability, the automatic rhetorical analysis and excellent expression recognition to reflect language expression ability, and the fine-grained quality analysis of essay to evaluate overall quality. We then propose an adaptive hybrid scoring model, combining linguistic features and deep neural networks. The experimental results on Chinese student essay datasets show that 1) incorporating deep language analysis features can effectively improve the performance of automated essay scoring; and 2) the grade and topic adaptive training strategy also improves the transferring and predication abilities.

关键词

语文作文自动评分 / 深层语言分析 / 自适应混合评分模型

Key words

Chinese automated essay scoring / deep language analysis / adaptive hybrid scoring model

引用本文

导出引用
魏思,巩捷甫,王士进,宋巍,宋子尧. 利用深层语言分析改进中文作文自动评分方法. 中文信息学报. 2022, 36(4): 111-123
WEI Si, GONG Jiefu, WANG Shijin, SONG Wei, SONG Ziyao. Improving Chinese Automated Essay Scoring via Deep Language Analysis. Journal of Chinese Information Processing. 2022, 36(4): 111-123

参考文献

[1] Page E B. The imminence of grading essays by computer[J]. The Phi Delta Kappan, 1966, 47(5): 238-243.
[2] Burstein, J. The E-rater scoring engine: Automated essay scoring with natural language processing[M]. Automated essay scoring: A cross-disciplinary perspective. Lawrence Erlbaum Associates Publishers, 2003: 113-121.
[3] 张晋军, 任杰. 汉语测试电子评分员实验研究报告[J]. 中国考试, 2004(10): 27-32.
[4] 林素穗,游耿能,萧如渊,等.加强非同步式网页教学环境教学评量功能之探讨[EB/OL].http://www.users.cs.york.ac.uk/~derrick/document/papers/cyu2001,pdf. 2005-12-25.
[5] 曹亦薇, 杨晨. 使用潜语义分析的汉语作文自动评分研究[J]. 考试研究,2007(1): 65-73.
[6] Dong F, Zhang Y. Automatic features for essay scoring: an empirical study[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016: 1072-1077.
[7] Taghipour K, Ng H T. A neural approach to automated essay scoring[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016: 1882-1891.
[8] Dong F, Zhang Y, Yang J. Attention-based recurrent convolutional neural network for automatic essay scoring[C]//Proceedings of the 21st Conference on Computational Natural Language Learning, 2017: 153-162.
[9] Yang R, Cao J, Wen Z, et al.Enhancing automated essay scoring performance via cohesion measurement and combination of regression and ranking[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing: Findings, 2020: 1560-1569.
[10] Song W, Zhang K, Fu R, et al. Multi-stage pre-training for automated Chinese essay scoring[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2020: 6723-6733.
[11] Uto M, Xie Y, Ueno M. Neural automated essay scoring incorporating handcrafted features[C]//Proceedings of the 28th International Conference on Computational Linguistics, 2020: 6077-6088.
[12] Gong J, Hu X, Song W, et al. IFlyEA: a Chinese essay assessment system with automated rating, review generation, and recommendation[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations, 2021: 240-248.
[13] Rao G, Gong Q, Zhang B, et al. Overview of NLPTEA-2018 share task Chinese grammatical error diagnosis[C]//Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications, 2018: 42-51.
[14] Bell S, Yannakoudakis H, Rei M. Context is key: grammatical error detection with contextual word representations[C]//Proceedings of the 14th Workshop on Innovative Use of NLP for Building Educational Applications, 2019: 103-115.
[15] Fu R, Pei Z, Gong J, et al. Chinese grammatical error diagnosis using statistical and prior knowledge driven features with probabilistic ensemble enhancement[C]//Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications, 2018: 52-59.
[16] Wang S, Wang B, Gong J, et al. Combining ResNet and transformer for Chinese grammatical error diagnosis[C]//Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications, 2020: 36-43.
[17] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the Advances in Neural Information Processing Systems, 2017: 5998-6008.
[18] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.
[19] Lewis M, Liu Y, Goyal N, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 7871-7880.
[20] Zhang S, Huang H, Liu J, et al. Spelling error correction with soft-masked BERT[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 882-890.
[21] Liu X, Cheng K, Luo Y, et al. A hybrid Chinese spelling correction using language model and statistical machine translation with reranking[C]//Proceedings of the 7th SIGHAN Workshop on Chinese Language Processing, 2013: 54-58.
[22] Yu J, Li Z. Chinese spelling error detection and correction based on language model, pronunciation, and shape[C]//Proceedings of The 3rd CIPS-SIGHAN Joint Conference on Chinese Language Processing, 2014: 220-223.
[23] Wang D, Tay Y, Zhong L. Confusionset-guided pointer networks for Chinese spelling check[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 5780-5785.
[24] Tseng Y H, Lee L H, Chang L P, et al. Introduction to SIGHAN 2015 bake-off for Chinese spelling check[C]//Proceedings of the 8th SIGHAN Workshop on Chinese Language Processing, 2015: 32-37.
[25] Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 4171-4186.
[26] 朱跃生. 排比比喻联姻: 作文出彩的重要方法[J]. 中学语文: 大语文论坛旬刊, 2012, (9): 73-74.
[27] 陈娟. 《世说新语》辞格研究[D]. 扬州: 扬州大学硕士学位论文,2009.
[28] 巩捷甫. 面向语文作文自动评阅的修辞手法识别系统的设计与实现[D]. 哈尔滨: 哈尔滨工业大学硕士学位论文,2016.
[29] Song W, Liu T, Fu R, et al. Learning to identify sentence parallelism in student essays[C]//Proceedings of the 26th International Conference on Computational Linguistics: Technical papers, 2016: 794-803.
[30] Yang Z, Yang D, Dyer C, et al. Hierarchical attention networks for document classification[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016: 1480-1489.
[31] 何屹松,孙媛媛,张凯,等. 计算机智能辅助评分系统定标集选取和优化方法研究[J]. 中国考试,2021(2020-1): 30-36.
[32] Loshchilov I, Hutter F. Fixing weight decay regularization in adam[J]. arXiv preprint arXiv:1711.05101.2017.

基金

国家重点研究与发展计划(2018YFB1005105),国家自然科学基金(61876113)
PDF(3920 KB)

Accesses

Citation

Detail

段落导航
相关文章

/