中文文本自动校对综述

李云汉,施运梅,李宁,田英爱

PDF(1995 KB)
PDF(1995 KB)
中文信息学报 ›› 2022, Vol. 36 ›› Issue (9) : 1-18,27.
综述

中文文本自动校对综述

  • 李云汉1,2,施运梅1,2,李宁1,2,田英爱1,2
作者信息 +

A Survey of Automatic Error Correction of Chinese Text

  • Li Yunhan1,2, Shi Yunmei1,2, Li Ning1,2, Tian Ying'ai1,2
Author information +
History +

摘要

文本校对在新闻发布、书刊出版、语音输入、汉字识别等领域有着极其重要的应用价值,是自然语言处理领域中的一个重要研究方向。该文对中文文本自动校对技术进行了系统性的梳理,将中文文本的错误类型分为拼写错误、语法错误和语义错误,并对这三类错误的校对方法进行了梳理,对中文文本自动校对的数据集和评价方法进行了总结,最后展望了中文文本自动校对技术的未来发展。

Abstract

Text correction, an important research field in Natural Language Processing (NLP), is of great application value in fields such as news, publication, and text input . This paper provides a systematic overview of automatic error correction technology for Chinese texts. Errors in Chinese texts are divided into spelling errors, grammatic errors and semantic errors, and the methods of error correction for these three types are reviewed. Moreover, datasets and evaluation methods of automatic error correction for Chinese texts are summarized. In the end, prospects for the automatic error correction for Chinese texts are raised.

关键词

自动校对 / 拼写错误 / 语法错误 / 语义错误 / 数据集 / 评估指标

Key words

automatic correction / spelling errors / grammatical errors / semantic errors / datasets / evaluation indicators

引用本文

导出引用
李云汉,施运梅,李宁,田英爱. 中文文本自动校对综述. 中文信息学报. 2022, 36(9): 1-18,27
Li Yunhan, Shi Yunmei, Li Ning, Tian Ying'ai. A Survey of Automatic Error Correction of Chinese Text. Journal of Chinese Information Processing. 2022, 36(9): 1-18,27

参考文献

[1] 徐连诚, 石磊. 自动文字校对动态规划算法的设计与实现[J]. 计算机科学, 2002, 29(9): 149-150.
[2] 龚小谨, 罗振声, 骆卫华. 中文文本自动校对中的语法错误检查[J]. 计算机工程与应用, 2003, 39(8): 98-100.
[3] Cho K, Van Merrienboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2014: 1724-1734.
[4] Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks[C]//Proceddings of the 27th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2014: 3104-3112.
[5] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of 3rd International Conference on Learning Representations. San Diego, United States: International Conference on Learning Representations, 2015: 940-1000.
[6] Luong T, Pham H, Manning C D. Effective approaches to attention-based neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2015: 1412-1421.
[7] Gehring J, Auli M, Grangier D, et al. Convolutional sequence to sequence learning[C]//Proceedings of the 34th International Conference on Machine Learning. United States: JMLR, 2017: 2029-2042.
[8] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY, USA: Curran Associates Inc, 2017: 6000-6010.
[9] 张仰森、丁冰青. 中文文本自动校对技术现状及展望[J]. 中文信息学报, 1998(301): 51-57.
[10] 张仰森, 俞士汶. 文本自动校对技术研究综述[J]. 计算机应用研究, 2006, 23(6): 8-12.
[11] Liu C L, Lai M H, Tien K W, et al. Visually and phonologically similar characters in incorrect Chinese words[J]. ACM Transactions on Asian Language Information Processing, 2011, 10(2): 1-39.
[12] Wang D, Song Y, Li J, et al. A hybrid approach to automatic corpus generation for Chinese spelling check[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics, 2018: 2517-2527.
[13] Duan J, Pan L, Wang H, et al. Automatically build corpora for Chinese spelling check based on the input method[C]//Proceedings of the 8th Natural Language Proceedings and Chinese Computing. Cham, Swizerland: Springer, 2019: 471-485.
[14] Yu L C, Lee L H, Chang L P. Overview of grammatical error diagnosis for learning Chinese as a foreign language[C]//Proceedings of the 1st Workshop on Natural Language Processing Techniques for Educational Applications. Nara, Japan: Asia Pacific Society for Computers in Education, 2014: 42-47.
[15] Lee L H, Yu L C, Chang L P. Overview of the NLP-TEA shared task for Chinese grammatical error diagnosis[C]//Proceedings of the 2nd Workshop on Natural Language Processing Techniques for Educational Applications. Stroudsburg, PA, USA: Association for Computational Linguistics, 2015: 1-6.
[16] Lee L, Rao G, Yu L, et al. Overview of NLP-TEA shared task for Chinese grammatical error diagnosis[C]//Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications. Osaka, Japan: Natural Language Processing Techniques for Educational Applications, 2016: 40-48.
[17] Rao G, Zhang B, Xun E, et al. IJCNLP-2017 task 1: Chinese grammatical error diagnosis[C]//Proceedings of the IJCNLP, Shared Tasks. Taipei, Taiwan: Asian Federation of Natural Language Processing, 2017: 1-8.
[18] Zhao Y, Jiang N, Sun W, et al. Overview of the NLPCC shared task: Grammatical error correction[C]//Proceedings of the 7th CCF International Conference on Natural Language Processing and Chinese Computing. Hohhot, PEOPLES R CHINA: Springer, Cham, 2018: 439-445.
[19] Rao G, Gong Q, Zhang B, et al. Overview of NLPTEA share task Chinese grammatical error diagnosis[C]//Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018: 42-51.
[20] Rao G, Gong Q, Zhang B, et al. Overview of NLPTEA shared task for Chinese grammatical error diagnosis[C]//Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications. Suzhou, China: Association for Computational Linguistics, 2020: 25-35.
[21] 姜赢, 庄润钹, 吴烨凡, 等. 基于描述逻辑本体推理的语义级中文校对方法[J]. 计算机系统应用, 2017, 26(4): 224-229.
[22] Chang C H. A new approach for automatic Chinese spelling correction[C]//Proceedings of Natural Language Processing Pacific Rim Symposium. Japan: Information Processing Society of Japan, 1995: 278-283.
[23] 于勐, 姚天顺. 一种混合的中文文本校对方法[J]. 中文信息学报, 1998, 12(2): 32-37.
[24] 张仰森, 丁冰青. 基于二元接续关系检查的字词级自动查错方法[J]. 中文信息学报, 2001, 15(3): 37-44.
[25] Li J, Wang X. Combining trigram and automatic weight distribution in Chinese spelling error correction[J]. Journal of Computer Science and Technology, 2002, 17(6): 915-923.
[26] 张仰森, 曹元大, 俞士汶. 基于规则与统计相结合的中文文本自动查错模型与算法[J]. 中文信息学报, 2006, 20(4): 3-9.
[27] 张道行, 苏守彦. 字形相似别字之自动校正方法[C]//Proceedings of the 24th Conference on Computational Linguistics and Speech Processing. Taiwan: The Association for Computational Linguistics and Chinese Language Processing, 2012: 125-139.
[28] Chang T, Chen H, Tseng Y H, et al. Automatic detection and correction for Chinese misspelled words using phonological and orthographic similarities[C]//Proceedings of the 7th SIGHAN Workshop on Chinese Language Processing. Nagoya, Japan: Asian Federation of Natural Language Processing, 2013: 97-101.
[29] Yeh J F, Li S F, Wu M R, et al. Chinese word spelling correction based on N-Gram ranked inverted index list[C]//Proceedings of the 7th SIGHAN Workshop on Chinese Language Processing. Nagoya, Japan: Asian Federation of Natural Language Processing, 2013: 43-48.
[30] Lin C J, Chu W C. NTOU Chinese spelling check system in SIGHAN bake-off [C]//Proceedings of the 7th SIGHAN Workshop on Chinese Language Processing. Nagoya, Japan: Asian Federation of Natural Language Processing, 2013: 102-107.
[31] Wang Y. Conditional random field-based parser and language model for traditional Chinese spelling checker[C]//Proceedings of the 7th SIGHAN Workshop on Chinese Language Processing. Nagoya, Japan: Asian Federation of Natural Language Processing, 2013: 69-73.
[32] He Y, Fu G. Description of HLJU Chinese spelling checker for SIGHAN bakeoff[C]//Proceedings of the 7th SIGHAN Workshop on Chinese Language Processing. Nagoya, Japan: Asian Federation of Natural Language Processing, 2013: 84-87.
[33] Han D, Chang B. A maximum entropy approach to Chinese spelling check[C]//Proceedings of the 7th SIGHAN Workshop on Chinese Language Processing. Nagoya, Japan: Asian Federation of Natural Language Processing, 2013: 74-78.
[34] Liu X, Cheng F, Luo Y, et al. A hybrid Chinese spelling correction using language model and statistical machine translation with reranking[C]//Proceedings of the 7th SIGHAN Workshop on Chinese Language Processing. Nagoya, Japan: Asian Federation of Natural Language Processing, 2013: 54-58.
[35] Chiu H W, Wu J C, Chang J S. Chinese spelling checker based on statistical machine translation[C]//Proceedings of the 7th SIGHAN Workshop on Chinese Language Processing. Nagoya, Japan: Asian Federation of Natural Language Processing, 2013: 49-53.
[36] Huang Q, HuanG P, Zhang X, et al. Chinese spelling check system based on tri-gram model[C]//Proceedings of The 3rd CIPS-SIGHAN Joint Conference on Chinese Language Processing. Wuhan, China: Association for Computational Linguistics, 2014: 173-178.
[37] Chiu H, Wu J C, Chang J S. Chinese spell checking based on noisy channel model[C]//Proceedings of The 3rd CIPS-SIGHAN Joint Conference on Chinese Language Processing. Wuhan, China: Association for Computational Linguistics, 2014: 202-209.
[38] Xin Y, Zhao H, Wang Y, et al. An improved graph model for Chinese spell checking[C]//Proceedings of The 3rd CIPS-SIGHAN Joint Conference on Chinese Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2014: 157-166.
[39] Xiong J, Zhang Q, Hou J, et al. Extended HMM and ranking models for Chinese spelling correction[C]//Proceedings of The 3rd CIPS-SIGHAN Joint Conference on Chinese Language Processing. Wuhan, China: Association for Computational Linguistics, 2014: 133-138.
[40] Wang Y R, Liao Y F. Word vector/conditional random field-based Chinese spelling error detection for SIGHAN-2015 evaluation[C]//Proceedings of the 8th SIGHAN Workshop on Chinese Language Processing. Beijing, China: Association for Computational Linguistics, 2015: 46-49.
[41] Xie W, Huang P, Zhang X, et al. Chinese spelling check system based on N-Gram model[C]//Proceedings of the 8th SIGHAN Workshop on Chinese Language Processing. Beijing, China: Association for Computational Linguistics and Asian Federation of Natural Language Processing, 2015: 128-136.
[42] 刘亮亮, 曹存根. 中文“非多字词错误”自动校对方法研究[J]. 计算机科学, 2016, 43(10): 200-205.
[43] Yeh J F, Chang L T, Liu C Y, et al. Chinese spelling check based on N-Gram and string matching algorithm[C]//Proceedings of the 4th Workshop on Natural Language Processing Techniques for Educational Applications. Taipei, Taiwan: Asian Federation of Natural Language Processing, 2017: 35-38.
[44] Wang H, Wang B, Duan J, et al. Chinese spelling error detection using a fusion lattice LSTM[J]. ACM Transactions on Asian and Low-Resource Language Information Processing, 2021, 20(2): 1-11.
[45] Zhao H, Cai D, Xin Y, et al. A hybrid model for Chinese spelling check[J]. ACM Transactions on Asian and Low Resource Language Information Processing, 2017, 16(3): 1-22.
[46] Han Z, Lv C, Wang Q, et al. Chinese spelling check based on sequence labeling[C]//Proceedings of International Conference on Asian Language Processing. Shanghai, China: IEEE, 2019: 373-378.
[47] Duan J, Wang B, Tan Z, et al. Chinese spelling check via bidirectional LSTM-CRF[C]//Proceedings of IEEE 8th Joint International Information Technology and Artificial Intelligence Conference. Chongqing, China: IEEE, 2019: 1333-1336.
[48] Wang D, Tay Y, Zhong L. Confusionset-guided pointer networks for Chinese spelling check[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019: 5780-5785.
[49] Wang Q, Liu M, Zhang W, et al. Automatic proofreading in Chinese: detect and correct spelling errors in character-level with deep neural networks[M]. Lecture Notes in Computer Science. Springer International Publishing, 2019: 349-359.
[50] Hong Y, Yu X, He N, et al. FASPell: A fast, adaptable, simple, powerful Chinese spell checker based on DAE-Decoder paradigm[C]//Proceedings of the 5th Workshop on Noisy User-Generated Text. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019: 160-169.
[51] 沈峻毅, 张道行. 基于BERT任务模型之低误报率中文别字侦测模型[C]//Proceddings of the 32nd Conference on Computational Linguistics and Speech Processing. Taipei, Taiwan: The Association for Computational Linguistics and Chinese Language Processing, 2020: 319-330.
[52] Zhang S, Huang H, Liu J, et al. Spelling error correction with soft-masked bert[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020: 882-890.
[53] Cheng X, Xu W, Chen K, et al. SpellGCN: Incorporating phonological and visual similarities into language models for Chinese spelling check[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020: 871-881.
[54] Bao Z, Li C, Wang R. Chunk-based Chinese spelling check with global optimization[C]//Proceddings of the Association for Computational Linguistics: EMNLP. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020: 2031-2040.
[55] Liu S, Yang T, Yue T, et al. PLOME: Pre-training with misspelled knowledge for Chinese spelling correction[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021: 2991-3000.
[56] Xu H D, Li Z, Zhou Q, et al. Read, Listen, and See: Leveraging multimodal information helps Chinese spell checking[C]//Proceedings of the Association for Computational Linguistics: ACL-IJCNLP. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021: 716-728.
[57] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceddings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vgas, NV, USA: IEEE, 2016: 770-778.
[58] Sun Z, Li X, Sun X, et al. Chinese BERT: Chinese pretraining enhanced by Glyph and Pinyin information[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021: 2065-2075.
[59] Huang L, Li J, Jiang W, et al. PHMOSpell: Phonological and morphological knowledge guided Chinese spelling check[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021: 5958-5967.
[60] Shen J, Pang R, Weiss R J, et al. Natural TTS synthesis by conditioning wavenet on MEL spectrogram predictions[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2018: 4779-4783.
[61] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J/OL]. arXiv preprint arXiv: 1409.1556, 2014.
[62] Zhang R, Pang C, Zhang C, et al. Correcting Chinese spelling errors with phonetic Pre-Training[C]//Proceddings of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021: 2250-2261.
[63] Li J, Yin D, Wang H, et al. DCSpell: A detector-corrector framework for Chinese spelling error correction[C]//Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY, USA: ACM, 2021: 1870-1874.
[64] Wang B, Che W, Wu D, et al. Dynamic connected networks for Chinese spelling check[C]//Proceedings of the Association for Computational Linguistics: ACL-IJCNLP. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021: 2437-2446.
[65] See A, Liu P J, Manning C D. Get to the point: Summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2017: 1073-1083.
[66] Och F J. Minimum error rate training in statistical machine translation[C]//Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics. Sapporo, Japan: Association for Computational Linguistics, 2003: 160-167.
[67] Liao Q, Wang J, Yang J, et al. YNU-HPCC at IJCNLP task 1: Chinese grammatical error diagnosis using a bi-directional LSTM-CRF model[C]//Proceedings of the 8th International Joint Conference on Natural Language Processing. Taipei, Tiwan: Asian Federation of Natural Language Processing, 2017: 73-77.
[68] Liu Y, Zan H, Zhong M, et al. Detecting simultaneously Chinese grammar errors based on a BiLSTM-CRF model[C]//Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018: 188-193.
[69] Soni M, Thakur J S. Chinese grammatical error diagnosis with long short-term memory networks[C]//Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Application. Osaka, Japan: TheCOLING Organizing Committee, 2016: 49-56.
[70] Yang Y, Xie P, Tao J, et al. Alibaba at IJCNLP-2017 task 1: Embedding grammatical features into LSTMs for Chinese grammatical error diagnosis task[C]//Proceedings of the 8th International Joint Conference on Natural Language Processing, Shared Tasks. Taipei, Taiwan: Asian Federation of Natural Language Processing, 2017: 41-46.
[71] Fu R, Pei Z, Gong J, et al. Chinese grammatical error diagnosis using statistical and prior knowledge driven features with probabilistic ensemble enhancement[C]//Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018: 52-59.
[72] Zhou Y, Shao Y. Chinese grammatical error diagnosis based on CRF and LSTM-CRF model[C]//Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018: 165-171.
[73] Zhang Y, Hu Q, Liu F, et al. CMMC-BDRC solution to the NLP-TEA Chinese grammatical error diagnosis task[C]//Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications. Stroudsburg, PA,USA: Association for Computational Linguistics, 2018: 180-187.
[74] Zan H, Han Y, Huang H, et al. Chinese grammatical errors diagnosis system based on BERT at NLPTEA CGED shared task[C]//Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications. Suzhou, China: Association for Computational Linguistics, 2020: 102-107.
[75] Han Y, Yan Y, Han Y, et al. Chinese grammatical error diagnosis based on RoBERTa-BiLSTM-CRF model[C]//Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications. Suzhou, China: Association for Computational Linguistics, 2020: 97-101.
[76] Cao Y, He L, Ridley R, et al. Integrating BERT and score-based feature gates for Chinese grammatical error diagnosis[C]//Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications. Suzhou, China: Association for Computational Linguistics, 2020: 49-56.
[77] Cheng Y, Duan M. Chinese grammatical error detection based on bert model[C]//Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications. Suzhou, China: Association for Computational Linguistics, 2020: 108-113.
[78] Wang S, Wang B, Gong J, et al. Combining resnet and transformer for Chinese grammatical error diagnosis[C]//Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications. Suzhou, China: Association for Computational Linguistics, 2020: 36-43.
[79] 谢海华, 陈志优, 程静, 等. 基于数据增强和多任务特征学习的中文语法错误检测方法[C]//Proceedings of the 19th Chinese National Conference on Computational Linguistics. Haikou, China: Chinese Information Processing Society of China, 2020: 761-770.
[80] Luo Y, Bao Z, Li C, et al. Chinese grammatical error diagnosis with graph convolution network and multi-task learning[C]//Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications. Suzhou, China: Association for Computational Linguistics, 2020: 44-48.
[81] Ren H, Yang L, Xun E. A sequence to sequence learning for Chinese grammatical error correction[C]//Proceddings of the CCF International Conference on Natural Language Processing and Chinese Computing. Cham: Springer International Publishing, 2018: 401-410.
[82] Li S, Zhao J, Shi G, et al. Chinese grammatical error correction based on convolutional sequence to sequence model[J]. IEEE Access, 2019, 7: 72905-72913.
[83] Hinson C, Huang H H, CHEN H H. Heterogeneous recycle generation for Chinese grammatical error correction[C]//Proceedings of the 28th International Conference on Computational Linguistics. Stroudsburg, PA, USA: International Committee on Computational Linguistics, 2020: 2191-2201.
[84] Qiu Z, Qu Y. A two-stage model for Chinese grammatical error correction[J]. IEEE Access, 2019, 7: 146772-146777.
[85] Liang D, Zheng C, Guo L, et al. BERT enhanced neural machine translation and sequence tagging model for Chinese grammatical error diagnosis[C]//Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications. Suzhou, China: Association for Computational Linguistic, 2020: 57-66.
[86] Zhu J, Xia Y, Wu L, et al. Incorporating BERT into neural machine translation[C]//Proceddings of the 8th International Conference on Learning Representations. New Orleans,America: Neural Information Processing Systems, 2020: 1-18.
[87] 王辰成, 杨麟儿, 王莹莹, 等. 基于Transformer增强架构的中文语法纠错方法[J]. 中文信息学报, 2020, 34(6): 106-114.
[88] Fu K, Huang J, Duan Y. Youdao’s winning solution to the NLPCC task 2 challenge: A neural machine translation approach to Chinese grammatical error correction[C]//Proceddings of the Natural Language Processing and Chinese Computing. Cham, Switzerland: Springer, 2018: 341-350.
[89] Zhou J, Li C, Liu H, et al. Chinese grammatical error correction using statistical and neural models[C]//Proceddings of the Natural Language Processing and Chinese Computing. Cham, Switzerland: Springer, 2018: 117-128.
[90] Li C, Zhou J, Bao Z, et al. A hybrid system for Chinese grammatical error diagnosis and correction[C]//Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018: 60-69.
[91] 汤泽成, 纪一心, 赵怡博, 等. 基于字词粒度噪声数据增强的中文语法纠错[C]//Proceedings of the 20th Chinese National Conference on Computational Linguistics. Huhhot, China: Technical Committee on Computational Linguistics, 2021: 813-824.
[92] Zhao Z, Wang H. MaskGEC: Improving neural grammatical error correction via dynamic masking[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(01): 1226-1233.
[93] Ge T, Wei F, Zhou M. Reaching human-level performance in automatic grammatical error correction: an empirical study[J/OL]. arXiv preprint arXiv: 1807.01270, 2018.
[94] Malmi E, Krause S, Rothe S, et al. Encode, tag, realize: High-precision text editing[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019: 5053-5064.
[95] 骆卫华. 中文文本自动校对的语义级查错研究[J]. 计算机工程与应用, 2003, 12: 1-4.
[96] 程显毅, 孙萍, 朱倩. 基于HNC的中文文本校对系统模型的研究[J]. 微电子学与计算机, 2009, 26(10): 49-52.
[97] 郭充, 张仰森. 基于《知网》义原搭配的中文文本语义级自动查错研究[J]. 计算机工程与设计, 2010, 31(17): 3924-3928.
[98] 吴林, 张仰森. 基于知识库的多层级中文文本查错推理模型[J]. 计算机工程, 2012, 38(20): 21-25.
[99] 张仰森, 郑佳. 中文文本语义错误侦测方法研究[J]. 计算机学报, 2017, 40(4): 911-924.
[100] Wu S H, Liu C L, Lee L H. Chinese spelling check evaluation at SIGHAN bake-off[C]//Proceedings of the 7th SIGHAN Workshop on Chinese Language Processing. Nagoya, Japan: Asian Federation of Natural Language Processing, 2013: 35-42.
[101] Yu L C, Lee L H, Tseng Y H, et al. Overview of SIGHAN bake-off for Chinese spelling check[C]//Proceedings of The 3rd CIPS-SIGHAN Joint Conference on Chinese Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2014: 126-132.
[102] Tseng Y H, Lee L H, Chang L P, et al. Introduction to SIGHAN bake-off for Chinese spelling check[C]//Proceedings of the 8th SIGHAN Workshop on Chinese Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2015: 32-37.
[103] Fung G, Debosschere M, Wang D, et al. NLPTEA shared task - Chinese spelling check[C]//Proceedings of the 4th Workshop on Natural Language Processing Techniques for Educational Applications. Taipei, Tiwan: Asian Federation of Natural Language Processing, 2017: 29-34.
[104] Bradski G. The OpenCV library[J]. Dr. Dobb’s Journal: Software Tools for the Professional Programmer, 2000, 25(11): 120-123.
[105] Sanh V, Debut L, Chaumond J, et al. DistilBERT, A distilled version of BERT: Smaller, faster, cheaper and lighter[J/OL]. arXiv preprint arXiv: 1910.01108, 2019.
[106] 张仰森, 唐安杰, 张泽伟. 面向政治新闻领域的中文文本校对方法研究[J]. 中文信息学报, 2014, 28(6): 79-84.[107] Ge T, Wei F, Zhou M. Fluency boost learning and inference for neural grammatical error correction[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018: 1055-1065.
[108] Lichtarge J, Alberti C, Kumar S, et al. Corpora Generation for grammatical error correction[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis, Minnesota: Association for Computational Linguistics, 2019: 3291-3301.

基金

国家重点研发计划项目(2018YFB1004100)
PDF(1995 KB)

3910

Accesses

0

Citation

Detail

段落导航
相关文章

/