面向中文成语的阅读理解方法研究

徐家伟,刘瑞芳,高升,李思

PDF(6177 KB)
PDF(6177 KB)
中文信息学报 ›› 2021, Vol. 35 ›› Issue (7) : 118-125.
机器阅读理解

面向中文成语的阅读理解方法研究

  • 徐家伟1,2,刘瑞芳1,2,高升1,2,李思1,2
作者信息 +

An Approach to Chinese Idioms Reading Comprehension

  • XU Jiawei1,2, LIU Ruifang1,2, GAO Sheng1,2, LI Si1,2
Author information +
History +

摘要

在自然语言处理领域,全局注意力机制通过考虑编码器的所有隐藏状态来捕获信息,从而帮助预测结果。然而在理解汉语成语这种复杂的语言现象时,模型往往会因特定语境的干扰而产生错误的决定和认知。因此,为了使模型更好地感知成语在不同语境下的语法功能,该文提出了一种增强型的全局注意力机制,通过对每个位置空间产生额外的注意因子来调整原始的全局注意力,最终提高了模型对特定语义的学习能力。该文将增强型全局注意力机制与BERT语言模型相结合,设计了一个用于完形填空任务的模型,并在最近发布的中文成语完形填空数据集ChID上进行了实验。结果表明,相比于传统的BERT模型和全局注意模型,该模型取得的效果更优。

Abstract

To address the machine reading comprehension of some complex linguistic phenomena, such as Chinese idioms, we propose an enhanced global attention module to better perceive the grammatical functions of idioms in different contexts. We adjust original global attention by generating an extra attention factor for each spatial position, so as to enhance the recognition of specific word senses. We integrate this module with the popular BERT language model for Chinese cloze task. Results on a recently released cloze-test dataset ChID show that our method achieves significant improvements, compared with the fine-tuned BERT model and global attention model.

关键词

阅读理解 / 分类 / 注意力机制 / 完形填空

Key words

reading comprehension / classification / attention mechanism / cloze-test

引用本文

导出引用
徐家伟,刘瑞芳,高升,李思. 面向中文成语的阅读理解方法研究. 中文信息学报. 2021, 35(7): 118-125
XU Jiawei, LIU Ruifang, GAO Sheng, LI Si. An Approach to Chinese Idioms Reading Comprehension. Journal of Chinese Information Processing. 2021, 35(7): 118-125

参考文献

[1] Taylor W L . "Cloze Procedure": A New Tool For Measuring Readability[J]. The Journalism Quarterly, 1953, 30(4):415-433.
[2] Fotos S S . The cloze test as an integrative measure of EFL proficiency: A substitute for essays on college entrance examinations?[J]. Language Learning, 2010, 41(3):313-336.
[3] Jackendoff R, Jackendoff R S. Foundations of language: Brain, meaning, grammar, evolution[M]. Oxford University Press, USA, 2002: 124-125.
[4] Sag IA, Baldwin T, Bond F, et al. Multiword expressions: A pain in the neck for NLP[C]//Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. Springer, Berlin, Heidelberg, 2002: 1-15.
[5] Li Z, Huang J, Zhou Z, et al. LSTM-based deep learning models for answer ranking[C]//Proceedings of the 2016 IEEE 1st International Conference on Data Science in Cyberspace (DSC). IEEE, 2016: 90-97.
[6] Chen D, Bolton J, Manning C D. A thorough examination of the CNN/daily mail reading comprehension task[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL), 2016: 2358-2367.
[7] Kenton J D M W C, Toutanova L K. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Universal Language Model Fine-tuning for Text Classification, 2019.
[8] Zheng C, Huang M, Sun A. ChID: A Large-scale Chinese IDiom Dataset for Cloze Test[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 778-787.
[9] Vaswani A,Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[10] Adhikari A, Ram A, Tang R, et al. Docbert: Bert for document classification[J]. arXiv preprint arXiv:1904.08398, 2019.
[11] Bahdanau D, Cho K H, Bengio Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015. 2015.
[12] Rush A M, Chopra S, Weston J. A neural attention model for abstractive sentence summarization[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 379-389.
[13] Seo M, Kembhavi A, Farhadi A, et al. Bidirectional attention flow for machine comprehension[J]. arXiv preprint arXiv:1611.01603, 2016.
[14] Xu K, Ba J, Kiros R, et al. Show, Attend and Tell: Neural image caption generation with visual attention[C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning, 2015:2048-2057.
[15] Luong M T, Pham H, Manning C D. Effective approaches to attention-based neural machine translation[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1412-1421.
[16] 聂言之.通用成语与异体成语[J].江西师范大学学报,1992(02):92-97.
[17] 姚鹏慈.同源成语刍议[J].杭州大学学报(哲学社会科学版),1987(01):98-106.
[18] He K, Zhang X, Ren S, et al. Delving deep into rectifiers: Surpassing human-level performance onimagenet classification[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 1026-1034.
[19] Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the International Conference on Machine Learning. PMLR, 2015: 448-456.
[20] Cui Y, Che W, Liu T, et al. Pre-training with whole word masking forchinese bert[J]. arXiv preprint arXiv:1906.08101, 2019.
[21] Golik P, Doetsch P, Ney H. Cross-entropy vs. squared error training: a theoretical and experimental comparison[C]//Proceedings of the Interspeech. 2013, 13: 1756-1760.
[22] Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[J]. arXiv preprint arXiv:1502.03167, 2015.
[23] Shao Y,Sennrich R, Webber B, et al. Evaluating machine translation performance on Chinese idioms with a blacklist method[J]. arXiv preprint arXiv:1711.07646, 2017.
PDF(6177 KB)

900

Accesses

0

Citation

Detail

段落导航
相关文章

/