通过<i>N</i>-gram增强局部上下文视野感知的中文生成式摘要

PDF(4814 KB)

中文信息学报 ›› 2022, Vol. 36 ›› Issue (8) : 135-143,153.

自然语言理解与生成

通过N-gram增强局部上下文视野感知的中文生成式摘要

尹宝生,安鹏飞

作者信息 +

Chinese Abstractivte Summarization with Local Context Augmentation via N-gram

YIN Baosheng, AN Pengfei

Author information +

History +

摘要

基于序列到序列模型的生成式文档摘要算法已经取得了良好的效果。鉴于中文N-gram蕴含着丰富的局部上下文信息,该文提出将N-gram信息整合到现有模型的神经框架NgramSum,即利用N-gram信息增强神经模型局部上下文语义感知能力。该框架以现有的神经模型为主干,从本地语料库提取N-gram信息,提出了一个局部上下文视野感知增强模块和一个门模块,并来分别对这些信息进行编码和聚合。在NLPCC 2017中文单文档摘要评测数据集上的实验结果表明: 该框架有效增强了基于LSTM、Transformer、预训练模型三种不同层次的序列到序列的强基线模型,其中ROUGE-1/2/L相较基线模型平均分别提高了2.76, 3.25, 3.10个百分点。进一步的实验和分析也证明了该框架在不同N-gram度量方面的鲁棒性。

Abstract

The abstractive document summarization algorithm based on sequence-to-sequence model has achieved good performance. Given that the rich local contextual information contained in Chinese n-grams, this paper proposes NgramSum to integrate n-gram information into the neural framework of the existing model.. The framework takes the existing neural model as the backbone, extracts n-grams information from the local corpus, and applies the n-gram information to augment the local context via a gate module. The experimental results on the dataset of NLPCC2017 shared task3 show that the framework effectively enhances the sequence-to-sequence strong baseline model of LSTM, Transformer, and pre-trained model with an average of 2.76%, 3.25% and 3.10% increase, respectively, according to the ROUGE-1/2/L scores.

导出引用

尹宝生,安鹏飞. 通过N-gram增强局部上下文视野感知的中文生成式摘要. 中文信息学报. 2022, 36(8): 135-143,153

YIN Baosheng, AN Pengfei. Chinese Abstractivte Summarization with Local Context Augmentation via N-gram. Journal of Chinese Information Processing. 2022, 36(8): 135-143,153

参考文献

[1] Sumit C, Michael A, Alexander M R. Abstractive sentence summarization with attentive recurrent neural networks[C]//Proceedings of the North American Chapter of the Association for Computational Linguistics,2016: 93-98.
[2] Ramesh N, Zhou Bowen, Cicero D S, et al. Abstractive text summarization using sequence-to-sequence RNNs and beyond[C]//Proceedings of the Computational Natural Language Learning,2016: 280-290.
[3] Rush A M, Chopra S, Weston J.A neural attention model for abstractive sentence summarization[C]//Proceedings of the Empirical Methods in Natural Language Processing,2015: 379-389.
[4] Zeng W, Luo W, Fidler S, et al. Efficient summarization with read-again and copy mechanism[J]. arXiv preprint arXiv:1611.03382,2016.
[5] Gu J, Lu Z, Li H, et al.Incorporating copying mechanism in sequence-to-sequence learning [C]//Proceedings of the Association for Computational Linguistics,2016: 1631-1640.
[6] See A, Liu P J, Manning C D.Get to the point: Summarization with pointer-generator networks[C]//Proceedings of the Association for Computational Linguistics,2017: 1073-1083.
[7] Wang Z, Duan Z, Zhang H, et al. Friendly topic assistant for transformer based abstractive summarization[C]//Proceedings of the Empirical Methods in Natural Language Processing,2020: 485-497.
[8] Wang H Yang G, Yu B, et al. Exploring explainable selection to control abstractive summarization[C]//Proceedings of the Association for the Advancement of Artificial Intelligence,2021: 13933-13941.
[9] Liu Y, Lapata M.Text summarization with pretrained encoders[C]//Proceedings of the Empirical Methods in Natural Language Processing,2019: 3728-3738.
[10] Vaswani A, Shazeer N, Parmar N, et al.Attention is all you need[C]//Proceedings of the Neural Information Processing Systems, 2017: 5998-6008.
[11] Qi W, Yan Y, Gong Y, et al. ProphetNet: Predicting future n-gram for sequence-to-sequence pre-training[C]//Proceedings of the Empirical Methods in Natural Language Processing: Findings,2020: 2401-2410.
[12] Kupiec J, Pedersen J, Chen F.A trainable document summarizer[C]//Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,1995: 68-73.
[13] Paice C D.Constructing literature abstracts by computer: techniques and prospects[J]. Information Processing & Management,1990,26(1): 171-186.
[14] Saggion H, Poibeau T. Automatic text summarization: past, present and future[M]. Multi-source, Multilingual Information Extraction and Summarization, 2013: 3-21.
[15] Hochreiter S, Schmidhuber. Long short-term memory[J]. Neural Computation,1997,9(8): 1735-1780.
[16] hung J, Gulcehre C, Cho K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[J]. arXiv preprint arXiv:1412. 3555, 2014.
[17] Dehghani M, Gouws S, Vinyals O, et al. Universal transformers[J]. arXiv preprint arXiv: 1807.03819, 2018.
[18] Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks[C]//Proceedings of the Neural Information Processing Systems, 2014: 3104-3112.
[19] Devlin J, Chang M W, Lee K, et al.BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,2019: 4171-4186.
[20] Cohan A, Beltagy I, King D, et al. Pretrained language models for sequential sentence classification[C]//Proceedings of the Empirical Methods in Natural Language Processing and the International Joint Conference on Natural Language Processing,2019: 3691-3697.
[21] Tu Z, Lu Z, Liu Y, et al. Modeling coverage for neural machine translation[C]//Proceedings of the Association for Computational Linguistics,2016: 76-85.
[22] Hou L, Hu P, Bei C.Abstractive document summarization via neural model with joint attention [C]//Proceedings of the Natural Language Processing and Chinese Computing, 2017: 329-338.
[23] 侯丽微,胡珀,曹雯琳. 主题关键词信息融合的中文生成式自动摘要研究[J]. 自动化学报, 2019, 45(3): 530-539.
[24] 魏文杰,王红玲,王中卿. 基于文本结构和图卷积网络的生成式摘要[J]. 中文信息学报,2021,35(3): 78-87.
[25] Song K, Tan X, Qin T, et al. MASS: Masked sequence to sequence pre-training for language generation[C]//Proceedings of the International Conference on Machine Learning,2019: 5926-5936.
[26] Dong L, Yang N, Wang W, et al. Unified language model pre-training for natural language understanding and generation[C]//Proceedings of the Neural Information Processing Systems,2019: 13042-13054.
[27] Raffel C, Shazeer N, Roberts A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer[J]. Journal of Machine Learning Research,2020, 21(140): 1-67.
[28] Lewis M, Liu Y, Goyal N, et al. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[C]//Proceedings of the Association for Computational Linguistics, 2020: 7871-7880.
[29] Feng H, Chen K, Deng X, et al. Accessor variety criteria for Chinese word extraction[J]. Computational Linguistics, 2004, 30(1): 75-93.
[30] Kumar A, Kawahara D, Kurohashi S.Knowledge -enriched two-layered attention network for sentiment analysis[C]//Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,2018: 253-258.
[31] Margatina K, Baziotis C, Potamianos A. Attention-based conditioning methods for external knowledge integration[C]//Proceedings of the Association for Computational Linguistics, 2019: 3944-3951.
[32] Tian Y, Song Y, Xia F, et al.Improving Chinese word segmentation with wordhood memory networks [C]//Proceedings of the Association for Computational Linguistics,2020: 8274-8285.
[33] Hua L, Wan X, Li L.Overview of the NLPCC 2017 shared task: Single document summarization[C]//Proceedings of the Natural Language Processing and Chinese Computing, 2017: 942-947.
[34] Paszke A, Gross S, Massa F, et al. PyTorch: An imperative style, high-performance deep learning library[C]//Proceedings of the Neural Information Processing Systems,2019: 8024- 8035.
[35] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of the International Conference on Learning Representations,2015: 1-15.
[36] Kingma D P, Ba J. Adam: A method for stochastic optimization[C]//Proceedings of the International Conference on Learning Representations,2015: 1-15.
[37] Lin C Y.ROUGE: A package for automatic evaluation of summaries[C]//Proceedings of the Association for Computational Linguistics,2004: 74-81.
[38] Kit C, Wilks Y. Unsupervised learning of word boundary with description length Gain[C]//Proceedings of the Computational Natural Language Learning,1999: 1-6.
[39] Sun M, Shen D, Tsou B K. Chinese word segmentation without using lexicon and hand-crafted training data[C]//Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, 1998: 1265-1271.

基金

国防技术基础项目(JSQB2017206C002)

PDF(4814 KB)

1612

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

Received	Published
2021-07-06	2022-09-26
Issue Date
2022-09-26

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金