融合上下文信息和关键信息的文本摘要

李志欣,彭智,唐素勤,马慧芳

PDF(1701 KB)
PDF(1701 KB)
中文信息学报 ›› 2022, Vol. 36 ›› Issue (1) : 83-91.
信息抽取与文本挖掘

融合上下文信息和关键信息的文本摘要

  • 李志欣1,彭智1,唐素勤1,马慧芳2
作者信息 +

Fusing Context Information and Key Information for Text Summarization

  • LI Zhixin1, PENG Zhi1, TANG Suqin1, MA Huifang2
Author information +
History +

摘要

文本摘要的一个迫切需要解决的问题是如何准确地概括文本的核心内容。目前文本摘要的主要方法是使用编码器-解码器架构,在解码过程中利用软注意力获取所需的上下文语义信息。但是,由于编码器有时候会编码过多的信息,所以生成的摘要不一定会概括源文本的核心内容。为此,该文提出一种基于双注意指针网络的文本摘要模型。首先,该模型使用了双注意指针融合网络,其中自注意机制从编码器中收集关键信息,软注意和指针网络通过上下文信息生成更连贯的核心内容。两者融合能够生成具有总结性和连贯性的摘要。其次,采用改进后的覆盖率机制来处理重复问题,提高生成摘要的准确性。同时,结合计划采样和强化学习产生新的训练方法来优化模型。在CNN/ Daily Mail数据集和LCSTS数据集上的实验表明,该模型达到了当前主流模型的效果。实验结果分析表明,该模型在总结性方面具有良好的表现,同时减少了重复的出现。

Abstract

In text summarization, the mainstream method is to use encoder-decoder architecture to obtain the required context semantic information by using soft attention in the decoding process. Since the encoder sometimes encodes too much information, the generated summary does not always summarize the core content of the source text. To address this issue, this paper proposes a text summarization model based on a dual-attention pointer network. Firstly, in the dual-attention pointer network, the self-attention mechanism collects key information from the encoder, while the soft attention and the pointer network generate more coherent core content through context information. The fusion of both will generate accurate and coherent summaries. Secondly, the improved coverage mechanism is applied to address the repetition problem and improve the quality of the generated summaries. Simultaneously, scheduled sampling and reinforcement learning are combined to generate new training methods to optimize the model. Experiments on the CNN/Daily Mail dataset and the LCSTS dataset show that the proposed model performs as well as many state-of-the-art models.

关键词

文本摘要 / 神经网络 / 注意力机制 / 指针网络

Key words

text summarization / neural network / attention mechanism / pointer network

引用本文

导出引用
李志欣,彭智,唐素勤,马慧芳. 融合上下文信息和关键信息的文本摘要. 中文信息学报. 2022, 36(1): 83-91
LI Zhixin, PENG Zhi, TANG Suqin, MA Huifang. Fusing Context Information and Key Information for Text Summarization. Journal of Chinese Information Processing. 2022, 36(1): 83-91

参考文献

[1] Kalchbrenner N, Blunsom P. Recurrent continuous translation models[C] //Proceedings of the Conference on Empirical Methods in Natural Language Processing.Stroudsburg, PA: ACL, 2013: 1700-1709.
[2] Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks[C]//Proceedings of Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2014: 3104-3112.
[3] Rush A M, Chopra S, Weston J. A neural attention model for abstractive sentence summarization[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2015: 379-389.
[4] See A, Liu P J, Manning C D. Get to the point: Summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2017:1073-1083.
[5] Paulus R, Xiong C, Socher R. A deep reinforced model for abstractive summarization[C]//Proceedings of the 6th International Conference on Learning Representations.La Jolla, CA: ICLR, 2018: 1-13.
[6] Liu L, Lu Y, Yang M, et al. Generative adversarial network for abstractive text summarization[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Menlo Park, CA: AAAI, 2018: 8109-8110.
[7] Celikyilmaz A, Bosselut A, He X, et al. Deep communicating agents for abstractive summarization[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2018: 1662-1675.
[8] Bengio S, Vinyals O, Jaitly N, et al. Scheduled sampling for sequence prediction with recurrent neural networks[C]//Proceedings of Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2015: 1171-1179.
[9] Ranzato M A, Chopra S, Auli M,et al. Sequence level training with recurrent neural networks[C]//Proceedings of the 4th International Conference on Learning Representations. La Jolla, CA: ICLR, 2016: 1-16.
[10] Lin C Y. Rouge: A package for automatic evaluation of summaries[C]//Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics Workshop.Stroudsburg, PA: ACL, 2004: 74-81.
[11] Wei H, Li Z, Zhang C,et al. Image captioning based on sentence-level and word-level attention[C]//Proceedings of the International Joint Conference on Neural Networks. Piscataway, NJ:IEEE, 2019: 1-8.
[12] 李志欣, 施智平, 李志清, 等. 融合语义主题的图像自动标注[J]. 软件学报, 2011, 22(4): 801-812.
[13] Nallapati R, Zhou B, Gulcehre C, et al. Abstractive text summarization using sequence-to-sequence RNNs and beyond[C]//Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Stroudsburg, PA: ACL, 2016: 280-290.
[14] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[C] //Proceedings of the 3rd International Conference on Learning Representations. La Jolla, CA: ICLR, 2015: 1-15.
[15] Pasunuru R, Bansal M. Multi-reward reinforced summarization with saliency and entailment[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2018: 646-653.
[16] Vinyals O, Fortunato M, Jaitly N. Pointer networks[C] //Proceedings of Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2015: 2692-2700.
[17] Gu J, Lu Z, Li H, et al. Incorporating copying mechanism in sequence-to-sequence learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Stroudsburg, PA: ACL, 2016: 1631-1640.
[18] Gulcehre C, Ahn S, Nallapati R, et al. Pointing the unknown words[C] //Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2016: 140-149.
[19] Tu Z, Lu Z, Liu Y, et al. Modeling coverage for neural machine translation[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2016: 76-85.
[20] Wu Y, Schuster M, Chen Z, et al. Googles neural machine translation system: Bridging the gap between human and machine translation[EB/OL].http://arxiv.org/abs/1609.08144[2020-08-31].
[21] Tu Z, Liu Y, Shang L, et al. Neural machine translation with reconstruction[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence. Menlo Park, CA: AAAI, 2017: 3097-3103.
[22] Lin Z, Feng M, Santos C N, et al. A structured self-attentive sentence embedding [C] //Proceedings of the 5th International Conference on Learning Representations. La Jolla, CA: ICLR, 2017: 1-15.
[23] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C] //Proceedings of Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2017: 5998-6008.
[24] Li Y, Xiao T, Li Y, et al. A simple and effective approach to coverage-aware neural machine translation[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2018: 292-297.
[25] Mi H, Sankaran B, Wang Z, et al. Coverage embedding models for neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2016: 955-960.
[26] Rennie S J, Marcheret E, Mroueh Y, et al. Self-critical sequence training for image captioning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos, CA: IEEE Computer Society, 2017: 7008-7024.
[27] Hermann K M, Kocisky T, Grefenstette E, et al. Teaching machines to read and comprehend[C]//Proceedings of Advances in Neural Information Processing Systems. Cambridge, MA:MIT Press, 2015: 1693-1701.
[28] Hu B, Chen Q, Zhu F. LCSTS: a large scale Chinese short text summarization dataset[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:ACL, 2015: 1967-1972.
[29] Kingma D P, Ba J. Adam: a method for stochastic optimization[C]//Proceedings of the 3rd International Conference on Learning Representations. La Jolla, CA: ICLR, 2015: 119-133.
[30] Tan J, Wan X, Xiao J. Abstractive document summarization with a graph-based attentional neural model[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2017: 1171-1181.
[31] Kry'sciński W, Paulus R, Xiong C, et al. Improving abstraction in text summarization[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:ACL, 2018: 1808-1817.

基金

国家自然科学基金(61966004,61663004,61967002,61866004,61762078);广西自然科学基金(2019GXNS-FDA245018,2018GXNSFDA281009);广西八桂学者创新科研团队项目
PDF(1701 KB)

1395

Accesses

0

Citation

Detail

段落导航
相关文章

/