|
|
Fusing Context Information and Key Information for Text Summarization |
LI Zhixin1, PENG Zhi1, TANG Suqin1, MA Huifang2 |
1.Guangxi Key Lab of Multi-source Information Mining and Security, Guangxi Normal University, Guilin, Guangxi 541004, China; 2.School of Computer Science and Engineering, Northwest Normal University, Lanzhou, Gansu 730070, China |
|
|
Abstract In text summarization, the mainstream method is to use encoder-decoder architecture to obtain the required context semantic information by using soft attention in the decoding process. Since the encoder sometimes encodes too much information, the generated summary does not always summarize the core content of the source text. To address this issue, this paper proposes a text summarization model based on a dual-attention pointer network. Firstly, in the dual-attention pointer network, the self-attention mechanism collects key information from the encoder, while the soft attention and the pointer network generate more coherent core content through context information. The fusion of both will generate accurate and coherent summaries. Secondly, the improved coverage mechanism is applied to address the repetition problem and improve the quality of the generated summaries. Simultaneously, scheduled sampling and reinforcement learning are combined to generate new training methods to optimize the model. Experiments on the CNN/Daily Mail dataset and the LCSTS dataset show that the proposed model performs as well as many state-of-the-art models.
|
Received: 12 May 2020
|
|
|
|
|
[1] Kalchbrenner N, Blunsom P. Recurrent continuous translation models[C] //Proceedings of the Conference on Empirical Methods in Natural Language Processing.Stroudsburg, PA: ACL, 2013: 1700-1709. [2] Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks[C]//Proceedings of Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2014: 3104-3112. [3] Rush A M, Chopra S, Weston J. A neural attention model for abstractive sentence summarization[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2015: 379-389. [4] See A, Liu P J, Manning C D. Get to the point: Summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2017:1073-1083. [5] Paulus R, Xiong C, Socher R. A deep reinforced model for abstractive summarization[C]//Proceedings of the 6th International Conference on Learning Representations.La Jolla, CA: ICLR, 2018: 1-13. [6] Liu L, Lu Y, Yang M, et al. Generative adversarial network for abstractive text summarization[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Menlo Park, CA: AAAI, 2018: 8109-8110. [7] Celikyilmaz A, Bosselut A, He X, et al. Deep communicating agents for abstractive summarization[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2018: 1662-1675. [8] Bengio S, Vinyals O, Jaitly N, et al. Scheduled sampling for sequence prediction with recurrent neural networks[C]//Proceedings of Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2015: 1171-1179. [9] Ranzato M A, Chopra S, Auli M,et al. Sequence level training with recurrent neural networks[C]//Proceedings of the 4th International Conference on Learning Representations. La Jolla, CA: ICLR, 2016: 1-16. [10] Lin C Y. Rouge: A package for automatic evaluation of summaries[C]//Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics Workshop.Stroudsburg, PA: ACL, 2004: 74-81. [11] Wei H, Li Z, Zhang C,et al. Image captioning based on sentence-level and word-level attention[C]//Proceedings of the International Joint Conference on Neural Networks. Piscataway, NJ:IEEE, 2019: 1-8. [12] 李志欣, 施智平, 李志清, 等. 融合语义主题的图像自动标注[J]. 软件学报, 2011, 22(4): 801-812. [13] Nallapati R, Zhou B, Gulcehre C, et al. Abstractive text summarization using sequence-to-sequence RNNs and beyond[C]//Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Stroudsburg, PA: ACL, 2016: 280-290. [14] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[C] //Proceedings of the 3rd International Conference on Learning Representations. La Jolla, CA: ICLR, 2015: 1-15. [15] Pasunuru R, Bansal M. Multi-reward reinforced summarization with saliency and entailment[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2018: 646-653. [16] Vinyals O, Fortunato M, Jaitly N. Pointer networks[C] //Proceedings of Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2015: 2692-2700. [17] Gu J, Lu Z, Li H, et al. Incorporating copying mechanism in sequence-to-sequence learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Stroudsburg, PA: ACL, 2016: 1631-1640. [18] Gulcehre C, Ahn S, Nallapati R, et al. Pointing the unknown words[C] //Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2016: 140-149. [19] Tu Z, Lu Z, Liu Y, et al. Modeling coverage for neural machine translation[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2016: 76-85. [20] Wu Y, Schuster M, Chen Z, et al. Googles neural machine translation system: Bridging the gap between human and machine translation[EB/OL].http://arxiv.org/abs/1609.08144[2020-08-31]. [21] Tu Z, Liu Y, Shang L, et al. Neural machine translation with reconstruction[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence. Menlo Park, CA: AAAI, 2017: 3097-3103. [22] Lin Z, Feng M, Santos C N, et al. A structured self-attentive sentence embedding [C] //Proceedings of the 5th International Conference on Learning Representations. La Jolla, CA: ICLR, 2017: 1-15. [23] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C] //Proceedings of Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2017: 5998-6008. [24] Li Y, Xiao T, Li Y, et al. A simple and effective approach to coverage-aware neural machine translation[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2018: 292-297. [25] Mi H, Sankaran B, Wang Z, et al. Coverage embedding models for neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2016: 955-960. [26] Rennie S J, Marcheret E, Mroueh Y, et al. Self-critical sequence training for image captioning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos, CA: IEEE Computer Society, 2017: 7008-7024. [27] Hermann K M, Kocisky T, Grefenstette E, et al. Teaching machines to read and comprehend[C]//Proceedings of Advances in Neural Information Processing Systems. Cambridge, MA:MIT Press, 2015: 1693-1701. [28] Hu B, Chen Q, Zhu F. LCSTS: a large scale Chinese short text summarization dataset[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:ACL, 2015: 1967-1972. [29] Kingma D P, Ba J. Adam: a method for stochastic optimization[C]//Proceedings of the 3rd International Conference on Learning Representations. La Jolla, CA: ICLR, 2015: 119-133. [30] Tan J, Wan X, Xiao J. Abstractive document summarization with a graph-based attentional neural model[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2017: 1171-1181. [31] Kry'sciński W, Paulus R, Xiong C, et al. Improving abstraction in text summarization[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:ACL, 2018: 1808-1817. |
|
|
|