基于编码器—解码器架构的序列到序列学习模型是近年来主流的生成式文摘方法。但是,传统的编码器尚不能有效地对长文档进行语义编码,并且只能学习线性链结构的信息, 忽视了文档具有的层次结构。而文档的层次结构(字—句—文档)有助于自动文摘系统更加准确地判断文档内不同结构单元的语义信息和重要程度。为了使编码器能够获取文档的层次结构信息,该文根据文档的层次结构对文档进行编码: 首先构建字级语义表示,然后由字级语义表示构建句级语义表示。另外,该文还提出了一种语义融合单元来对输入文档不同层次的语义信息进行融合,作为最终的文档表示提供给编码器生成摘要。实验结果表明,在加入该文提出的层次文档阅读器与语义融合单元后,系统性能在 ROUGE 评价指标上有显著提高。
Abstract
Sequence-to-sequence model based on encoder-decoder architecture is the mainstream of generative summarization method at present. However, the traditional encoder cannot effectively encode long document semantically, and ignores the hierarchical structure information of document. To deal with this issue, this paper propose to hierarchically encode the document: firstly, the word-level semantic representation is constructed, and then the sentence-level semantic representation is constructed from the word-level semantic representation. In addition, a semantic fusion unit is proposed to fuse the different levels of representation information as the final document-level representation. The experimental results show that the system performance is significantly improved according to ROUGE evaluation.
关键词
文档层次结构 /
自动文摘 /
自然语言生成
{{custom_keyword}} /
Key words
document hierarchical structure /
automatic text summarization /
natural language generation
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Radev D R, Hovy E, McKeown K. Introduction to the special issue on summarization[J]. Computational Linguistics, 2002, 28(4): 399-408.
[2] 刘挺, 吴岩, 王开铸. 自动文摘综述[J]. 情报科学, 1998, 16(1): 63-69.
[3] 王红玲, 周国栋, 朱巧明. 面向冗余度控制的中文多文档自动文摘[J]. 中文信息学报, 2012, 26(2): 92-97.
[4] Liu F, Flanigan J, Thomson S, et al. Toward abstractive summarization using semantic representations[C]//Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2015:1077-1086.
[5] Blitzer J, Weinberger K Q, Saul L K, et al. Hierarchical distributed representations for statistical language modeling[C]//Proceedings of the 17th International Conference on Neural Information Processing Systems. MIT Press, 2004:185-192.
[6] 刘洋. 神经机器翻译前沿进展[J]. 计算机研究与发展, 2017, 54(6): 1144-1149.
[7] Elman J L. Finding structure in time[J]. Cognitive Science, 1990, 14(2): 179-211.
[8] Harabagiu S M, Lacatusu F. Generating single and multi-document summaries with gistexter[C]//Proceedings of DUC-2002, 2002:11-12.
[9] Genest P E, Lapalme G. Fully abstractive approach to guided summarization[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, 2012:354-358.
[10] Barzilay R, McKeown K R. Sentence fusion for multidocument news summarization[J].Computational Linguistics, 2005, 31(3): 297-328.
[11] Lee C S, Jian Z W, Huang L K. A fuzzy ontology and its application to news summarization[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2005, 35(5): 859-880.
[12] Gatt A, Reiter E. SimpleNLG: A realisation engine for practical applications[C]//Proceedings of the 12th European Workshop on Natural Language Generation. Association for Computational Linguistics, 2009:90-93.
[13] Genest P E, Lapalme G. Framework for abstractive summarization using text-to-text generation[C]//Proceedings of the Workshop on Monolingual Text-to-Text Generation. Association for Computational Linguistics, 2011:64-73.
[14] Moawad I F, Aref M. Semantic graph reduction approach for abstractive Text Summarization[C]//Proceedings of the 7th International Conference on Computer Engineering & Systems (ICCES). IEEE, 2012:132-138.
[15] Rush A M, Chopra S, Weston J. A neural attention model for abstractive sentence summarization[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015:379-389.
[16] Chopra S, Auli M, Rush A M. Abstractive sentence summarization with attentive recurrent neural networks[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016:93-98.
[17] Nallapati R, Zhou B, dos Santos C, et al. Abstractive text summarization using sequence-to-sequence RNNs and beyond[C]//Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, 2016:280-290.
[18] Gu J, Lu Z, Li H, et al. Incorporating copyingmechanism in sequence-to-sequence learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016, 1:1631-1640.
[19] Gulcehre C, Ahn S, Nallapati R, et al. Pointing the unknown words[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016, 1:140-149.
[20] Zeng W, Luo W, Fidler S, et al. Efficient summarization with read-again and copy mechanism[J]. arXiv preprint arXiv: 1611.03382, 2016.
[21] Yu L, Buys J, Blunsom P. Online segment to segment neural transduction[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016:1307-1316.
[22] Shen S, Cheng Y, He Z, et al. Minimum risk training for neural machine translation[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016, 1:1683-1692.
[23] Ayana S S, Liu Z, Sun M. Neural headline generation with minimum risk training[J]. arXiv preprint arXiv: 1604.01904, 2016.
[24] Tang D, Qin B, Liu T. Document modeling with gated recurrent neural network for sentiment classification[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015:1422-1432.
[25] Yang Z, Yang D, Dyer C, et al. Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,2016:1480-1489.
[26] Li J, Luong T, Jurafsky D. A hierarchical neural autoencoder for paragraphs and documents[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015, 1:1106-1115.
[27] Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016:2818-2826.
[28] Lin J, Sun X, Ma S, et al. Global encoding for abstractive summarization[J].arXiv preprint arXiv: 1805.03989,2018.
[29] Wang S, Jiang J.Machine comprehension using match-lstm and answer pointer[J].arXiv preprint arXiv: 1608.07905, 2016.
[30] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. aXiv preprine arXiv: 1706.03762,2017.
[31] Hua L, Wan X, Li L. Overview of the NLPCC 2017 Shared Task: Single Document Summarization[C]//Proceedings of the 6th Conference on Natural Language Processing and Chinese Computing. Springer, Cham, 2017:942-947.
[32] Lin C Y, Hovy E. Automatic evaluation of summaries using n-gram co-occurrence statistics[C]//Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, 2003.
[33] Liu M, Yu Y, Qi Q, et al. Extractive single document summarization via multi-feature combination and sentence compression[C]//Proceedings of the 6th Conference on Natural Language Processing and Chinese Computing. Springer, Cham, 2017:807-817.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61402314)
{{custom_fund}}