现有多轮对话生成的Encoder-Decoder模型容易产生单一的响应,虽然使用条件自动编码器(CVAE)可以有效改善响应的多样性问题,但是基于CVAE的模型大多不能够捕捉上下文中较长的依赖。同时,现有的模型也无法显式处理上下文话语和源语句之间的差异。该文将Transformer与CVAE结合,通过Transformer捕捉对话中的长依赖,使潜在变量可以学习到更丰富的对话分布。通过分离上下文语句的编码实现上下文的信息流向源语句,并使用门控机制来控制上下文话语和源语句的信息融合,捕捉对话中对响应影响更大的信息。实验表明,该模型产生的响应多样性更高,质量更好。
Abstract
The Conditional Variational Autoencoder (CVAE)is applied in multi-turn dialogues to improve the diversity of the responses. Most CVAE based models fail to capture long dependencies in context. Meanwhile, the existed methods cannot explicitly deal with the difference between context utterance and source utterances. This paper combines Transformer with CVAE to capture the long dependencies in the dialogue. By separating the encoding of the context utterances, the information of the context is directed to the source utterances, with the gated structures controlling information fusion between context utterances and source utterances. Experiments show that the proposed model has higher response diversity and better quality.
关键词
对话多样性 /
CVAE /
分离Context机制 /
Transformer
{{custom_keyword}} /
Key words
response diversity /
CVAE /
separating context mechanism /
Transformer
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 宋皓宇,张伟男,刘挺. 基于DQN的开放域多轮对话策略学习[J]. 中文信息学报, 2018, 32(7): 99-108,136.
[2] Sordoni A, Galley M, Auli M, et al. A neural network approach to context-sensitive generate-on of conversational responses[C]//Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the ACL, 2015: 196-205.
[3] Serban I V, Sordoni A, Lowe R, et al. A hierarchical latent variable encoder-decoder model for generating dialogues[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017: 3295-3301.
[4] Li J, Galley M, Brockett C, et al. A persona-based neural conversation model[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 994-1003.
[5] Xing C, Wu W, Wu Y, et al. Topic aware neural response generation[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017: 3351-3357.
[6] Zhou G, Luo P, Cao R, et al. Mechanism-aware neural machine for dialogue responsegeneation[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017: 3400-3406.
[7] Li J, Galley M, Brockett C, et al. A diversity-promoting objective function for neural con-versation models[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016: 110-119.
[8] Li J, Monroe W,Jurafsky D. A simple, fast diverse decoding algorithm for neural generation[J]. arXiv preprint arXiv: 1611.08562, 2016.
[9] Zhang Y, Galley M, Gao J, et al. Generating informative and diverse conversational responses via adversarial information maximization[C]//Proceedings of Advances in Neural Information Processing Systems, 2018: 1810-1820.
[10] Zhao T, Zhao R,Eskenazi M. Learning discourse-level diversity for neural dialog models using conditional variational autoencoders[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 654-664.
[11] Shen X, Su H, Li Y, et al. A conditional variational framework for dialog generation[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 504-509.
[12] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of Advances in Neural Information Processing Systems, 2017: 5998-6008.
[13] Wiseman S, Rush A M. Sequence-to-sequence learning as beam-search optimization[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016: 1296-1306.
[14] Li J, Monroe W, Ritter A, et al. Deep reinforcement learning for dialogue generation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016: 1192-1202.
[15] Xu Z, Liu B, Wang B, et al. Neural response generation viagan with an approximate embedding layer[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2017: 617-626.
[16] Serban I V, Sordoni A, Bengio Y, et al. Building end-to-end dialogue systems using generative hierarchical neural network models[C]//Proceedings of the 13th AAAI Conference on Artificial Intelligence, 2016: 3776-3783.
[17] Oluwatobi O Olabiyi, Erik T Mueller. DLGNet: A transformer-based model for dialogue response generation[J]. arXiv preprint arXiv: 1611.08562, 2019, 1908.01841v2.
[18] Yan X, Yang J,Sohn K, et al. Attribute2image: Conditional image generation from visual attributes[C]//Proceedings of the European Conference on Computer Vision. Springer, Cham, 2016: 776-791.
[19] Bowman S,Vilnis L, Vinyals O, et al. Generating sentences from a continuous space[C]//Proceedings of the 20th SIGNLL Conference on Computa-tional Natural Language Learning, 2016: 10-21.
[20] Gu X, Cho K, Ha J W, et al. Dialogwae: Multimodal response generation with conditional Wasserstein auto-encoder[C]//Proceedings of the 7th International Conference on Learning Representations, 2019.
[21] Gao J, Bi W, Liu X, et al. A discrete CVAE for response generation on short-text conversation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 1898-1908.
[22] Shen D,Celikyilmaz A, Zhang Y, et al. Towards generating long and coherent text with multi-level latent variable models[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 2079-2089.
[23] Wang T, Wan X. T-CVAE: Transformer based conditioned variational autoencoder for story completion[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence, 2019: 5233-5239.
[24] Xing C, Wu Y, Wu W, et al. Hierarchical recurrent attention network for response generation[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 2018: 5610-5617.
[25] Pan Z, Bai K, Wang Y, et al. Improving open-domain dialogue systems via multi-turn incomplete utterance restoration[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 1824-1833.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61303155);中国科学院2019年度大学生创新实践项目基金(118900FA12)
{{custom_fund}}