1.Key Laboratory of System Control and Information Processing, Department of Automation, Shanghai Jiao Tong University, Shanghai 200240, China; 2.College of Management Information System, Antai College of Economics and Management, Shanghai Jiao Tong University, Shanghai 200240, China
摘要对话是自然语言处理的一个重要研究领域,其成果已经得到广泛的应用。然而中文对话模型训练时由于字词数量庞大,必然会面临模型复杂度过高的问题。为解决此问题,该文首先将对话模型的汉字输入转化为拼音输入并将拼音分为声母、韵母和声调三个部分,以此减小输入的字词数量。然后以嵌入编码的方法将拼音信息组合为图像形式,再通过全卷积神经网络(FCN)和双向Long Short Term Memory(LSTM)网络提取拼音特征。最后采用4层的Gated Recurrent Units(GRU)网络对拼音特征进行解码以解决长时记忆问题,得到对话模型的输出。在此基础上,模型在解码阶段加入了注意力机制,使模型的输出可以更好地与输入进行对应。为对提出的中文对话模型进行评价,该文建立了应用于医疗领域的中文对话数据库,并以BLEU和ROUGE_L为评价指标在该数据库上对模型进行了测试。
Abstract:Conversation is an important research field in natural language processing with wide applications. However, when training the Chinese conversation model, we have to face the problem of excessively high model complexity due to the large number of words. To deal with this issue, this paper proposes to convert the Chinese input into Pinyin and divide it into initials, finals and tones three parts, thereby reducing the number of words. Then, the Pinyin information is combined into image form using embedding method. We extract the Pinyin feature through a Fully Convolutional Network (FCN) and a bi-directional Long Short Term Memory (LSTM) network. Finally, we use a 4-layer Gated Recurrent Units (GRU) network to decode the Pinyin feature for solving the problem of long time memory, and obtain the output of the conversation model. On this basis, the attention mechanism is added in the decoding stage so that the output can correspond with the input better. In the experiment, we set up a conversation database in the medical field, and use BLEU and ROUGE_L as an evaluation indicator to test our model on the database.
[1] Ji Z,Lu Z,Li H. An information retrieval approach to short text conversation[J]. arXiv preprint arXiv:1408.6988,2014. [2] Williams J D,Young S. Partially observable Markov decision processes for spoken dialog systems[J]. Computer Speech & Language,2007,21(2):393-422. [3] Misu T,Georgila K,Leuski A,et al. Reinforcement learning of question-answering dialogue policies for virtual museum guides[C]//Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Association for Computational Linguistics,2012:84-93. [4] Ritter A,Cherry C,Dolan W B. Data-driven response generation in social media[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics,2011:583-593. [5] Socher R,Huang E H,Pennin J,et al. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection[C]//Proceedings of Advances in Neural Information Processing Systems. 2011:801-809. [6] Hinton G,Deng L,Yu D,et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups[J]. IEEE Signal Processing Magazine,2012,29(6):82-97. [7] Sutskever I,Vinyals O,Le Q V. Sequence to sequence learning with neural networks[C]//Proceedings of Advances in neural information processing systems. 2014:3104-3112. [8] Vinyals O,Le Q. A neural conversational model[J]. arXiv preprint arXiv:1506.05869,2015. [9] Shang L,Lu Z,Li H. Neural responding machine for short-text conversation[J].arXiv preprint arXiv:1503.02364,2015. [10] Sordoni A,Galley M,Auli M,et al. A neural network approach to context-sensitive generation of conversational responses[J]. arXiv preprint arXiv:1506.06714,2015. [11] Bahdanau D,Cho K,Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473,2014. [12] Jean S,Cho K,Memisevic R,et al. On using very large target vocabulary for neural machine translation[J]. arXiv preprint arXiv:1412.2007,2014. [13] Ling W,Trancoso I,Dyer C,et al. Character-based neural machine translation[J]. arXiv preprint arXiv:1511.04586,2015. [14] Papineni K,Roukos S,Ward T,et al. BLEU: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics,2002:311-318. [15] 赵宇晴, 向阳. 基于分层编码的深度增强学习对话生成[J]. 计算机应用,2017,37(10):2813-2818. [16] 王文,赵群飞,朱特浩. 人—服务机器人交互中自然语言理解研究[J]. 微型电脑应用,2015(3):45-49. [17] 赵博轩,房宁,赵群飞,等. 利用拼音特征的深度学习文本分类模型[J]. 高技术通讯,2017,27(7):596-603. [18] Chung J,Gulcehre C,Cho K H,et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[J]. arXiv: 1412,3555 v1,2014.. [19] Cho,Kyunghyun,et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J].arXiv preprint arXiv:1406.1078,2014. [20] Li J,Galley M,Brockett C,et al. A diversity-promoting objective function for neural conversation models[J]. arXiv preprint arXiv:1510.03055,2015. [21] Jean S,Cho K,Memisevic R,et al. On using very large target vocabulary for neural machine translation[J]. arXiv preprint arXiv:1412.2007,2014. [22] Luong M T,Sutskever I,Le Q V,et al. Addressing the rare word problem in neural machine translation[J]. arXiv preprint arXiv:1410.8206,2014.