对话系统对上文信息使用不充分是当前制约多轮对话效果的主要因素,基于上文信息对用户当前输入进行改写是该问题的一种重要解决方法。改写任务的核心在于指代消解(pronoun resolution)和省略补全(ellipsisrecovery)。该文提出了一种基于BERT的指针网络(Span Prediction for Dialogue Rewrite,SPDR),该模型会预测用户当前轮次输入语句中所有token前面需要填充的内容,在上文中对应的片段(span)起始和结束的位置,来实现多轮对话改写;该文还提出了一种新的衡量改写结果的评价指标sEMr。相较于基于指针生成网络的模型,该模型在不损失效果的前提下推理速度提升接近100%,基于RoBERTa-wwm的SPDR模型在5项指标上均有明显提升。
Abstract
The main factor restricting the performance of multi-turn dialogue is the insufficient use of context information. Currently, one of important solutions to this problem is to rewrite user’s input based on preceding text of dialogue. The core task of rewrite is pronoun resolution and ellipsis recovery. We proposed SPDR (Span Prediction for Dialogue Rewrite) based on BERT, which performs multi-turn dialogue rewrite through predicting the start and end position of the span to fill before each token in user’s input. A new metric comes forward to evaluate the performance of rewrite result. Compared with traditional pointer generate network, the inference speed of our model is improved by about 100% without damaging the performance. Our model based on RoBERTa-wwm outperforms the pointer generate network in five metrics.
关键词
对话改写 /
指针网络 /
BERT
{{custom_keyword}} /
Key words
dialogue rewrite /
pointer network /
BERT
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Qiu M, Li F L, Wang S, et al. Alime chat: A sequence to sequence and rerank based chatbot engine[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 498-503.
[2] Zhou L, Gao J, Li D, et al. The design and implementation of xiaoice, an empathetic social chatbot[J]. Computational Linguistics, 2020, 46(1): 53-93.
[3] Su H, Shen X, Zhang R, et al. Improving multi-turn dialogue modelling with utterance Rewriter[J]. arXiv preprint arXiv:1906.07004, 2019.
[4] Pan Z, Bai K, Wang Y, et al. Improving open-domain dialogue systems via multi-turn incomplete utterance restoration[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 1824-1833.
[5] Huang M, Li F, Zou W, et al. SARG: A novel semi autoregressive generator for multi-turn incomplete utterance restoration[J]. arXiv preprint arXiv:2008.01474, 2020.
[6] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:1810.04805, 2018.
[7] Gururangan S, Marasovi A, Swayamdipta S, et al. Don’t stop pretraining: Adapt language models to domains and tasks[J]. arXiv preprint arXiv:2004.10964, 2020.
[8] Tian H, Gao C, Xiao X, et al. SKEP: Sentiment knowledge enhanced pre-training for sentiment analysis[J]. arXiv preprint arXiv:2005.05635, 2020.
[9] Yang W, Qiao R, Qin H, et al. End-to-end neural context reconstruction in Chinese dialogue[C]//Proceedings of the First Workshop on NLP for Conversational AI, 2019: 68-76.
[10] Yin Q, Zhang W, Zhang Y, et al. A deep neural network for Chinese zero pronoun resolution[J]. arXiv preprint arXiv:1604.05800, 2016.
[11] Yin Q, Zhang Y, Zhang W, et al. Zero pronoun resolution with attention-based neural network[C]//Proceedings of the 27th International Conference on Computational Linguistics, 2018: 13-23.
[12] Yin Q, Zhang Y, Zhang W, et al. Deep reinforcement learning for Chinese zero pronoun resolution[J]. arXiv preprint arXiv:1806.03711, 2018.
[13] Zhang H, Song Y, Song Y, et al. Knowledge-aware pronoun coreference resolution[J]. arXiv preprint arXiv:1907.03663, 2019.
[14] Malmi E, Krause S, Rothe S, et al. Encode, tag, realize: High-precision text editing[J]. arXiv preprint arXiv:1909.01187, 2019.
[15] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 5998-6008.
[16] Vinyals O, Fortunato M, Jaitly N. Pointer networks[C]//Proceedings of the 29th International Conference on Neural Information Processing Systems, 2015: 2692-2700.
[17] Cui Y, Chi W, Liu T, et al. Pre-training with whole word masking for Chinese BERT[J]. arXiv preprint arXiv:1906.08101, 2019.
[18] Lan Z, Chen M, Goodman S, et al. Albert: A lite BERT for self-supervised learning of language representations[J]. arXiv preprint arXiv:1909.11942, 2019.
[19] Dong L, Yang N, Wang W, et al. Unified language model pre-training for natural language understanding and generation[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019: 13063-13075.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(62071189)
{{custom_fund}}