基于多尺度自注意力增强的多方对话角色识别方法

张禹尧,蒋玉茹,张仰森

PDF(1953 KB)
PDF(1953 KB)
中文信息学报 ›› 2021, Vol. 35 ›› Issue (5) : 101-109.
问答与对话

基于多尺度自注意力增强的多方对话角色识别方法

  • 张禹尧,蒋玉茹,张仰森
作者信息 +

Multi-party Dialogue Character Identification Method Based on Multi-scale Self-attention Enhancement

  • ZHANG Yuyao, JIANG Yuru, ZHANG Yangsen
Author information +
History +

摘要

角色识别任务是近年来提出的一项自然语言处理任务,面向多方参与的对话场景,目标是将对话中的人物提及映射到具体的人物实体。目前在该任务的最优系统中,只使用了较为简单的编码器,并未针对对话文本特点进行改造创新。该文在最优系统的基础上,提出了一种基于多尺度自注意力增强的方法,借助不同尺度的自注意力,来获得更好的信息表示。首先,通过尺度较大的全局注意力,对场景内的全部对话信息进行处理,保留了全局的对话信息;然后,通过尺度较小的局部注意力,对局部范围内的对话进行计算,捕获近距离的信息之间的关联关系;最后,将不同尺度得到的信息进行融合,达到对编码信息增强的效果。在SemEval2018 Task4任务上的实验结果表明了该方法的有效性,相较于目前最优系统,在全部实体的F1值上提高了18.94%。

Abstract

The character identification task aims at mapping the person mentions in the dialogue to specific person entities in the dialogue scenarios involving multiple parties. This paper proposes a method based on multi-scale self-attention enhancement, which uses self-attention at different scales to obtain better information representation. First, the global dialog information in the scene is captured through global attention with a large scope. Then, through the small-scale local attention, the dialog in the local area is calculated to capture the association relationship between the information at close range. Finally, the information obtained at different scales is fused to enhance the encoded information. The experimental results on SemEval2018 Task4 show the effectiveness of the method by 18.94% in F1 compared with the current optimal system.

关键词

角色识别 / 多尺度自注意力 / 全局注意力 / 局部注意力

Key words

character identification / multi-scale self-attention / global attention / local attention

引用本文

导出引用
张禹尧,蒋玉茹,张仰森. 基于多尺度自注意力增强的多方对话角色识别方法. 中文信息学报. 2021, 35(5): 101-109
ZHANG Yuyao, JIANG Yuru, ZHANG Yangsen. Multi-party Dialogue Character Identification Method Based on Multi-scale Self-attention Enhancement. Journal of Chinese Information Processing. 2021, 35(5): 101-109

参考文献

[1] Chen H Y, Zhou E, Choi J D. Robust coreference resolution and entity linking on dialogues: Character identification on TV show transcripts[C]//Proceedings of the 21st Conference on Computational Natural Language Learning, 2017: 216-225.
[2] Chen Y H, Choi J D. Character identification on multiparty conversation: Identifying mentions of characters in TV shows[C]//Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2016: 90-100.
[3] Choi J D, Chen H Y. SemEval 2018 task 4: Character identification on multiparty dialogues[C]//Proceedings of the 12th International Workshop on Semantic Evaluation, 2018: 57-64.
[4] Aina L, Silberer C, Sorodoc I, et al. AMORE-UPF at SemEval-2018 task 4: BiLSTM with entity library[C]//Proceedings of the 12th International Workshop on Semantic Evaluation, 2018: 65-69.
[5] Park C, Song H, Lee C. KNU CI system at SemEval-2018 task 4: Character identification by solving sequence-labeling problem[C]//Proceedings of the 12th International Workshop on Semantic Evaluation, 2018: 655-659.
[6] Han K, Choi S H, Shin G, et al. Character identification on multiparty dialogues using multimodal features[C]//Proceedings of the 30th Annual Conference on Human and Cognitive Language Technology, 2018: 215-219.
[7] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv: 1409.0473, 2014.
[8] Luong M T, Pham H, Manning C D. Effective approaches to attention-based neural machine translation[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015: 1412-1421.
[9] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st Annual Conference on Neural Information Processing Systems, 2017: 5998-6008.
[10] Lee K, He L, Lewis M, et al. End-to-endneural coreference resolution[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017: 188-197.
[11] Fei H, Li X, Li D, et al. End-to-end deep reinforcement learning based coreference resolution[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 660-665.
[12] Yang Z, Yang D, Dyer C, et al. Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics, 2016: 1480-1489.
[13] 袁和金,张旭,牛为华,等.融合注意力机制的多通道卷积与双向GRU模型的文本情感分析研究[J].中文信息学报, 2019, 33(10): 109-118.
[14] 赵赟,吴璠,王中卿,等.基于注意力机制与文本信息的用户关系抽取[J].中文信息学报, 2019, 33(03): 87-93.
[15] Wang W, Yan M, Wu C. Multi-granularity hierarchical attention fusion networks for reading comprehension and question answering[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018: 1705-1714.
[16] 张禹尧,蒋玉茹,毛腾,等. MCA-Reader: 基于多重联结机制的注意力阅读理解模型[J].中文信息学报, 2019, 33(10): 73-80.
[17] 郑玉昆,李丹,范臻,等.T-Reader: 一种基于自注意力机制的多任务深度阅读理解模型[J].中文信息学报, 2018, 32(11): 128-134.
[18] Paszke A, Gross S, Massa F, et al. PyTorch: An imperative style, high-performance deep learning library[C]//Proceedings of the 33rd Annual Conference on Neural Information Processing Systems, 2019: 8024-8035.
[19] Pennington J, Socher R, Manning C. GloVe: Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014: 1532-1543.
[20] Kingma D P, Ba J. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv: 1412.6980, 2014.
[21] Noreen E W. Computer-intensive methods for testing hypotheses[M]. New York: Wiley, 1989.

基金

国家自然科学基金(61602044,61772081)
PDF(1953 KB)

1279

Accesses

0

Citation

Detail

段落导航
相关文章

/