基于BERT的意图分类与槽填充联合方法

马天宇,覃俊,刘晶,帖军,后琦

PDF(3521 KB)
PDF(3521 KB)
中文信息学报 ›› 2022, Vol. 36 ›› Issue (8) : 127-134.
自然语言理解与生成

基于BERT的意图分类与槽填充联合方法

  • 马天宇1,2,覃俊1,2,刘晶1,2,帖军1,2,后琦1,2
作者信息 +

BERT Based Joint Model for Intention Classification and Slot Filling

  • MA Tianyu1,2, QIN Jun1,2, LIU Jing1,2, TIE Jun1,2, HOU Qi1,2
Author information +
History +

摘要

口语理解是自然语言处理的一个重要内容,意图分类和槽填充是口语理解的两个基本子任务。最近的研究表明,共同学习这两项任务可以起到相互促进的作用。该文提出了一个基于BERT的意图分类联合模型,通过一个关联网络使得两个任务建立直接联系和共享信息,以此来提升任务效果。模型引入BERT来增强词向量的语义表示,有效解决了目前联合模型由于训练数据规模较小导致的泛化能力较差的问题。在ATIS和Snips数据集上的实验结果表明,该模型能有效提升意图分类和槽填充的性能。

Abstract

The intention classification and the slot filling are two basic sub-tasks of spoken language understanding. A joint model of intention classification and slot filling based on BERT is proposed. Through an association network, the two tasks establish direct contact and share information. BERT is introduced into the model to enhance the semantic representation of word vectors, which effectively alleviates the issue of small training data. Experiments on ATIS and Snips data sets show that the proposed model can significantly improve the accuracy of intention classification and the F1 value of slot filling.

关键词

意图分类 / 槽填充 / BERT / 关联网络

Key words

intent classification / slot filling / BERT / association network

引用本文

导出引用
马天宇,覃俊,刘晶,帖军,后琦. 基于BERT的意图分类与槽填充联合方法. 中文信息学报. 2022, 36(8): 127-134
MA Tianyu, QIN Jun, LIU Jing, TIE Jun, HOU Qi. BERT Based Joint Model for Intention Classification and Slot Filling. Journal of Chinese Information Processing. 2022, 36(8): 127-134

参考文献

[1] Ramanand J, Bhavsar K, Pedanekar N. Wishful thinking-finding suggestions and ‘buy’wishes from product reviews[C]//Proceedings of NAACL HLT Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, 2010: 54-61.
[2] Thomson B. Statistical methods for spoken dialogue management[M]. Springer Science & Business Media, 2013.
[3] Evgeniou T, Pontil M. Support vector machines: Theory and applications[M]. Berlin: Springer, 1999: 249-257.
[4] Ravuri S, Stolcke A. Recurrent neural network and LSTM models for lexical utterance classification[C]//Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015:135-139.
[5] Liu B, Lane I. Recurrent neural network structured output prediction for spoken language understanding[C]//Proceedings of the NIPS Workshop on Machine Learning for Spoken Language Understanding and Interactions, 2015: 1-7.
[6] Hakkani-Tür D, Tür G, Celikyilmaz A, et al. Multi-domain joint semantic frame parsing using bi-directional rnn-lstm[C]//Proceedings of Interspeech, 2016: 715-719.
[7] Zhang C, Li Y, Du N, et al. Joint slot filling and intent detection via capsule neural networks[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 5259-5267.
[8] Zhang X, Wang H. A joint model of intent determination and slot filling for spoken language understanding[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence, 2016: 2993-2999.
[9] Yoon Ki. Convolutional neural network for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014: 1746-1751.
[10] Ravuri S, Stolcke A. Recurrent neural network and LSTM models for lexical utterance classification[C]//Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015: 1-5.
[11] 侯丽仙等. 融合多约束条件的意图和语义槽填充联合识别[J].计算机科学与探索, 2020,14(9): 1545-1553.
[12] Goo C W, Gao G, Hsu Y K, et al. Slot-gated modeling for joint slot filling and intent prediction[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 753-757.
[13] Yao K, Peng B, Zhang Y, et al. Spoken language understanding using long short-term memory neural networks[C]//Proceedings of the IEEE Spoken Language Technology Workshop. IEEE, 2014: 189-194.
[14] Liu B, Lane I. Attention-based recurrent neural network models for joint intent detection and slot filling[C]//Proceedings of Interspeech, 2016: 685-689.
[15] Radford A, Narasimhan K, Salimans T, et al. Improving language understanding by generative pre-training[EB/OL]. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf.[2021-07-29].
[16] Sarzynska wawer J, Wawer A, Pawlak A, et al. Detecting formal thought disorder by deep contextualized word representations[J]. Psychiatry Research, 2021, 304: 114135-114135.
[17] Radford A, Wu J, Child R, et al. Language models are unsupervised multitask learners[J]. OpenAI blog, 2019, 1(8): 9.
[18] Wang C, Huang Z, Hu M. SASGBC: Improving sequence labeling performance for joint learning of slot filling and intent detection[C]//Proceedings of the 6th International Conference on Computing and Data Engineering, 2020: 29-33.
[19] Hardalov M, Koychev I, Nakov P. Enriched pre-trained transformers for joint slot filling and intent detection[J]. arXiv proprine arXiv: 2004.14848,2020.
[20] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017:5998-6008.
[21] Wu Y, Schuster M, Chen Z, et al. Google's neural machine translation system: Bridging the gap between human and machine translation[J]. arXiv preprint arXiv:1609.08144, 2016.
[22] Zhou J, Xu W. End-to-end learning of semantic role labeling using recurrent neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015: 1127-1137.
[23] Kingma D P, Ba J L. Adam: Amethod for stochastic optimization[C]//Proceedings of the ICLR, 2015:1-15.
[24] Liu B, Lane I. Joint online spoken language understanding and language modeling with recurrent neural networks[C]//Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2016: 22-30.
[25] Coucke A, Saade A, Ball A, et al. Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces[J]. arXiv preprint arXiv:1805.10190, 2018.
[26] Hakkani-Tür D, Tür G, Celikyilmaz A, et al. Multi-domain joint semantic frame parsing using bi-directional rnn-lstm[C]//Proceedings of Interspeech, 2016: 715-719.
[27] Chen Q, Zhuo Z, Wang W. Bert for joint intent classification and slot filling[J]. arXiv preprint arXiv:1902.10909, 2019.
[28] Wang C, Huang Z, Hu M. SASGBC: Improving sequence labeling performance for joint learning of slot filling and intent detection[C]//Proceedings of the 6th International Conference on Computing and Data Engineering, 2020: 29-33.
[29] Liu H, Liu Y, Wong L P, et al. A hybrid neural network BERT-cap based on pre-trained language model and capsule network for user intent classification[J]. Complexity, 2020: 1-11.
[30] Dey R, Salem F M. Gate-variants of gated recurrent unit neural networks[C]//Proceedings of the IEEE 60th international midwest symposium on circuits and systems. IEEE, 2017: 1597-1600.

基金

湖北省技术创新专项重大项目(2019ABA101);武汉市科技计划应用基础前沿项目(2020020601012267);中南民族大学中央高校基本科研业务费专项(CZQ20012)
PDF(3521 KB)

Accesses

Citation

Detail

段落导航
相关文章

/