文本语义匹配指基于给定的文本判别文本之间的语义关系。针对该任务,现有模型的信息编码未考虑利用除汉字字符外的潜在语义信息,且在分类时未考虑标签信息对模型性能的影响。因此,该文提出了一种使用汉字形音义多元知识和标签嵌入的文本语义匹配方法。首先,通过信息编码层对汉字的形音义的多元知识进行编码;其次,通过信息整合层获取融合汉字形音义多元知识的联合表示;然后,经过标签嵌入层利用编码后的分类标签与汉字形音义的联合表示生成信号监督标签;最后,经过标签预测层获取文本层面与标签层面的联合信息表示,进而对文本语义关系进行最终的判别。在多个数据集上的实验结果显示,该文提出的模型优于多个基线模型,验证了模型的有效性。
Abstract
Text semantic matching aims to identify semantic relationships between texts based on the given texts. The existing methods neglect the enhancement and utilization of potential semantic information other than Chinese characters in the encoder and do not consider the impact of label information. Therefore, this paper proposes a text semantic matching method with multi-knowledge and label embedding via language models. Firstly, the information encodeing layer is used to encode the multi-knowledge of Chinese characters glyph, pinyin and sense. Next, the information integration layer is used to get the joint representation of multi-knowledge of Chinese characters' glyph, pinyin and sense. Then, the label embedding layer utilizes the encoded representation of classification labels and joint representation of multi-knowledge to generate the representation of supervised labels. Further, the label prediction layer acquires enhanced joint representations from both the textual and label aspects, and obtains the ultimate prediction of semantic relationships. The experiment results on multiple widely used datasets show that the proposed method is effective and outperforms previous state-of-the-art models.
关键词
汉字形音义多元知识 /
标签嵌入 /
文本语义匹配
{{custom_keyword}} /
Key words
Chinese characters' glyph, pinyin, sense-based multi-knowledge /
label embedding /
text semantic matching
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] WANG Z G, HAMZA W, FLORIAN R. Bilateral multi-perspective matching for natural language sentences[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence, 2017: 4144-4150.
[2] LIU P F, QIU X P, CHEN J F, et al. Deepfusion LSTMs for text semantic matching[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 1034-1043.
[3] SEVERYN A, MOSCHITTI A. Learning to rank short text pairs with convolutional deep neural networks[C]//Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2015: 373-382.
[4] YIN W P, SCHTZE H, XIANG B, et al. ABCNN: Attention-based convolutional neural network for modeling sentence pairs[J]. Transactions of the Association for Computational Linguistics, 2016, 4:259-272.
[5] KIM S H, KANG I H, KWAK N J. Semantic sentence matching with densely-connected recurrent and co-attentive information[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2019: 6586-6593.
[6] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training ofdeep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 4171-4186.
[7] WANG A, SINGH A, MICHAEL J L, et al. GLUE: Amulti-task benchmark and analysis platform for natural language understanding[C]//Proceedings of the EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, 2018: 353-355.
[8] ROMANO L, KOUYLEKOV M, SZPEKTOR I, et al. Investigating a generic paraphrase-based approach for relation extraction[C]//Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, 2006: 409-416.
[9] WANG M Q, SMITH N A, MITAMURA T. What is the Jeopardy model? A quasi-synchronous grammar for QA[C]//Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2007: 22-32.
[10] BOWMAN S R, ANGELI G, POTTS C, et al. A large annotated corpus for learning natural language inference[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2015: 632-642.
[11] WILLIAMS A, NANGIA N, BOWMAN S. A broad-coverage challenge corpus for sentence understanding through inference[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 1112-1122.
[12] HUANG P S, HE X D, GAO J F, et al. Learning deep structured semantic models for web search using clickthrough data[C]//Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, 2013: 2333-2338.
[13] CONNEAU A, KIELA D, SCHWENK H, et al. Supervised learning of universal sentence representations from natural language inference data[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2017: 670-680.
[14] NIE Y X, BANSAL M. Shortcut-stacked sentence encoders for multi-domain inference[C]//Proceedings of the 2nd Workshop on Evaluating Vector Space Representations for NLP, 2017: 41-45.
[15] PARIKH A, TCKSTRM O, DAS D, et al. A decomposable attention model for natural language inference[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016: 2249-2255.
[16] CHENG J P, DONG L, LAPATA M. Long short-term memory-networks for machine reading[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016: 551-561.
[17] CHEN Q, ZHU X D, LING Z H, et al. Enhanced LSTM for natural language inference[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 1657-1668.
[18] GONG Y C, LUO H, ZHANG J. Natural language inference over interaction space[C]//Proceedings of the International Conference on Learning Representations, 2018.
[19] ZHANG Z Y, HAN X, LIU Z Y, et al. ERNIE: Enhanced language representation with informative entities[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2019: 1441-1451.
[20] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the Advances in Neural Information Processing Systems, 2017:6000-6010.
[21] LIU Z H, XIONG C Y, SUN M S, et al. Entity-duet neural ranking: Understanding the role of knowledge graph semantics in neural information retrieval[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, 2018: 2395-2405.
[22] CHEN Q, ZHU X D, LING Z H, et al. Neural natural language inference models enhanced with external knowledge[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, 2018: 2406-2417.
[23] 周烨恒,石嘉晗,徐睿峰. 结合预训练模型和语言知识库的文本匹配方法[J]. 中文信息学报, 2020, 34(2): 63-72.
[24] SUN Z J, LI X Y, SUN X F, et al. ChineseBERT: Chinese pretraining enhanced by glyph and pinyin information[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, 2021: 2065-2075.
[25] KINGMA D, BA J. Adam: A method for stochastic optimization[C]//Proceedings of the 3rd International Conference on Learning Representations, 2015.
[26] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 2980-2988.
[27] LIU X, CHEN Q C, DENG C, et al. LCQMC: A large-scale chinese question matching corpus[C]//Proceedings of the 27th International Conference on Computational Linguistics, 2018: 1952-1962.
[28] CHEN J, CHEN Q C, LIU X, et al. The BQ corpus: A large-scale domain-specific Chinese corpus for sentence semantic equivalence identification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018: 4946-4951.
[29] CUI Y M, CHE W X, LIU T, et al. Pre-training with whole word masking for Chinese BERT[J]. IEEE/ACM Translation. Audio, Speech and Language Procesing, 2021, 29: 3504-3514.
[30] XIAO D L, LI Y K, ZHANG H, et al. ERNIE-Gram: Pre-training with explicitly n-gram masked language modeling for natural language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021: 1702-1715.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61936012);山西省重点研发计划项目(202102020101008);山西省“四个一批”科技兴医创新计划项目(2022XM01)
{{custom_fund}}