基于BERTCA的新闻实体与正文语义相关度计算模型

向军毅,胡慧君,刘茂福,毛瑞彬

PDF(1891 KB)
PDF(1891 KB)
中文信息学报 ›› 2022, Vol. 36 ›› Issue (3) : 109-119.
信息检索与问答

基于BERTCA的新闻实体与正文语义相关度计算模型

  • 向军毅1,2,胡慧君1,2,刘茂福1,2,毛瑞彬3
作者信息 +

BERTCA Based Semantic Relevance Model for News Entity and Text

  • XIANG Junyi1,2, HU Huijun1,2, LIU Maofu1,2, MAO Ruibin3
Author information +
History +

摘要

目前的搜索引擎仍然存在“重形式,轻语义”的问题,无法做到对搜索关键词和文本的深层次语义理解,因此语义检索成为当前搜索引擎中亟需解决的问题。为了提高搜索引擎的语义理解能力,该文提出一种语义相关度的计算方法。首先,标注了金融类新闻标题实体与新闻正文语义相关度语料1万条,然后建立新闻实体与正文语义相关度计算的BERTCA(Bidirectional Encoder Representation from Transformers Co-Attention)模型,通过使用BERT预训练模型,综合考虑细粒度的实体和粗粒度的正文的语义信息,然后经过协同注意力,实现实体与正文的语义匹配,不仅能计算出金融新闻实体与新闻正文之间的相关度,还能根据相关度阈值来判定相关度类别,实验表明该模型在1万条标注语料上准确率超过95%,优于目前主流模型,最后通过具体搜索示例展示了该模型的优秀性能。

Abstract

In order to improve the ability of semantic retrieval in search engines, this paper proposes a semantic relevancy model for news entity and text. A corpus 10, 000 financial news with the semantic relatedness between entities in headlines and text has been manually annotated. Then the BERTCA (Bidirectional Encoder Representation from Transformers Co-Attention semantic relevancy computing) model has been established using this corpus. Through the co-attention mechanism, this model can obtain the semantic matching between the entity and text, and it can not only calculate the degree of correlation between entity and text, but also determine the degree of correlation according to the semantic relevancy. The experimental results show that the accuracy of the proposed model surpasses 95%, which is better than the state-of-the-art models.

关键词

语义相关度计算 / BERT模型 / 协同注意力机制

Key words

semantic relevance computing / BERT model / co-attention mechanism

引用本文

导出引用
向军毅,胡慧君,刘茂福,毛瑞彬. 基于BERTCA的新闻实体与正文语义相关度计算模型. 中文信息学报. 2022, 36(3): 109-119
XIANG Junyi, HU Huijun, LIU Maofu, MAO Ruibin. BERTCA Based Semantic Relevance Model for News Entity and Text. Journal of Chinese Information Processing. 2022, 36(3): 109-119

参考文献

[1] Ai Q, Bi K, Guo J, et al. Learning a deep listwise context model for ranking refinement[C]//Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. 2018: 135-144.
[2] Wang S, Bao Z, Culpepper J S, et al. Torch: A search engine for trajectory data[C]//Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. 2018: 535-544.
[3] Xiong C, Liu Z, Callan J, et al. Towards better text understanding and retrieval through kernel entity salience modeling[C]//Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. 2018: 575-584.
[4] Macavaney S, Yates A, Cohan A, et al. CEDR: Contextualized embeddings for document ranking[C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019: 1101-1104.
[5] Burges C J. From ranknet to lambdarank to lambdamart: An overview[J]. Learning, 2010,11(23-581): 81.
[6] Kppel M, Segner A, Wagener M, et al. Pairwise learning to rank by neural networks revisited: Reconstruction, theoretical analysis and practical performance[C]//Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 2019: 237-252.
[7] Qin T, Liu T Y, Xu J, et al. Letor: A benchmark collection for research on learning to rank for information retrieval[J]. Information Retrieval, 2010, 13(4): 346-374.
[8] Huang P S, He X, Gao J, et al. Learning deep structured semantic models for web search using clickthrough data[C]//Proceedings of the 22nd ACM international conference on Information and Knowledge Management. 2013: 2333-2338.
[9] 庞亮, 兰艳艳, 徐君, 等. 深度文本匹配综述[J]. 计算机学报, 2017, 40(4): 985-1003.
[10] Kim S, Kang I, Kwak N. Semantic sentence matching with densely-connected recurrent and co-attentive information[C]//Proceedings of the 33rd AAAI conference on artificial intelligence. 2019: 6586-6593.
[11] Wan S, Lan Y, Guo J, et al. A deep architecture for semantic matching with multiple positional sentence representations[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016: 2835-2841.
[12] Lu Z, Li H. A deep architecture for matching short texts[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013: 1367-1375.
[13] Xu C, Lin Z, Wu S, et al. Multi-level matching networks for text matching[C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019: 949-952.
[14] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[J/OL]. arXiv preprint arXiv: 1810.04805, 2018.
[15] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 5998-6008.
[16] Lan Z, Chen M, Goodman S, et al. Albert: A lite bert for self-supervised learning of language representations[J/OL]. arXiv preprint arXiv: 1909.11942, 2019.
[17] Yang Z, Dai Z, Yang Y, et al. Xlnet: Generalized autoregressive pretraining for language understanding[C]//Proceedings of the 33rd Conference on Neural Information Processing Systems. 2019: 5753-5763.
[18] Liu Y, Ott M, Goyal N, et al. Roberta: A robustly optimized bert pretraining approach[J/OL]. arXiv preprint arXiv: 1907.11692, 2019.
[19] Dai Z, Yang Z, Yang Y, et al. Transformer-xl: Attentive language models beyond a fixed-length context[J]. arXiv preprint arXiv: 1901.02860, 2019.
[20] Liu T Y. Learning to rank for information retrieval[M/OL]. Springer Science and Business Media, 2011: 181-191.
[21] Yin D, Hu Y, Tang J, et al. Ranking relevance in yahoo search[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016: 323-332.
[22] Wang L, Li S, Lü Y, et al. Learning to rank semantic coherence for topic segmentation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2017: 1340-1344.
[23] Wu Q, Burges C J, Svore K M, et al. Adapting boosting for information retrieval measures[J/OL]. Information Retrieval, 2010, 13(3): 254-270.
[24] Xingjian S, Chen Z, Wang H, et al. Convolutional lstm network: A machine learning approach for precipitation nowcasting[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015: 802-810.
[25] Liu P, Qiu X, Chen J, et al. Deep fusion lstms for text semantic matching[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 1034-1043.
[26] Cui Y, Che W, Liu T, et al. Pre-training with whole word masking for chinese bert[J/OL]. arXiv preprint arXiv: 1906.08101, 2019.
[27] Kalchbrenner N, Danihelka I, Graves A. Grid long short-term memory[J/OL]. arXiv preprint arXiv: 1507.01526, 2015, 6: 19-34.
[28] Shu B, Ren F, Bao Y. Investigating lstm with k-max pooling for text classification[C]//Proceedings of the 11th International Conference on Intelligent Computation Technology and Automation (ICICTA). IEEE, 2018: 31-34.
[29] Chen Q, Zhu X, Ling Z, et al. Enhanced lstm for natural language inference[J/OL]. arXiv preprint arXiv: 1609.06038, 2016.
[30] Loshchilov I, Hutter F. Decoupled weight decay regularization[J/OL]. arXiv preprint arXiv: 1711.05101, 2017.

基金

深圳证券信息有限公司联合研究计划(2018002) ;全军共用信息系统装备预先研究项目(31502030502)
PDF(1891 KB)

927

Accesses

0

Citation

Detail

段落导航
相关文章

/