基于GraphTransformer的知识库问题生成

胡月,周光有

PDF(1491 KB)
PDF(1491 KB)
中文信息学报 ›› 2022, Vol. 36 ›› Issue (2) : 111-120.
信息检索与问答系统

基于GraphTransformer的知识库问题生成

  • 胡月,周光有
作者信息 +

Question Generation from Knowledge Base with Graph Transformer

  • HU Yue, ZHOU Guangyou
Author information +
History +

摘要

知识库问答依靠知识库推断答案,需要大量带标注信息的问答对,但构建大规模且精准的数据集不仅代价昂贵,还受领域等因素限制。为缓解数据标注问题,面向知识库的问题生成任务引起了研究者关注,该任务的特点是利用知识库三元组自动生成问题,但现有方法仅由一个三元组生成的问题过于简短,且缺乏多样性。为生成信息量丰富且多样化的问题,该文采用Graph Transformer和BERT两个编码层来加强三元组多粒度语义表征以获取背景信息,在SimpleQuestions数据集上的实验结果证明了该方法的有效性。

Abstract

Knowledge base question answering requires a large number of question answering pairs. To alleviate the problem of data annotation, the question generation from knowledge base has attracted the attention of researchers. This task is to use the triples of knowledge base to automatically generate the questions. To generate questions with rich and diverse information, this paper uses two encoding layers, Graph Transformer and BERT, to enhance the multi-granular semantic representation of triples to obtain background information. Experimental results on the SimpleQuestions dataset prove the effectiveness of the method.

关键词

问题生成 / 知识库 / 语义表征 / 知识库问答

Key words

question generation / knowledge base / semantic representation / knowledge base question answering

引用本文

导出引用
胡月,周光有. 基于GraphTransformer的知识库问题生成. 中文信息学报. 2022, 36(2): 111-120
HU Yue, ZHOU Guangyou. Question Generation from Knowledge Base with Graph Transformer. Journal of Chinese Information Processing. 2022, 36(2): 111-120

参考文献

[1] Bordes A, Usunier N, Chopra S, et al. Large-scale simple question answering with memory networks[J]. arXiv preprint arXiv: 1506.02075, 2015.
[2] Vrandeˇci'c D, Krtzsch M. Wikidata: A free collaborative knowledgebase[J]. Communications of the ACM, 2014, 57(10): 78-85.
[3] Rajpurkar P, Zhang J, Lopyrev K, et al. SQuAD: 100,000+ questions for machine comprehension of text[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016: 2383-2392.
[4] Daniel Duma, Ewan Klein. Generating natural language from linked data: Unsupervised template extraction[C]//Proceedings of the IWCS, 2013: 83-94.
[5] Seyler D, Yahya M, Berberich K. Generating quiz questions from knowledge graphs[C]//Proceedings of the 24th International Conference on World Wide Web, 2015: 113-114.
[6] Seyler D, Yahya M, Berberich K. Knowledge questions from knowledge graphs[C]//Proceedings of the ICTIR, 2017: 11-18.
[7] Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks[C]//Proceedings of the Advances in Neural Information Processing Systems, 2014: 3104-3112.
[8] Cho K, van Merrinboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014: 1724-1734.
[9] 张建华, 陈家骏. 自然语言生成综述[J]. 计算机应用研究, 2006, 23(8): 1-3.
[10] Luong T, Pham H, Manning C D. Effective approaches to attention-based neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2015: 1412-1421.
[11] 贾熹滨, 李让, 胡长建, 等. 智能对话系统研究综述[J]. 北京工业大学学报, 2017, 043(009): 1344-1356.
[12] Serban I V, García-Durán A, Gulcehre C, et al. Generating factoid questions with recurrent neural networks: The 30M factoid question-answer corpus[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 588-598.
[13] Mitesh M Khapra, Dinesh Raghu, et al. Generating natural language question-answer pairs from a knowledge graph using a RNN based question generation model[C]//Proceedings of the EACL, 2017: 376-385.
[14] Elsahar H, Gravier C, Laforest F. Zero-shot question generation from knowledge graphs for unseen predicates and entity types[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 218-228.
[15] Wang H, Zhang X, Wang H. A neural question generation system based on knowledge base[C]//Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing, 2018:133-142.
[16] Gu J, Lu Z, Li H, et al. Incorporating copying mechanism in sequence-to-sequence learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 1631-1640.
[17] Cai D, Lam W. Graph transformer for graph-to-sequence learning[J]. arXiv preprint arXiv: 1911.07470, 2019.
[18] Koncel-Kedziorski R, Bekal D, Luan Y, et al. Text generation from knowledge graphs with graph transformers[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 2284-2293.
[19] Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 4171-4186.
[20] 胡月. 面向大规模知识库的开放域问题生成技术研究[D].武汉:华中师范大学硕士学位论文,2020.
[21] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of Advances in Neural Information Processing Systems, 2017: 5998-6008.
[22] Bollacker K, Cook R G, Tufts P, et al. FreeBase: A shared database of structured general human knowledge[C]//Proceedings of National Conference on Artificial Intelligence, 2007: 1962-1963.
[23] Bordes A, Weston J, Collobert R, et al. Learning structured embeddings of knowledge bases[C]// Proceedings of the 25th Conference on Artificial Intelligence, 2011: 301-306.
[24] Wang Z, Zhang J, Feng J, et al. Knowledge graph embedding by translating on hyperplanes[C]//Proceedings of the 28th Conference on Artificial Intelligence, 2014: 1112-1119.
[25] Lin Hailun, Liu Yong, Wang Weiping, et al. Learning entity and relation embeddings for knowledge resolution[J]. Procedia Computer Science, 2017, 108: 345-354.
[26] Lin Y, Liu Z, Luan H, et al. Modeling relation paths for representation learning of knowledge bases[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2015: 705-714.
[27] Huang X, Zhang J, Li D, et al. Knowledge graph embedding based question answering[C]//Proceedings of the 12th ACM International Conference on Web Search and Data Mining, 2019: 105-113.
[28] Papineni K, Roukos S, Ward T, et al. BLEU: A method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2002: 311-318.
[29] Denkowski M, Lavie A. Meteor universal: Language specific translation evaluation for any target language[C]//Proceedings of the 19th Workshop on Statistical Machine Translation, 2014: 376-380.
[30] Lin C Y. ROUGE: A package for automatic evaluation of summaries[C]//Proceedings of the Workshop on Text Summarization Branches Out at ACL, 2004.
[31] Pennington J, Socher R, Manning C D. GloVe: Global vectors for word representation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014: 1532-1543.
[32] Abadi M, Barham P, Chen J, et al. Tensorflow: A system for large-scale machine learning[C]//Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, 2016: 265-283.
[33] Kingma D P, Ba J. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv: 1412.6980, 2014.

基金

国家自然科学基金(61972173)
PDF(1491 KB)

2245

Accesses

0

Citation

Detail

段落导航
相关文章

/