汉藏双语旅游领域知识图谱系统构建

冯小兰,赵小兵

PDF(3979 KB)
PDF(3979 KB)
中文信息学报 ›› 2019, Vol. 33 ›› Issue (11) : 64-72.
知识表示与知识获取

汉藏双语旅游领域知识图谱系统构建

  • 冯小兰,赵小兵
作者信息 +

A Chinese-Tibetan Bilingual Knowledge Graph System in Tourism Domain

  • FENG Xiaolan, ZHAO Xiaobing
Author information +
History +

摘要

旅游业是藏族地区主要的经济来源之一。然而,目前互联网上缺乏藏文旅游信息智能化服务系统,且藏文景点介绍文本也十分匮乏;相反,汉文旅游网站信息量大,但各旅游网站包含的景点不尽相同,景点介绍文本篇幅较长,且各旅游网站对同一个景点描述侧重点不同。为便于不同语言使用者能快速准确地了解景点相关的知识,该文首先在汉文旅游领域分别采用基于BLSTM神经网络模型、基于维基百科以及基于网络爬虫等形式获取与景点相关的共8种属性知识;并通过采用基于维基百科等方法构建的旅游领域汉藏词典,将获取的汉文知识迁移到藏文,其翻译覆盖率平均值达70.44%。最终,构建汉藏双语旅游领域知识图谱。

Abstract

Tourism is one of the main economic sources in the Tibetan region. However, there is no Tibetan tourism information intelligent service system on the Internet, and the introduction text of Tibetan attractions is also rare. In contrast, Chinese tourism websites have a large amount of information and contain different attractions. To facilitate the understanding of the knowledge related to the attraction, this paper firstly uses the BLSTM neural network model to acquire 11 kinds of attribute knowledge related to scenic spots in the Chinese tourism field. Through the Chinese-Tibetan dictionary of tourism, the Chinese knowledge acquired is transferred to Tibetan, and the translation coverage rate is 70.44%. Finally, a knowledge graph of Chinese-Tibetan bilingual tourism is constructed.

关键词

旅游领域关系抽取 / 知识图谱 / BLSTM

Key words

tourism domain relationship extraction / knowledge graph / BLSTM

引用本文

导出引用
冯小兰,赵小兵. 汉藏双语旅游领域知识图谱系统构建. 中文信息学报. 2019, 33(11): 64-72
FENG Xiaolan, ZHAO Xiaobing. A Chinese-Tibetan Bilingual Knowledge Graph System in Tourism Domain. Journal of Chinese Information Processing. 2019, 33(11): 64-72

参考文献

[1] 张练.领域信息抽取相关技术研究[D].哈尔滨: 哈尔滨工业大学硕士学位论文,2010.
[2] 刘龙.音乐领域全局实体关系抽取研究[D].哈尔滨: 哈尔滨工业大学硕士学位论文,2010.
[3] Kong B,Xu R F,Wu D Y.Bootstrapping-based relation extraction in financial domain[C]//Proceedings of the International Conference on Natural Language Processing,2004: 777-786.
[4] 周蓝珺.音乐领域中文实体关系抽取研究[D].哈尔滨: 哈尔滨工业大学硕士学位论文,2009.
[5] Daojian Zeng,Kang Liu,Siwei Lai,et al. Relation cla-ssification via convolutional deep neural network.[C]//Proceedings of COLING,2014: 2335-2344.
[6] 司文豪,贾雷,戚银城.基于卷积神经网络的中文人物关系抽取方法[J].计算机与现代化,2018(1): 59-63.
[7] Dongxu Zhang,Dong Wang.Relation classification via recurrent neural network[J].arXiv preprint arXiv: 1508.01006,2015.
[8] Sepp Hochreiter,Jurgen Schmidhuber. Long short-term memory[J].Neural Computation,1997,9(8): 1735-1780.
[9] Mikolov T,Chen K,Corrado G,et al. Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv: 30.3781,2013.
[10] Cho K,Van Merrinboer B,Gulcehre C,et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. arXiv preprint arXiv: 1406.1078,2014.
[11] Yao K,Cohn T,Vylomova K,et al. Depth-gated recurrent neural networks[J].arXiv preprint arXiv: 1508.03790,2015.
[12] Greff K,Srivastava R K,Koutník J,et al. LSTM: A search space odyssey[J]. IEEE Transactions on Neural Networks and Learning Systems,2017,28(10): 2222-2232.
[13] Zhang S,Zheng D,Hu X,et al. Bidirectional long short-term memory networks for relation classification[C]//Proceedings of the 29th Pacific Asia Conference on Language,Information and Computation. 2015: 73-78.
[14] 张华杰. 基于维基百科的知识抽取和重用[D]. 上海: 上海交通大学硕士学位论文,2009.

基金

国家语委重点项目(ZDI135-39)
PDF(3979 KB)

970

Accesses

0

Citation

Detail

段落导航
相关文章

/