|
|
Combination Methods of Chinese Character and Word Embeddings in Deep Learning |
LI Weikang, LI Wei, WU Yunfang |
Key Laboratory of Computational Linguistics (Peking University), Ministry of Education, Beijing 100871, China |
|
|
Abstract This paper investigates the combination of Chinese character and word embeddings in deep learning. We propose to do experiments considering shallow and deep combinations based on word and character. In order to demonstrate the effectiveness of combination, we present a compare-aggregate model solving the problem of question answering. Extensive experiments conducted on the open DBQA data demonstrate that the effective combination of characters and words significantly improves the system achieving comparable results with state-of-art systems.
|
Received: 29 September 2017
|
|
|
|
|
[1]Feng M,Xiang B,Glass M R,et al.Applying deep learning to answer selection:A study and an open task[C]//Proceeding of Automatic Speech Recognition and Understanding (ASRU),2015 IEEE Workshop on.IEEE,2015:813-820. [2]Tan M,Santos C,Xiang B,et al.LSTM-based deep learning models for non-factoid answer selection[J].arXiv preprint arXiv:1511.04108,2015. [3]Tan M,Dos Santos C,Xiang B,et al.Improved representation learning for question answer matching[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,2016. [4]Tai K S,Socher R,Manning C D.Improved semantic representations from tree-structured long short-term memory networks[J].arXiv preprint arXiv:1503.00075,2015. [5]Bowman S R,Angeli G,Potts C,et al.A large annotated corpus for learning natural language inference[J].arXiv preprint arXiv:1508.05326,2015. [6]Seo M,Kembhavi A,Farhadi A,et al.Bidirectional attention flow for machine comprehension[J].arXiv preprint arXiv:1611.01603,2016. [7]Sukhbaatar S,Weston J,Fergus R.End-to-end memory networks[C]//Advances in Neural Information processing systems,2015:2440-2448. [8]Wang S,Jiang J.Learning natural language inference with LSTM[J].arXiv preprint arXiv:1512.08849,2015. [9]Goldberg Y,Levy O.word2vec explained:Deriving mikolov et al.s negative-sampling word-embedding method[J].arXiv preprint arXiv:1402.3722,2014. [10]Pennington J,Socher R,Manning C D.Glove:Global vectors for word representation[C]//EMNLP.2014,14:1532-1543. [11]Alexandrescu A,Kirchhoff K.Factored neural language models[C]//Proceedings of the Human Language Technology Conference of the NAACL,Companion Volume:Short Papers.Association for Computational Linguistics,2006:1-4. [12]Luong T,Socher R,Manning C D.Better word representations with recursive neural networks for morphology[C]//Proceedings of CoNLL 2013,2013:104-113. [13]Huang E H,Socher R,Manning C D,et al.Improving word representations via global context and multiple word prototypes[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics:Long Papers-Volume 1.Association for Computational Linguistics,2012:873-882. [14]Chen X,Xu L,Liu Z,et al.Joint learning of character and word embeddings[C]//Proceedings of the 24th International Joint Conference on Artificial Intelligence.2015. [15]Yang Y,Yih W,Meek C.WikiQA:A challenge dataset for open-domain question answering[C]//Proceedings of the EMNLP 2015.2015:2013-2018. [16]Wang S,Jiang J.A compare-aggregate model for matching text sequences[J].arXiv preprint arXiv:1611.01747,2016. [17]Duan N.Overview of the NLPCC-ICCPOL 2016 Shared task:open domain Chinese question answering[C]//Proceedings of International Conference on Computer Processing of Oriental Languages.Springer InternationalPublishing,2016:942-948. [18]Sun J.‘Jieba’Chinese word segmentation tool[CP/OL].2012.https://github.com/whtsky/jieba/ [19]Fu J,Qiu X,Huang X.Convolutional deep neural networks for document-based question answering[C]//Proceedings of International Conference on Computer Processing of Oriental Languages.Springer International Publishing,2016:790-797. [20]Wu F,Yang M,Zhao T,et al.A hybrid approach to DBQA[C]//Proceedings of International Conference on Computer Processing of Oriental Languages.Springer International Publishing,2016:926-933. |
|
|
|