神经网络机器翻译模型在蒙古文到汉文的翻译任务上取得了很好的效果。神经网络翻译模型仅利用双语语料获得词向量,而有限的双语语料规模却限制了词向量的表示。该文将先验信息融合到神经网络机器翻译中,首先将大规模单语语料训练得到的词向量作为翻译模型的初始词向量,同时在词向量中加入词性特征,从而缓解单词的语法歧义问题。其次,为了降低翻译模型解码器的计算复杂度以及模型的训练时间,通常会限制目标词典大小,这导致大量未登录词的出现。该文利用加入词性特征的词向量计算单词之间的相似度,将未登录词用目标词典中与之最相近的单词替换,以缓解未登录词问题。最终实验显示在蒙古文到汉文的翻译任务上将译文的BLEU值提高了2.68个BLEU点。
Abstract
Neural machine translation (NMT) has become a prominent model in Mongolian-Chinese translation task. We implement neural machine translation model with priori information. On one hand,we train word representations using large-scale monolingual corpus to act as the initial word vectors. On the other hand,we add part-of-speech feature for word vector to solve the problem of grammatical ambiguity. To solve the out of vocabulary problem,we use word embedding to calculate the similarity of words,then replace the out-of-vocabulary words by the most similar words who are covered by the target vocabulary. In the task of Mongolian-Chinese machine translation,experimental results show that BLEU increased 2.68 points.
关键词
重现神经网络 /
未登录词 /
词向量 /
词性标注
{{custom_keyword}} /
Key words
recurrent neural network /
out-of-vocabulary /
word embedding /
part-of-speech
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Sutskever I,Vinyals O,Le Q V.Sequence to sequence learning with neural networks[C]//Proceedings of Advances in neural information processing systems,2014:3104-3112.
[2] Kalchbrenner N,Blunsom P.Recurrent Continuous Translation Models[C]//Proceedings of EMNLP,2013:3(39):413.
[3] Collobert R,Weston J.A unified architecture for natural language processing:Deep neural networks with multitask learning[C]//Proceedings of the 25th International Conference on Machine learning.ACM,2008:160-167.
[4] 郑亚楠,珠杰.基于词向量的藏文词性标注方法研究[J].中文信息学报,2017,31(1):112-117.
[5] Cho K,Van Merrinboer B,Bahdanau D,et al.On the properties of neural machine translation:Encoder-decoder approaches[J].arXiv preprint arXiv,2014,1409.1259.
[6] Bahdanau D,Cho K,Bengio Y.Neural machine translation by jointly learning to align and translate[J].arXiv preprint arXiv,2014,1409.0473.
[7] Wu Y,Schuster M,Chen Z,et al.Google's neural machine translation system:Bridging the gap between human and machine translation[J].arXiv preprint arXiv:2016,1609.08144.
[8] Gers,Felix A,Jürgen Schmidhuber.Learning to forget:Continual prediction with LSTM[J].Neural Computation,2000,12(10):2451-2471.
[9] Mikolov T,Chen K,Corrado G,et al.Efficient estimation of word representations in vector space[J].arXiv preprint arXiv,2013,1301.3781.
[10] Mikolov T,Sutskever I,Chen K,et al.Distributed representations of words and phrases and their compositionality[C]//Proceedings of Advances in neural information processing systems,2013:3111-3119.
[11] Mikolov T,Yih W,Zweig G.Linguistic regularities in continuous space word representations[C]//Proceedings of HLT-NAACL.2013:746-751.
[12] Bojanowski P,Grave E,Joulin A,et al.Enriching word vectors with subword information[J].arXiv preprint arXiv:2016,1607.04606.
[13] 白栓虎,夏莹,黄昌宁.汉语语料库词性标注方法研究[J].机器翻译研究进展,1992:408-418.
[14] 丁伟伟,常宝宝.基于语义组块分析的汉语语义角色标注[J].中文信息学报,2009,23(5):53-61.
[15] 张仰森,黄改娟,苏文杰.基于隐最大熵原理的汉语词义消歧方法[J].中文信息学报,2012,26(3):72-78.
[16] Chen W,Matusov E,Khadivi S,et al.Guided alignment training for topic-aware neural machine translation[J].arXiv preprint arXiv:2016,1607.01628.
[17] Sennrich R,Haddow B.Linguistic input features improve neural machine translation[J].arXiv preprint arXiv:2016,1606.02892.
[18] Jean S,Cho K,Memisevic R,et al.On using very large target vocabulary for neural machine translation[J].arXiv preprint arXiv:2014,1412-2007.
[19] Luong M T,Sutskever I,Le Q V,et al.Addressing the rare word problem in neural machine translation[J].arXiv preprint arXiv:2014,1410.8206.
[20] Chitnis R,DeNero J.Variable-Length Word Encodings for Neural Translation Models[C]// Proceedings of EMNLP.2015:2088-2093.
[21] 那顺乌日图.蒙古文词根、词干、词尾的自动切分系统[J].内蒙古大学学报(人文社会科学版),1997,02:53-57.
[22] 史建国,侯宏旭,飞龙.基于词典、规则的斯拉夫蒙古文词切分系统的研究[J].中文信息学报,2015,29(1):197-202.
[23] Zhang R,Yasuda K,Sumita E.Improved statistical machine translation by multiple Chinese word segmentation[C]//Proceedings of the Third Workshop on Statistical Machine Translation.Association for Computational Linguistics Ohio:Association for Computational Linguistics,2008,216-223.
[24] 张剑,吴际,周明.机器翻译评测的新进展[J].中文信息学报,2003,17(6):1-8.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61362028)
{{custom_fund}}