李培芸,李茂西,裘白莲,王明文. 融合BERT语境词向量的译文质量估计方法研究[J]. 中文信息学报, 2020, 34(3): 56-63.
LI Peiyun, LI Maoxi, QIU Bailian, WANG Mingwen. Integrating BERT Word Embedding into Quality Estimation of Machine Translation. , 2020, 34(3): 56-63.
融合BERT语境词向量的译文质量估计方法研究
李培芸,李茂西,裘白莲,王明文
江西师范大学 计算机信息工程学院,江西 南昌 330022
Integrating BERT Word Embedding into Quality Estimation of Machine Translation
LI Peiyun, LI Maoxi, QIU Bailian, WANG Mingwen
School of Computer and Information Engineering, Jiangxi Normal University, Nanchang, Jiangxi 330022, China
Abstract:The word embedding of BERT contains semantic, syntactic and context information, pre-trained for a various downstream tasks of natural language processing. We propose to introduce BERT into neural quality estimation of MT outputs by employing stacked BiLSTM (bidirectional long short-term memory), concatenated with the existing the quality estimation network at the output layer. The experiments on the CWMT18 datasets show that the quality estimation can be significantly improved by integrating upper and middle layers of the BERT, with the top-improvement brought by average pooling of the last four layers of the BERT. Further analysis reveals that the fluency in translation is better exploited by BERT in the MT quality estimation task.
[1] 宗成庆. 统计自然语言处理[M]. 北京: 清华大学出版社, 2013.
[2] 刘洋. 神经机器翻译前沿进展[J]. 计算机研究与发展, 2017, 54 (06): 1144-1149.
[3] 李亚超, 熊德意, 张民. 神经机器翻译综述[J]. 计算机学报, 2018, 41 (12): 2734-2755.
[4] Specia L, Shah K, De Souza J G C, et al. QuEst-A translation quality estimation framework[C]//Proceedings of the ACL,2013: 79-84.
[5] Shah K, Logacheva V, Paetzold G, et al.Shef-nn: Translation quality estimation with neural networks[C]//Proceedings of the WMT,2015: 342-347.
[6] Chen Z, Tan Y, Zhang C, et al.Improving machine translation quality estimation with neural network features[C]//Proceedings of the WMT,2017: 551-555.
[7] 陈志明, 李茂西, 王明文. 基于神经网络特征的句子级别译文质量估计[J]. 计算机研究与发展, 2017, 54 (8): 1804-1812.
[8] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the NIPS,2013: 3111-3119.
[9] Kim H, Jung H Y, Kwon H, et al. Predictor-estimator: Neural quality estimation based on target word prediction for machine translation [J]. ACM Transactions on Asian and Low-Resource Language Information Processing, 2017, 17(1): 1-22.
[10] Li M, Xiang Q, Chen Z, et al. A unified neural network for quality estimation of machine translation [J]. IEICE Transactions on Information and Systems, 2018, E101-D (9): 2417-2421.
[11] Ive J, Blain F, Specia L. deepQuest: A framework for neural-based quality estimation[C]//Proceedings of COLING,2018: 3146-3157.
[12] Wang J, Fan K, Li B, et al.Alibaba submission for WMT18 quality estimation task[C]//Proceedings of the WMT, 2018: 809-815.
[13] Fan K, Wang J, Li B, et al."Bilingual Expert" can find translation errors[C]//Proceedings of the AAAI, 2019: 1-8.
[14] 孙潇, 朱聪慧, 赵铁军. 融合翻译知识的机器翻译质量估计算法[J]. 智能计算机与应用, 2019, 9(2): 271-275.
[15] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. arXiv preprint arXiv: 1706.03762, 2017.
[16] Devlin J, Chang M W,Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding [J]. arXiv preprint arXiv: 1810.04805, 2018.
[17] Liu N F, Gardner M, Belinkov Y, et al. Linguistic knowledge and transferability of contextual representations[J]. arXiv preprint arXiv: 1903.08855, 2019.
[18] Peters M, Neumann M, Iyyer M, et al. Deep contextualized word representations[C]//Proceedings of the NAACL, 2018: 2227-2237.
[19] Radford A, Narasimhan K, Salimans T, et al.Improving language understanding with unsupervised learning[R]. Technical Report, OpenAI, 2018.