针对循环神经网络模型无法直接提取句子的双向语义特征,以及传统的词嵌入方法无法有效表示一词多义的问题,该文提出了基于ELMo和Transformer的混合模型用于情感分类。首先,该模型利用ELMo模型生成词向量。基于双向LSTM模型,ELMo能够在词向量中进一步融入词语所在句子的上下文特征,并能针对多义词的不同语义生成不同的语义向量。然后,将得到的ELMo词向量输入Transformer模型进行情感分类。为了实现分类,该文修改了Transformer的Encoder和Decoder结构。ELMo和Transformer的混合模型是循环神经网络和自注意力的组合,两种结构可从不同侧面提取句子的语义特征,得到的语义信息更加全面、丰富。实验结果表明,该方法与当前主流方法相比,在NLPCC2014 Task2数据集上分类正确率提高了3.52%;在酒店评论的4个子数据集上分类正确率分别提高了0.7%、2%、1.98%和1.36%。
Abstract
A hybrid model based on ELMo (Embeddings from Language Models) and Transformer is proposed for sentimental analysis. Firstly, the ELMo model based on bilateral LSTM model is applied to generate word vectors that combine the contexts features and word features, with different vectors for different meanings of a polysemous word. Then, the ELMo vector is input into a Transformer with the encoder and decoder modified for sentiment classification. The hybrid model of ELMo and Transformer with two different network structures can extract the semantic features of sentences from different aspects. The experimental results show that, compared with state-of-the-arts methods, the proposed model improves the accuracy by 3.52% on NLPCC2014 Task2 datasets, by 0.7%, 2%, 1.98% and 1.36% on 4 sub-datasets of hotel reviews respectively.
关键词
情感分析 /
ELMo模型 /
Transformer模型 /
多头自注意力机制 /
自然语言处理
{{custom_keyword}} /
Key words
sentiment analysis /
embeddings from language models /
transformer model /
multi-heads self-attention mechanism /
natural language processing
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 李婷婷,姬东鸿. 基于SVM和CRF多特征组合的微博情感分析[J]. 计算机应用研究, 2015, 32(4): 978-981.
[2] 苏莹, 张勇, 胡珀, 等. 基于朴素贝叶斯与潜在狄利克雷分布相结合的情感分析[J]. 计算机应用, 2016, 36(6): 1613-1618.
[3] Zhou Z H, Feng J. Deep Forest: Towards an alternative to deep neural networks[J]. arXiv preprint arXiv: 1702, 008835v1, 2017.
[4] Kim Y. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. New York: ACM, 2014: 1746-1751.
[5] Attardi G, Sartiano D. UniPI at SemEval-2016 Task 4: Convolutional neural networks for sentiment classification[C]//Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), 2016: 220-224.
[6] Socher R, Pennington J, Huang E H, et al. Semi-supervised recursive autoencoders for predicting sentiment distributions[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2011: 151-161.
[7] Wang X, Liu Y, Sun CJ, et al. Predicting polarities of tweets by composing word embeddings with long short-term memory[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2015, 1: 1343-1353.
[8] Cho K, Van Merrienboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J/OL]. arXiv preprint arXiv: 1406.1078, 2014.
[9] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J/OL]. arXiv preprint arXiv: 1409.0473, 2014.
[10] Luong M T, Pham H, Manning C D. Effective approaches to attention-based neural machine translation[J/OL]. arXiv preprint arXiv: 1508.04025, 2015.
[11] Yin W, Schutze H, Xiang B, et al. ABCNN: Attention-based convolutional neural network for modeling sentence pairs[J]. Transactions of the Association for Computational Linguistics, 2016, 4: 259-272.
[12] Wang Y, Huang M, Zhu X, et al. Attention-based LSTM for aspect-level sentiment classification[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016: 606-615.
[13] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems, 2017: 5998-6008.
[14] Radford A, Narasimhan K, Salimans T, et al. Improving language understanding by generative pre-training[EB/OL].[2019-12-03]https://s3-us-west-2. amazonaws. com/ openai-assets/ researchcovers/ languageunsupervised/ language understanding paper. pdf.
[15] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[J/OL]. arXiv preprint arXiv: 1810.04805, 2018.
[16] Yang Z, Dai Z, Yang Y, et al. XlNET: Generalized autoregressive pretraining for language understanding[C]//Proceedings of the 33rd Conference on Neural Information Processing Systems, 2019: 5754-5764.
[17] Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space[J/OL]. arXiv preprint arXiv: 1301.3781, 2013.
[18] Pennington J, Socher R, Manning C D. Glove: Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014: 1532-1543.
[19] Peters M E, Neumann M, Iyyer M, et al. Deep contextualized word representations[J/OL]. arXiv preprint arXiv: 1802.05365, 2018.
[20] 杜永萍, 赵晓铮, 裴兵兵. 基于CNN-LSTM模型的短文本情感分类[J]. 北京工业大学学报, 2019, 45(7): 48-56.
[21] Tan S B, Zhang J. An empirical study of sentiment analysis for Chinese documents[J]. Expert Systems with applications, 2008, 34(4): 2622-2629.
[22] Hendrycks D, Gimpel K. Gaussian error linear units (gelus)[J/OL]. arXiv preprint arXiv: 1606.08415, 2016.
[23] Wang X, Li J, Yang X, et al. Chinese text sentiment analysis using bilinear character-word convolutional neural networks[C]//Proceedings of the 2017 International of Conference on Computer Science and Application Engineering. Pennsylvania: DEStech Publications, 2017: 36-43.
[24] 赵富,杨洋,蒋瑞,等. 融合词性的双注意力Bi-LSTM情感分析[J]. 计算机应用, 2018, 38(S2): 103-106.
[25] 韩慧,王黎明,柴玉梅,等. 基于强化表征学习深度森林的文本情感分类[J]. 计算机科学, 2019, 46(7): 172-178.
[26] Cui Y, Che W, Liu T, et al. Pre-training with whole word masking for Chinese BERT[J/OL]. arXiv preprint arXiv: 1906.08101, 2019.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家重点研发计划云计算和大数据重点专项(2016YFB1001100,2016YFB1001104);国家自然科学基金青年项目(61702218)
{{custom_fund}}