针对在金融领域实体级情感分析任务中缺乏足够的标注语料,以及通用的情感分析模型难以有效处理金融文本等问题,该文构建一个百万级别的金融领域实体情感分析语料库,并标注5 000余个金融领域情感词作为金融领域情感词典。同时,基于该金融领域数据集,提出一种结合金融领域情感词典和注意力机制的金融文本细粒度情感分析模型(FinLexNet)。该模型使用两个LSTM网络分别提取词级别的语义信息和基于情感词典分类后的词类级别信息,能有效获取金融领域词语的特征信息。此外,为了让文本中金融领域情感词获得更多关注,提出一种基于金融领域情感词典的注意力机制来为不同实体获取重要的情感信息。最终在构建的金融领域实体级语料库上进行实验,取得了比对比模型更好的效果。
Abstract
To address the entity-level sentiment analysis of financial texts, this paper builds a multi-million level corpus of sentiment analysis of financial domain entities and labels more than five thousand financial domain sentiment words as financial domain sentiment dictionary. We further propose an Attention-based Recurrent Network Combined with Financial Lexicon, called FinLexNet. FinLexNet model uses a LSTM to extract category-level information based on financial domain sentiment dictionary and another LSTM to extract semantic information at the word-level. In addition, in order to get more attention to the financial sentiment words, an attention mechanism based on the financial domain sentiment dictionary is proposed. Finally, experiments on the dataset we constructed shows that our model has achieved better performance than the baseline models.
关键词
细粒度情感分析 /
金融文本 /
金融情感词典
{{custom_keyword}} /
Key words
fine-grained sentiment analysis /
financial texts /
financial sentiment lexicon
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Pang B, Lee L. Opinion mining and sentiment analysis[J]. Foundations and Trends in Information Retrieval, 2008, 2(1-2): 1-135.
[2] 赵妍妍, 秦兵, 刘挺. 文本情感分析[J]. 中文信息学报, 2010, 21(8): 1834-1848.
[3] Meng X, Wei F, Xu G,et al., Lost in translations building sentiment lexicons using context based machine translation[C]//Proceeding of the International Conference on Computational Linguistics, 2012: 829-838.
[4] 梅莉莉, 黄河燕, 周新宇, 等. 情感词典构建综述[J]. 中文信息学报, 2016, 30(5): 19-27.
[5] Christiane F. WordNet[R]. The Encyclopedia of Applied Linguistics, 2012,11.
[6] Wu C, Chuang Z, Lin Y. Emotion recognition from text using semantic labels and separable mixture models[J]. ACM Transactions on Asian Language Information Processing (TALIP), 2006, 5(2): 165-183.
[7] Lipenkova, J. A system for fine-grained aspect-based sentiment analysis of Chinese[C]//Proceeding of the 53th Annual Meeting of the Association for Computational Linguistics, 2015: 55-60.
[8] Kiritchenko S, Zhu X, Cherry C, et al. NRC-Canada-2014: detecting aspects and sentiment in customer reviews[C]//Proceeding of the 8th International Workshop on Semantic Evaluation, 2014: 437-442.
[9] Ramesh A, Kumar S,Foulds J, et al. Weakly supervised models of aspect-sentiment for online course discussion forums[C]//Proceeding of the 53th Annual Meeting of the Association for Computational Linguistics, 2015: 74-83.
[10] 郝志峰, 杜慎芝, 蔡瑞初, 等. 基于全局变量CRFs模型的微博情感对象识别方法[J]. 中文信息学报, 2015, 29(4): 50-58.
[11] Tang D, Qin B, Feng X, et al. Effective LSTMs for target-dependent sentiment classification[C]//Proceeding of the International Conference on Computational Linguistics, 2016: 3298-3307.
[12] 赵冬梅,李雅,陶建华,等. 基于协同过滤Attention机制的情感分析模型[J]. 中文信息学报, 2018, 32(8): 128-134.
[13] 曾锋,曾碧卿,韩旭丽,等. 基于双层注意力循环神经网络的方面级情感分析[J]. 中文信息学报, 2019, 33(6): 108-115.
[14] 吴小华,陈莉,魏甜甜,等. 基于Self-Attention和Bi-LSTM的中文短文本情感分析[J]. 中文信息学报, 2019, 33(6): 100-107.
[15] Cortis K, Freitas A, Daudert T, et al. Semeval-2017 task 5: fine-grained sentiment analysis on financial microblogs and news[C]//Proceeding of the 11th International Workshop on Semantic Evaluation, 2017: 519-535.
[16] Wang B, Liu M. Deep learning for aspect-based sentiment analysis[J]. Expert Systems with Applications, 2019: 272-299.
[17] Maia M,Handschuh S, Freitas A, et al. WWW'18 open challenge: financial opinion mining and question answering[C]//Proceeding of the International Conference of World Wide Web, 2018: 1941-1942.
[18] Yang S, Rosenfeld J, Makutonin J. Financial aspect-based sentiment analysis using deep representations[J]. arXiv: 1808.07931. 2018.
[19] SalunkheL A, Mhaske S. Aspect based sentiment analysis on financial data using transferred learning approach using pre-trained BERT and regressor model[C]//Proceeding of the International Research Journal of Engineering and Technology, 2019: 1097-1101.
[20] Cohen J. A coefficient of agreement for nominal scales[J]. Educational and Psychological Measurement, 1960, 20(1): 37-46.
[21] Fleiss J. Measuring nominal scale agreement among many raters[J]. Psychological Bulletin,76(5), 1971: 378-382.
[22] Song Y, Shi S, Li J, et al. Directional skip-gram: explicitly distinguishing left and right context for word embeddings[C]//Proceedings of the North American Chapter of the Association for Computational Linguistics, 2018: 175-180.
[23] Kingma D, BA J. Adam: A method for stochastic optimization[C]//Proceeding of the International Conference on Learning Representations, 2015.
[24] Srivastava N, Hinton G,Krizhevsky A. Dropout: A simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15(56): 1929-1958.
[25] Cho K, Bahdanau D. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceeding of the Conference on Empirical Methods in Natural Language Processing, 2014: 1724-1734.
[26] Dehong M. Interactive attention networks for aspect-level sentiment classification[C]//Proceeding of the International Joint Conference on Artificial Intelligence, 2017: 4068-4074.
[27] Huang B. Aspect level sentiment classification with attention-over-attention neural networks[C]//Proceeding of the Conference on Empirical Methods in Natural Language Processing, 2018: 5469-5477.
[28] Tang D, Qin B, Liu T. Aspect level sentiment classification with deep memory network[C]//Proceeding of the Conference on Empirical Methods in Natural Language Processing, 2016: 214-224.
[29] Wang Y, Huang M, Zhao L. Attention-based LSTM for aspect-level sentiment classification[C]//Proceeding of the Conference on Empirical Methods in Natural Language Processing, 2016: 606-615.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61876053,62006062);深圳市技术攻关项目(JSGG20210802154400001);深圳市基础研究学科布局项目(JCY20210324115614039)
{{custom_fund}}