基于双向LSTM与CRF融合模型的否定聚焦点识别

沈龙骧,邹博伟,叶静,周国栋,朱巧明

PDF(2500 KB)
PDF(2500 KB)
中文信息学报 ›› 2019, Vol. 33 ›› Issue (1) : 25-34.
语言分析与计算

基于双向LSTM与CRF融合模型的否定聚焦点识别

  • 沈龙骧,邹博伟,叶静,周国栋,朱巧明
作者信息 +

Negation Focus Identification via Bi-directional LSTM-CRF Model

  • SHEN Longxiang, ZOU Bowei, YE Jing, ZHOU Guodong, ZHU Qiaoming
Author information +
History +

摘要

否定表达作为自然语言文本中常见的语言现象,对自然语言处理上层应用,如情感分析、信息抽取等,具有十分重要的意义。否定聚焦点识别任务是更细粒度的否定语义分析,其旨在识别出句子中被否定词修饰和强调的文本片段。该文将该任务作为序列标注问题,提出了一种基于双向长短期记忆网络结合条件随机场(BiLSTM-CRF)的否定聚焦点识别模型,其中,BiLSTM网络能够充分利用上下文信息并抓取全局特征,CRF层能够有效学习输出标签之间的前后依赖关系。在*SEM2012评测任务数据集上的实验结果表明,基于BiLSTM-CRF的否定聚焦点识别方法的准确率(accuracy)达到69.58%,与目前最好的系统相比,性能提升了2.44%。

Abstract

Negative expressions are common phenomena in natural language text and play a critical role in various applications of natural language processing, such as sentiment analysis, information extraction. Negation focus identification task is a finer-grained negative semantic analysis, which aims at identifying the text fragment modified and emphasized by a negative keyword. Treating the negation focus identification as a sequence labeling task, we propose a bidirectional Long Short-Term Memory network with a Conditional Random Field layer (BiLSTM-CRF). It can not only learn the contextual information from both directions, but also learn the dependency between the output tags by the CRF layer. Experimental results on the *SEM2012 dataset shows that the performance of our approach achieves an accuracy of 69.58%, i.e. 2.44% improvement compared to the state-of-the-art methods.

关键词

否定聚焦点 / BiLSTM-CRF模型 / 序列标注

Key words

negation focus / BiLSTM-CRF model / sequence labeling

引用本文

导出引用
沈龙骧,邹博伟,叶静,周国栋,朱巧明. 基于双向LSTM与CRF融合模型的否定聚焦点识别. 中文信息学报. 2019, 33(1): 25-34
SHEN Longxiang, ZOU Bowei, YE Jing, ZHOU Guodong, ZHU Qiaoming. Negation Focus Identification via Bi-directional LSTM-CRF Model. Journal of Chinese Information Processing. 2019, 33(1): 25-34

参考文献

[1] Blanco E,Moldovan D. Semantic representation of negation using focus detection[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics(ACL),2011:581-589.
[2] Rosenberg S,Bergler S. UConcordia:CLaC negation focus detection at *Sem 2012[C]//Proceedings of the Joint Conferece on Lexical and Computational Semantics. Association for Computational Linguistics,2013:294-300.
[3] Cho K,et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. arXiv:406.1078.2014.
[4] Bahdanau D,Cho K,Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv:1409.0473.2014.
[5] Santos C,Gattit M. Deep convolutional neural networks for sentiment analysis of short texts[C]//Proceedings of the International Conference on Computational Linguistics,2014.
[6] Wang J,et al. Dimensional sentiment analysis using a regional CNN-LSTM model[C]//Proceedings of the Meeting of the Association for Computational Linguistics,2016:225-230.
[7] Zeng D,et al. Distant supervision for relation extraction via piecewise convolutional neural networks[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing. 2015:1753-1762.
[8] Lin Y,et al. Neural relation extraction with selective attention over instances[C]//Proceedings of the Meeting of the Association for Computational Linguistics,2016:2124-2133.
[9] Goller C,Kuchler A. Learning task-dependent distributed representations by backpropagation through structure[C]//Proceedings of the IEEE International Conference on Neural Networks,1996:347-352.
[10] Hochreiter S,Schmidhuber J. Long short-term memory[J]. Neural Computation,1997,9(8):1735-1780.
[11] Gers F,Schmidhuber J,Cummins F. Learning to Forget:Continual prediction with LSTM[J]. Neural Computation,2000,12(10):2451-2471.
[12] Cho K,et al. On the Properties of neural machine translation:Encoder-Decoder Approaches[C]//Proceedings of SSST-8,Eighth Workshop on Syntax,Semantics and Structure in Statistical Translation,2014:103-111.
[13] Palmer M,Gildea D,Kingsbury P. The proposition Bank:An annotated corpus of semantic roles[J]. Computational Linguistics,2005,31(1):71-106.
[14] Morante R,Blanco E. *SEM 2012 Shared Task:Resolving the Scope and Focus of Negation[C]//Proceedings of the First Joint Conference on Lexical and Computational Semantics (*SEM),2012:265-274.
[15] Zou B,Zhu Q,Zhou G. Negation focus identification with contextual discourse information[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics(ACL),2014:522-530.
[16] Bengio Y,Simard P,Frasconi P. Learning long-term dependencies with gradient descent is difficult[J]. IEEE Transactions on Neural Networks,2002,5(2):157-166.
[17] Pascanu R,Mikolov T,Bengio Y. On the difficulty of training recurrent neural networks[C]//Proceedings of the International Conference on Machine Learning. 2013:1310-1318.
[18] Graves A,Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J]. Neural Networks,2005,18(5):602-610.
[19] Graves A,Mohamed A,Hinton G. Speech recognition with deep recurrent neural networks[C]//Proceedings of the IEEE International Conference on Acoustics,Speech and Signal Processing IEEE,2013:6645-6649.
[20] Lafferty J,Mccallum A,Pereira F. Conditional random fields:Probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the Eighteenth International Conference on Machine Learning. Morgan Kaufmann Publishers Inc,2001:282-289.
[21] Huang Z,Xu W,Yu K. Bidirectional LSTM-CRF models for sequence tagging[J]. arXiv:1508.01991.2015.
[22] Ma X,Hovy E. End-to-end Sequence labeling via Bi-directional LSTM-CNNs-CRF[C]//Proceedings of the Meeting of the Association for Computational Linguistics,2016:1064-1074.
[23] Lample G,et al. Neural architectures for named entity recognition[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies(NAACL-HLT),2016:260-270.
[24] Poon H,Domingos P. Unsupervised semantic parsing[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2009:1-10.
[25] Pradhan S,et al. Shallow semantic parsing using support vector machines[C]//Proceedings of the North American Chapter of the Association for Computational Linguistics,2003:233-240.
[26] Soricut R,Marcu D. Sentence level discourse parsing using syntactic and lexical information[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology,2003:149-156.
[27] Collobert R,et al. Natural language processing (almost) from scratch[J]. The Journal of Machine Learning Research,2011,(12):2493-2537.
[28] Pennington J,Socher R,Manning C. Glove:Global vectors for word representation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2014:1532-1543.
[29] Mikolov T,et al. Distributed representations of words and phrases and their compositionality[J]. Advances in Neural Information Processing Systems,2013,(26):3111-3119.
[30] Zeiler M. ADADELTA:An Adaptive Learning Rate Method[J]. arXiv:1212.5701.2012.
[31] Kingma D,Ba J. Adam:A Method for Stochastic Optimization[J]. arXiv:1412.6980.2014.

基金

国家自然科学基金(61703293,61672367);江苏省科技计划(BK20151222)
PDF(2500 KB)

789

Accesses

0

Citation

Detail

段落导航
相关文章

/