运用多层注意力神经网络识别中文隐式篇章关系

徐昇,王体爽,李培峰,朱巧明

PDF(1690 KB)
PDF(1690 KB)
中文信息学报 ›› 2019, Vol. 33 ›› Issue (8) : 12-19,35.
语言分析与计算

运用多层注意力神经网络识别中文隐式篇章关系

  • 徐昇1,2,王体爽1,2,李培峰1,2,朱巧明1,2
作者信息 +

Multi-Layer Attention Network Based Chinese Implicit Discourse Relation Recognition

  • XU Sheng1,2, WANG Tishuang1,2, LI Peifeng1,2, ZHU Qiaoming1,2
Author information +
History +

摘要

中文隐式篇章关系识别是一个具有挑战性的任务,其难点在于如何捕获论元的语义信息。该文提出了一个模拟人类双向阅读和重复阅读过程的三层注意力网络模型(TLAN)用于识别中文隐式篇章关系。首先,使用Self-Attention层对论元进行编码;然后,通过细粒度的Interactive Attention层模拟双向阅读过程以生成包含交互信息的论元表示,并且通过非线性变换获得论元对信息的外部记忆;最后,通过包含外部记忆的注意力层来模拟重复阅读过程,在论元对记忆的引导下生成论元的最终表示。在中文篇章树库(CDTB)上进行的隐式篇章关系识别实验结果显示,该文提出的模型TLAN在Micro-F1和Macro-F1上超过了多个基准模型。

Abstract

Chinese implicit discourse relation recognition is a challenging task due to the difficulty in capturing the semantic information of the argument. This paper proposes a Three-Layer Attention Network (TLAN) to simulate the human bidirectional reading strategy and repeated reading process, and then recognizes Chinese implicit discourse relations between arguments. First, two arguments are encoded by the self-attention layer. Then, the interactive attention layer is applied to simulate the bidirectional reading strategy and generate the argument representation containing interactive information, and the external memory of the argument pair will be obtained through a nonlinear transformation. Finally, an attention layer with external memory is introduced to simulate the repeated reading process to generate the final representation of the arguments. Experimental results on the CDTB show that our TLAN outperforms various strong baselines in both micro-F1 and macro-F1.

关键词

篇章分析 / 隐式篇章关系识别 / 注意力机制

Key words

discourse parsing / implicit discourse relation recognition / attention mechanism

引用本文

导出引用
徐昇,王体爽,李培峰,朱巧明. 运用多层注意力神经网络识别中文隐式篇章关系. 中文信息学报. 2019, 33(8): 12-19,35
XU Sheng, WANG Tishuang, LI Peifeng, ZHU Qiaoming. Multi-Layer Attention Network Based Chinese Implicit Discourse Relation Recognition. Journal of Chinese Information Processing. 2019, 33(8): 12-19,35

参考文献

[1] Nianwen Xue,Hwee Tou Ng,Sameer Pradhan,et al.The CoNLL-2016 shared task on shallow discourse parsing[C]//Proceedings of CoNLL 2016,2016:1-19.
[2] Emily Pitler,Annie Louis,Ani Nenkova.Automatic sense prediction for implicit discourse relations in text[C]//Proceedings of IJCNLP 2009,2009: 683-691.
[3] Yizhong Wang,Sujian Li,Houfeng Wang.A two-stage parsing method for text-level discourse analysis[C]//Proceedings of ACL 2017,2017: 184-188.
[4] Biao Zhang,Jinsong Su,Deyi Xiong,et al.Shallow convolutional neural network for implicit discourse relation recognition[C]//Proceedings of EMNLP2015,2015: 2230-2235.
[5] Jifan Chen,Qi Zhang,Pengfei Liu,et al.Implicit discourse relation detection via a deep architecture with gated relevance network[C]//Proceedings of ACL 2016,2016: 1726-1735.
[6] Yang Liu,Sujian Li.Recognizing implicit discourse relations via repeated reading: Neural networks with multi-level attention[C]//Proceedings of EMNLP 2016,2016: 1224-1233.
[7] Fengyu Guo,Ruifang He,Di Jin,et al.Implicit discourse relation recognition using neural tensor network with interactive attention and sparse learning[C]//Proceedings of COLING 2018,2018: 547-558.
[8] Zhimin Zhou,Yu Xu,Zhengyu Niu,et al.Predicting discourse connectives for implicit discourse relation recognition[C]//Proceedings of COLING 2010,2010: 1507-1514.
[9] Lianhui Qin,Zhisong Zhang,Hai Zhao,et al.Adversarial connective-exploiting networks for implicit discourse relation classification[C]//Proceedings of ACL 2017,2017:1006-1017.
[10] Man Lan,Jianxiang Wang,Yuanbin Wu,et al.Multi-task attention-based neural networks for implicit discourse relationship representation and identification[C]//Proceedings of EMNLP 2017,2017: 1299-1308.
[11] Samuel Rnnqvist,Niko Schenk,Christian Chiarcos.A recurrent neural model with attention for the recognition of Chinese implicit discourse relations[C]//Proceedings of ACL2017,2017: 256-262.
[12] Yang Liu,Jiajun Zhang,Chengqing Zong.Memory augmented attention model for Chinese implicit discourse relation recognition[C]//Proceedings of CCL2017,2017: 411-423.
[13] Ilya Sutskever,Joshua B.Tenenbaum,et al.Modelling relational data using bayesian clustered tensor factorization[C]//Proceedings of NIPS 2009,2009: 1821-1828.
[14] Rashmi Prasad,Nikhil Diesh,Alan Lee,et al.Thepenn discourse treebank 2.0[C]//Proceedings of LREC 2008,2008.
[15] Lynn Carlson,Daniel Marcu,Mary Ellen Okurowski.Building a discourse-tagged corpus in the framework of rhetorical structure theory[M].Current and new directions in discourse and dialogue.Berlin: Springer,2003: 85-112.
[16] Vanessa Wei Feng,Graeme Hirst.A linear-time bottom-up discourse parser with constraints and post-editing[C]//Proceedings of ACL 2014,2014: 511-521.
[17] Qi Li,Tianshi Li,Baobao Chang.Discourse parsing with attention-based hierarchical neural networks[C]//Proceedings of EMNLP 2016: 362-371.
[18] Naiwen Xue,Fei Xia,Fudong Chiou,et al.The Penn Chinese TreeBank: Phrase structure annotation of a large corpus[J].Natural language engineering,2005,11(2): 207-238.
[19] Yancui Li,Fang Kong,Guodong Zhou.Building Chinese discourse corpus with connective-driven dependency tree structure[C]//Proceedings of EMNLP 2014: 2105-2114.
[20] 孙静,李艳翠,周国栋,等.汉语隐式篇章关系识别[J].北京大学学报(自然科学版),2014,50(1):111-117.
[21] 李艳翠,孙静,冯文贺,周国栋.基于连接依存树的汉语篇章结构分析平台[C]//全国第十四届计算语言学会议(CCL 2015),2015.
[22] Fang Kong,Guodong Zhou.A CDT-styled end-to-end Chinese discourse parser[J].TALLIP 2017,16(4): 26.
[23] Ashish Vaswani,Noam Shazeer,Niki Parmar,et al.Attention is all you need[J].arXiv preprint arXiv:1706.03762,2017.
[24] Peter Shaw,Jakob Uszkoreit,Ashish Vaswani.Self-attention with relative position representations[J].arXiv preprint arXiv:1803.02155,2018.
[25] Ronan Collobert,Jason Weston.A unified architecture for natural language processing: Deep neural networks with multitask learning[C]//Proceedings of ICML 2008,2008: 160-167.
[26] Dehong Ma,Sujian Li,Xiaodong Zhang,et al.Interactive attention networks for aspect-level sentiment classification[C]//Proceedings of IJCAI 2017,2017: 4068-4074.
[27] Mikolov T,Yih W,Zweig G.Linguistic regularities in continuous space word representations[C]//Proceedings of NAACL 2013,2013: 746-751.
[28] Young-Hyman,Deborah Lee.National Institute of Child Health and Human Development[M].Corsini Encyclopedia of Psychology.New York:John Wiley &Sons,Inc.2010.
[29] Diederik P.Kingma,Jimmy Ba.Adam: A method for stochastic optimization[J].arXiv preprint arXiv:1412.6980,2014.
[30] Feng Wang,Jian Cheng,Weiyang Liu,et al.Additive margin softmax for face verification[J].IEEE Signal Processing Letters,2018,25(7): 926-930.
[31] Tomas Mikolov,Ilya Sutskever,Kai Chen,et al.Distributed representations of words and phrases and their compositionality[J].arXiv preprint arXiv:1310.4546,2013.

基金

国家自然科学基金(61773276,61772354,61836007)
PDF(1690 KB)

Accesses

Citation

Detail

段落导航
相关文章

/