Abstract:Implicit discourse relation recognition is a challenging task in that it is difficult to obtain semantic informative and interaction-informative argument representations. The paper proposes an implicit discourse relation recognition method based on the Graph Convolutional Network (GCN). With the arguments encoded by fine-tuned BERT, the GCN is designed by concatenating the argument representations as feature matrix, and concatenating the attention score matrixes as adjacent matrix. It is hoped that the argument representations can be optimized by the self-attention and inter-attention information to improve implicit discourse relation recognition. Experimental results on the Penn Discourse Treebank (PDTB) show that the proposed method outperforms BERT in recognizing the four of implicit discourse relations, and it outperforms the state-of-the-art methods on Contingency and Expansion with 60.70% and 74.49% on F1 score, respectively.
[1] 阮慧彬. 基于数据增广与论元表征的隐式篇章关系识别方法研究[D].苏州: 苏州大学硕士学位论文,2020. [2] Somasundaran S, Namata G, Wiebe J, et al. Supervised and unsupervised methods in employing discourse relations for improving opinion polarity classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2009: 170-179. [3] Zhou L, Li B, Gao W, et al. Unsupervised discovery of discourse relations for eliminating intra-sentence polarity ambiguities[C] //Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2011: 162-171. [4] Narasimhan K, Barzilay R. Machine comprehension with discourse relations[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015: 1253-1262. [5] Yoshida Y, Suzuki J, Hirao T, et al. Dependency-based discourse parser for single-document summarization[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014: 1834-1839. [6] Meyer T, Popescu-Belis A. Using sense-labeled discourse connectives for statistical machine translation[C]//Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation and Hybrid Approaches to Machine Translation. Association for Computational Linguistics, 2012: 129-138. [7] Xiong D, Ding Y, Zhang M, et al. Lexical chain based cohesion models for document-level statistical machine translation[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013: 1563-1573. [8] Meyer T, Webber B. Implicitation of discourse connectives in machine translation[C]//Proceedings of the Workshop on Discourse in Machine Translation, 2013: 19-26. [9] Prasad R,Dinesh N,LeeA,et al. The Penn Discourse TreeBank 2.0[C]//Proceedings of the International Conference on Language Resources and Evaluation, 2008: 2961-2968. [10] Pitler E, Raghupathy M, Mehta H, et al. Easily identifiable discourse relations[R]. Technical Reports (CIS), 2008: 884. [11] Linh The Nguyen, Linh Van Ngo, Khoat Than, et al. Employing the correspondence of relations and connectives to identify implicit discourse relations via label embeddings[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics,2019: 4201-4207. [12] Chen J, Zhang Q, Liu P, et al. Implicit discourse relation detection via a deep architecture with gated relevance network[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016, 1: 1726-1735. [13] Bai H, Zhao H. Deep enhanced representation for implicit discourse relation recognition[J]. arXiv preprint arXiv: 1807.05154, 2018. [14] Guo F, He R, Jin D, et al. Implicit discourse relation recognition using neural tensor network with interactive attention and sparse learning[C]//Proceedings of the 27th International Conference on Computational Linguistics, 2018: 547-558. [15] Liu Y, Li S, Zhang X, et al. Implicit discourse relation classification via multi-task neural networks[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2016: 2750-2756. [16] Lan M, Wang J, Wu Y, et al. Multi-task attention-based neural networks for implicit discourse relationship representation and identification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2017: 1299-1308. [17] Lei W, Xiang Y, Wang Y, et al. Linguistic properties matter for implicit discourse relation recognition: Combining semantic interaction, topic continuity and attribution[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 2018. [18] Pitler E, Louis A, Nenkova A. Automatic sense prediction for implicit discourse relations in text[C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Association for Computational Linguistics, 2009: 683-691. [19] Lin Z, Kan M Y, Ng H T. Recognizing implicit discourse relations in the Penn Discourse Treebank[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2009: 343-351. [20] Rutherford A, Xue N. Discovering implicit discourse relations through brown cluster pair representation and coreferencepatterns[C]//Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, 2014: 645-654. [21] Braud C, Denis P. Comparing word representations for implicit discourse relation classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2015: 2201-2211. [22] Ji Y, Eisenstein J. One vector is not enough: Entity-augmented distributional semantics for discourse relations[J]. arXiv preprint arXiv: 1411.6699, 2014. [23] Zhang B, Su J, Xiong D, et al. Shallow convolutional neural network for implicit discourse relation recognition[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2015: 2230-2235. [24] Qin L, Zhang Z, Zhao H. A stacking gated neural architecture for implicit discourse relation classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016: 2263-2270. [25] 朱珊珊, 洪宇, 丁思远等. 基于训练样本集扩展的隐式篇章关系分类[J]. 中文信息学报, 2016, 30(5): 111-120. [26] Wu C, Chen Y, Huang Y. Bilingually-constrained synthetic data for implicit discourse relation recognition[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2016: 2306-2312. [27] Xu Y, Hong Y, Ruan H, et al. Using active learning to expand training data for implicit discourse relation recognition[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2018: 725-731. [28] Ruan H, Hong Y, Sun Y, et al. Using WHY-typequestion-answer pairs to improve implicit causal relation recognition[C]//Proceedings of the International Conference on Asian Language Processing,2019: 355-360. [29] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 4171-4186. [30] Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks[J]. arXiv preprint arXiv: 1609.02907, 2016. [31] Marcheggiani D, Titov I. Encoding sentences with graph convolutional networks for semantic role labeling[J]. arXiv preprint arXiv: 1703.04826, 2017. [32] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of Advances in Neural Information Processing Systems,2017: 5998-6008. [33] Hendrycks D, Gimpel K. Gaussian error linear units (gelus)[J].arXiv preprint arXiv: 1606.08415, 2016. [34] Kingma D, Ba J. Adam: A Method for stochastic optimization[J].arXiv preprint arXiv: 1412.6980, 2014. [35] Peters M E, Neumann M, Iyyer M, et al. Deep contextualized word representations[J]. arXiv preprint arXiv: 1802.05365, 2018. [36] Dai Z, Huang R. Improving Implicitdiscourse relation classification by modeling inter-dependencies of discourse units in a paragraph[J]. arXiv preprint arXiv: 1804.05918, 2018.