Abstract:For neural networks based dependency parsing, this paper presents a novel architecture for transition-based dependency parsing leveraging fused multi-feature encoding. We model the stack states based on subtrees representations and encode structural dependency subtrees with TreeLSTM. Particularly, we propose a LSTM-based technique to encode the historical parsed dependency arcs and states as global features. Finally, based on fused multi-feature encoding, we combine the extracted local features and global features for parsing decision. Experiments on Chinese Penn TreeBank (CTB5) show that our parser reaches 87.8% (unlabeled) and 86.8% (labeled) attachment accuracy with a greedy strategy, which effectively improves neural transition-based dependency parsing.
[1] Su N K,Baldwin T.Interpreting semantic relations in noun compounds via verb semantics [C]//Proceedings of International Conference on Computational Linguistics and Meeting of the Association for Computational Linguistics,Sydney,Australia,2006: 17-21. [2] 高源,席耀一,李弼程.基于依存句法分析与分类器融合的触发词抽取方法[J].计算机应用研究,2016,33(5):1407-1410. [3] 胡禹轩.基于依存句法分析的语义角色标注[D].哈尔滨工业大学硕士学位论文,2009. [4] Nivre J.An efficient algorithm for projective dependency parsing [C]//Proceedings of the 8th International Workshop on Parsing Technologies (IWPT),2003: 149-160. [5] Yamada H,Matsumoto Y.Statistical dependency analysis with support vector machines [C]//Proceeding of International Workshop on Parsing Technologies(IWPT).2003:195-206. [6] McDonald R.Discriminative learning and spanning tree algorithms for dependency parsing [D].University of Pennsylvania PHD Thesis,2006. [7] Zhang Y,Nivre J.Transition-based dependency parsing with rich non-local features [C]//Proceedings of Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers.Association for Computational Linguistics,2011:188-193. [8] He H,Daum III H,Eisner J.Dynamic feature selection for dependency parsing [C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing,2013: 1455-1464. [9] Chen D,Manning C.A fast and accurate dependency parser using neural networks [C]//Proceedings of Conference on Empirical Methods in Natural Language Processing,2014:740-750. [10] Dyer C,et al.Transition-based dependency parsing with stack long short-term memory [J].Computer Science,2015,37(2):321 332. [11] Kiperwasser E,Goldberg Y.Simple and accurate dependency parsing using bidirectional LSTM feature representations [J].Transactions of the Association of Computational Linguistics,2016,4(1): 313-327. [12] Wang Y,et al.A neural transition-based approach for semantic dependency graph parsing [C]//Proceedings of AAAI.2018. [13] Tai K S,Socher R,Manning C D.Improved semantic representations from tree-structured long short-term memory networks [J].Computer Science,2015,5(1): 36. [14] Schmidhuber J,Hochreiter S.Long short-term memory [J].Neural Computation,1997,9(8): 1735-1780. [15] Mikolov T,Chen K,Corrado G,et al.Efficient estimation of word representations in vector space[J].arXiv preprint arXiv:1301.3781,2013. [16] Kingma D P,Ba J.Adam: A method for stochastic optimization[J].arXiv preprint arXiv:1412.6980,2014. [17] Srivastava N,et al.Dropout: A simple way to prevent neural networks from overfitting[J].Journal of Machine Learning Research,2014,15(1):1929-1958. [18] Ioffe S,Szegedy C.Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]//Proceedings of International Conference on International Conference on Machine Learning.JMLR.org,2015. [19] Ballesteros M,et al.Training with exploration improves a greedy stack LSTM parser[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing,2016: 2005-2010.