语义依存分析要走向实用,模型从单领域迁移到其他领域的领域适应能力至关重要。近年来,对抗学习针对领域适应任务取得了较好的效果,但对目标领域的无标注数据利用率并不高。该文使用自训练方法用来提高无标注数据的利用效率,弥补对抗学习方法的不足。但传统的自训练方法效率和性能并不好,为此该文针对跨领域语义依存分析任务,尝试强化学习数据选择器,提出了局部伪标注的标注策略,实验结果证明,该文所提出的模型优于基线模型。
Abstract
Domain adaptation is crucial to the application of dependency parsing, and the recent solution is the adversarial learning. To better utilize the unlabeled data in the target domain, we proposes to combine the adversarial learning and self-training, and design a strategy of data selection plus partial pseudo annotation for domain adaptation of dependency parsing. The experimental results prove the proposed method is superior to baseline model.
关键词
语义依存分析 /
领域适应 /
自训练方法
{{custom_keyword}} /
Key words
semantic dependency parsing /
domain adaptation /
self-training
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] MAO D, LI H, SHAO Y. Chinese information processing society of China[C]//Proceedings of the 19th Chinese National Conference on Computational Linguistics, Haikou, China, 2020: 783-794.
[2] CHE W, SHAO Y, LIU T, et al. Association for computational linguistics. SemEval-2016 task 9: Chinese semantic dependency parsing[C]//Proceedings of the 10th International Workshop on Semantic Evaluation, San Diego, California, 2016: 1074-1080.
[3] GAMALLO P, GARCIA M, FERNNDEZ-LANZA S. Association for computational linguistics, Dependency-based open information extraction[C]//Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP, 2012, France: 10-18.
[4] PORIA S, OFEK N, GELBUKH A,et al. Springer international publishing. Dependency tree-based rules for concept-level aspect-based sentiment analysis[C]//Proceedings of the Semantic Web Evaluation Challenge, 2014: 41-47.
[5] WU S, Z D, ZHANG Z, et al.Dependency-to-dependency neural machine translation[J].IEEE/ACM Transaction on Audio, Speech and Language Processing. 2018,26(11): 2132-2141.
[6] DEVLIN J, CHANG M W, LEE K,et al. BERT: Pretraining of deep bidirectional transformers for language understanding[J/OL] arXiv preprint: 1810.04805[cs], 2019.
[7] YANG Z, DAI Z, YANG Y, et al.XLNet: Generalized autoregressive pretraining for language understanding[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019: 5753-5763.
[8] LIU Y, OTT M, GOYAL N, et al. RoBERTa: A robustly optimized BERT pretraining approach[J]. arXiv preprint: 1907.11692[cs], 2019.
[9] GUI T, ZHANG Q, HUANG H,et al. Association for computational linguistics. Part-of-speech tagging for Twitter with adversarial neural networks[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017: 2411-2420.
[10] CHEN X, CARDIE C. Multinomial adversarial networks for multi-domain text classification[C]//Proceedings of NAACL. 2018: 1226-1240.
[11] MCCLOSKY D, CHARNIAK E, JOHNSON M. Coling 2008 Organizing Committee. When is self-training effective for parsing?[C]//Proceedings of the 22nd International Conference on Computational Linguistics, UK, 2008: 561-568.
[12] LIU Q, LIU B, WU D,et al. Asian federation of natural language processing. A self-learning template approach for recognizing named entities from web text[C]//Proceedings of the 6th International Joint Conference on Natural Language Processing, Japan,2013: 1139-1143.
[13] ARDEHALY E M, CULOTTA A. Domain adaptation for learning from label proportions using self-training[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016: 3670-3676.
[14] ZHOU Y, KANTARCIOGLU M, THURAISINGHAM B. IEEE. Self-training with selection-by-rejection[C]//Proceedings of the 12th International Conference on Data Mining, Brussels, Belgium, 2012: 795-803.
[15] MNIH V, KAVUKCUOGLU K,et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533.
[16] YANG Y, CHEN W, LI Z,et al. Association for computational linguistics. Distantly supervised NER with partial annotation learning and reinforcement learning[C]//Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA,2018: 2159-2169.
[17] DOZAT T, MANNING C D. Deep biaffine attention for neural dependency parsing[C]//Proceedings of the ICLR, 2017: 1-8.
[18] SHEN Z, LIU D.Dependency-gated cascade biaffine network for Chinese semantic dependency graph parsing[C]//Proceedings of the 8th CCF International Conference, 2019: 840-851.
[19] ZILLY J G, SRIVASTAVA R K, KOUTNK J, et al. Recurrent highway networks[C]//Proceedings of the 34th International Conference on Machine Learning. 2017: 4189-4198.
[20] ARJOVSKY M, CHINTALA S, BOTTOU L. Wasserstein GAN[C]//Proceedings of the 34th International Conference on Machine Learning. 2017: 214-223.
[21] YU N, P D.Ensemble-style self-training on citation classification[C]//Proceedings of the 5th International Conference Natural Language Processing, 2011: 623-631.
[22] YU N, LIU Z, ZHEN R,et al. Domain information enhanced dependency parser[C]//Proceedings of the 8th CCF International Conference, 2020: 801-810.
[23] CHEN C, ZHANG Y. Learning how to self-learn: Enhancing self-training using neural reinforcement learning[C]//Proceedings of the International Conference on Asiah Larguge Processing, 2018: 25-30.
[24] ZHANG Y, LI Z, LANG J,et al. Asian federation of natural language processing. Dependency parsing with partial annotations: An empirical comparison[C]//Proceedings of the 8th International Joint Conference on Natural Language Processing, 2017: 49-58.
[25] KONDA, V; TSITSIKLIS J. Actor-critic algorithms[M]. MA: MIT Press, 1999.
[26] HINTON G E, SRIVASTAVA N, KRIZHEVSKY A,et al. Improving neural networks by preventing co-adaptation of feature detectors[J]. arXiv preprint arXiv: 1207.0580, 2012.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61872402);教育部人文社科规划基金(17YJAZH068);北京语言大学校级项目(中央高校基本科研业务费专项资金)(18ZDJ03)
{{custom_fund}}