汉语零形回指消解研究综述

蒋玉茹,张禹尧,毛腾,张仰森

PDF(2779 KB)
PDF(2779 KB)
中文信息学报 ›› 2020, Vol. 34 ›› Issue (3) : 1-12.
综述

汉语零形回指消解研究综述

  • 蒋玉茹1,2,张禹尧1,毛腾1,张仰森1,2
作者信息 +

A Survey of Chinese Zero Anaphora Resolution

  • JIANG Yuru1,2, ZHANG Yuyao1, MAO Teng1, ZHANG Yangsen1,2
Author information +
History +

摘要

关于零形回指的研究一直是语言学研究中的一个热点,零形回指消解是自然语言处理中一项十分重要的任务。20多年来,学者们基于语言学规则、机器学习、深度学习等方面,提出了各种研究方法,并取得了大量研究成果。该文首先介绍零形回指的相关概念;接着介绍目前国际上汉语零形回指消解的公开评测资源OntoNotes 5.0数据集及评价指标;其次,系统梳理和对比了国内外汉语零形回指消解所采用的方法;最后,总结和分析了目前零形回指消解研究的主要制约因素,这些因素也正是未来可能的研究方向。

Abstract

Zero anaphora resolution is a very important task in natural language processing. For more than two decades, scholars have proposed various methods based on linguistic rules, machine learning, and deep learning, reporting rich findings and empirical results. This paper first introduces the concepts of zero anaphora, following by the current international evaluation resources OntoNotes 5.0 dataset and the evaluation matrix. After that, we systematically examine and summarize the methods used in Chinese zero anaphora resolution at home and abroad. Finally, we disclose the main constraints of the current zero anaphora resolution research, as well as possible future research directions.

关键词

零形回指消解 / 语言学规则 / 机器学习 / 深度学习

Key words

zero anaphora resolution / linguistic rules / machine learning / deep learning

引用本文

导出引用
蒋玉茹,张禹尧,毛腾,张仰森. 汉语零形回指消解研究综述. 中文信息学报. 2020, 34(3): 1-12
JIANG Yuru, ZHANG Yuyao, MAO Teng, ZHANG Yangsen. A Survey of Chinese Zero Anaphora Resolution. Journal of Chinese Information Processing. 2020, 34(3): 1-12

参考文献

[1] Kim Y J. Subject/object drop in the acquisition of Korean: A cross-linguistic comparison[J]. Journal of East Asian Linguistics, 2000, 9(4): 325-351.
[2] 屈承熹,潘文国.汉语篇章语法[M].北京: 北京语言大学出版社,2006: 28,247-292,217.
[3] 陈平.汉语零形回指的话语分析[J].中国语文,1987,(5) : 363-378.
[4] 王晓龙, 关毅. 计算机自然语言处理[M]. 北京: 清华大学出版社, 2005.
[5] Hovy E, Marcus M, Palmer M, et al. OntoNotes: the 90% solution[C]//Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers,2006: 57-60.
[6] Hobbs J. Resolving pronoun references[J]. Lingua, 1978, 44(4): 311-338.
[7] Grosz B J, Weinstein S, Joshi A K. Centering: A framework for modeling the local coherence of discourse[J]. Computational Linguistics, 1995, 21(2): 203-225.
[8] 王厚峰. 指代消解的基本方法和实现技术[J]. 中文信息学报, 2002, 16(6): 10-18.
[9] Converse S P, Palmer M S. Pronominal anaphora resolution in Chinese[M]. University of Pennsylvania, 2006.
[10] Xue N, Xia F, Chiou F D, et al. The Penn Chinese TreeBank: Phrase structure annotation of a large corpus[J]. Natural Language Engineering, 2005, 11(2): 207-238.
[11] Dong Z, Dong Q,Hao C. HowNet and its computation of meaning[C]//Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations. Association for Computational Linguistics, 2010: 53-56.
[12] Miller G A. WordNet: A lexical database for English[J]. Communications of the ACM, 1995, 38(11): 39-41.
[13] Yeh C L, Chen Y C. Zero anaphora resolution in Chinese with partial parsing based on centering theory[C]//Proceedings of the International Conference on Natural Language Processing and Knowledge Engineering, IEEE, 2003: 683-688.
[14] Yeh C, Chen Y. Zero anaphora resolution in Chinese with shallow parsing[J]. Journal of Chinese Language and Computing, 2007: 41-56.
[15] 王德亮.汉语零形回指解析——基于向心理论的研究[J].现代外语,2004(04): 350-359,436.
[16] McCarthy J F, Lehnert W G. Using decision trees for coreference resolution[J]. 1999, 47(1): 1050--1055.
[17] Soon W M, Ng H T, Lim D C Y. A machine learning approach to coreference resolution of noun phrases[M]. MIT Press, 2001.
[18] Zhao S, Ng H T. Identification and resolution of Chinese zero pronouns: A machine learning approach[C]//Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic. DBLP, 2007: 541-550.
[19] Hall M, Frank E,Holmes G, et al. The WEKA data mining software : An update[J]. ACM SIGKDD Explorations Newsletter, 2009, 11(1): 10-18.
[20] Chen C, Ng V. Chinese zero pronoun resolution: Some recent advances[C]//Proceedings of the 2013 Coference on Empirical Methods in Natural Language Processing. 2013: 1360-1365.
[21] Yang Y,Xue N. Chasing the ghost: Recovering empty categories in the Chinese Treebank[C]//Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics, 2010: 1382-1390.
[22] 黄李伟, 孔芳, 朱巧明, 等. 基于树核函数的中文零指代项识别研究[J]. 计算机科学, 2011, 38(1): 214-216.
[23] Kong F, Zhou G. A tree kernel-based unified framework for Chinese zero anaphora resolution.[C]//Proceedings of the Conference on Empirical Methods in Na-tural Language Processing, EMNLP 2010, 9-11 October 2010,Mit Stata Center, Massachusetts, USA, A Meeting of Sigdat, A Special Interest Group of the ACL. DBLP, 2010: 882-891.
[24] Cheng S, Fang K,Guodong Z. Towards better Chinese zero pronoun resolution from discourse perspective[C]//Proceedings of the National CCF Conference on Natural Language Processing and Chinese Computing. Springer, Cham, 2017: 406-418.
[25] Fang K,Guodong Z. Chinese zero pronoun resolution: A chain to chain approach[C]//Proceedings of the National CCF Conference on Natural Language Processing and Chinese Computing. Springer, Cham, 2017: 393-405.
[26] Chen C, Ng V. Chinese zero pronoun resolution: An unsupervised probabilistic model rivaling supervised resolvers[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP),2014: 763-774.
[27] Chen C, Ng V. Chinese zero pronoun resolution: A joint unsupervised discourse-aware model rivaling state-of-the-art resolvers[C]//Proceedings of the 53rd Annual Meetings of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015: 320-326.
[28] Martschat S, Strube M. Latent structures for coreference resolution[J]. Transactions of the Association for Computational Linguistics, 2015, 3: 405-418.
[29] Xi X F, Zhou G, Hu F, et al. A convolutional deep neural network for coreference resolution via modeling hierarchical features[M]//Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques. Springer International Publishing, 2015.
[30] Godbole V, Liu W, Togneri R. An investigation of neural embeddings for coreference resolution[C]//Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. Springer, Cham, 2015: 241-251.
[31] Chen C, Ng V. Chinese zero pronoun resolution with deep neural networks[C]//Proceedings of the Meeting of the Association for Computational Linguistics,2016: 778-788.
[32] Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv: 1301.3781, 2013.
[33] Yin Q, Zhang W, Zhang Y, et al. A deep neural network for Chinese zero pronoun resolution[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence. AAAI Press, 2017: 3322-3328.
[34] Hochreiter S, Schmidhuber J. LSTM can solve hard long time lag problems[C]//Proceedings of the Advances in Neural Information Processing Systems,1997: 473-479.
[35] Yin Q, Zhang Y, Zhang W, et al. Zero pronoun resolution with attention-based neural network[C]//Proceedings of the 27th International Conference on Computational Linguistics,2018: 13-23.
[36] Bahdanau D, Cho K,Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv: 1409.0473, 2014.
[37] Yin Q, Zhang Y, Zhang W, et al. Chinese zero pronoun resolution with deep memory network[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing,2017: 1309-1318.
[38] Yin Q, Zhang Y, Zhang W N, et al. Deep reinforcement learning for Chinese zero pronoun resolution[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018, 1: 569-578.
[39] Yin Q, Zhang W, Zhang Y, et al. Chinese zero pronoun resolution: A collaborative filtering-based approach[J]. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 2019, 19(1): 3.
[40] Liu T, Cui Y, Yin Q, et al. Generating and exploiting large-scale pseudo training data for zero pronoun resolution[C]//Proceedings of the 55th Annual Meetings of the Association for Computational Linguistics. 2017: 102-111.

基金

国家自然科学基金(61602044,61772081);促进高校内涵发展——研究生科技创新项目(5121911044)
PDF(2779 KB)

921

Accesses

0

Citation

Detail

段落导航
相关文章

/