1.School of Software, Xinjiang University, Urumqi, Xinjiang 830008, China; 2.Net Center, Xinjiang University, Urumqi, Xinjiang 830046, China; 3.School of Humanities, Xinjiang University, Urumqi, Xinjiang 830046, China; 4. School of Information Science and Engineering, Xinjiang University, Urumqi, Xinjiang 830046, China
Abstract:Focusedon Uyghur noun phrase coreference identification task, this paper proposed a Stacked Nonnegative Constrained Autoencoder( SNCAE) for anaphoricity determination based on semantic feature. Through the analysis of Uyghur noun phrase language phenomenon, 15 kinds of semantic features are extracted, and then input into SNCAE to extract the deep semantic features. Finally, the Softmax classifier is used to complete the recognition task. Compared with Support Vector Machine (SVM), the positive accuracy and negative accurate increased by 8.259% and 4.158%, respectively, and increased by 1.884% and 1.590%, respectively, than the Stacked Autoencoder (SAE).
[1] Soon W M, Ng H T, Lim D. A machine learning approach to coreference resolution of noun phrase [J]. Computational Linguistics, 2001, 27(4):521-544. [2] 钱伟, 郭以昆, 周雅倩, 等. 基于最大熵模型的英文名词短语指代消解[J]. 计算机研究与发展, 2003, 40(9):1337-1343. [3] 周俊生, 黄书剑, 陈家骏, 等. 一种基于图划分的无监督汉语指代消解算法[J]. 中文信息学报, 2007, 21(2):77-82. [4] 孔芳, 周国栋. 基于树核函数的中英文代词消解[J]. 软件学报, 2012, 23(5):1085-1099. [5] 奚雪峰, 周国栋. 基于Deep Learning的代词指代消解[J]. 北京大学学报(自然科学版), 2014, 50(1):100-110. [6] Bergsma S, Lin D. Bootstrapping path-based pronoun resolution[C]//Proceedings of the 21st International Conference on Computational Linguistics and the 4th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2006:33-40. [7] Lappin S, Herbert J L. Analgorithm for Pronominal anaphora resolution [J]. Computational Linguistics, 1994, 20(4);535-561. [8] Ng V, Cardie C. Improving machine learning approaches to coreference resolution [C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL). Philadelphia:Association for Computational Linguistics, 2002:104-111. [9] Zhou G D, Kong F. Global learning of noun phrase anaphoricity in coreference resolution via label propagetion[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA:Association for Computational Linguistics, 2009:978-986. [10] 孔芳, 朱巧明, 周国栋. 中英文指代消解中待消解项识别的研究[J]. 计算机研究与发展, 2012, 49(5):1072-1085. [11] 张 超, 孔 芳, 周国栋. 交互式问答系统中待消解项的识别方法研究. 中文信息学报, 2014, 28(4):111-116. [12] Bengio Y, Delalleau O. On the expressive power of deep architectures[C]//Proceedings of the 14th International Conference on Discovery Science. Berlin:Springer-Verlag, 2011:18-36. [13] Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks[C]//Proceedings of 14th International Conference on Artificial Intelligence and Statistics. Fort Lauderdale, USA :AISTATS, 2011:315-323. [14] Salakhutdinov R, Hinton G. Semantic hashing[J]. International Journal of Approximate Reasoning, 2009, 50(7):969-978. [15] Zhang K X, Zhou C L. Unsupervised feature learning for Chinese lexicon based on auto-encoder[J]. Journal of Chinese Information Processing, 2013, 27(5):85-92. [16] 张开旭, 周昌乐. 基于自动编码器的中文词汇特征无监督学习[J]. 中文信息学报, 2013, 27(5):85-92. [17] 刘勘, 袁蕴英. 基于自动编码器的短文本特征提取及聚类研究[J]. 北京大学学报(自然科学版), 2015, 51(2):282-288. [18] G E Hinton, S Osindero, Y W Teh. A fast learning algorithm for deep belief nets[J]. Neural Computation, 2006, 18(1):1527-1554.