Review
ZHU Shanshan, HONG Yu, DING Siyuan, YAN Weirong,YAO Jianmin, ZHU Qiaoming
2016, 30(5): 111-120.
The implicit discourse relation recognition is to automatically detect the relationships between two arguments without explicit connectives. Previous studies show that linguistic features are effective for implicit discourse relation recognition. However, the state-of-the-art accuracy is merely 40% for the lack of enough training data. For the problem, this paper presents a novel implicit discourse relation recognition method based on the training data expansion. Firstly, we take some origin training data as seed samples, and then use them to mine semantically and relationally parallel data from the external data resources by using “arguments vectors”. Secondly, we augment origin training data with the mined parallel training data. Finally, we experiment the implicit discourse relation classification using the expanded data. Experiment results on the Penn Discourse Treebank (PDTB) show that our method outperforms the baseline system with a gain of 8.41% on the whole, and 5.42% on average in classification accuracy respectively. Compared with the state-of-the-art system, we further acquire 6.36% improvements.
Key words: implicit discourse relation; semantic vector; training data expansion; discourse analysis 收稿日期: 2014-12-25 定稿日期: 2015-03-27 基金项目: 国家自然科学基金(61373097, 61272259, 61272260, 90920004);教育部博士学科点专项基金(2009321110006, 20103201110021);江苏省自然科学基金(BK2011282);江苏省高校自然科学基金(11KJA520003);苏州市自然科学基金(SH201212)