复句关系是指分句间的语义关系。目前关于复句关系的分类体系有很多,复句三分系统与HIT-CDTB分类体系为其代表。对不同分类体系各类别进行相互转换可以为机器翻译等任务提供支持。该文基于预训练模型ERNIE-Gram和TinyBERT,嵌入主成分分析方法,提出一种三阶段复句关系识别混合模型,实现三分系统与HIT-CDTB两种分类体系下复句关系的转换。通过实验检验,复句三分系统到HIT-CDTB以及HIT-CDTB到复句三分系统关系转换的准确率分别达到77.60%、89.17%。
Abstract
The compound sentence relation refers to the semantic relation between clauses. Among the current classification systems of compound sentence, the compound sentence trichotomy and HIT-CDTB are the most popular systems. Based on the pre-trained language models like ERNIE-Gram and TinyBERT, as well as PCA (principal component analysis), we proposed a three-stage model to recognize relation about compound sentence. Experiments reveal 77.60% accuracy of relation conversion from compound sentence trichotomy to HIT-CDTB, and 89.17% vice vesa.
关键词
复句 /
ERNIE-Gram /
TinyBERT
{{custom_keyword}} /
Key words
compound sentence /
ERNIE-Gram /
TinyBERT
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 邢福义. 汉语复句研究[M]. 北京: 商务印书馆,2001: 1-56.
[2] 宗成庆. 中文信息处理研究现状分析[J]. 语言战略研究,2016,1(06): 19-26.
[3] 徐赳赳, Webster J. 复句研究与修辞结构理论[J]. 外语教学与研究,1999(04): 16-22.
[4] Prasad R,Dinesh N,Lee A,et al. The penn discourse TreeBank 2.0[C]//Proceedings of the 6th International Conference on Language Resources and Evaluation,2008: 2961-2968.
[5] 张牧宇,宋原,秦兵,等. 中文篇章级句间语义关系识别[J]. 中文信息学报,2013,27(06): 51-57.
[6] 宗成庆. 统计自然语言处理[M].北京: 清华大学出版社,2008: 8-13.
[7] Han X,Zhang Z,Ding N,et al. Pre-trained models: Past,present and future[J]. AI Open,2021,2: 225-250.
[8] 胡金柱,吴锋文,李琼. 汉语复句关系词库的建设及其利用[J]. 语言科学,2010,9(02): 133-142.
[9] 杨进才,涂馨丹,沈显君,等. 基于依存关系规则的汉语复句关系词自动识别[J]. 计算机应用研究,2018,35(06): 1756-1760.
[10] 周文翠,袁春风. 并列复句的自动识别初探[J]. 计算机应用研究,2008(03): 764-766.
[11] Lin Z,Kan M Y,Ng H T. Recognizing implicit discourse relations in the Penn Discourse Treebank[C]//Proceedings of the EMNLP 2009,Singapore,2009.
[12] 杨进才,陈忠忠,沈显君,等. 二句式非充盈态有标复句关系类别的自动标志[J]. 计算机应用研究,2017,34(10): 2950-2953.
[13] 孙凯丽,邓沌华,李源,等. 基于句内注意力机制多路CNN的汉语复句关系识别方法[J]. 中文信息学报,2020,34(06): 9-17.
[14] Mikolov T,Sutskever I,Chen K,et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the NIPS,2013: 3111-3119.
[15] Vaswani A,Shazeer N,Parmar N,et al. Attention is all you need[C]//Proceedings of the NIPS,2017: 5998-6008.
[16] Devlin J,Chang M W,Lee K,et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg,PA: Association for Computational Linguistics,2019: 4171-4186.
[17] Brown T,Mann B,Ryder N,et al. Language models are few-shot learners[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems,2020,33: 1877-1901.
[18] Xiao D L,Li Y K,Zhang H,et al. ERNIE-Gram: Pre-training with explicitly n-gram masked language modeling for natural language understanding[C]//Proceedings of the NAACL-HLT,2021: 1702-1715.
[19] Jiao X Q,Yin Y C,Shang L F,et al. TinyBERT: Distilling BERT for natural language understanding[C]//Proceedings of the EMNLP,2020: 4163-4174.
[20] Hotelling H. Analysis of a complex of statistical variables into principal components[J]. Journal of Educational Psychology,1933,24(7): 498-520.
[21] Wei J,Zou K. EDA: Easy data augmentation techniques for boosting performance on text classification tasks[C]//Proceedings of the EMNLP/IJCNLP(1),2019: 6381-8387.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家社会科学基金(19BYY092)
{{custom_fund}}