中英双语政治论辩挖掘任务数据集建设

张霄军,周静狮

PDF(2351 KB)
PDF(2351 KB)
中文信息学报 ›› 2023, Vol. 37 ›› Issue (10) : 167-174.
计算论辩专栏

中英双语政治论辩挖掘任务数据集建设

  • 张霄军1,2,周静狮3
作者信息 +

Building Chinese-English Bilingual Dataset for Political Argument Mining Tasks

  • ZHANG Xiaojun1,2, ZHOU Jingshi3
Author information +
History +

摘要

受制于训练语料资源稀缺,中文论辩挖掘在政治领域的研究才刚刚起步。外交辞令、外事问答以及外宣公告都蕴含着丰富而微妙的政治论辩技巧,在外交领域开展政治论辩挖掘研究具有现实意义和应用价值。该文从在建的“多语外交对话语料库”得到启发,选取部分语料进行政治论辩观点标注、论辩关系标注和论辩句情感分析,初步建成了包含200篇外交部例行记者会实录中英文文本、1 536个话轮的中英双语政治论辩挖掘任务数据集BiDAM,并以示例的形式展示了该数据集的可用性。

Abstract

It is the scarcity of dataset that obstacles the development of political argument mining research in China. Diplomatic texts, as the typical form of political genre, demonstrate more vivid but logic debating skills and deserve more attention in argument mining tasks. Inspired by a corpus building for multilingual diplomatic dialogue texts, we select 200 transcribed texts for each Chinese and English Press Regular Conferences, which can be split into 1,536 diplomatic Q/A turns. We annotate Claims, Premises and their relations so that a Chinese-English bilingual dataset for political argument miming tasks (BiDAM) is constructed. We also make the sentiment analysis for each extracted argument for the further studies.

关键词

政治论辩 / 多语外交对话语料库 / 跨语言论辩挖掘 / 论辩挖掘任务数据集

Key words

political debate / multilingual diplomatic dialogue corpus / cross-lingual argument mining / dataset for argument mining tasks

引用本文

导出引用
张霄军,周静狮. 中英双语政治论辩挖掘任务数据集建设. 中文信息学报. 2023, 37(10): 167-174
ZHANG Xiaojun, ZHOU Jingshi. Building Chinese-English Bilingual Dataset for Political Argument Mining Tasks. Journal of Chinese Information Processing. 2023, 37(10): 167-174

参考文献

[1] MOCHALES R, MOENS M. Argumentation mining[J]. Artificial Intelligence and Law, 2022, 19(1): 1-22.
[2] 石岳峰,王熠,张岳. 深度学习在论辩挖掘任务中的应用[J]. 中文信息学报, 2022, 36(7): 1-12.
[3] HADDADAN S, CABRIO E, VILLATA S. Yes, we can!: Mining arguments in 50 years of US presidential campaign debates[C]//Proceedings of the ACL, 2019: 4684-4690.
[4] LIPPI M, TORRONI P. Argument mining from speech: Detecting claims in political debates[C]//Proceedings of the AAAI, 2016: 2979-2985.
[5] MENINI S, CABRIO E, TONELLI S, et al. Never retreat, never retract: Argumentation analysis for political speeches[C]//Proceedings of the AAAI, 2018: 4889-4896.
[6] DUTHIE R, BUDZYNSKA K, REED C. Mining ethos in political debate[C]//Proceedings of the COMMA, 2016: 299-310.
[7] WALTON, D. Ethotic arguments and fallacies: The credibility function in multi-agent dialogue systems[J]. Pragmatics & Cognition, 1999, 7(1): 177-203.
[8] CANO-BASAVE A, HE Y. A study of the impact of persuasive argumentation in political debates[C]//Proceedings of the NAACL HTL, 2016: 1405-1413.
[9] STAB C, GUREVYCH I. Identifying argumentative discourse structures in persuasive essays[C]//Proceedings of the EMNLP, 2014: 46-56.
[10] VISSER J, KONAT B, DUTHIE R, et al. Argumentation in the US presidential elections: annotated corpora of television debates and social media reaction[J]. Language Resource and Evaluation, 2020: 123-154.
[11] LAWRENCE J, BEX F, REED C, et al. AIFdb: Infrastructure for the argument web[C]//Proceedings of the COMMA, 2012: 515-516.
[12] GUO Y, GOETZ J, MAZUMDER R, et al. Mining events with declassified diplomatic documents[J]. Annals of Applied Statistics, 2020, 14(4): 1699-1723.
[13] TOLEDORONEN O, ORBACH M, BILU Y, et al. Multilingual argument mining: Datasets and analysis[G]//Proceedings of the EMNLP, 2020: 303-317.
[14] EGER S, DAXENBERGER J, STAB C, et al. Cross-lingual argumentation mining: Machine translation (and a bit of projection) is all you need![C]//Proceedings of the ACL, 2018: 831-844.
[15] LIU Z, WANG H, NIU Z, et al. DuRecDial 2.0: A bilingual parallel corpus for conversational[C]//Proceedings of the EMNLP, 2021: 4335-4347.
[16] SHIMIZU N, RONG N, MIYAZAKI T. Visual question answering dataset for bilingual image understanding: A study of cross-lingual transfer using attention maps[C]//Proceedings of the ACL, 2018: 1918-1928.
[17] VISSER J, LAWRENCE J, REED C, et al. Annotating argument schemes[J]. Argumentation, 2021, 35(1): 101-139.
[18] 邢福义.汉语复句研究[M]. 北京: 商务印书馆, 2001.
[19] 宋柔. 汉语小句复合体的话头结构[C]//实证和语料库语言学前沿. 北京: 中国社会科学出版社, 2018: 218-230.
[20] COHEN J. A coefficient of agreement for nominal scales[J]. Educational and Psychological Measurement, 1960, 20(1): 37-46.
[21] CHENG L, LIANG B, YU Q, et al. APE: Argument pair extraction from peer review and rebuttal via multi-task learning[C]//Proceedings of the EMNLP, 2020: 7000-7011.
[22] BAO J, SUN J, ZHU Q, et al. Have my arguments been replied to?: Argument pair extraction as machine reading comprehension[C]//Proceedings of the ACL, 2022: 29-35.
[23] BAO J, LIANG B, SUN J, et al. Argument pair extraction with mutual guidance and inter-sentence relation graph[C]//Proceedings of the EMNLP, 2021: 3923-3934.
[24] LAWRENCE J, REED C. Argument mining: A survey[J]. Computational Linguistics, 2029, 45 (4): 765-818.

基金

西交利物浦大学科研发展基金(RDF-22-01-053);广东省安全智能新技术重点实验室开放课题基金(2022B1212010005)
PDF(2351 KB)

970

Accesses

0

Citation

Detail

段落导航
相关文章

/