事实验证是一项具有挑战性的任务,旨在使用来自可信赖语料库的多个证据句子来验证声明。为了促进研究,一些事实验证数据集被提出,极大地加速了事实验证技术的发展。然而,现有的事实验证数据集通常采用众包的方法构造,无可避免地引入偏差。已有事实验证去偏工作大致可以分为基于数据增强的方法和基于权重正则化的方法,前者不灵活,后者依赖于训练阶段的不确定输出。与已有工作不同,该文从因果关系出发,提出基于反事实推理的事实验证去偏方法。该文首先设计事实验证中的因果图,建模声明、证据以及它们之间的交互和预测结果的因果关系。接着,根据因果图提出事实验证去偏方法,通过总间接效应去除声明带来的偏差影响。我们使用多任务学习的方式来训练模型。训练时,该文采用多任务学习的方式建模各个因素的影响,同时在有偏和无偏测试集上评估模型的性能。实验结果表明,对比基准方法,该文模型在性能上获得了一致的提升。
Abstract
Fact verification is a challenging task with the purpose to verify a claim using multiple evidence from a trustworthy corpus. In contrast to the existing data-augmentation-based methods and weight-regularization-based methods, we propose a debias model for fact verification based on counterfactual inference from the perspective of causality. Specifically, we first design the cause graph for fact verification, modeling the claim, evidence, interactions between them and the cause-effect relationship of the predicted results. Then, we propose a debias method based on the cause graph to remove the bias effect caused by the claim through the total indirect effect. During training, we use multi-task learning to model the influence of various factors. Experiments on both biased and unbiased test datasets demonstrate that our model can achieve consistent performance improvement compared with baselines.
关键词
事实验证 /
反事实推理 /
去偏模型
{{custom_keyword}} /
Key words
fact verification /
counterfactual inference /
debias
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] CHAKRABARTY T,ALHINDI T, MURESAN S. Robust document retrieval and individual evidence modeling for fact extraction and verification[C]//Proceedings of the 1st Workshop on Fact Extraction and Verification, 2018: 127-131.
[2] CHERNYAVSKIY A, ILVOVSKY D. Extract and aggregate: A novel domain independent approach to factual data verification[C]//Proceedings of the 2nd Workshop on Fact Extraction and Verification, 2019: 69-78.
[3] HANSELOWSKI A, ZHANG H, LI Z, et al. Ukp-athene: Multi-sentence textual entailment for claim verification[J]. arXiv preprint arXiv: 1809.01479, 2018.
[4] HIDEY C, DIAB M. TeamSWEEPer: Joint sentence extraction and fact checking with pointer networks[C]//Proceedings of the 1st Workshop on Fact Extraction and Verification, 2018: 150-155.
[5] LIU Z,XIONG C, SUN M, et al. Finegrained fact verification with kernel graph attention network[C]//Proceedings of the 58th Annual Neeting of the Association for Computational Linguistics, 2020: 7342-7351.
[6] LUKEN J, JIANG N, DE MARNEFFE M C. QED: A fact verification system for the FEVER shared task[C]//Proceedings of the 1st Workshop on Fact Extraction and Verification, 2018: 156-160.
[7] THORNE J, VLACHOS A,CHRISTODOULOPOULOS C, et al. Fever: A large-scale dataset for fact extraction and verification[C]//Proceedings of NAACL, 2018: 809-819.
[8] SCHUSTER T, SHAH D J, YEO Y J S, et al. Towards debiasing fact verification models[C]//Proceedings of EMNLP-IJCNLP,2019: 3411-3425.
[9] GARG S, SHARMA D K. NewPolitifact: A dataset for counterfeit news[C]//Proceedings of the 9th International Conference System Modeling and Advancement in Research Trends. IEEE, 2020: 17-22.
[10] ZHOU J, HAN X, YANG C, et al. GEAR: Graph-based evidence aggregating and reasoning for fact verification[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 892-901.
[11] YONEDA T, MITCHELL J, WELBL J, et al. Ucl machine reading group: Four factor framework for fact finding (hexaf)[C]//Proceedings of the 1st Workshop on Fact Extraction and Verification, 2018: 97-102.
[12] ZHAO C,XIONG C, ROSSET C, et al. Transformer-xh: Multi-evidence reasoning with extra hop attention[C]//Proceedings of ICLR,2020: 1-16.
[13] MA J, GAO W,JOTY S, et al. Sentence-level evidence embedding for claim verification with hierarchical attention networks[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 2561-2571.
[14] WEI J, ZOU K. Eda: Easy data augmentation techniques for boosting performance on text classification tasks[C]//Proceedings of EMNLP-IJCNLP, 2019: 6382-6388.
[15] LEE M, WON S, KIM J, et al. CrossAug: A contrastive data augmentation method for debiasing fact verification models[C]//Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021: 3181-3185.
[16] LEWIS M, LIU Y, GOYAL N, et al. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 7871-7880.
[17] MAHABADI R K, BELINKOV Y, HENDERSON J. End-to-end bias mitigation by modelling biases in corpora[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 8706-8716.
[18] NIU Y, TANG K, ZHANG H, et al. Counterfactual vqa: A cause-effect look at language bias[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 12700-12710.
[19] WEI T, FENG F, CHEN J, et al. Model-agnostic counterfactual reasoning for eliminating popularity bias in recommender system[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021: 1791-1800.
[20] QIU R, WANG S, CHEN Z, et al. Causalrec: Causal inference for visual debiasing in visually-aware recommendation[C]//Proceedings of the 29th ACM International Conference on Multimedia, 2021: 3844-3852.
[21] NIE Y, CHEN H, BANSAL M. Combining fact extraction and verification with neural semantic matching networks[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(01): 6859-6866.
[22] NIE Y, WANG S, BANSAL M. Revealing the importance of semantic retrieval for machine reading at scale[C]//Proceedings of EMNLP-IJCNLP, 2019: 2553-2566.
[23] CHEN J, ZHANG R, GUO J, et al. GERE: Generative evidence retrieval for fact verification[C]//Proceedings of SIGIR, 2022: 2184-2189.
[24] SOLEIMANI A,MONZ C, WORRING M. Bert for evidence retrieval and claim verification[C]//Proceedings of the Conference on Information Retrieval. Springer, Cham, 2020: 359-366.
[25] TOKALA S, VISHAL G, SAHA A, et al. Attentive checker: A bi-directional attention flow mechanism for fact verification[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 2218-2222.
[26] PORTELLI B, ZHAO J, SCHUSTER T, et al. Distilling the evidence to augment fact verification models[J]. arXiv preprint arXiv: 2006. 2020: 47-51.
[27] LI T, ZHU X, LIU Q, et al. Several experiments on investigating pretraining and knowledge-enhanced models for natural language inference[J].arXiv preprint arXiv: 1904.12104, 2019.
[28] DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of NAACL, 2019: 4171-4186.
[29] LEE N, LI B Z, WANG S, et al. Language models as factcheckers?[J]. arXiv preprint arXiv: 2006, 2020: 36-41.
[30] SUBRAMANIAN S, LEE K. Hierarchical evidence set modeling for automated fact extraction and verification[C]//Proceedings of EMNLP,2020: 7798-7809.
[31] KOTONYA N, TONI F. Explainable automated fact-checking: A survey[C]//Proceedings of the 28th International Conference on Computational Linguistics, 2020: 5430-5443.
[32] THORNE J, VLACHOS A,CHRISTODOULOPOULOS C, et al. Evaluating adversarial attacks against multiple fact verification systems[C]//Proceedings of EMNLP-IJCNLP, 2019: 2944-2953.
[33] XIONG R, CHEN Y, PANG L, et al. Uncertainty calibration for ensemble-based debiasing methods[C]//Proceedings of the 35th Conference on Neural Information Processing Systems, 2021: 1-13.
[34] AMIRKHANI H,PILEHVAR M T. Don't discard all the biased instances: Investigating a core assumption in dataset bias mitigation techniques[C]//Proceedings of EMNLP, 2021: 4720-4728.
[35] TAN J, XU S, GE Y, et al. Counterfactual explainable recommendation[C]//Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021: 1784-1793.
[36] KINGMA D P, BA J. Adam: A method for stochastic optimization[C]//Proceedings of ICL,2015: 1-15.
[37] RADFORD A, WU J, CHILD R, et al. Language models are unsupervised multitask learners[J].OpenAI Blog, 2019, 1(8): 9-16.
[36] YANG X, ZHANG H, CAI J. Deconfounded image captioning: A causal retrospect[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021,45(11): 12996-13010.
[37] NGUYEN T Q, SCHMID I, STUART E A. Clarifying causal mediation analysis for the applied researcher: Defining effects based on what we want to learn[J]. Psychological Methods, 2021, 26(2): 255-267.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(62006218,61902381);中国科学院青年创新促进会项目(20144310,2021100);中国科学技术协会青年人才托举工程项目(YESS20200121);联想-中科院联合实验室青年科学家项目
{{custom_fund}}