Abstract:Most of the errors in the political news are semantic errors. On the basis of analyzing expression characteristics of political errors in news field text, we summarize the political error types in the newspapers, and establish the corresponding knowledge bases for political error detection. According to the research on linguistic features of political news,a formal model of detecting political errors is presented. The strategy based on the combination of rules and Statistics is used to proofread semantic errors of the political news field. The results show a good application prospect of the method: with a recall rate of 65.5% and an accuracyof 80.5%.
[1] 桂红星,陈晖.报纸重大差错的成因及防堵[DB/OL],2006-8-29, http://www.cnhubei.com/200608/ca1147130.htm. [2] 王燚.基于场景化知识表示的自然语言处理及其在自动文本校对中的应用[D].成都: 西南交通大学博士论文,2005. [3] 王亚东.消除报刊政治性差错需要注意的几个问题[J].吉林省教育学院学报, 2012,28(2): 125-126. [4] 郭爱民.书报刊中常见政治性差错例析[J].科技与出版,2006(5): 50-52. [5] 新华社.新华社新闻报道中的禁用词(第一批)[DB/OL].http://dms.mca.90U.Cnlarticle/xxyd/201408/20140800680684.shtml.2014. [6] Li M, Zhang Y, et al. Exploring Distributional Similarity Based Models for Query Spelling Correction[C]//Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL.2006: 1025-1032. [7] 张仰森,曹元大,俞士汶.基于规则与统计相结合的中文文本自动查错模型与算法[J].中文信息学报,2005,20(4): 1-8. [8] 李蓉.一个用于OCR输出的中文文本的拼写校对系统[J].中文信息学报,2009,23(5): 92-97. [9] 管君,谢伟,张仰森.基于多知识源的语义搭配知识库的构建及应用[J].计算机工程与设计,2013,34(6): 2136-2140.