案源线索的违法业务种类自动分类研究

范钦,李兵,温立强,李伟平

PDF(2182 KB)
PDF(2182 KB)
中文信息学报 ›› 2023, Vol. 37 ›› Issue (5) : 157-164.
情感分析与社会计算

案源线索的违法业务种类自动分类研究

  • 范钦1,李兵1,温立强2,3,李伟平3
作者信息 +

Automatic Classification of Illegal Business Types for Clues to Case Sources

  • FAN Qin1, LI Bing1, WEN Liqiang2,3, LI Weiping3
Author information +
History +

摘要

案源线索管理是工商行政执法办案的初始环节。随着网络举报途径的简化,案源线索的数量激增,现有的人工对案源线索进行分派处理的方式存在压力大、错误率高、人工成本高等种种弊端。为了降低人工成本、提高案源线索分类的准确率,该文以某一线城市的案源线索数据为例,探索基于深度学习模型的分类算法,来实现违法种类的自动识别。经过模型选择和实证研究,发现所提算法的总体分类准确率较高,能够满足实际的业务需求。本研究的成果表明了基于深度学习模型的分类器,可以有效地实现案源线索的自动分类,为推进社会治理能力的智能化和现代化提供借鉴。

Abstract

Case source clues management is the initial step for industrial and commercial administration and law-enforcement. To deal with the sharp increasing case source clues, this paper explore the deep learning model to realize illegal types automatic recognition. After model selection and empirical research, the overall classification accuracy rate meets actual business needs. The experiment on a first-tier city’s data show that the proposed model can effectively realize the case source clues automatic classification.

关键词

案源线索 / 文本分类 / BERT模型

Key words

case source clues / text classification / BERT model

引用本文

导出引用
范钦,李兵,温立强,李伟平. 案源线索的违法业务种类自动分类研究. 中文信息学报. 2023, 37(5): 157-164
FAN Qin, LI Bing, WEN Liqiang, LI Weiping. Automatic Classification of Illegal Business Types for Clues to Case Sources. Journal of Chinese Information Processing. 2023, 37(5): 157-164

参考文献

[1] 市场监督管理行政处罚程序暂行规定[J]. 中华人民共和国国务院公报, 2019, 11(11): 58-68.
[2] 王胤元. 行政执法检察监督的案源问题[J]. 中国检察官, 2017, 3(05): 3-5.
[3] 于洋. 新形势下12315热线标准化建设推进路径和建议[J]. 质量与市场, 2020, 3(20): 118-20.
[4] LIU P, QIU X, HUANG X. Recurrent neural network for text classification with multi-task learning[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence, 2016: 2873-2879.
[5] JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification [C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: 2016: 427-431.
[6] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space [J]. arXiv preprint arXiv: 1301.3781, 2013.
[7] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding [C]// Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019:4171-4186.
[8] RADFORD A, NARASIMHAN K. Improving language understanding by generative pre-training[EB/OL].https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf[2020-11-05].
[9] 孙宗锋, 赵兴华. 网络情境下地方政府政民互动研究——基于青岛市市长信箱的大数据分析[J]. 电子政务, 2019, 15(05): 12-26.
[10] 王思迪, 胡广伟, 杨巳煜等. 基于文本分类的政府网站信箱自动转递方法研究[J]. 数据分析与知识发现, 2020, 4(06): 51-59.
[11] 段尧清, 姚兰. 政媒融合问政平台非正式文本自动分类匹配研究[J]. 情报理论与实践, 2020, 43(06): 156-161,48.
[12] 薛彬, 陶海军, 王加强. 针对民生热线文本的热点挖掘系统设计[J]. 中国计量大学学报, 2017, 28(03): 371-9.
[13] SURJANDARI I, MEGAWATI C, DHINI A. Application of text mining for classification of textual reports: a study of Indonesia's national complaint handling system [C]// Proceedings of the 6th International Conference on Industrial Engineering and Operations Management, 2016.
[14] FAUZAN A, KHODRA M L. Automatic multilabel categorization using learning to rank framework for complaint text on Bandung government [C]//Proceedings of the International Conference of Advanced Informatics: Concept, Theory and Application, 2015: 28-33.
[15] HAYATI S A, WICAKSONO A F, ADRIANI M. Short text classification on complaint documents [J]. International Journal of Computational Linguistics & Applications, 2016, 7(2): 129-43.
[16] COHEN W W, SINGER Y. Context-sensitive learning methods for text categorization [J]. ACM Trans Inf Syst, 1999, 17(2): 141-73.
[17] THOMPSON P. Automatic categorization of case law [C]//Proceedings of the 8th International Conference on Artificial Intelligence and Law,2001: 70-77.
[18] SULEA O-M, ZAMPIERI M, MALMASI S, et al. Exploring the use of text classification in the legal domain[C]//Proceedings of the 2nd Workshop on Automated Semantic Analysis of Information in Legal Texts, 2017.
[19] VASWANI A, SHAZEER N, PARMAR N, et al. Attention Is all you need [C]//Proceedings of the 31st Conference on Neural Information Processing Systems, 2017: 6000-6010.
[20] 王成, 刘亚峰, 王新成等. 分类器的分类性能评价指标[J]. 电子设计工程, 2011, 19(08): 13-15,21.

基金

国家重点研发计划(2020YFC0833304)
PDF(2182 KB)

717

Accesses

0

Citation

Detail

段落导航
相关文章

/