罪名预测是智慧司法领域中的一项重要研究内容,其旨在依据犯罪事实自动预测出犯罪主体触犯的罪名。犯罪事实是案件的真实客观描述,犯罪事实中各词语的语义重要性在不同罪名的判决中有所差异,而现有方法在对犯罪事实建模的过程中往往忽略了这种语义差异性,且缺乏对数罪并罚情形的处理。为此,该文在对犯罪事实的建模过程中将词语的语义差异融入注意力机制;并将数罪并罚情形下的多标签罪名预测转化为多个独立的单标签罪名预测。实验结果表明,该文基于词语语义差异性建模和多标签转化策略均有利于提升罪名预测的效果,在“中国法研杯”2018司法人工智能挑战赛公布的数据集上达到了88.0%的F1值。
Abstract
Charge prediction is an important part in the field of intelligent judicature, which is aimed to predict the charge of the criminal subject based on the criminal facts. Criminal facts are the authentic and objective description of a case, in which the semantic importance of each word in criminal facts differs in the judgment of different charges. Existing studies ignore this semantic difference during modeling crime facts, and neglect the situation of cumulative punishment. In this paper, we incorporate the semantic differences of words into the attention mechanism in modeling crime facts. We then decompose the multi-label charges into several independent parts to realize the prediction under the condition of cumulative punishment. The experimental results show that the modeling based on semantic differences and multi-label transformation strategies are helpful to improve the effect of crime prediction, achieving F1 of 88.0% on CAIL2018 dataset.
关键词
罪名预测 /
语义差异性 /
多标签
{{custom_keyword}} /
Key words
charge prediction /
semantic differences /
multi-label
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Kort F. Predicting supreme court decisions mathematically: A auantitative analysis of the “right to counsel” Cases[J].American Political Science Review, 1957, 51(1):1-12.
[2] Nagel S S. Applying correlation analysis to case prediction[J].Texas Law Review, 1964, 42(7):1006-1017.
[3] Liu C L,Chang C T,Ho J H.Classification and clustering for case-based criminal summary judgments[C]//Proceedings of the 9th international conferenceon Artificial Intelligence and law.ACM,2003:252-261.
[4] Luo B, Feng Y, Xu J,et al. Learning to predict charges for criminal cases with legal basis[C]//Proceedings of EMNLP 2017, 2017:2727-2736.
[5] Hu Z,Li X,Tu C,et al.Few-shot charge prediction with discriminative legal attributes[C]//Proceedings of the 27th International Conference on Computational Linguistics.2018:487-498.
[6] Keown R. Mathematical models for legal prediction[J]. The John Marshall Journal of Information Technology & Privacy Law. 1980, 2:829.
[7] Anne von der Lieth Gardner. An artificial intelligence approach to legal reasoning[D].PhD Thesis, Department of Computer Science, Stanford University. MIT Press, Cambridge,1987.
[8] Deedman C, The nervous shock advisor: A legal expert system in case-based law[J].Operational Expert System Applications in Canada, 1991:56-71.
[9] Thompson P. Automatic categorization of case law.[C]//Proceedings of the International Conference on Artificial Intelligence & Law.ACM Press, 2001:70-77.
[10] Liu C L, Hsieh C D. Exploring phrase-based classification of judicial documents for criminal charges in chinese[C]//Proceedings of the International Conference on Foundations of Intelligent Systems. Springer-Verlag, 2006:681-690.
[11] Liu Y H, Chen Y L, Ho W L. Predicting associated statutes for legal problems[J].Information Processing & Management, 2015, 51(1):194-211.
[12] Sulea O M, Zampieri M, Malmasi S, et al. Exploring the use of text classification in the legal domain[J]. arXiv preprint arXiv, 1710.09306,2017.
[13] KimY. Convolutional neural networks for sentence classification[J].arXiv preprint arXiv: 1408.5882,2014.
[14] Hochreiter S, Schmidhuber, Jürgen. Long short-term memory[J].Neural Computation, 1997, 9(8):1735-1780.
[15] Zichao Y, Diyi Y, Chris D, et al. Hierarchical attention networks for document classification[C]//Proceedings of the NAACL-HLT. Cambridge, MA: MITPress, 2016:1480-1489.
[16] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the Annual Conference on Neural Information Processing Systems, 2017:5998-6008.
[17] Ye H, Jiang X, Luo Z, et al. Interpretable charge predictions for criminal cases: Learning to generate court views from fact descriptions[J], arXiv preprint arXiv, 1802.08504,2018.
[18] Xie S, Tu Z. Holistically-nested edge detection[J]. International Journal of Computer Vision, 2015, 125(1-3):3-18.
[19] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the Advances in Neural Information Processing Systems. 2013:3111-3119.
[20] Kingma D P,Ba J.Adam: A method for stochastic optimization[J].arXiv preprint arXiv: 1412.6980,2014.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家社会科学基金(18BYY074);山西省重点研发计划项目(201803D121055);山西省研究生联合培养基地人才培养项目(2018JD01)
{{custom_fund}}