Research reports


Hierarchical Approaches to Text-based Offense Classification

The purpose of this paper is to convey efforts to develop a machine-learning approach to assist with the classification of text-based offense description. We introduces a new offense classification schema, the Uniform Crime Classification Standard (UCCS), and the Text-based Offense Classification (TOC) tool to classify offense descriptions. The UCCS schema draws from existing Department of Justice efforts, with the goal of better reflecting offense severity and improving offense type disambiguation. The TOC tool is a machine learning algorithm that uses a hierarchical, multilayer perceptron classification framework, built on 313,209 unique hand-coded offense descriptions from 24 states, to translate raw description information into UCCS codes. In a series of experiments, we test how variations in data processing and modeling approaches impact recall, precision, and F1 scores to assess their relative influence on the TOC tool’s performance. The code scheme and classification tool are collaborations between Measures for Justice and the Criminal Justice Administrative Records System.


Benchmarking the Criminal Justice Administrative Records System’s Data Infrastructure

The purpose of this report is to convey the findings of a series of exercises that were conducted to benchmark the CJARS data infrastructure against other widely-used sources of data on justice-involved populations (i.e., Uniform Crime Report, State Court Processing Statistics, National Prisoners Statistics Program, National Corrections Reporting Program, Annual Probation Survey, and Annual Parole Survey). This involved comparing counts of events and caseload characteristics reported in these data series to similar estimates produced using CJARS. Results indicated that there was a high degree of alignment between the counts of events and caseload characteristics that were estimated using CJARS as compared to the data series that were used for benchmarking. The findings reviewed in this report provide a substantial amount of evidence in support of the efficacy of the CJARS data infrastructure.


Modernizing Person-Level Entity Resolution with Biometrically Linked Records

In this paper, we propose a novel approach to person-level record linkage in administrative data, a procedure and setting that is increasingly at the frontier of economic research. We build a supervised learning algorithm trained on fingerprint identifiers that act as an unbiased measure of true match status. Both the size and nature of the training data yield performance that substantially improves on existing literature, especially for women and minorities. We demonstrate model effectiveness in deduplication and record linkage applications, and extendibility to dissimilar populations from the training data. Simulation exercises illustrate how matching performance impacts internal and external validity, and statistical precision.

Criminal Disqualifications in the Paycheck Protection Program

In response to the COVID-19 pandemic, Congress created the Paycheck Protection Program (PPP) to support small businesses. However, some businesses are not eligible for this program if an owner of 20% or more of the equity of the business has had certain disqualifying events based on their history of involvement in the justice system. The goal of this report was to estimate the overall impact of these criminal history disqualifications. The report was also released as Census Bureau ADEP Working Paper ADEP-WP-2020-04.


Trends in Michigan Marijuana Offenses

In collaboration with the Michigan State Court Administrative Office, CJARS produced a brief report examining court case filing trends for marijuana-related offenses in the state.