Algorithmic Fairness in Criminal Justice: The COMPAS Case

The use of algorithmic risk assessment tools in criminal justice has become a focal point in debates about fairness, accountability, and the ethics of artificial intelligence. This project examines the COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) recidivism risk assessment algorithm, which has been widely deployed across the United States to inform bail, sentencing, and parole decisions.

As Brian Christian explores in The Alignment Problem, ensuring that AI systems align with human values, particularly fairness and equity, is one of the most pressing challenges of our time. The COMPAS algorithm exemplifies this challenge: while designed to improve criminal justice outcomes through data-driven risk prediction, investigative journalism by ProPublica in 2016 revealed troubling racial disparities in its predictions. Specifically, the algorithm was found to disproportionately misclassify Black defendants as high-risk (false positives) while misclassifying White defendants as low-risk (false negatives).

📋 Project Overview

This project analyzes racial disparities in the COMPAS recidivism risk assessment algorithm using statistical methods. Specifically, we use bootstrap resampling to quantify differences in score distributions across racial groups, and permutation testing to formally test whether Black and White defendants experience significantly different false positive rates when classified as high-risk.

Contents:

data/: Datasets published by Pro Publica (see official GitHub repo)
eda.ipynb: Brief examination of potential bias of risk assessment scores in the datasets
analysis.ipynb: Main analysis notebook with fairness metrics and comparisons
results/: Generated visualizations and plots

📊 Key Findings

Mean COMPAS scores across race groups

The bootstrap results show a clear difference in mean COMPAS decile scores across race groups. Black defendants have a substantially higher average score than White defendants, with narrow and non-overlapping confidence intervals, reflecting a precise and sizeable difference. This upward shift in the score distribution implies that, for any fixed threshold, Black defendants are more likely to be classified as high risk.

IQR of COMPAS scores across race groups

The bootstrap estimates indicate that Black defendants have a larger interquartile range of COMPAS decile scores than White defendants, suggesting greater dispersion in assigned risk scores. The confidence intervals for Black and White defendants show limited overlap, while intervals for smaller groups are wide due to sample size. Greater score dispersion implies more mass in the upper tail of the score distribution, increasing the likelihood of exceeding a fixed high-risk threshold.

False Positive Rate Disparity

The permutation test reveals a significant disparity in false positive rates between racial groups. We reject the null hypothesis of no difference ($H_0: \Delta_\mathrm{FPR}=0$) with $p\approx0$, far below conventional significance thresholds ($\alpha = 0.05$).

The observed disparity of 0.214 means Black defendants are 21.4 percentage points more likely to be incorrectly classified as high-risk compared to White defendants among those who do not reoffend. This stark difference is extremely unlikely to occur by random chance, indicating systematic racial bias in COMPAS score distributions rather than statistical noise.

🚀 Usage

Explore the dataset:

jupyter notebook eda.ipynb

Run the analysis:

jupyter notebook analysis.ipynb

📚 References

Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016, May 23). Machine Bias: There’s software used across the country to predict future criminals. And it’s biased against blacks. ProPublica. Retrieved from https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
ProPublica. compas-analysis: Data and analysis for “Machine Bias”. GitHub repository. Retrieved from https://github.com/propublica/compas-analysis
Christian, B. (2020). The Alignment Problem: Machine Learning and Human Values. W. W. Norton & Company.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
analysis.ipynb		analysis.ipynb
eda.ipynb		eda.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Algorithmic Fairness in Criminal Justice: The COMPAS Case

📋 Project Overview

📊 Key Findings

Mean COMPAS scores across race groups

IQR of COMPAS scores across race groups

False Positive Rate Disparity

🚀 Usage

📚 References

About

Uh oh!

Languages

License

rplacucci/fairness-compas

Folders and files

Latest commit

History

Repository files navigation

Algorithmic Fairness in Criminal Justice: The COMPAS Case

📋 Project Overview

📊 Key Findings

Mean COMPAS scores across race groups

IQR of COMPAS scores across race groups

False Positive Rate Disparity

🚀 Usage

📚 References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages