This piece is a commentary on the article, The Age of Secrecy and Unfairness in Recidivism Prediction
Rudin, Wang, and Coker (2020) present a sophisticated and thoughtful analysis of several issues central to ongoing debate regarding the use of risk assessment tools in the criminal justice system, including whether black box, algorithmic risk assessment models are racially biased. Their framing of transparency as a type of procedural fairness in risk assessment is an important advance in the field. Further, the provision of empirical evidence that speaks to concerns regarding the use of proprietary models, such as the COMPAS, when nonproprietary approaches are available is valuable. Building upon these strengths, there are a handful of issues upon which elaboration would support further discussion regarding the role, if any, of risk assessment tools (no matter how simple or complex) in the criminal justice system.
First, transparency is but one conceptualization of fairness. Even a very simple and transparent risk assessment model, especially one comprised of relatively static (i.e., unchangeable factors), such as age and criminal history, may be ‘unfair.’ Indeed, the use of age, criminal history, and other static variables may disadvantage people of color and may capitalize on or even exacerbate inequalities (Lowder, Morrison, Kroner, & Desmarais, 2019). Rudin and colleagues’ point that removing age and criminal history may reduce, if not eliminate altogether, the predictive utility of the model is duly noted. However, this is an empirical question that should be examined in future research. Beyond these static variables, many other variables available in criminal justice records reflect marginalization, because marginalized individuals are more likely to find themselves in contact with the justice system and, therefore, to have their data collected (Eckhouse, Lum, Conti-Cook, & Ciccolini, 2019). As a result, another important avenue of investigation may be to examine how machine learning (or other statistical methods) might be able to reduce (or ideally eliminate) the perpetuation of systemic bias that is reflected in criminal justice records.
Second, rather than eliminating variables such as age and criminal history altogether, a more nuanced approach may be to explore models that use different operational definitions of criminal history. Indeed, Rudin and colleagues, like many others, have measured criminal history and recidivism using metrics that reflect innocence as opposed to guilt. Specifically, Rudin and colleagues operationalize ‘criminal history’ as prior arrests and ‘recidivism’ as charges incurred during the follow-up period. Yet, in our criminal justice system, an individual is innocent until proven guilty (i.e., convicted). It is well documented that racial and ethnic minorities are more likely to be arrested for behaviors that would not result in arrest for those of other racial and ethnic backgrounds. As a result, many argue for criminal history to be operationalized using prior convictions. Although this metric is not bias free, at least the presumption of innocence no longer applies. Using filed charges is perhaps less problematic, as it reduces false positives associated with arrest and false negatives associated with convictions (Spohn & Holleran, 2002), but it may still be subject to biases and concerns regarding innocence. Testing diverse measures of criminal history and recidivism may help identify models that minimize the impact of biases that exist in the data while broader reform efforts aim to minimize biases in policing and case processing practices themselves.
Third, it is worth considering what might happen if risk assessment tools were abolished from the criminal justice system. A return to the status quo—that is, decision-making in the absence of the results of risk assessment tools—would not necessarily result in accurate, transparent, and racially just decisions. Instead, as Rudin and colleagues suggest, abolishing the use of algorithmic risk assessment models would have “similar disadvantages to COMPAS.” Judges’ decisions are arguably no more transparent, accurate, or consistent than the results of risk assessment tools. In fact, quite the opposite may be true. For more than 50 years, researchers have been examining and comparing the accuracy of unstructured and statistical predictions of human behavior, including crime and violence. Memorably, Ennis and Litwack (1974) compared clinical predictions of future violent crime proffered by psychiatrists in court to “flipping coins in the courtroom” because they were so frequently biased and inaccurate. Since then, there have been dozens of investigations comparing unstructured judgments by clinicians, judges, and others, with statistical predictions. In its totality, this work shows that statistical predictions of future criminal and violent behavior are significantly more accurate than clinical ones. Further, there is similarly decades of research documenting judicial biases in sentencing and release decisions. So, abolishing risk assessment from the criminal justice system quite likely would result in less accurate, less transparent, and more racially biased decisions.
Fourth, there seems to be some clarification needed regarding the role of risk assessment in criminal justice decision-making. Most risk assessment tools were developed to inform and not replace judicial discretion. There are few experts, stakeholders, or even tool developers that suggest that the results of risk assessment tools be used to make the ultimate in/out decisions, whether it be at the time of pretrial decision-making, sentencing, parole, or otherwise. Instead, most agree that risk assessment tools should provide information to support (not replace) decision-making, and that magistrates, judges, and other criminal justice decision makers should still consider the characteristics and circumstances of each case, as appropriate. Wisconsin v. Eric Loomis (2016) asserts that scores produced by risk assessment tools, and the COMPAS specifically, may not be the determinative factor in decisions of release. As such, even if an entirely transparent and easily understood risk assessment model is developed, judges should be considering the results as one piece of information to inform their decision-making. Moreover, the extent to which the results of risk assessment tools are actually considered by judges in case decision-making has been raised into question, suggesting that risk assessment tools may be having less of an impact than assumed (or even feared) by many.
Fifth, and finally, the COMPAS has been at the center of much of the debate regarding the use of risk assessment tools in the criminal justice system—perhaps too much so, some of us would argue. There are literally hundreds of other risk assessment tools that exist and are used at diverse decision points across the criminal justice system. A number of these risk assessment tools are nonproprietary in nature and their contents and scoring methods are readily available on websites, in manuals, or upon request. These tools vary in the number, type, and content of their items, as well as in their approach, from simple checklists to more complicated algorithms. Instead of continuing to focus on concerns regarding the COMPAS in particular, efforts that examine other risk assessment tools that do not suffer from COMPAS-related critiques, including transparency and complexity, may be better suited to advance conversations regarding fairness in risk assessment and the role risk assessment tools should play in the criminal justice system.
Read invited commentary by:
Shawn Bushway (The RAND Corporation)
Alexandra Chouldechova (Carnegie Mellon University)
Brandon L. Garrett (Duke University School of Law)
Eugenie Jackson and Christina Mendoza (Northpointe, Inc.)
Greg Ridgeway (University of Pennsylvania)
Read a rejoinder by: Cynthia Rudin, Caroline Wang, and Beau Coker
Eckhouse, L., Lum, K., Conti-Cook, C., & Ciccolini, J. (2019). Layers of bias: A unified approach for understanding problems with risk assessment. Criminal Justice and Behavior, 46, 185–209. https://doi.org/10.1177/0093854818811379
Ennis, B. J., & Litwack, T. R. (1974). Psychiatry and the presumption of expertise: Flipping coins in the courtroom. California Law Review, 62, 693–752. https://doi.org/10.15779/Z38XX8W
Lowder, E. M., Morrison, M. M., Kroner, D. G., & Desmarais, S. L. (2019). Racial bias and LSI-R assessments in probation sentencing and outcomes. Criminal Justice and Behavior, 46, 210–233. https://doi.org/10.1177/0093854818789977
Rudin, C., Wang, C., & Coker, B. (2020). The age of secrecy and unfairness in recidivism prediction. Harvard Data Science Review, 2(1).
Spohn, C., & Holleran, D. (2002). The effect of imprisonment on recidivism rates of felony offenders: A focus on drug offenders. Criminology, 40, 329–358. https://doi.org/10.1111/j.1745-9125.2002.tb00959.x
Wisconsin v. Eric Loomis, 881 N.W.2d 749 (Wis. 2016).
This article is © 2020 by Sarah L. Desmarais. The article is licensed under a Creative Commons Attribution (CC BY 4.0) International license (https://creativecommons.org/licenses/by/4.0/legalcode), except where otherwise indicated with respect to particular material included in the article. The article should be attributed to the authors identified above.