Gender Equality is an Inequality in Risk Assessment

Gender Equality is an Inequality in Risk Assessment

Forensic Training AcademyPCRA strongly predicts arrests for both genders, but overestimates women’s likelihood of recidivism. This is the bottom line of a recently published article in Law and Human Behavior. Below is a summary of the research and findings as well as a translation of this research into practice.

Law and Human BehaviorFeatured Article | Law and Human Behavior | 2016, Vol. 40, No. 5, 580-593

Gender, Risk Assessment, and Sanctioning: The Cost of Treating Women Like Men


Jennifer Skeem, University of California, Berkeley
John Monahan, University of Virginia
Christopher Lowenkamp, Administrative Office, U.S. Courts, Washington, DC


Increasingly, jurisdictions across the United States are using risk assessment instruments to scaffold efforts to unwind mass incarceration without compromising public safety. Despite promising results, critics oppose the use of these instruments to inform sentencing and correctional decisions. One argument is that the use of instruments that include gender as a risk factor will discriminate against men in sanctioning. On the basis of a sample of 14,310 federal offenders, we empirically test the predictive fairness of an instrument that omits gender, the Post Conviction Risk Assessment (PCRA). We found that the PCRA strongly predicts arrests for both genders, but overestimates women’s likelihood of recidivism. For a given PCRA score, the predicted probability of arrest, which is based on combining both genders, is too high for women. Although gender neutrality is an obviously appealing concept, it may translate into instrument bias and overly harsh sanctions for women. With respect to the moral question of disparate impact, we found that women obtain slightly lower mean scores on the PCRA than men (d = .32, 99% CI = .29-.35, or 87% overlap in scores); this difference is wholly attributable to men’s greater criminal history, a factor already embedded in sentencing guidelines.


risk assessment, gender, test bias, disparities, sentencing

Summary of the Research

Individuals have argued that it is possible to reduce mass incarceration without concurrently increasing crime rates. “A principal strategy proposed to accomplish this goal is the incorporation of risk assessment throughout the sanctioning process. More specifically, advocates envision three ways that risk assessment can reduce jail and prison populations without increasing the crime rate. First, risk assessment can inform decisions about whether an offender has a sufficiently high likelihood of again committing crime to justify a period of incapacitation…. Second, risk assessment can inform decisions about whether an offender has a sufficiently low likelihood of again committing crime to justify an abbreviated period of incapacitation or, in the case of supervised probation or parole, no incapacitation at all…. Finally, risk assessment can inform the type and intensity of correctional interventions designed to reduce an offender’s likelihood of again committing crime.” (p. 580-581).

“Although many extoll the potential contribution of risk assessment to reducing mass incarceration without increasing crime, others are equally adamant in opposition to incorporating risk assessment in the sanctioning process. The principal concern is that any benefits in terms of reduced rates of incarceration achieved through the use of risk assessment will be offset by costs to social justice claimed to be inherent in the risk assessment enterprise. On the basis of a sample of over 34,000 offenders, Skeem and Lowenkamp (in press) have examined this claim with respect to race and one well-known risk assessment instrument—the federal PCRA. There were two relevant findings. First, the PCRA was free of predictive bias—the instrument predicted re-arrest for both African American and White offenders strongly and with similar form (i.e., a given PCRA score roughly corresponded to a given probability of recidivism, across race). Second, on average, African American offenders obtained modestly higher PCRA scores than did White offenders (mostly because of higher scores on the criminal history scale). Although these differences do not reflect test bias, some uses of the PCRA could have disparate impact on African American offenders” (p. 581).

“A staple in criminology is the fact that women participate in violent crime at a lower rate than men do. “Although the PCRA omits gender as an explicit risk factor, it includes potential correlates of gender (e.g., history of violent offending). Whether the PCRA is subject to gender bias is an empirical question” (p. 581).

“In this study, we use a cohort of male and female federal offenders to empirically examine the relationships among gender, risk assessment, and recidivism. In the federal system, risk assessment is not used to inform sentencing decisions. Instead, the PCRA is used to inform decisions designed to reduce risk—that is, to identify whom to provide with relatively intensive services (i.e., higher risk offenders) and what to target in those services (i.e., variable risk factors). The PCRA was developed by the Administrative Office of the U.S. Courts (i.e., Probation and Pretrial Services Office) and is administered post conviction on intake to a term of supervised release or probation. Given that the PCRA is well-validated and includes major risk factors tapped by many other risk assessment instruments, these federal data are well-suited for addressing three aims with broader implications:

  1. To what extent is the instrument—and the risk factors it includes—free of predictive bias? We hypothesize that there will be little evidence of test bias by gender.
  2. To what extent does the instrument yield average score differences between gender groups that are relevant to disparate impact? We hypothesize that women will obtain lower PCRA scores than men.
  3. Which risk factors contribute the most to mean score differences between men and women? Given past research and the PCRA’s scoring system, we expect criminal history to contribute the most to these differences” (p. 583).

“Our results make two major points. First, given that risk assessment is relevant only to utilitarian sanctioning goals of crime control, turning a blind eye to the effects of gender on recidivism can translate to discrimination against women. The PCRA—like many other risk assessment instruments and all sentencing guidelines—omits gender. We found that the PCRA overestimates women’s likelihood of recidivism. This is true even though the PCRA strongly predicts recidivism for both women and men and gender does not moderate the relation between the PCRA and recidivism. Given a particular PCRA score, men are 1.53 times more likely than are women to be arrested for any crime and are 2.27 times more likely to be arrested for a violent crime. Second, setting aside the issue of test bias, men obtained slightly higher average scores on the PCRA than did women, which is entirely a function of men’s greater criminal history. Given that criminal history is emphasized in most sentencing guidelines, it is not clear that use of the PCRA would increase any disparate impact on men” (p. 589).

Translating Research into Practice

“Our findings as a whole are inconsistent with legal scholars’ categorical arguments that the use of risk assessment instruments to inform sanctioning decisions will have discriminatory effects. We have tested the predictive fairness

of the PCRA with respect to two factors that this instrument (like many others) omits—gender and race. Unlike race, we found that gender matters for predictive fairness. Specifically, the PCRA strongly predicts re-arrest for both Black and White offenders and for both men and women. But a given PCRA score has the same meaning across groups—that is, same probability of recidivism— only for Black and White offenders. Unless gender-specific recidivism rates are considered when interpreting PCRA scores (see the following text), the instrument will overestimate women’s probability of recidivism” (p. 590).

“Our general findings indicate that research can inform legal debate about the use of risk assessment in sanctioning. Demographic factors can relate differently to risk assessment, and risk is directly relevant to the sanctioning goal of preventing new offenses (despite its irrelevance to moral blameworthiness). Lumping together race and gender as risk factors for recidivism—and then avoiding both—can be costly” (p. 590).

“Results suggest that using a gender-neutral instrument like the PCRA to inform sanctioning decisions risks discriminating against women. The simple way to avoid overestimating women’s likelihood of recidivism is to interpret PCRA scores in a gender specific manner. That is, to explicitly acknowledge that women who score between 10 and 12 on the PCRA, for example, do not present the same “moderate” risk of recidivism as do men who score within the same range on that instrument. Instead, moderate risk women have a 37% of recidivism, whereas moderate-risk men have a 52% chance of recidivism. As Frase et al. (2015) argued, even if gender is a characteristic that an offender cannot control, it may be fair to give women reduced sanctions reflecting their lower risk—as long as sanctions for men ‘are proportionate to their current offenses and prior convictions’ (p. 102). At a minimum, ‘policy makers need to wrestle with the fact that practices that are. . . gender neutral, as well as those that are overtly prejudiced, can produce injustice’” (p.591).

Other Interesting Tidbits for Researchers and Clinicians

“First, this study is among the first of its kind, so the generalizability of its results to other contexts is unclear. Surprisingly, few risk assessment instruments have been tested for predictive bias and mean score differences by gender. Additional research is needed to establish the extent to which our findings generalize from federal- to state-level offenders and from the PCRA to other risk assessment instruments. The LSI-R—which excludes gender, like the PCRA—has been most heavily studied in nonfederal correctional contexts. A recent meta-analysis indicated that the predictive utility of the LSI-R is similar for men and women.

Although this lends confidence to our basic results for the PCRA, it is unclear whether LSI-R (like the PCRA) over predicts women’s recidivism. More importantly, it is unclear whether instruments that include gender as a risk factor are less subject to over prediction than those that exclude gender. Third, data on interrater reliability in scoring the PCRA for this study are not available. Although some risk domains may have been scored more accurately than others, all officers who complete the PCRA must complete a certification process that has been shown to yield reliable scores” (p. 589).

“The effect of risk assessment on gender disparities in sanctioning will vary not only as a function of the instrument used, but also as a function of the baseline sanctioning context: Risk assessment, compared to what? Given that criminal history is virtually always considered in sentencing decisions, it is not clear that risk assessment would exacerbate any gender disparities. As noted by Frase

et al. (2015), ‘criminal history scores make up one of the two most significant determinants of the punishment an offender receives [the other being the gravity of the conviction offense] in a sentencing guidelines jurisdiction’ and ‘prior convictions are taken into account by all U.S. sentencing systems” (p. 7). However, because criminal history may be operationalized in myriad ways— and sanctioning decisions extend beyond sentencing—research is needed to identify any conditions under which risk assessment contributes to gender disparities in sanctioning. The effect of a given instrument on such disparities will depend on what practices are being replaced” (p. 591).

Join the Discussion

As always, please join the discussion below if you have thoughts or comments to add!

Authored by Amanda Beltrani

Amanda Beltrani is a current graduate student in the Forensic Psychology Masters program at John Jay College of Criminal Justice in New York. Her professional interests include forensic assessments, specifically, criminal matter evaluations. Amanda plans to continue her studies in a doctoral program after completion of her Masters degree.

1 Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

  • I am the developer of the CARE2 violence frisk assessment for youth. I found that not only is the scoring different according to gender, items used in a prediction equation are different for males and females. There has been cross validation of the CARE2 reinforcing the idea that male and female violence are different and have different courses. the hypothesis is that the life long course of behavior problems and violence may not apply to women or girls. There is some European research that supports the hypothesis that lifelong course of chronic violence is more applicable to males than females.

    Dr. Kathryn Seifert Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.