Examining the Validity of Risk Assessments in Predicting General, Nonsexual Violence in Sexual Offenders
This study, published in Law and Human Behavior, provides implications with respect to the accurate assessment of institutional violence risk within a sex offender population. Below is a summary of the research and findings as well as a translation of this research into practice.
Featured Article | Law and Human Behavior | 2018, Vol. 42, No. 1, 13-25
Predictive Validity of HCR-20, START, and Static-99 Assessments in Predicting Institutional Aggression Among Sexual Offenders
Joel K. Cartwright, North Carolina State University and RTI International, Research Triangle Park, North Carolina
Sarah L. Desmarais, North Carolina State University
Justin Hazel and Travis Griffith, California Department of State Hospitals–Coalinga, Coalinga, California
Allen Azizian, California Department of State Hospitals–Coalinga, Coalinga, California, and California State University, Fresno
Sexual offenders are at greater risk of nonsexual than sexual violence. Yet, only a handful of studies have examined the validity of risk assessments in predicting general, nonsexual violence in this population. This study examined the predictive validity of assessments completed using the Historical-Clinical-Risk Managment-20 Version 2 (HCR-20; Webster, Douglas, Eaves, & Hart, 1997), Short-Term Assessment of Risk and Treatability (START; Webster, Martin, Brink, Nicholls, & Desmarais, 2009), and Static-99R (Hanson & Thornton, 1999) in predicting institutional (nonsexual) aggression among 152 sexual offenders in a large secure forensic state hospital. Aggression data were gathered from institutional records over 90-day and 180-day follow-up periods. Results support the predictive validity of HCR-20 and START, and to a lesser extent, Static-99R assessments in predicting institutional aggression among patients detained or civilly committed pursuant to the sexually violent predator (SVP) law. In general,
HCR-20 and START assessments demonstrated greater predictive validity—specifically, the HCR-20 Clinical subscale scores and START Vulnerability total scores—than Static-99R assessments across types of aggression and follow-up periods.
HCR-20, Static-99R, START, risk assessment, sexual offender
Summary of the Research
“Structured risk assessment protocols have become integral components of the criminal justice system, as part of efforts to mitigate the potential risk to public safety, as well as strategies designed to rehabilitate offenders. One subgroup of offenders for which this is especially true is sexual offenders. Indeed, many jurisdictions mandate the use of risk assessment instruments with sexual offenders to assist in making decisions regarding placement, treatment, and other management concerns within the institutions. As a result, the use of instruments designed to inform assessments of institutional aggression among sexual offenders is common practice in correctional and forensic psychiatric settings” (p. 13).
“Accordingly, much empirical attention has been focused on testing the psychometric properties and establishing the validity of assessments completed using these instruments for predicting sexual violence in this population. Overall, findings of the extant research provide overwhelming support for the validity of risk assessment tools in predicting sexual recidivism . . . In contrast, less empirical attention has focused on the assessment of risk for general (nonsexual) aggression among sexual offenders, and institutional aggression specifically, which is the focus of the present study” (p. 14).
“While the importance of assessing risk for sexual recidivism is without question, sexual offenders demonstrate higher rates of nonsexual violent recidivism than rates of sexual recidivism . . . Many sexual offenders are hospitalized for very long periods, particularly under SVP laws. Thus, risk of aggression posed by sexual offenders against staff, peers, and property within the institution is a pressing safety and risk management concern. Consequently, the accurate assessment of institutional violence risk within sexual offenders would benefit case management and treatment, as well as assist in decisions regarding supervision and release. However, there has been limited evaluation of the validity of assessments completed using tools designed to forecast general (nonsexual) violence risk among sexual offenders” (p. 14).
“Meta-analytic research shows that many violence risk assessment instruments can have good validity in predicting violence. Research also shows that instruments developed for predicting violent offending perform better than those developed to predict sexual or general offending. When risk for general violence is assessed among sexual offenders, tools designed to evaluate risk for sexual recidivism, such as the Static-99R, are frequently used rather than tools designed to evaluate risk for general (nonsexual) violence. However, the HCR-20 and, most recently, the START are two risk assessment instruments now used to assess risk for general violence in this population” (p. 14).
“This study examines the predictive validity of the HCR-20 and START assessments, as well as Static-99R assessment, in predicting institutional aggression in a sample of 152 male sexual offenders. Our specific aims were to: (a) examine the distribution of HCR-20, START, and Static-99R assessment scores and risk estimates; (b) evaluate concordance among the HCR-20, START, and Static-99R assessments; and (c) test the predictive validity of the HCR-20, START, and Static-99R assessments in predicting institutional aggression over 90 and 180 days” (p. 15).
“Overall, almost a quarter of the sample engaged in some form of aggression during the 90-day follow-up period and over a third of the sample at 180-day follow-up . . . Across both START and HCR-20 assessments, very few patients were rated as high risk for violence (4.1% and 6.5%, respectively). Using the HCR-20, the majority were rated as low risk (67.4%); few were rated as moderate (26.1%) or high (6.5%). START final risk estimates showed a similar pattern of results: most participants were rated as low risk (83.7%), followed by moderate (12.2%) and high (4.1%). The Static-99R risk classifications demonstrated an inverse pattern of results, with more than half of participants classified as high risk (54.5%). Approximately one third (32.1%) were classified as moderate risk by the Static-99R, and relatively few (13.4%) as low” (p. 17-19).
“Associations were moderate to strong between START Strength total scores and Vulnerability total scores, and between HCR-20 subscale and total scores. Conversely, START Strength and Vulnerability total scores were weakly associated with Static-99R scores, if at all. The HCR-20 and START risk estimates showed moderate agreement. There were no instances in which a patient was identified as high risk on the HCR-20 or the START and low risk on the other instrument, and both HCR-20 and START assessments showed similar distributions of violence risk estimates” (p. 19).
“Over the 90-day follow-up period, HCR-20 total scores predicted all forms of aggression except physical aggression toward objects. The HCR-20 Clinical subscale score was the most predictive of the HCR-20 subscales, predicting all forms of aggression. All three HCR-20 subscale scores and total score predicted both any aggression and verbal aggression. START Strength total scores predicted all forms of aggression with the exception of physical aggression toward objects. START Vulnerability total scores predicted all forms of aggression. Static-99R total scores predicted any aggression and verbal aggression, but not physical aggression toward others or toward objects” (p. 19).
“We found significant discrimination among participants classified as low compared with moderate and high risk on the HCR-20 for any aggression, verbal aggression, and physical aggression toward others. For example, those rated high risk on the HCR-20 were almost 20 times more likely, and those rated as moderate risk were over 4 times more likely, to engage in any aggression compared with those rated as low risk. For physical aggression toward objects, there was only significant discrimination between those classified as low versus high risk, but not moderate versus high risk. For the START assessments, there was significant discrimination among participants classified as low compared with moderate and high risk for any aggression and verbal aggression. To demonstrate, those rated as high risk on the START were almost 15 times more likely, and those rated as moderate risk were over 7 times more likely, to engage in any aggression compared with those rated as low risk. For physical aggression toward objects, significant discrimination was only found for those identified as low versus moderate risk. START violence risk estimates did not discriminate among participants in the prediction of physical aggression toward others. Statistics for Static 99-R risk categories could not be calculated due to empty cells” (p. 19-20).
“Over the 180-day follow-up period, HCR-20 Clinical and Risk Management subscale scores, as well as the HCR-20 total score, predicted all outcomes, with one exception: Historical subscale scores were not associated with physical aggression toward objects. START Vulnerability total scores showed strong predictive validity across outcomes. START Strength total scores predicted any aggression and verbal aggression, but not physical aggression toward others or toward objects. Static-99R total scores showed moderate associations with any, verbal, and physical aggression toward others, but not physical aggression toward objects” (p. 20-21).
“HCR-20 risk estimates predicted all forms of aggression during this time frame, with the greatest discrimination appropriately found between those estimated as low and high risk. To demonstrate, those estimated as high risk using the HCR-20 were over 20 times more likely to engage in any or verbal aggression, over 70 times more likely to engage in property damage, and almost 15 times more likely to engage in aggression toward others compared with those classified as low risk. In contrast, we did not find discrimination between those rated low versus high risk on the START. However, those estimated as moderate risk using the START were approximately 8 times more likely to engage in any or verbal aggression, over 12 times as likely to engage in physical aggression toward objects, and over 3 times more likely to engage in physical aggression toward others when compared with those classified as low risk. For the Static-99R, ORs were significant for only one comparison: those classified as high risk on the Static-99R were almost 5 times more likely to engage in verbal aggression than those classified as low risk” (p. 21).
Translating Research into Practice
“Although we may have anticipated ceiling effects on the HCR-20 subscale and total scores, and START Vulnerability total scores, as well as floor effects for the START Strength total scores, in our relatively homogenous sample of male sexual offenders, this was not the case. Instead, assessments made use of the full range of possible scores and violence risk estimates for both HCR-20 and START ratings. This finding suggests that HCR-20 and START assessments may be useful for distinguishing between patients more or less likely to engage in aggressive behaviors even within a somewhat homogenous, high-risk population. They also suggest that HCR-20 and START assessments may be useful for informing supervision decisions and risk management strategies (e.g., identifying which patients require higher security levels)” (p. 21).
“Further, we found high rates of concordance between the results of HCR-20 and START assessments, but low rates of concordance among HCR-20 and START total scores with Static-99R total scores and high rates of discordance among HCR-20 and START total scores with Static-99R. These patterns of results are not surprising, given that both instruments were developed to predict violence risk over the short-to-medium term. In contrast, the Static-99R is designed to predict sexual recidivism over much longer time frames. Nonetheless, these findings indicate that the HCR-20 and START are measuring constructs and risks that are distinct from those measured by the Static-99R. And, as such, they support the use of the HCR-20 and START in addition to the use of the Static-99R in clinical practice with sexual offenders. Indeed, results of the predictive validity analyses provided stronger support for the use of the HCR-20 and START, compared with the Static-99R, in assessing risk for different forms of institutional aggression among sexual offenders” (p. 21-22).
“Consistent with prior research examining the predictive validity of the HCR-20, HCR-20 assessments performed well across outcomes. Like prior studies, however, the HCR-20 Historical subscale demonstrated the lowest levels of predictive validity of the HCR-20 assessment components and failed to predict physical aggression toward objects or others. In contrast, the HCR-20 Clinical subscale performed the best of the HCR-20 scales and predicted all forms of aggression at good or excellent levels. Generally, performance of HCR-20 assessments was greater for the prediction of aggression over the 180-day than 90-day follow-up period, demonstrating good to excellent predictive validity. Taken together, these findings add to the empirical evidence supporting the use of the HCR-20 for identifying violence risk over the medium term (i.e., 6 months) among sexual offenders” (p. 22).
“START assessments, including the Vulnerability and Strength total scores, as well as violence risk estimates, showed good to excellent validity in predicting any aggression and verbal aggression over both 90-day and 180-day follow-up periods. In fact, of all the assessments, the START Vulnerability total score outperformed any other HCR-20, START, or Static-99R subscale or total score. The extant literature varies on whether the START Strength or Vulnerability total scores perform better than the other, but the current results suggest greater validity of the Vulnerability than Strength total scores in the prediction of institutional aggression among sexual offenders. Strength total scores nonetheless demonstrated good validity in predicting any aggression, verbal aggression, and physical aggression, particularly over the 90-day follow-up period. This finding is consistent with the START’s intended 3-month assessment and prediction time frame and is similar to, if not slightly better than, findings reported in prior studies of START assessments in forensic psychiatric patients. Overall, findings support the use of the START in the assessment and management of risk for short-term institutional aggression among sexual offenders” (p. 22).
“Finally, the Static-99R assessments showed fair to good validity in predicting any aggression and verbal aggression, as well as physical aggression toward objects, but not physical aggression toward others. Further, although the Static-99R assessments demonstrated validity in predicting these forms of aggression, performance was consistently poorer compared with the performance of both the HCR-20 and START assessments. Prior research has found good validity of Static-99R assessments in predicting general aggression; however the majority of these studies have focused on community-based rather than institutional aggression, have aggregated sexual offenses with general offenses, and have investigated much longer follow-up periods. For these reasons, findings of the current study suggest that the Static-99R is most appropriately used for estimating sexual recidivism risk and that general violence risk assessment instruments, such as the HCR-20 or START, should be used to assess general aggression within sexual offenders” (p. 22).
Other Interesting Tidbits for Researchers and Clinicians
“This study supports the validity of the HCR-20, START, and to a lesser extent, the Static-99R assessments in predicting institutional aggression among patients detained or civilly committed pursuant to the SVP law. Typically, risk assessment instruments have shown lower levels of predictive validity in field studies compared with development studies. However, this does not appear to be the case in this sample of SVPs” (p. 23).
“This study adds to the body of literature supporting the application of structured violence risk assessments across diverse populations in the criminal justice system, and sexual offenders specifically. Beyond the assessment of risk for sexual recidivism, our findings suggest that general violence risk assessment instruments, such as the HCR-20 or START, have a place in the assessment and management of sexual offenders. Indeed, results indicate that instruments designed to assess sexual recidivism risk, and the Static-99R in particular, are limited in their ability to assess risk for general (nonsexual) violence. This is in keeping with the recommendations of the Static-99R authors to administer the Brief Assessment for Recidivism Risk – 2002R (BARR-2002R) for predicting nonsexual violence among sexual offenders, though this is not always done in practice. Consistent with the risk-need-responsivity model, findings suggest that using the HCR-20 or START to identify general violence risk among sexual offenders would benefit case management and treatment, as well as assist in decisions regarding supervision and release. Yet, the contributions of such assessments to clinical practice and, ultimately, violence prevention among sexual offenders remain to be tested in future research” (p. 23).
Join the Discussion
As always, please join the discussion below if you have thoughts or comments to add! To read the full article, click here.
Authored by Becca Cheiffetz
publicBecca Cheiffetz is a master’s student in the Forensic Psychology program at John Jay College of Criminal Justice. She graduated in 2015 from Sam Houston State University with a BS in Psychology and plans to continue her studies in a Clinical/Forensic Psychology PhD program in the near future. Her professional interests include providing clinical evaluations and treatment for individuals in prison as a prison psychologist and conducting forensic assessments for defendants in criminal court.