The START is a better predictor of aggression and self-harm in women than it is for men. This is the bottom line of a recently published article in International Journal of Forensic Mental Health. Below is a summary of the research and findings as well as a translation of this research into practice.
Featured Article | International Journal of Forensic Mental Heath | 2015, Vol. 14, No. 2, 132-146.
Predictive Validity of the Short-Term Assessment of Risk and Treatability (START) for Aggression and Self-Harm in a Secure Mental Health Service: Gender Differences
Laura E. O’Shea, St. Andrew’s Academic Department, Northampton, United Kingdom; Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, United Kingdom
Geoffrey L Dickens, St. Andrew’s Academic Department, Northampton, United Kingdom; School of Social and Health Sciences, Abertay University, Dundee, United Kingdom
The START predicts aggressive outcomes and to some extent self-harm. However, it is not known whether gender moderates its performance. This study used routinely collected data to investigate the predictive ability of the START for aggression and self-harm in secure psychiatric patients. Utility of the START was examined separately for men and women. The START was a stronger predictor of aggression and self-harm in women than men. The specific risk estimates produced large effect sizes for the prediction of aggression and self-harm in women; none of the AUC values reached the threshold for a large effect size in the male sample.
risk assessment, START, gender aggression, self-harm
Summary of the Research
“Given that risk assessments are frequently used to inform decisions about restrictive management interventions it is crucial to determine their effectiveness in all the groups to which they are applied. This is an important consideration as clinicians have to determine the relevance of evidence derived from validation studies to the individual case at hand when making risk judgments. However, samples in studies of the predictive validity of risk assessment tools have been primarily male. This has limited the detailed examination of whether, and the extent to which, their performance significantly differs as a function of gender” (p.132).
“Previous research has identified gender differences in reasons for engaging in self-harm; women more commonly endorsed statements describing self-harm as serving an avoidance or punishment function, while men endorsed statements about self-harm as an attention-getter or as a show of personal strength. The underlying differences in the factors predicting risk behaviours suggests that formal risk assessment schemes might perform differently as a function of gender. From a theoretical gendered perspective, it may be expected that this difference would manifest in the form of superior performance of risk assessment tools among males, due to the predominance of male samples in their development. It is therefore perhaps counterintuitive that recent research has suggested that structured professional judgment tools have at least equal efficacy in women compared with men” (p.133).
“The Short-Term Assessment of Risk and Treatability is a commonly used structured professional judgment tool, which was developed in the context of two frequent criticisms of such schemes. First, that previous risk assessment tools have focused exclusively on factors associated with increased risk while ignoring protective factors. Second, that risk schemes focus on aggression and violence despite the range of clinical issues facing psychiatric patients. The current study aims to establish whether the predictive efficacy of the START for inpatient aggression and self-harm differs as a function of gender, whilst controlling for significant covariate characteristics. We hypothesized that the START would perform best in women, due to increasing evidence that risk assessment tools perform more accurately among this group in inpatient settings” (p.133).
“The current study has provided the first evidence that the START is a better predictor of aggression and of self-harm for women than it is for men. Importantly, this was the case when significant potential confounders were controlled for including diagnosis, ethnicity, and previous relevant behavioral history. Women identified as being at elevated risk of violence were 36–52 times more likely to engage in physical aggression than those rated as low risk and those rated at elevated risk of self-harm or suicide were 3–7 times more likely to engage in the corresponding outcome” (p. 139).
Translating Research into Practice
These findings provide important information about the implications for using the START in practice. “First, the START specific risk estimates are strong predictors of aggressive and self-harm outcomes for women; second, that the START is a moderate predictor of aggressive outcomes in men, but the START scores do not predict self-harm/suicidal behavior. As a result, practitioners may have a degree of added confidence in their START assessment rating if the subject in question is female” (p.140).
“Positive predictive values for aggressive outcomes were moderate to large. This suggests that clinicians can be reasonably confident that those rated at elevated risk of engaging in aggression will do so and implement management strategies accordingly; however, some individuals identified as low risk are engaging in aggressive behaviors, suggesting that there may be additional risk factors that are not covered by the START” (p.141).
“The current study adds to a growing body of evidence that risk assessment instruments provide more accurate predictions of inpatient aggression and self-harm for women than for men. Interpretation of the results of studies of the predictive accuracy of the START should always consider the proportion of females in the sample since they are likely to inflate the effect sizes detected. There is a lack of theoretical explanation for the repeated empirical finding that risk assessment tools perform better in women than men. When attempting to quantify the relevance of group-derived data to the case at hand, clinicians must determine the degree to which it is reasonable to present the individual as if it was a case from the validation sample” (p. 141).
“With the exception of self-harm in the female sample, Strength items were more potent predictors than the Vulnerability items, suggesting that interventions aimed at bolstering strengths may be more effective than those aimed at reducing vulnerabilities. This may run contrary to clinicians’ perceptions; the fact that the mean number of critical items identified was higher than the mean number of key items identified in both samples suggests that clinicians consider Vulnerabilities to be more important and are giving them more weight” (p.141).
Other Interesting Tidbits for Researchers and Clinicians
“Closer examination of individual START items revealed that those with the best predictive potency differed between men and women. S9 (impulse control) seemed a particularly relevant item for males, being among the most predictive items for all outcomes; DBT strategies such as behavioural analysis, distress tolerance skills, and emotion regulation have been suggested as possible treatment targets for impulsivity and may prove useful in reducing aggression and self-harm in males” (p.141).
“Interestingly, there was no correspondence between the key Strengths or critical Vulnerabilities identified in either sample, and the most predictive items for the relevant group. This suggests that, at least at a group level, the items which clinicians are identifying as most important to the case in hand are not those that demonstrate the greatest predictive ability. It is likely that clinicians are giving extra weight to items identified as key or critical when forming their specific risk estimates. If this is the case, the current analyses suggest that although the specific risk estimates have reasonable predictive efficacy among the female samples, the items may not be being considered in the optimal manner. It is reasonable that the items identified as the most potent predictors at the group level may not be the most relevant items in all individual cases, particularly when individuals have low scores on these items” (p.142).
Join the Discussion
As always, please join the discussion below if you have thoughts or comments to add!