You Can’t Pull a Fast One on the MMPI-2: Detection of Coached Malingering

You Can’t Pull a Fast One on the MMPI-2: Detection of Coached Malingering

Symptom coaching and malingering remain acute problems for forensic evaluators. However, meta analyses show that the MMPI-2 validity scales are sensitive enough to detecting malingering even in the context of coaching. This is the bottom line of a recently published article in Psychological Assessment. Below is a summary of the research and findings as well as a translation of this research into practice.


Featured Article | Psychological Assessment | 2021, Vol. 33, No. 8, 729-745

The Impact of Coaching on Feigned Psychiatric and Medical Symptoms: A Meta-Analysis Using the MMPI-2


Maria Aparcero, Fordham University
Emilie H. Picard, University of Virginia Health System
Alicia Nijdam-Jones, University of California
Barry Rosenfeld, Fordham University


Coaching individuals on test-taking strategies presents legal and ethical concerns and threatens the validity of psychological test score interpretations, which could lead to inaccuracies in clinical settings and injustices within the legal system. This meta-analysis examined the impact of coaching on the detection of symptom exaggeration or feigning on the MMPI-2. A total of 99 feigning studies (N = 19,536) comparing validity subscale scores between genuine and nongenuine (coached or non-coached) responders were analyzed. Potential moderating effects of control group, type of symptoms, publication status, financial incentive, and non-content validity screening were also examined regarding their impact on the effectiveness of coaching. Findings suggested that detection-based coaching (i.e., coaching regarding the presence of validity scales and detection avoidance strategies within the MMPI-2) improved individuals’ ability to elude detection by the MMPI-2 validity scales. Nonetheless, the MMPI-2 validity scales still generated moderate to very large effect sizes in detecting symptom exaggeration or feigning even in the context of coaching (range g = .89 to 1.95). The findings provide reassurance for detection efforts, indicating that while the effectiveness of the MMPI-2 is somewhat diminished, it remains useful in detecting non-genuine responders even in the context of coaching.


coaching, feigning, malingering, forensic assessment, meta-analysis

Summary of the Research

“Coaching individuals on either how to feign symptoms or respond to a psychological test presents legal and ethical concerns and threatens the validity of psychological test score interpretations. If undetected, this distortion could lead to inaccuracies in clinical settings and injustices within the legal system.” (p. 729)

“The PICO criteria were used to define the inclusion criteria for this meta-analysis: Problem of interest (P), Intervention (I), Comparison (C), and Outcome (O). All studies included in this meta-analysis examined the MMPI-2 validity scales (P) using a simulation or known-groups research design (I). Studies had to compare validity scale scores (C) between genuine responders (either a non-clinical sample responding honestly or a clinical sample with genuine psychopathology) and non-genuine (coached or non-coached) responders who exaggerated or feigned symptoms. To calculate standardized mean difference effect sizes, studies needed to report the mean and standard deviation for at least one MMPI-2 validity scale (O). This meta-analysis only included studies conducted in English-speaking countries using the English language version of the MMPI-2 with adults (i.e., study participants were between 18 and 80 years old), allowing the comparison of scale scores (calculated using the same norms) across studies. When studies reported duplicate data (i.e., same samples and validity scales), only the most comprehensive study was included.” (p.730)

“The RVE analyses showed that detection-based coaching improved individuals’ ability to elude detection by the validity scales, while disorder-based coaching did not impact performance. Nonetheless, even when non-genuine responders were coached, the MMPI-2 still generated moderate to very large, and statistically significant effect sizes in differentiating genuine responders.” (p.736)

“…some validity scales were more compromised by coaching than others, particularly after participants received detection-based coaching. Among the MMPI-2 scales, Fp seems to be the least impacted by detection-based coaching and slightly better in detecting nongenuine individuals who have received detection-based coaching, with a reduction in the overall effect size by about 10% with coaching. On the other hand, the F scale effect size decreased by 40% among participants who received detection-based coaching. However, because the F scale was superior among uncoached participants, the two scales became roughly comparable (g = 1.29 and 1.19, for Fp and F respectively) in the context of coaching. Nevertheless, both remained effective in detecting coached responders who exaggerated or feigned symptoms.” (p.736)

Translating Research into Practice

“the following validity scales have been sufficiently researched to permit meta-analysis on them: F, Fb, Fp, FBS, Ds/ Dsr, O–S, and F–K. These scales use different strategies to detect overreporting (a) the F (Infrequency) and Fb (Back-Infrequency) scales use a quasi-rare-symptom strategy that identifies the endorsement of symptoms rarely present in the normative sample; (b) the Fp scale (Infrequency-Psychopathology) uses a rare-symptom strategy that detects symptomatology or impairment that is uncommon among genuine psychiatric patients; (c) the FBS (Fake-Bad Scale) and Ds/Dsr scales (Dissimulation scale and its abbreviated version; use an erroneous-stereotypes strategy that relies on the assumption that individuals who attempt to exaggerate or feign cognitive or psychiatric symptoms, respectively, are unlikely to distinguish between common misperceptions and genuine symptomatology; (d) the O–S index (Wiener, 1948) assesses the difference in endorsement of obvious and subtle items, with the overendorsement of symptoms clearly indicative of psychopathology (i.e., obvious symptoms) compared to subtle symptoms suggesting potential feigning or symptom exaggeration; and (e) the F–K index examines the raw score difference of F and K, with a considerably higher score on the F scale than on the K scale indicating maladjustment uncommon to psychiatric patients.” (p.731)

“clinicians need to be aware of the fact that genuine patients tend to have moderate to marked T score elevations on the validity scales. The Fp, which capitalizes on rare symptoms, appeared to be the most robust in minimizing the misclassification of bona fide patients as feigners” (p.740)

“…most feigners had large T score elevations on the validity scales, with T scores around 100, except on the FBS scale in which they had T scores around 85. However, the elevations tended to be more modest among feigners who received detection-based coaching. The FBS scale, which capitalizes on erroneous stereotypes and was rationally derived by selecting items frequently endorsed by personal injury malingerers, appeared the least successful in detecting non-genuine responders. Its lack of success may be due to its narrow focus, the type of samples or symptoms examined across studies, or the development technique used, whose quality has been debated” (p.740)

“In cases where feigning is suspected, clinicians should expect to see scale elevations but, especially in the context of detection-based coaching, they should also consider and weigh converging evidence collected through clinical interview and collateral sources.” (p.740)

Other Interesting Tidbits for Researchers and Clinicians

“Meta-analysis used a Robust Variance Estimation (RVE) method because feigning studies often have multiple comparison groups (i.e., more than one honest or feigning group), yielding multiple effect sizes for each validity scale within the same study. The RVE method adjusts the standard errors of statistically dependent effect sizes as well as the degrees of freedom of small samples (i.e., small number of studies), making this method particularly useful in feigning research. Additionally, the RVE method allowed us to examine moderator effects, providing a clearer understanding of the study characteristics that may impact the effectiveness of coaching in improving individuals’ ability to feign symptoms and evade detection.” (p.731)

Join the Discussion

As always, please join the discussion below if you have thoughts or comments to add!