Is cross-examination an effective means to educate jurors on valid and reliable evidence?

Is cross-examination an effective means to educate jurors on valid and reliable evidence?


Judges, attorneys, and jurors were largely insensitive to variations in scientific quality. This is the bottom line of a recently published article in Law and Human Behavior. Below is a summary of the research and findings as well as a translation of this research into practice.

Featured Article | Law and Human Behavior | 2019, Vol. 43, No. 6, 542-557

Variations in Reliability and Validity Do Not Influence Judge, Attorney, and Mock Juror Decisions About Psychological Expert Evidence


Jacqueline Austin Chorn, John Jay College of Criminal Justice and the Graduate Center, City University of New York
Margaret Bull Kovera, John Jay College of Criminal Justice and the Graduate Center, City University of New York


Objective: We tested whether the reliability and validity of psychological testing underlying an expert’s opinion influenced judgments made by judges, attorneys, and mock jurors. Hypotheses: We predicted that the participants would judge the expert’s evidence more positively when it had high validity and high reliability. Method: In Experiment 1, judges (N = 111) and attorneys (N = 95) read a summary of case facts and proffer of expert testimony on the intelligence of a litigant. The psychological testing varied in scientific quality; either there was (a) blind administration (i.e., the psychologist did not have an expectation for the test result) of a highly reliable test, (b) nonblind administration (i.e., the psychologist did have an expectation for the test result) of a highly reliable test, or (c) blind administration of a test with low reliability. In a trial simulation (Experiment 2), we varied the scientific quality of the intelligence test and whether the cross-examination addressed the scientific quality of the test. Results: The variations in scientific quality did not influence judges’ admissibility decisions nor their ratings of scientific quality nor did it influence attorneys’ decisions about whether to move to exclude the evidence. Attorneys’ ratings of scientific quality were sensitive to variations in reliability but not the testing conditions. Scientifically informed cross-examinations did not help mock jurors (N = 192) evaluate the validity or the reliability of a psychological test. Conclusion: Cross-examination was an ineffective method for educating jurors about problems associated with nonblind testing and reliability, which highlights the importance of training judges to evaluate the quality of expert evidence.


cross-examination, decision making, jurors, expert evidence

Summary of the Research

“Since the 14th century, courts have sought the knowledge of experts to assist in rendering appropriate legal judgments, but the use and admission of psychological experts in courts can be controversial. The U.S. Supreme Court clarified standards for the admissibility of scientific evidence in federal courts in Daubert v. Merrell Dow Pharmaceuticals (1993). In Daubert, the Court rejected the general acceptance test established in Frye v. United States (1923) as the singular criterion for admitting expert testimony. Instead, under Daubert, expert evidence must meet two admissibility criteria: The evidence must be both relevant and reliable. With regard to the reliability criterion, the Court provided a list of four nonexhaustive factors that judges could use in their determination of reliability: whether the data can be falsified, whether the data have been subjected to peer review or publication, whether the known or potential error rates are available, and general acceptance of the findings in the relevant scientific com- munity. With Daubert, the Court placed the responsibility of determining whether expert evidence is reliable squarely on the shoulders of trial judges, making them the gatekeepers for the admissibility of expert testimony” (p. 543).

“Although judges may make a good faith effort to distinguish between high- and low-quality expert testimony, they may lack the skills necessary to detect flaws in research. According to a national survey of state trial court judges, 91% of the state judges believed that it was appropriate for them to have the role of gatekeeper, yet only 52% opined that their education had sufficiently prepared them to perform these duties. Moreover, very few judges responding to this survey understood concepts central to the Daubert admissibility criteria, including error rates and falsifiability” (p. 543).

“In another test of whether judges make competent judgments about the validity of expert evidence, Florida state court judges read about research underlying expert testimony in a hostile work environment case. The descriptions of the research varied its internal validity (valid, missing control group, confound, or experimenter bias) and peer reviewed status (published in a peer review journal or not). Judges ruled whether they would admit the testimony in court, provided justifications for their decision, and evaluated the quality of the expert testimony. Irrespective of study validity, judges admitted the evidence at the same rates. Moreover, when providing justifications for their decisions, judges very rarely mentioned the presence of an internal validity threat” (p. 543).

“It is possible that attorneys could aid judges’ evaluations of expert evidence through their arguments accompanying a motion to exclude the expert evidence. However, this assistance from attorneys would be likely only if they were sensitive to variations in the methodological quality of research. Yet when attorneys evaluated the same research summaries that were provided to, they provided more positive ratings of the evidence when the testimony was generally accepted within the scientific field than when it was not, irrespective of the study’s validity. In addition, 95% of attorneys stated that if they were the opposing attorney, they would file a motion to exclude the testimony. Thus, evidence validity did not affect attorneys’ decisions to file a motion. Rather most attorneys stated that they would routinely file to exclude expert testimony that did not favor their case as part of their typical trial strategy. This pattern of effects suggests that motions to exclude expert evidence are unlikely to provide information that will assist judges in their gatekeeping role” (p. 543).

“When judges and attorneys fail to exclude invalid or unreliable science, jurors become responsible for critically evaluating scientific evidence and assigning the appropriate weight to that testimony when rendering a verdict. Most laypeople, like judges and attorneys, are not trained in scientific reasoning and struggle with the concepts necessary to evaluate scientific quality. For example, without proper instruction, people did not recognize that results from small samples are more likely to be erroneous than results from large samples and did not consider issues of sample representation when determining whether to generalize from a sample to the population. In the legal setting, mock jurors were sensitive to some internal validity threats such as missing control groups. However, they were unable to recognize more sophisticated threats to internal validity like confounds and nonblind testing. In addition, mock jurors were less critical of scientific evidence when it was presented in the context of a trial than when they reviewed it outside of a trial context, which suggests that jurors assumed evidence admitted at trial is valid” (p. 543).

“We designed two studies to test several remaining empirical questions. Studies to date have tested whether legal decision makers can evaluate scientific validity accurately but not whether they are sensitive to variations in the reliability of psychological tests. Are judges and attorneys knowledgeable about reliability? If so, will judges make admissibility decisions that reflect that knowledge? Will attorneys be more likely to file motions to exclude expert testimony based on assessments conducted by psychologists who are not blind to the hypothesis being tested by the administrator or when the psychological test used has low reliability? Will both judges and attorneys formulate cross-examination questions that reflect their understanding of reliability and validity? Would a scientifically sophisticated cross-examination help jurors evaluate the reliability and validity of psychological testing? If not, might cross-examination by a judge (rather than an attorney) increase jurors’ motivation to process expert evidence, so when combined with a sophisticated cross-examination, jurors are both motivated and able to recognize flawed testing? Or might sophisticated cross-examination merely undermine the credibility and trustworthiness of experts, irrespective of the reliability of their testimony, without helping evaluators identify unreliable evidence?” (p. 544).

“[T]he results of our second study contribute to our understanding regarding the effectiveness of Daubert’s safeguards. In the first study, we found that judges admitted both unreliable and invalid expert evidence at high rates, which is concerning when we consider that attorneys’ ability to expose methodological issues relevant to validity and reliability during cross-examination may not matter if jurors cannot be trained to evaluate scientific quality through cross-examination. In the second study, we found that cross-examination may not function as a method of scientifically training jurors to reason about reliability and nonblind testing. This study adds to prior findings jurors have trouble identifying sophisticated threats to scientific validity like confounds and nonblind testing without assistance. And although cross-examination has been successfully used in two studies to educate jurors about missing control groups which arguably should be more familiar to a lay population than some of these other methodological flaws, cross-examination has not worked to improve decision-making about construct validity or nonblind testing, as was evident in the present study. Similarly, opposing expert testimony has not successfully helped jurors evaluate confounds but showed some slight ability to sensitize jurors to missing control groups. Unfortunately, scientific studies do not only suffer from simple, easily identifiable threats so we must identify interventions that help jurors and other legal decision makers spot these flaws” (p. 556).

Translating Research into Practice

“If cross-examinations do not act as a form of methodological training for jurors, the judge’s role as the gatekeeper for admissible evidence (i.e., to keep unreliable evidence from being presented at trial) becomes significantly more important. A small percentage of judges do appear to craft questions to experts that would help to elicit information about reliability and validity during questioning of the expert. However, these percentages were far from 100% and this research did not provide judges with answers to their questions, so we cannot answer whether the responses from experts would effectively educate judges. As such, the legal field should begin to examine ways for judges and attorneys to receive continuing legal education about scientific evidence, perhaps through continuing legal education courses or professional manuals” (p. 556).

“In addition to continuing legal education for judges, research should continue to assess viable methods of training jurors to reason scientifically. To this end, a deeper understanding is needed to understand how jurors evaluate scientific quality and to identify the factors that may be effective in educating jurors. Rarely is science ever “good” or “bad.” How do jurors weigh evidence when there are both good and bad components present (e.g., the study is reliable, but lacks validity)? No empirical study is ever perfect, but when scientists choose whether to accept an article for publication, they weigh the strengths and limitations of a study’s methodology. Can we expect untrained laypeople to do the same? Further, we also need to understand whether jurors can appropriately apply judgments about scientific quality in their evaluations of trial evidence and strength. If we can sensitize jurors to scientific quality, can they then translate those skills into a verdict? In other words, can jurors use validity assessments to improve decision making or can they only be trained to differentiate between valid and invalid science? If jurors are unable to adjust critical trial judgments after identifying substandard expert evidence, then additional effort should be concentrated on equipping judges with the skills necessary to keep inferior science out of court” (p. 556).

Other Interesting Tidbits for Researchers and Clinicians

“It is, of course, possible that our scientifically informed cross-examination was simply ineffective at educating jurors about scientific quality. Potentially, jurors can be trained to reason about science with cross-examination, but this particular operationalization of a scientifically informed cross-examination may have been ineffective at helping jurors to evaluate evidence scientifically. During direct examination, the expert explained that good psycho- logical tests are both valid and reliable and simply introduced the topics of reliability and validity to the jury. On cross-examination, the expert elaborated on these issues by responding to questions about the validity and reliability of the intelligence test employed by the expert. Furthermore, the expert made a direct link between the concepts of validity and reliability and the intelligence test during this cross-examination, a practice that has increased the influence of expert testimony in other studies. Specifically, the expert stated whether the intelligence test was of excellent or average reliability and explicitly admitted that the test could be invalid due to experimenter bias because she only tested the IQ of children the school officials thought might be in need of remedial classes or she stated that the test was not invalid due to experimenter bias because she did not know the students’ predicted intelligence before she administered the test and examined all students regardless of their predicted intelligence. It is unclear what else could be incorporated during cross-examination to pro- vide jurors with the tools necessary to evaluate methodological concepts within the confines of a cross-examination” (p. 555).

Join the Discussion

As always, please join the discussion below if you have thoughts or comments to add!