Home About Us Laboratory Services Forensic Science Communications Back Issues July 2002 Does the Confession Criterion in Case Selection Inflate...
This is archived material from the Federal Bureau of Investigation (FBI) website. It may contain outdated information and links may no longer function.

Does the Confession Criterion in Case Selection Inflate Polygraph Accuracy Estimates?, (Forensic Science Communications, July 2002)

“Does the Confession Criterion in Case Selection Inflate Polygraph Accuracy Estimates?” (Forensic Science Communications, July 2002)

July 2002 - Volume 4 - Number

Research and Technology

Does the Confession Criterion in Case Selection Inflate Polygraph Accuracy Estimates?

Donald J. Krapohl
U.S. Department of Defense Polygraph Institute
Fort Jackson, South Carolina

Kendall W. Shull
Chief (Retired)
Polygraph Unit
Federal Bureau of Investigation
Washington, DC

Andrew A. Ryan
Research Division
U.S. Department of Defense Polygraph Institute
Fort Jackson, South Carolina

Abstract | Introduction | Method: Cases | Instrumentation | Scoring Method | Results | Discussion | References


Many polygraph field studies have relied on confessions as verification of ground truth, a criterion that some critics argue creates an overestimation of polygraph accuracy. This is because there is a relationship between polygraph results and the likelihood that a suspect will confess. Confessions come from interrogations, which follow failed polygraphs. If a guilty person fails the polygraph, an interrogation is initiated, which might yield a confession. If a guilty person passes the polygraph, there is no interrogation, no confession, and little chance the polygraph error will be uncovered. This would suggest that among guilty suspects, there could be qualitative group differences between confession and nonconfession cases. The biasing effect of this confession criterion has not yet been resolved. In this study, a comprehensive sample of field polygraph cases from a large U.S. government polygraph program was examined to uncover differences in the polygraph detectibility of guilty confessing suspects, and guilty suspects who did not confess but were caught by other means. The present data failed to find any differences in the groups. This manuscript does, however, correct errors published elsewhere regarding law enforcement polygraph and investigative practices in the field.


Among forensic disciplines, none is as controversial as using the polygraph to detect deception. The use of the polygraph to uncover criminal and security-related behaviors now spans seven decades and has been the center of heated debate for virtually the entire period. There are many facets to the debate, but the most frequent issue centers on the accuracy of the comparison question technique, the most common polygraph technique in the field. Critics have charged that the comparison question technique (formerly known as the control question technique) lacks validity and argue that the empirical evidence is, at best, incomplete. Proponents agree that more research is needed, but argue that the preponderance of the available field data points to an accuracy of about 90 percent.

Critics are not as comfortable with the available field studies as are the proponents. It is well known that the method in which cases are selected for a study affects the outcome of the study and that some methods are better than others. Polygraph critics contend that existing research supporting polygraphy has systematically stacked the deck in favor of higher accuracy. The biggest culprit, according to some (Ben-Shakhar et al. 1982; Lykken 1998; Patrick and Iacono 1991), is the confession criterion. The confession criterion allows polygraph cases to be selected for research based on the confession of the examinee. This use of the confession criterion may bias the types of cases used in a field study. The confession criterion could inflate accuracy estimates in detecting deception by the way comparison question technique field studies are typically conducted. This is explained in the following paragraphs.

To test the efficacy of the comparison question technique, it is necessary to have confirmed cases, that is, polygraph recordings from a group of examinees for whom ground truth has been unquestionably established. Ground truth is easy to determine in laboratory studies because experimenters assign examinees their roles of guilt or innocence. In the field, on the other hand, examinees arrive for polygraph appointments with self-assigned roles, usually not known to anyone except themselves and their collaborators. Therefore, experimenters must resort to other means to determine ground truth in field studies.

In polygraph field research, the use of the confession criterion is fairly common. The confessions of examinees are the most readily available confirmations, but this is where the problem begins. Guilty examinees typically do not spontaneously confess their crimes or deceptions to polygraph examiners or investigators. They are far more likely to acknowledge their acts during an interrogation. However, in standard polygraph practice, the only examinees who are interrogated are the ones who have failed the polygraph examination. If a guilty person manages to pass the examination, there probably would be no confession because there would have been no interrogation. Therefore, data sets consisting only of confession-confirmed cases might contain merely those where deception was most apparent in the test charts. Cases where the polygraph was fooled would not be found in the sets. As Iacono (1991) points out:

“Because polygraphers seldom discover ground truth except as a consequence of post-test confessions, and because diagnoses evaluated in this way are almost invariably verified as correct, the typical experienced examiner will accumulate a personal record of almost unblemished accuracy (p. 202).”

A similar problem exists with misdiagnosed innocent examinees. If an innocent examinee fails a polygraph examination, he or she almost never confesses, even when interrogated. Unless evidence surfaces that someone else was actually guilty, the case remains unconfirmed and, therefore, would not be selected for accuracy studies. Iacono (1991) adds that cases are closed when an examinee has a deceptive outcome on the polygraph, thereby cutting off the possibility of the discovery of disconfirming information. This policy would reduce the likelihood of an agency ever uncovering the true guilty party and discovering the polygraph error.

Horvath (1977) was the first to investigate the possible relationship between confessions and polygraph accuracy. He drew a sampling of verified and unverified polygraph cases from the files of criminal suspects at a large police agency. He used an equal number of deceptive and nondeceptive cases from the verified and unverified categories, with a total of 112 cases used in the study. The cases were selected randomly to fill the cells, and the criterion for verification was the confession of an examinee. This inculpated the examinee and exculpated others being polygraphed for the same crime. The cases were subjected to blind analysis by ten field examiners who worked in law enforcement. Horvath did not find any differences in the scorers’ decisions with verified and unverified cases. These results led him to cautiously conclude that confession cases did not enjoy better discernment by polygraph examiners, although he recommended further investigation.

Raskin et al. (1988) evaluated all of the U.S. Secret Service polygraph cases for a 2½ -year period and found 76 cases where ground truth was established independently of the polygraph results. Raskin used a two-step process in case confirmation where there was a confession that inculpated or exculpated the examinee, and there was independent physical evidence consistent with the confession. To investigate the possible effects of the confession criterion, Raskin added 20 unconfirmed cases to the set. The 96 cases were then scored manually by U.S. Secret Service polygraph examiners who did not know the ground truth for any of the cases. Raskin reported that the average polygraph scores of confession-confirmed guilty cases along with unconfirmed guilty cases were different by approximately 20 percent. In cases where examinees confessed, the scores were an average of 20 percent more in the deceptive direction than in cases that were decided as deceptive but unconfirmed. At face value, these findings supported the argument that the confession criterion yields inflated accuracy estimates because the confession cases appeared easier to diagnose. However, the Raskin conclusions were mitigated by the findings that the unconfirmed guilty cases had scores 63 percent beyond the threshold needed to make a conclusive decision. In other words, the effect was statistically significant, but effectively trivial.

In their experimental design, Raskin et al. (1988) attempted to control the sampling bias among the innocent cases by requiring that each confirmed innocent case be part of a multiple-suspect investigation in which the culprit was found or that the crime be determined not to have taken place. In that way, any false-positive outcomes could be discovered without biasing the sample. This study has adopted Raskin’s safeguard.

It is prudent to agree with Iacono (2000) who suggested that this safeguard, by itself, might still have two possible weaknesses. First, if a polygrapher knew the outcome of other suspects’ tests, it is not unreasonable that this knowledge could influence how subsequent examinations are interpreted. In the perfect field study, all of the suspects would be polygraphed separately by polygraphers who did not know the number of suspects or the outcomes of the other polygraph examinations. In that way, the polygraph decisions could not be affected by examiner expectancies, one source of variability shown to influence polygraph scoring (Elaad et al. 1994). To control this potential scoring bias in the present study, an automated analysis method was applied that relies on measurements of tracing features rather than on the semiobjective scoring system used in the field. This approach, described later, avoids the confounding influence of examiner expectancies on chart interpretation.

A second potential source of selection bias of innocent cases, Iacono (2000) suggests, is that polygraphers who believe so strongly in their results do not usually test any further suspects in a case once one has failed the polygraph examination. If the failed suspect is actually innocent, subsequent investigative resources can be misdirected, resolution of the case can become more difficult, and the polygraph error can become less likely to be discovered. However, when the polygrapher correctly identifies a suspect, the decisions of nondeception would be confirmed for previous cases. Therefore, when the testing examiner makes the right decision, confirmation is more likely to arise.

There are two assumptions in Iacono’s (2000) hypothesis that bear closer scrutiny. First is the assertion that polygraphers believe in their exams so strongly that they usually stop testing other suspects once one has failed. It should be noted that polygraphers in the U.S. federal government are not empowered to choose whom to polygraph or not to polygraph. These decisions rest in the hands of investigators, managers, and prosecutors whose distance from the polygraph makes them less vulnerable to the errors of such blind acceptance. It is also worthy of note that it would be quite uncommon for any state or local law enforcement agency in the United States to delegate the decision of whom to polygraph to its staff polygraphers. Thus, Iacono’s (2000) assumption in this regard does not apply to examinations conducted in the United States.

The notion that polygraphing stops after an examinee is found deceptive, regardless of who decides whether or not to continue, also communicates an incomplete understanding of law enforcement investigative practices. At the heart is the misapprehension that law enforcement agencies act as though all crimes have a single culprit, that there are no coconspirators or partners that might also be on the list of suspects. In the real world, the decision to stop polygraphing depends on whether investigators are satisfied that all of the perpetrators have been identified, not on whether the polygrapher caught one. Iacono’s (2000) assumption is incorrect on this aspect, as well.

Patrick and Iacono (1991) also examined the sampling bias issue in a field study carried out on police cases from Vancouver, British Columbia. Beginning with 402 possible cases, they pared it to a sample of 89 cases where ground truth was verified to what Patrick and Iacono characterized as “maximum certainty”¾37 were innocent, and 52 were guilty. Among the 52 guilty, according to the Patrick and Iacono criteria, no false negatives were found in their exhaustive review of the evidence. They found that ground truth as determined by examiner-verified cases did not match those of their own strict confirmation criteria. Examiners were far more lenient in their judgments for confirmation of their own work. For example, an examinee was called deceptive on his polygraph examination, and during the post-test interrogation he admitted to committing a crime, though not the specific crime covered in the relevant test questions. The examiner still labeled the case as confirmed. Patrick and Iacono asserted that many comparison question technique field studies are based on just these types of data in which accuracy in detection of deception is inflated by the generous criteria that polygraph examiners afford themselves. Patrick and Iacono overcame this shortcoming through a more rigorous verification process. In addition to the blind scoring of the charts to remove extra polygraphic sources of information, they found that the polygraph decisions were 98 percent correct with guilty examinees, even with their criteria. Patrick and Iacono also reported that post-test confessions were related to highly negative (deceptive) scores. Correct classification of the innocent cases in the Patrick and Iacono study was near chance levels with blind scorers. Though accuracy was far lower than that achieved by the original examiners, the researchers proposed that the blind scoring results of those 37 cases were representative of polygraphy in the field.

Honts (1996) conducted a partial replication of the Patrick and Iacono (1991) study with a smaller data set but developed an innovative approach to test for the biasing effects of the confession criterion. Honts devised a scaling system that quantified the level of confirmation for the cases. The assumptions of the confession criterion bias would lead to the expectation that polygraph scores (and hence, decisions) would be related to the degree in which the criminal cases produced independent evidence. Honts’ results suggested that there was no effect on polygraph scores for the level of confirmation of ground truth; there was no meaningful effect for the confession criterion. His data also confirmed the high accuracy of guilty cases that Patrick and Iacono (1991) reported but found much better accuracy with the innocent cases than those from the Patrick and Iacono sample.

Honts suggested that the Patrick and Iacono (1991) study may have been an outlier because other similar studies (Honts and Raskin 1988; Raskin et al. 1988) found comparable accuracy for guilty and innocent examinees. As one explanation for the discrepant findings, Honts suggested that criterion contamination may have been an issue in the Patrick and Iacono (1991) study, a factor Honts stated had been controlled in the other research. In polygraph studies, criterion contamination can take place when an examinee’s intention to deceive is captured by the examination, though not specifically to the relevant question at hand. Honts used the example taken from the Patrick and Iacono study where one of the polygraphers shared the following details: a suspect had been given a relevant question¾”Did you steal the diamond ring?” The examinee was found deceptive on the polygraph examination and was confronted. He denied that he had stolen the ring but admitted that his brother had. The examinee’s part in the crime was only that he had sold the stolen ring. According to Honts (1996), Patrick and Iacono reported this as a false-positive error because the examinee was called deceptive, even though he was not guilty of the specific relevant question. Honts argued otherwise, pointing to the examinee’s intention to deceive about the ring theft. The issue is the subject of contentious debate even today. Readers wanting the full flavor are directed to the relevant chapters on polygraphy in Faigman et al. (1997).

The unbalanced accuracies in the Patrick and Iacono (1991) study may also have been the consequence of how the polygraph was applied to criminal investigation by that polygraph agency. In some settings the polygraph is used more generally to simply determine who should remain on the list of suspects. In other words, the appearance of unfair polygraph outcomes in Vancouver may have been that the decision rules were set so that no guilty examinees would escape, but that some percentage of innocent examinees would pass through. In the end, this method concentrates the suspect pool so that investigative resources can be more wisely invested. Because the polygraph was not used to incarcerate or convict the suspects, there was a relatively small cost to a false-positive outcome that might spring from biased decision rules. Those innocent examinees were on the suspect list before being polygraphed, and the polygraph examination merely failed to remove them from that list. In view of the potential harm to the community that can arise from a false-negative decision, especially when speaking of violent offenders, it may be that some police agencies adjust the decision rules to ensure those examinees are correctly classified, even when it means retaining some innocent examinees on the list. However, very different decision rules may be appropriate for other circumstances where there are more dire consequences for false-positive outcomes. For example, a far more balanced approach is warranted when polygraph evidence is used in courts of law.

Getting around suspected sampling problems has not been easy, and to date no mutually satisfying solution has been reported in the literature. Lykken (1998) proposed a novel approach to the investigation of accuracy of field polygraphy. He suggested that the FBI could use its own polygraph examiner staff, employing the polygraph in its usual manner, but to set aside the results of the examinations, and take no action; that is, no interrogations. At some later date, a panel would try to verify ground truth from all available evidence and compare it to blind scorings of the polygraph data.

Though it is interesting from an academic viewpoint and would help answer the question of polygraph accuracy, Lykken’s is not a practical proposal because the FBI is not likely to be persuaded to ignore one of its forensic tools when there are serious crimes to solve. A less intrusive approach is proposed here, at least with regard to deceptive examinees, and it begins with this assumption: if the confession criterion causes a bias in the sampling of field cases, there should be qualitative group differences in scores and decisions between guilty confessors and guilty nonconfessors. Guilty confessors are those who would collectively have their deceptions more apparent on the polygraph charts. This would be consistent with Patrick and Iacono’s (1991) report that confessions corresponded with more deceptive scores in most cases. If there are no significant differences in polygraph scores or results between confessors and nonconfessors, the impact of the confession criterion is likely to be relatively small and support the conclusions of Raskin et al. (1988), Horvath (1977), and Honts (1996). The present study was designed to test these two alternatives.



Data collected from the U.S. Army Criminal Investigation Detachment Polygraph Division were used in this study. Criminal Investigation Detachment cases were selected because of the uniform procedures, high standards, and multiple levels of quality control implemented by that organization. Examiners in the Criminal Investigation Detachment have conducted polygraph exams throughout the United States and the world, wherever U.S. Army service members are assigned. About 20 field examiners and two quality control supervisors staff the Criminal Investigation Detachment Polygraph Division at any given time. All have field investigative experience, have at least a four-year college degree, are federally trained and certified, and meet continuing education requirements.

There are two important features of the U.S. Army Criminal Investigation Detachment investigative practices that merit comment. In that system, only those suspects who are the focus of the investigations are asked to submit to the polygraph examination. The polygraph is not used in a dragnet fashion. Also, all suspects are routinely confronted and interrogated by a Criminal Investigation Detachment criminal investigator a number of days before the polygraph examination is scheduled. Those who acknowledge the crime to the investigator are usually not polygraphed. It is these two pre-polygraph processes that might cause an increase of the proportion of guilty examinees in that polygraph population, and a decrease of the proportion of those predisposed to confess, more than in other systems with less examinee filtering.

From August 1996 through March 1998, U.S. Department of Defense Polygraph Institute researchers reviewed all of the Criminal Investigation Detachment’s polygraph cases for which ground truth confirmation could be found, beginning with cases conducted after January 1, 1995. The time period for the sampling was January 1, 1995, through February 3, 1997, when the last case meeting selection criteria was available to the researchers. During this period 3,349 polygraph examinations were conducted in criminal cases. Of these, 2,010 (60.0 percent) were calls of deception indicated, 884 (26.4 percent) were no deception indicated, and 455 (13.6%) were no opinion (inconclusive). There were 1,146 cases of examinee confessions, and no reports of false confessions.

Also reviewed were the investigative files for those polygraph cases that are maintained separately from the polygraph files and include details of all of the investigative and laboratory findings. Confirmation of the polygraph cases required at least one of the following: an unrecanted confession of the examinee, an unrecanted confession from someone who exculpated the examinee, evidence that the crime under investigation was never committed such as when missing property was discovered to have been innocently misplaced instead of stolen, forensic evidence such as urinalysis or surveillance tapes that substantiated the truth, or suspects led investigators to where they had hidden evidence or the stolen property. Eyewitness testimony, prosecutorial decisions, or judicial outcomes did not rise to the level of sufficient confirmation. Because, in the Criminal Investigation Detachment system, polygraph and other investigative measures were conducted concurrently rather than sequentially, discovery of evidence was somewhat more independent of the polygraph outcomes than in a system where the polygraph is used either very early or very late in the investigative process.

For consistency, polygraph examinations using a common testing format were selected for this study. The cases had to be single-issue examinations in which the U.S. Department of Defense Polygraph Institute zone comparison technique (U.S. Department of Defense Polygraph Institute 1992) was employed. Single-issue examinations are those in which a lie to one relevant question means the examinee lied to all relevant questions, or if truthful to one, was truthful to all. By U.S. Department of Defense Polygraph Institute standards, a minimum of three repetitions (charts) of the questions is required. If more than three charts were collected, only the first three complete charts were used in the study. By limiting the data in this fashion, the inconclusive rate for the samples was likely to have increased (Senter et al. submitted for publication), but it was seen as necessary to standardize the quantity of data from each case.

There were 704 examinations that met the criteria for polygraph format, scope (single issue), and a minimum number of charts. From that group, the authors obtained an in-depth sampling of 177 confirmed guilty cases where a confession was obtained from the examinee and 61 cases where the examinee did not confess, but other evidence established guilt. Of the 177 confessor cases, 28 had other supporting forensic evidence, and 149 were confession-confirmed only.

The complete review of the archived Criminal Investigation Detachment cases included a search for confirmed innocent cases meeting these criteria. For this study, an additional criterion was imposed on the innocent cases consistent with Raskin et al. (1988)¾ innocent cases had to come from multiple-examinee investigations in which the guilty party was discovered, or it was proven that the crime did not take place. Sixteen innocent cases were found to satisfy the multisuspect, scope, polygraph format, minimum chart, and ground truth criteria. Of these, five were theft cases in which the missing items were later discovered not to have been stolen, and the remaining cases were confirmed by the confession of someone other than the examinee. Examinee demographics for all cases meeting the selection criteria are found in Table 1.


The Criminal Investigation Detachment polygraph program during this period used the Axciton computer polygraph (Axciton Systems, Incorporated, Houston, Texas) to record the traditional polygraph channels. There are two pneumographic sensors to register breathing, a standard blood pressure cuff for changes in blood volume, and finger electrodes for electrodermal activity. Data are digitized and available for offline analysis.

Scoring Method

This study avoided the original examiners’ scorings and decisions. They may have been prejudiced to some unknown extent by extra polygraphic sources of information such as case facts or the examinees’ gestures and verbal behaviors (Iacono and Patrick 1987). The interest was in determining just how diagnostic the physiological data were when these extra polygraphic sources of information were excluded. A scoring method developed at the U.S. Department of Defense Polygraph Institute was chosen for this type of polygraph format, called the objective scoring system (Dutton 2000; Krapohl and McManus 1999). The objective scoring system uses physiological tracing features previously shown to be most diagnostic: respiration line length, electrodermal response amplitude, and blood volume amplitude (Kircher and Raskin 1988). Feature sizes for the relevant and comparison questions were converted into ratios where the measurement of each relevant question was divided by the measurement taken of the matched comparison question. The resultant ratios were compared to a chart of empirically developed thresholds for score assignment (Table 2). The scores were summed, and the totals were used for making a veracity decision.

The objective scoring system scores for a three-chart polygraph examination have a potential range of -108 to +108. This system allows users to set their own cutting scores based on their tolerance for risk. The U.S. Department of Defense Polygraph Institute cutting scores of ±6 were used here: +6 or greater were categorized as no deception indicated, and -6 or lower were categorized as deception indicated. Scores between +/-6 were called inconclusive. These cutting scores produced decision accuracy at about 90 percent with the U.S. Department of Defense Polygraph Institute zone comparison technique (Krapohl and McManus 1999). The proportion of agreement between the trichotomous decisions of the objective scoring system and human blind scorers averaged 0.69 in that study.

Though the objective scoring system was designed to be performed manually in the field, the process was automated here to assure reliability. The three diagnostic features were measured automatically by a software package called Extract, version 3.0, developed for the U.S. government (Harris 1999). All had been conducted two years prior to the development of the objective scoring system; therefore, this scoring method had no influence on polygraph decisions by the original examiners or quality control personnel.


Decision accuracies for each of the four groups are found in Table 3. Tests of proportions were conducted for each group to determine whether their accuracies exceeded chance expectancy of 0.50. In the first evaluation, decision errors and inconclusive decisions were both counted as errors. Each of the guilty groups produced detection rates above chance levels: confession only (z=5.49, p<.01), confession plus evidence (z=4.91, p<.01), and evidence only (z=4.23, p<.01). The detection rate for the 16 innocent cases was not greater than chance (z=1.50, p>.01). Tests of proportions that excluded inconclusive decisions found all four groups to have detection accuracy greater than chance: confession only (z=7.78, p<.01), confession plus evidence (z=4.91, p<.01), evidence only (z=5.63, p<.01), and innocent (z=3.33, p<.01).

Figure 1 bar graphs the mean scorres with SEM bars for confession only, confession plus evidence, evidence only cases showing negative and the innocent cases showing positive.
Figure 1 Mean Scores with the Standard Error of Measurement Bars for Confession Only, Confession Plus Evidence, Evidence Only, and Innocent Cases Click to enlarge.

The objective scoring system scores were evaluated for the three guilty groups, and a one-way ANOVA was calculated as a function of the group using scores as the dependent measure. The group effect was not significant (F[2, 235] = 0.58, p>.01). Figure 1 displays the mean scores, along with the standard error of measurement bars, for the three guilty groups and the one innocent group. The mean scores and standard deviations for the four groups are found in Table 4.

Because there were no differences among the scores of the three guilty groups, those data were combined and a point-biserial correlation was conducted. Innocence was coded as 1 and guilt as 0. The correlation (r=0.43) was significant (t [252] = 7.65, p<.01).


The present findings are consistent with the conclusions of Horvath (1977), Raskin et al. (1988), and Honts (1996). A liberal estimation with these datas’ effect size, based on the one-way ANOVA, is quite small and negative due to the small value of the F ratio
ω2 = -.015 (Keppel 1991). Taken in context with most of the other literature on the issue, this evidence should offer some reassurance to those who wish to undertake field research on the polygraph and the comparison question technique. However, the present conclusions are restricted to data that came from sources with practices similar to those of the U.S. Army Criminal Investigation Detachment.

The conclusions in the present data are at odds with the Patrick and Iacono (1991) findings. Both the present study and that of Patrick and Iacono (1991) used extensive field samples taken from law enforcement agencies, high-confirmation criteria, and independent analysis of the polygraph recordings, although there were significant methodological differences that limit what could be said about the discrepant findings. Patrick and Iacono relied on a semiobjective field-scoring system performed by human blind scorers, while the present study used an objective and automated method of scoring the data not available to Patrick and Iacono when their work was published. Also, the polygraph was not used as a last-ditch method of solving cases with the agency this study sampled, as Patrick and Iacono described the practice in their report. Therefore, it may have been easier to uncover ground truth for a larger proportion of cases in this study. The present study had the benefit of larger and possibly more homogenous samples, a more consistent polygraph testing protocol that had been monitored by quality control oversight, and digitized physiological data. And, while it should be noted that Patrick and Iacono’s (1991) polygraph examiners used state-of-the-art examination procedures in the early 1980s when their data were collected in Vancouver, this study acknowledges that the practices of the more dispersed U.S. federal polygraph program in the 1990s are probably different.

The goal of this study was to determine whether there were differences in scores and decisions attributable to the confession criterion. Though none were found in this study, the confession criterion remains a potential source of contamination in undercontrolled studies. The present data demonstrate, however, that it is an overstatement to broadly assert that the confession criterion is a contaminant in a study. It is more defensible to state that the confession criterion is suspected when it leads to samples of cases with non-representative data, such as those with scores more extreme than the population as a whole. It should be relatively straightforward for researchers to collect and report such evidence as others have done so that skewed data can be recognized.


Ben-Shakhar, G., Lieblich, I., and Bar-Hillel, M. An evaluation of polygraphers’ judgments: A review from a decision theoretic perspective, Journal of Applied Psychology (1982) 67(6):701-713.

Dutton, D. W. Guide for performing the objective scoring system, Polygraph (2000) 29(2):177-184.

Elaad, E., Ginton, A., and Ben-Shakhar, G. The effects of prior expectations and outcome knowledge on polygraph examiners’ decisions, Journal of Behavioral Decision Making (1994) 7:279-292.

Faigman, D. L., Kaye, D. H., Saks, M. J., and Sanders, J. eds. Modern Scientific Evidence: The Law and Science of Expert Testimony. West, St. Paul, Minnesota, 1997.

Harris, J. C. Extract. Johns Hopkins University, Applied Physics Laboratory, 1999.

Honts, C. R. Criterion development and validity of the CQT in field application, Journal of General Psychology (1996) 123(4):309-324.

Honts, C. R. and Raskin, D. C. A field study of the validity of the directed lie control question, Journal of Police Science and Administration (1988) 16:56-61.

Horvath, F. The effect of selected variables on interpretation of polygraph records, Journal of Applied Psychology (1977) 62(2):127-136.

Iacono, W. G. Can we determine the accuracy of the polygraph tests? In: Advances in Psychophysiology. J. R. Jennings, P. K. Ackles, and M. G. H. Coles, eds. Jessica Kingsley, London, 1991, 4:202-208.

Iacono, W. G. The detection of deception. In: Handbook of Psychophysiology, 2nd ed., J. T. Cacioppo, L. G. Tassinary, and G. G. Berntson, eds. Cambridge University, New York, 2000.

Iacono, W. G. and Patrick, C. J. What psychologists should know about lie detection. In: Handbook of Forensic Psychology, I. B. Weiner and A. Hess, eds. Wiley, New York, 1987.

Keppel, G. Design and Analysis: A Researcher’s Handbook. Prentice Hall, Englewood Cliffs, New Jersey, 1991.

Kircher, J. C. and Raskin, D. C. Human versus computerized evaluations of polygraph data in a laboratory setting, Journal of Applied Psychology (1988) 73(2):291-302.

Krapohl, D. J. and McManus, B. An objective method for manually scoring polygraph data, Polygraph (1999) 28(3):209-222.

Lykken, D. T. A Tremor in the Blood: Uses and Abuses of the Lie Detector. Plenum, New York, 1998.

Patrick, C. J. and Iacono, W. G. Validity of the control question polygraph test: The problem of sampling bias, Journal of Applied Psychology (1991) 76(2):229-238.

Raskin, D. C., Kircher, J. C., Honts, C. R., and Horowitz, S. W. A Study of the Validity of Polygraph Examinations in Criminal Investigation. Final report to the National Institute of Justice, Grant No. 85-IJ-CX-0040, 1988.

Senter, S. M., Dollins, A. B., and Krapohl, D. J. Comparison of Utah and DoDPI scoring accuracy: Equating veracity decision rule, chart rule, and number of data channels used. (submitted for publication).

U.S. Department of Defense Polygraph Institute. Zone Comparison Test. Fort McClellan, Alabama, 1992.