FAS | Government Secrecy | Polygraph Policy ||| Index | Search | Join FAS


The following is the text of Department of Defense Polygraph Institute report DODPI94-R-0009, Psychophysiological Detection of Deception Accuracy Rates Obtained Using the Test for Espionage and Sabotage. This report, dated August 1995, is available from the Defense Technical Information Center (DTIC) as report #ADA330774, and was also reprinted in the American Polygraph Association quarterly, Polygraph, Vol. 27, No. 1 (1998), pp. 68-73.

Note, however, that while the DTIC report is 48 pages long, the Polygraph reprint is only six pages. As the DTIC report was not available for comparison purposes, it is not clear whether the Polygraph reprint actually provides the full text of the DTIC report.

The text presented here is taken from the Polygraph reprint. Page numbers are provided between curly braces for citation purposes. Corrections have been added in brackets.


{68}

Psychophysiological Detection of Deception Accuracy Rates Obtained Using the Test for Espionage and Sabotage

Department of Defense Polygraph Institute
Research Division Staff

Abstract

Previous research conducted by the Department of Defense Polygraph Institute (DoDPI) indicated that the decisions of examiners who administered the Test for Espionage and Sabotage (TES), were significantly more accurate at identifying programmed guilty examinees than were the decisions of examiners who administered either of two Counterintelligence Scope Polygraph (CSP) formats. The new format differs from previous security screening formats in that: (a) the number of issues being tested is reduced; (b) the number of repetitions of the questions used to calculate question scores is restricted to three; (c) between test stimulation is eliminated; (d) the order of questions within the question sequence cannot be altered; (e) each relevant question is compared to the same comparison questions; (f) the pretest is brief, more standardized and follows a logical sequence of information presentation; and (g) the Directed Lie Comparison (PLC) questions eliminate many of the problems associated with Probable Lie Comparison (PLC) questions. The procedures utilized during this study were identical to those in the previous study, but only the TES format was utilized. The replication was done in order to further validate the accuracy of the examiners' decision in identifying programmed guilty and innocent examinees, when the TES format was administered. The data collected in this study were evaluated using the new criteria developed from the previous study. Ten certified examiners from the Office of the Secretary of the Air Force conducted 88 examinations. The examiners had been trained to administer the TES and had been utilizing the TES when conducting security examinations. Ninety-eight percent of the innocent examinees and 83.3% of the programmed guilty examinees were correctly identified.

Keywords: counterespionage, DLC, detection of deception, directed lie comparison question, espionage, polygraph, PDD, screening, TES

Acknowledgements: Sheila D. Reed, Ph.D., served as principle investigator throughout planning, data collection, and drafting of this manuscript. We appreciate the support of the Office of the Secretary of the Air Force (OSAF) polygraph program, and specifically Bruce Thompson and Jim Morrison. Also thanks to Edith Andreasen, Richard Baird, Ray Brafford, Debbie Habel, Michael Rhodes, Donald Schupp, Ed Stoval, James Vaughan, Michael Walker, Harrison Wright, Earl Taylor, Sam Braddock, Gordon Barland, Jeff St. Cyr, Linda Knickerbocker, and Joan Harrison Woodard. This research was supported by funds from the Department of Defense Polygraph Institute's project DoDPI93-P-0045. The views expressed in this article do not reflect the official policy or position of the Department of Defense or the U.S. Government. Submit reprint requests to Andrew B. Dollins, Ph.D., Department of Defense Polygraph Institute, Building 3195, Fort McClellan, Alabama 36205-5113 [Note: the Institute has since moved to Fort Jackson, South Carolina; see its website at http://www.dodpoly.org.]

{69}

The accuracy of decisions for determining deception in a mock screening situation, using three psychophysiological detection of deception (PDD) formats has been compared [Department of Defense Polygraph Institute (DoDPI) Research Division Staff, 1997]. Two of the formats were counterintelligence scope polygraph (CSP) formats; one in which probable lie comparison (PLC) questions are asked and the other in which directed lie comparison (DLC) questions are asked. The third format was the test for espionage and sabotage (TES) in which: (a) the number of issues being tested is reduced; (b) the number of repetitions of the questions used to calculate question scores is restricted to three; (c) between test stimulation is eliminated; (d) the order of questions within the question sequence is constant; (e) each relevant question is compared to the same comparison questions; (f) the pretest is brief, more standardized and follows a logical sequence of information presentation; and (g) DLC questions are asked in place of the standard PLC questions.

The decisions of the examiners who administered the TES format were significantly more accurate (83.3%) at identifying the programmed guilty (PG) examinees than were the decisions of the examiners who administered either the CSP-PLC (55.6%) or the CSP-DLC (58.6%) format. There were no significant differences among the accuracies of the examiners' decisions at identifying the innocent examinees.

This study replicated the procedures utilized in the previous study, but only the TES format was administered. In addition, the data were evaluated using a scoring method that was developed using the data collected during the previous study (DoDPI Research Division Staff, 1997).

Methods

Examinees

Eighty-eight examinees were recruited by a local employment agency under contract to the Department of Defense Polygraph Institute and were paid $30.00 for their participation. Individuals who met the following criteria were excluded from participation: (a) less than 19 or more than 60 years of age, (b) not in good health, (c) pregnant, or (d) did not have the equivalent of a high school diploma. Thirty-four male (M = 27.2, SD = 9.7) and 54 female (M = 27.8, SD = 10.2) examinees were scheduled for testing. Thirty-three examinees were PG.

Examiners

Ten certified examiners (9 males and 1 female) were selected by and from the Office of the Secretary of the Air Force (OSAF) to conduct the examinations. Selection of the examiners was determined by the agencies. Although examiner selection was not random (selection criteria generally involve availability and experience), the examiners were considered representative of the CSP examiner population. The examiners had been trained in the administration of the TES and, for one month, had been conducting government examinations using the format. Examiners conducted two practice examinations before conducting an examination for the project. Each examiner completed two 4-hour examinations (morning and afternoon) on four days and one 4 hour examination on one day for a total of nine examinations each. The examiners were not given any information regarding the base rates. They did not receive feedback regarding the accuracy of their decisions until the end of the study, and they were blind as to whether the examinee was PG.

{70}

Apparatus

The examiners used standard analog field polygraphs manufactured by either Lafayette or Stoelting. Standard respiratory, electrodermal, and cardiovascular responses were recorded. The electrodermal component was operated in the manual mode. The examinations were conducted individually in large (6.2m x 6.2m) rooms in a building located on Fort McClellan. The scenarios used to program examinees guilty were enacted in another building located approximately two miles from the examination building. There were no video recording devices nor one-way mirrors in the examination rooms. The examinations were audio taped.

Scenarios

The PG examinees enacted one of four mock scenarios. Each scenario was representative of one of the four relevant questions. The espionage scenario required one examinee to steal a classified document from an office and to give the document to a second examinee. The second examinee received the document and placed it inside a vehicle located in the parking lot. Examinees who enacted the sabotage scenario, stole either a classified document or a classified computer disk. The examinee either put the document through a paper shredder or with a pair of scissors, cut the disk into pieces. An examinee who enacted the unauthorized contact scenario was asked to meet with a German agent who was sitting in a car in the parking lot. The agent requested that the examinee obtain some classified information to be given to the agent at a later time. During the enactment of the unauthorized disclosure scenario, the scenario setter was called out of his office midway through briefing the examinee regarding some classified computer information. A third person, who appeared to be fixing a window screen, entered the office and engaged the examinee in conversation regarding what the examinee had been told. All PG examinees received $100.00 as payment for their participation in the "crime." In addition, all PG examinees wrote a statement indicating that "for the purposes of this project" they had engaged in espionage, sabotage, unauthorized contact, or unauthorized disclosure, depending on which scenario they enacted.

Scoring and Decision Criteria

Scoring procedures developed during a previous study (DoDPI Research Division Staff, 1997) were used to evaluate the data. If the original decision was conclusive--significant responding (SR) or no significant responding (NSR)--then the decision was final. If a conclusive decision could not be made then the physiological responses to the first presentation of each relevant question were reevaluated by comparing them to the physiological responses only to the first presentation of the second comparison question. If, after the rescore, a conclusive decision was not possible, then the test was considered inconclusive (INC).

Procedures

During each session, ten examinees were given information regarding the research project, their participation, and the PDD examination. If they agreed to participate, they signed a consent form indicating that they were voluntarily participating in the research project. The examinees were taken in groups of two either to another building to be programmed guilty, or to the testing site. the PG examinees received information regarding the purpose of the scenario they were to participate in and signed an additional consent form indicating that they agreed to participate in the scenario. After

{71}

they enacted one of the scenarios, they were transported to the testing site . The transportation of the examinees to the testing site was timed so the examiners were not able to discern which examinees were innocent and which were programmed guilty.

The examinations were conducted according to guidelines provided to the examiners. Each examiner provided a numeric score and a decision (SR, INC, NSR) based on the numeric score, for each test. An NSR decision concluded the subtest. If the decision was INC, the examiner briefly discussed the questions with the examinee to determine if the examinee understood the questions. Then, the test was administered again. If, based on the data from the second test, the examiner's decision was INC, then the decision for that subtest was INC. When the examiner rendered an SR decision, the examiner confronted the examinee with the results.

Programmed guilty examinees were instructed to confess their guilt if they were confronted by the examiner, but not to reveal any details of their activities. Once a PG examinee confessed, the examination was concluded. However, an innocent examinee who responded significantly to the relevant questions--a false positive (FP) decision--was questioned by the examiner to determine if there was a legitimate real-world explanation for the examinee's physiological responses to the relevant questions. The examiner recorded any information provided by the examinee and concluded the examination. Two examiners, otherwise not involved with the study, independently evaluated the information obtained from the examinees who received FP decisions. If the two examiners agreed that the information was significant enough to justify the examinee's physiological responding--a false positive decision with justification (FPWJ)--then that examinee's data were not included in the original data analyses.

If the decision for the first subtest was either NSR or INC, the examiner conducted the second subtest. If, however, the decision for the first subtest was SR, then the second subtest was not conducted. All of the examinees tested during a session were debriefed simultaneously. Examinees who participated in mock scenarios returned the $100.00.

Data reduction and analyses

The data from 82 examinees were included in the analyses. The remaining six examinees were excluded for the following reasons: One PG examinee confessed to the examiner prior to the examination, one PG examinee was unable to understand the instructions of the scenario setter, two examinations were incomplete, and two FPWJ examinees were excluded.

If the scoring based on the physiological responding during an initial test resulted in an INC decision and a second test was conducted, unless otherwise indicated, only the result of the second test was included in the analyses. The percentages of innocent examinees and PG examinees correctly identified were calculated. Chi-square tests were conducted to determine if the numbers of correct decisions in identifying innocent and PG examinees were significantly different from chance. The significance criterion was set at .05.

{72}

Results

Excluding the one inconclusive decision, 98% of the innocent examinees and 83.3% of the PG examinees were correctly identified. The number of correct decisions, inconclusive decisions and errors made by the examiners are presented in Table 1. Both the number of innocent examinees correctly identified [X2 (1, N = 51) = 47.08, p < .001] and the number of PG examinees correctly identified [X2 (1, N = 30) = 13.33, p < .001] were significantly greater than chance. When the two FPWJ examinees are included in the analysis of the accuracy of decisions identifying innocent examinees, 94.3% of the innocent examinees were correctly identified. Including the FPWJs, the number of innocent examinees correctly identified was significantly greater than chance [X2 (1, N = 53) = 41.68, p < .001]. There were a total of five (10%) innocent examinees and five (17%) PG examinees for whom the initial test results were inconclusive. After retesting, the results remained inconclusive for only one innocent examinee (1.9%).

Table 1

Number of Correct Decisions, Inconclusive (INC) Decisions, and Errors Made by the Examiners in Identifying Programmed Guilty and Innocent Examinees

Decisions
RoleCorrectINCErrors
Innocent50*11
Guilty25*05
Note: Analyses tested whether each distribution was significantly different from change [sic].
* p < .001.

Discussion

Excluding the one inconclusive decision, 98% of the innocent examinees and 83.3% of the PG examinees were correctly identified. These results mirror the findings of the previous study, with respect to the accuracy of decisions obtained using the TES format. Although many questions remain regarding the generalizability of the TES format to field situations, the TES format appears to have greater validity than the format currently used by the federal government.

Further testing is required to answer some of the questions raised by the current and the previous studies (DoDPI Research Division Staff, 1997): (a) does the caveat "during this project" affect the accuracy of the decisions identifying innocent and PG examinees, (b) does the effect of the question caveat impact on the generalizability of the format to field situations, (c) does the reduced number of relevant issues addressed during the test contribute to the increase in the accuracy of identifying PG examinees, and (d) will the results generalize when different issues are addressed and different relevant questions are utilized.

Reference

Department of Defense Polygraph Institute Research Division Staff (1997). A Comparison of Psychophysiological Detection of Deception Accuracy Rates Obtained Using the Counterintelligence Scope Polygraph (CSP) and the Test for Espionage and Sabotage (TES) Question Formats. Polygraph, 26 (2), 79-106.

******




FAS | Government Secrecy | Polygraph Policy ||| Index | Search | Join FAS