deaths was the one she worked. To a layperson, this might suggest that Gilbert was clearly responsible for the deaths, but on its own it would not be sufficient to secure a convictionâindeed, it might not be enough to justify even an indictment. The problem is that it may be just a coincidence. The job of the statistician in this situation is to determine just how unlikely such a coincidence would be. If the answer is that the likelihood of such a coincidence is, say, 1 in 100, then Gilbert might well be innocent; and even 1 in 1,000 leaves some doubt as to her guilt; but with a likelihood of, say, 1 in 100,000, most people would find the evidence against her to be pretty compelling.
To see how hypothesis testing works, letâs start with the simple example of tossing a coin. If the coin is perfectly balanced (i.e., unbiased or fair), then the probability of getting heads is 0.5. * Suppose we toss the coin ten times in a row to see if it is biased in favor of heads. Then we can get a range of different outcomes, and it is possible to compute the likelihood of different results. For example, the probability of getting at least six heads is about 0.38. (The calculation is straightforward but a bit intricate, because there are many possible ways you can get six or more heads in ten tosses, and you have to take account of all of them.) The figure of 0.38 puts a precise numerical value on the fact that, on an intuitive level, we would not be surprised if ten coin tosses gave six or more heads. For at least seven heads, the probability works out at 0.17, a figure that corresponds to our intuition that seven or more heads is somewhat unusual but certainly not a cause for suspicion that the coin was biased. What would surprise us is nine or ten heads, and for that the probability works out at about 0.01, or 1 in 100. The probability of getting ten heads is about 0.001, or 1 in 1,000, and if that happened we would definitely suspect an unfair coin. Thus, by tossing the coin ten times, we can form a reliable, precise judgment, based on mathematics, of the hypothesis that the coin is unbiased.
In the case of the suspicious deaths at the Veteranâs Affairs Medical Center, the investigators wanted to know if the number of deaths that occurred when Kristen Gilbert was on duty was so unlikely that it could not be merely happenstance. The math is a bit more complicated than for the coin tossing, but the idea is the same. Table 1 gives the data the investigators had at their disposal. It gives numbers of shifts, classified in different ways, and covers the eighteen-month period ending in February 1996, the month when the three nurses told their supervisor of their concerns, shortly after which Gilbert took a medical leave.
Table 1. The data for the statistical analysis in the Gilbert case.
Altogether, there were 74 deaths, spread over a total of 1,641 shifts. If the deaths are assumed to have occurred randomly, these figures suggest that the probability of a death on any one shift is about 74 out of 1,641, or 0.045. Focusing now on the shifts when Gilbert was on duty, there were 257 of them. If Gilbert was not killing any of the patients, we would expect there to be around 0.045 Ã 257 = 11.6 deaths on her shifts, i.e., around 11 or 12 deaths. In fact there were moreâ40 to be precise. How likely is this? Using mathematical methods similar to those for the coin tosses, statistician Gehlbach calculated that the probability of having 40 or more of the 74 deaths occur on Gilbertâs shifts was less than 1 in 100 million. In other words, it is unlikely in the extreme that Gilbertâs shifts were merely âunluckyâ for the patients.
The grand jury decided there was sufficient evidence to indict Gilbertâpresumably the statistical analysis was the most compelling evidence, but we cannot know for sure, as a grand juryâs deliberations are not public knowledge. She was accused of four specific murders and