A Probability Puzzle
From Randomness by Deborah J Bennett:
“If a test to detect a disease whose prevalence is one in a thousand has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease, assuming you know nothing about the person’s symptoms or signs?”
For extra credit: what percentage of the physicians, residents, and fourth year medical students at a prominent medical school who were asked this question got it right?
Extra, extra credit: why is it critically important that doctors be able to get this one right? Give one example.
This is an honor system non-open book test.
Answers in comments, please. Will highlight correct answers in a subsequent post. Hat tip to Nassim Taleb in Fooled by Randomness for citing Bennett’s test.
Answers here.







I test 1000 people. 1 has the disease. 1000 X .05 = 50 people are false positive
Therefore 51 people will test positive
1/51 of the positives are true
1/51 X 100 = 1.96708% or a little less than 2% of the positive tests are true positive.
For extra credit: what percentage of the physicians, residents, and fourth year medical students at a prominent medical school who were asked this question got it right?
This answer has to be a WAG since it involves unknowables such as training, arrogance and workload of the medical staff but my gut is that about 30% will take the time to think the problem through and get the correct result.
Extra, extra credit: why is it critically important that doctors be able to get this one right? Give one example.
This directly effects the overall cost of health care in a huge way. Assume that it costs $10,000 to cure a patient who presents positive. Not an unlikely assumption. Assume further that the 50 false positive patients do not exhibit negative effects as a result of their treatment that require further medical treatment and they do not litigate as a result of the unnecessary treatment. This is a highly improbable assumption made for the sake of simplicity.
The true cost to cure 1 patient is $10,000.
The cost to cure that one patient and treat the 50 false positives is $510,000.
Corroborating diagnostic tests should be done.
This is an honor system non-open book test.
In the interest of full disclosure I should note that I worked for 3 years in a VA hospital.
I was not on the medical staff however, I was a CIO.
Posted by: Dennis Shanley | October 07, 2007 at 05:58 PM
I also came up with the 2% answer.
There is another interesting application for this kind of statistics: The beloved war on terror. The chance of a random person to be a terrorist is hopefully less than 1/1000. Imagine you manage to build some automated system which somehow claims to spot suspicious behavior, known faces, or miscreants by some other clever scheme.
These systems all have a non-negligible error-rate. If you're really lucky, you might push that one down to less than 1%.
Now do the math again, assuming a 1/100000 terrorist-rate and 1% false positives. No wonder I read that one trial for such a system got terminated.
Posted by: otmar | October 07, 2007 at 03:22 PM
Extra, extra credit: The answer is philosophical in nature and has to be nuanced. The question is a long standing one - in a testing of hypothesis, which should be considered as "Null hypothesis". In other words, which error is more serious and hence should be controlled. For example, Matt suggests that Red Cross has concluded for operational reason it is acceptable to send 50 people for a disheartening series of tests to protect the blood supply. Suppose now we are told that the false positive predominantly affects a biological group - gender or a racial group. Will that decision stand reason? Let us assume that the situation is internment during WWII in US. A nation has to live with the effects of a callous operation decision to accept a large false positive.
This example should suggest two things: it is not clear what should be the "Null hypothesis" and the fact that Type I error is so large compared to the underlying probability. So the medical community should work to improve the tests to bring the false positive to the low level of the prevalence of the condition.
Posted by: Aswath | October 07, 2007 at 10:28 AM
I'm going to agree with 2% (or 1/51 1.96% :) ). It's not an independence problem because, while there will be 50 false positives out of a 1000, I'm assuming there will be 1 true positive out of a 1000 (or else what's the point of the test) and that no false negatives were specified. This means that of the 51 people flagged as having the disease out of a 1000, 1 has the disease.
I won't venture a guess as to what percentage of doctors actually got this right since I'm not totally convinced I have it right either.
Posted by: Mike Kowalchik | October 06, 2007 at 05:26 PM
an open answer test... just to add pressure?
My answer is 2% logic: in a population of 1000, %5 positives yields a subpopulation of 50, the normal infection rate of 1 in 1000 should then be reduced to 1 in 50, thus 2%. could be 1 in 51, but usually test answers don't make you do percentages on numbers like 51 : )
Contests and puzzles are great way to reward regular readers with a weekend treat.
extra credit (or maybe partial credit if i blew the first part)
i would hope it breaks on the 80 20 rule. getting into med school requires a bunch of analytic thinking so i will take the 80 side.
extra extra credit. having 20% of the populations that make life and death decisions regularly would be very bad.
Posted by: rob | October 06, 2007 at 04:14 PM
95% chance. The question didn't state "What is the change that a person WILL BE found to have a positive result AND actually have the disease..." instead it simply asks what is the chance that "a persona found to have a positive result actually has the disease."
Posted by: Gabe | October 06, 2007 at 03:41 PM
Yeah, it's about 2%, as the previous commentor mentioned. This is Bayesian statistics at its best.
51 out of 1000 will receive a positive, but only one of those 51 is actually infected.
But when your doctor sits you down and told you that you came back with a positive result, and you asked "How accurate is the test?", she'd almost certainly say "It's about 95% accurate."
I'd bet that 80% or more of the doctors would answer this way.
Posted by: jb | October 05, 2007 at 11:08 PM
I am never good at handling problems described in colloquial terms and of course I am worse when the solution is described informally. In my opinion this is worse in the case of probability. It is interesting for me that a book that highlights unintuitive nature of probability uses such informal reasoning to solve "puzzlers".
I think this particular puzzle is not fully specified because it does not specify false negative rate for the test in question. Matt and the author of the book tacitly assume that this rate is 0. But I question whether that is valid.
If P denotes the result is positive and D denotes the patient having the disease, we are given P(P|D sup c) to be .05 and P(D) to be .001. Let us assume that P(P|D) to be x. Then what is required is to compute P(D|P).
P(P)=P(PD)+P(PD sup c) = P(P|D)P(D)+P(P|D sup c)P(D sup c) = .001x+.04995.
After some manipulation we can derive P(D|P) = x/(x+49.95).
Posted by: Aswath | October 05, 2007 at 10:51 PM
If the false positives are not correlated in any way with the true positives, then anyone who tests positive has almost no greater chance of actually having the disease than someone who tested negative (given the 1000:1 odds of actually having the disease). This seems counterintuitive, of course. I'm a bit rusty at the precise way of calculating this, but the answer would be slightly greater than 1/1000, not 1/51. This is a classic "independence" problem. Now if the question were posed such that you were looking at a false positive that you knew came from a pool of over 1000 tested candidates, you could make a different argument that makes the odds much higher. CAVEAT: it's been so long since I studied this kind of thing that I may be completely out to lunch.
Matt's discussion of false negatives vs. false positives is very important for a number of reasons. In my view, the medical profession as a whole has given insufficient thought to how to address the false positive issue with patients, leading to much more angst than is necessary when patients receive a positive test result -- invariably late on Friday -- and have to wait at least a couple of days to ask questions about it. ;-)
Posted by: Curtis Carmack | October 05, 2007 at 06:00 PM
Assume that the test is performed on everyone regardless of symptoms of the disease. Then out of every thousand people who receive the test, one has the disease and 999 do not. Further, assume that the test has no false negatives: anyone who actually has the disease gets a positive result. Then 1 out of every thousand tests are true positives. The remaining 999 should be negative results, but the 5% false positive rate means that 49.95 (so round to 50) of these people will receive false positive results. Then out of our 1000 tests, 51 return positive results. But only one of these is a true positive, so the chance that a positive test identified someone who actually has the disease is 1/51 or about 2%.
I would guess that less than 50% of the physicians, etc. at this medical school got the correct answer. Here's an example of why it's important to know the chance of a positive result actually indicating the disease: the HIV test used by the Red Cross blood drive in my area is vulnerable to a certain type of false positive. Some people will _always_ get a positive result on this test, even though the test available from a doctor will correctly show that they are HIV negative. The staff at the blood drive explain to people who test positive that they need to go to a doctor and get a real HIV test, because the test they just got a positive result on has a high false positive rate - but that they should never bother to donate blood again because they will consistently fail the HIV test. I imagine that many people have paniced upon hearing this test result. However, the Red Cross continues to use the same test, probably because it combines low cost with very low false negative rate. In this case it may be justified to trade a high false positive rate for a low false negative rate, because a false positive merely requires a second test but a false negative would spread HIV through transfusions.
Posted by: Matt Crawford | October 05, 2007 at 05:02 PM