A frightening study
We know from recent experience that political experts don’t have a very good track record but we’d all like to think that a doctor’s opinion is a fact. This scary old story is from Michael Lewis’ new book The Undoing Project.
Way back in 1968 Lew Goldberg published a study of how consistent doctors were in diagnosing ulcers from x-rays. Here’s how Lewis tells the story:
“The researchers then asked the doctors to judge the probability of cancer in ninety-six different individual stomach ulcers, on a seven-point scale from “definitely malignant” to “definitely benign.” Without telling the doctors what they were up to, they showed them each ulcer twice, mixing up the duplicates randomly in the pile so the doctors wouldn’t notice they were being asked to diagnose the exact same ulcer they had already diagnosed….
“More surprisingly, the doctors’ diagnoses were all over the map: The experts didn’t agree with each other. Even more surprisingly, when presented with duplicates of the same ulcer, every doctor had contradicted himself and rendered more than one diagnosis: These doctors apparently could not even agree with themselves. ‘These findings suggest that diagnostic agreement in clinical medicine may not be much greater than that found in clinical psychology— some food for thought during your next visit to the family doctor,’ wrote Goldberg. If the doctors disagreed among themselves, they of course couldn’t all be right— and they weren’t.”
The really interesting – and scary – part for me is that each doctor contradicted himself or herself at least once.
1968 was a long time ago. We have much better imaging technology. It’s likely that doctor training has gotten better. But my point is about experts. These were the experts of their time.
Now here’s the good part of the story if you’re a nerd like me. The same researchers asked the same doctors how they made their diagnoses. They took the doctors’ answers and wrote a simple computer algorithm to do the analysis the way the doctors said it was supposed to be done. The computer did better than even the best doctor in the group!
Lewis’ conclusion is “You could beat the doctor by replacing him with an equation created by people who knew nothing about medicine and had simply asked a few questions of doctors.” But he is also an expert and he isn’t telling us 1) how the programmers decided what method to use when doctors disagreed and 2) how the model got access to the data; back in 1968 computers read punch cards, not pictures. I suspect some human read the pictures and coded them for the computer so that human actually had a hand in the diagnosis.
You could conclude that the doctors didn’t follow their own prescription or that that the algorithm was crowd-sourced wisdom and therefor better than what any one doctor would do. You could also say we still need experts to come up with the rules which become the programs which do the diagnoses. The latter was true but artificial intelligence and big data available to be processed is all about generating rules. That’s how cars learned to drive themselves.
But surely we need experts (nerds) to write the programs that process the data that make the rules that do the diagnoses… but for how long?