I have been reading the book Super Crunchers by Ian Ayres. The bottom line is that statistics can do better than experts.
We have been living this reality in medicine for the best part of 20 years. Want advice on how to treat a rare disease? We used to go to an expert at one of the great teaching hospitals, but now we are more likely to use the Internet. On PubMed we can search for Randomized Controlled Trials in that disease and discover that treatment A is significantly better than treatment B.
It isn't only in medicine that the figures outperform the expert. Ayres begins by telling the story of Orley Ashenfelter who was sceptical about wine experts. Looking back over successful vintages and the weather at the time of their growth, he produced a regression formula based on high average summer temperature and low harvest-time rainfall that correlated with a good vintage. When he made predictions for the 1989 and 1990 vintages based on his formula, the experts scoffed. But he was proved correct, and today most wine investors follow his formula.
Bill James did the same for baseball. Baseball scouts claim to have an eye for a good player and watch hundreds of high school and college games to identify a future star. James derived a formula that he said would predict who would succeed in major league baseball. Once again statistics beat the experts.
What has changed is the availability of huge databases to guide decision making. The size of these databases is enormous. They are not measured in gigabytes but in terabytes or even petabytes (a million gigabytes). The entire Library of Congess consists of 20 terabytes of text. In contrast, Wal-Mart's data warehouse comprises 570 terabytes. Data mining is able to produce business decisions like the refusal rental car companies to offer a service to people with poor credit scores because they are more likely to have an accident. Airlines, when a fight is cancelled, no longer offer the next seat to frequent flyers as a reward for loyalty, but to the customer whose continued business is calculated to be at greatest risk. The "No Child Left Behind" Act requires schools to adopt teaching methods supported by rigorous data analysis. In some cases this means adopting lessons where every word is scripted and statistically vetted.
Apart from just analysing correlations, the super crunchers have introduced the randomized controlled trial into business. Without consent, you may be taking part in one right now. Say a manufacturer of cornflakes is concerned about packet design. He might produce the identical product save for "More Fiber" printed in red at the top left hand corner. Packets are sent out randomly to different stores and the manufacturer can compare how quickly each disappears from the shelf.
It gets more complicated, but the next thing is for consumers to game the system. Once we are aware of what is going on we should be able to turn the thing to our advantage.
Is there no place for the expert then? The wise expert will use this new technology and add value to it, by recognizing the flaws in clinical trials, just as I have in pointing out how the manufacturers cheat in their trials of supposedly new drugs against chlorambucil in CLL. You have to be aware of the tricks that are played in RCTs. However, gone are the days when we can just say, "Lies, damned lies and statistics." We need to understand statistics and make them work for us rather than the opposition.