Friday, July 30, 2010

Super Crunchers

I have been reading the book Super Crunchers by Ian Ayres. The bottom line is that statistics can do better than experts.

We have been living this reality in medicine for the best part of 20 years. Want advice on how to treat a rare disease? We used to go to an expert at one of the great teaching hospitals, but now we are more likely to use the Internet. On PubMed we can search for Randomized Controlled Trials in that disease and discover that treatment A is significantly better than treatment B.

It isn't only in medicine that the figures outperform the expert. Ayres begins by telling the story of Orley Ashenfelter who was sceptical about wine experts. Looking back over successful vintages and the weather at the time of their growth, he produced a regression formula based on high average summer temperature and low harvest-time rainfall that correlated with a good vintage. When he made predictions for the 1989 and 1990 vintages based on his formula, the experts scoffed. But he was proved correct, and today most wine investors follow his formula.

Bill James did the same for baseball. Baseball scouts claim to have an eye for a good player and watch hundreds of high school and college games to identify a future star. James derived a formula that he said would predict who would succeed in major league baseball. Once again statistics beat the experts.

What has changed is the availability of huge databases to guide decision making. The size of these databases is enormous. They are not measured in gigabytes but in terabytes or even petabytes (a million gigabytes). The entire Library of Congess consists of 20 terabytes of text. In contrast, Wal-Mart's data warehouse comprises 570 terabytes. Data mining is able to produce business decisions like the refusal rental car companies to offer a service to people with poor credit scores because they are more likely to have an accident. Airlines, when a fight is cancelled, no longer offer the next seat to frequent flyers as a reward for loyalty, but to the customer whose continued business is calculated to be at greatest risk. The "No Child Left Behind" Act requires schools to adopt teaching methods supported by rigorous data analysis. In some cases this means adopting lessons where every word is scripted and statistically vetted.

Apart from just analysing correlations, the super crunchers have introduced the randomized controlled trial into business. Without consent, you may be taking part in one right now. Say a manufacturer of cornflakes is concerned about packet design. He might produce the identical product save for "More Fiber" printed in red at the top left hand corner. Packets are sent out randomly to different stores and the manufacturer can compare how quickly each disappears from the shelf.

It gets more complicated, but the next thing is for consumers to game the system. Once we are aware of what is going on we should be able to turn the thing to our advantage.

Is there no place for the expert then? The wise expert will use this new technology and add value to it, by recognizing the flaws in clinical trials, just as I have in pointing out how the manufacturers cheat in their trials of supposedly new drugs against chlorambucil in CLL. You have to be aware of the tricks that are played in RCTs. However, gone are the days when we can just say, "Lies, damned lies and statistics." We need to understand statistics and make them work for us rather than the opposition.


Anonymous said...

Terry: I'm looking for the blog you wrote on Lymphomas someone I know was just Dxed. Is there a way I can search for it?

john liston

Anonymous said...

Never mind Terry, I figured it out

john liston

Mike said... recognizing the flaws in clinical trials, just as I have in pointing out how the manufacturers cheat in their trials of supposedly new drugs against chlorambucil in CLL. You have to be aware of the tricks that are played in RCTs.

Would you please enlighten us in detail how the manufacturers cheat in their trials. Be interesting to read as a number of CLL people are on trials or considering trying one. Thanks.

Terry Hamblin said...

The classic way is to choose chlorambucil as the comparator since it is licensed to treat CLL. Then they choose an inadequate dose - say 40 mg/sq m/ month. Fludarabine, and alemtuzumab were both licensed on this basis. With bendamustine they were more subtle, giving one drug in mg/sq m and the other in mg/kg, but instead of choosing actual weight, they chose ideal weight, which in most cases led to underdosing of chlorambucil. We know from successive MRC trials that chlorambucil dose is critical. When given at its optimal dose it has the same effect as fludarabine but is less toxic.

The other thing they do is to refuse to ask the right questions. Is ofatumumab better than rituximab? There are no head-to-head comparisons. Fludarabine v cladribine? FCR v PCR? FCR lite v FCR? Why would they? Somebody would have to lose. Since most trials are paid for by the manufacturers there is no incentive to do these trials. In the UK the NHS is paying for a trial that looks at low dose rituximab v full doses. But even so the manufacturers have set up a 'spoiler' trial to reduce the recruitment to the NHS trial.