By Bruce Ramshaw, MD
img-button

“Moneyball,” a book by Michael Lewis, and later made into a movie starring Brad Pitt, describes the success of applying the principles of data science to develop a winning strategy in baseball. It’s a transferable skill, so why hasn’t Moneyball happened in health care? You would think if data science can be used to win more games in baseball, it could be used to lower costs and improve outcomes. Data science is about measurement and improvement. In baseball, the measurement to be improved is runs scored with the lowest possible budget—producing the most wins per dollar spent.

Similarly, if we want a sustainable health care system, we should measure and improve the value of care we provide, resulting in lower costs and better outcomes over time. If you can measure something, it can be improved. But if something is not being measured, it can’t be improved—and we’re not measuring the value of care in health care, in any organization, in any health care system in the world.

For data science to work, there are basic rules. First, data requires “context,” or a definable process. Attempting to apply data science without context doesn’t work. In sports, context is provided by the specific set of rules for a particular game, like baseball: nine players, three outs, three strikes, nine innings, etc. The insight from the application of data science tools applied to baseball will not work the same if applied to a different sport such as American football, with 11 players, four quarters, four downs, etc.

In health care, context means defining each whole patient care process. The specific patient and treatment factors and outcome measures collected will be different for different types of patient care processes. For example, outcome measures used to define the value of care for a breast cancer process will not be the same as those used for a ventral hernia process.

Another principle of data science is it should be applied to measure and improve outcomes that matter most. In baseball, what matters most to improve the value of the team performance is combining salaries (financial measures) with factors that result in the most runs and wins (e.g., on-base percentage). Applying data science to measure and increase the number of pitches thrown will likely not help win more games.

We’re not typically measuring outcomes that matter in health care. We tend to measure things that are easy to measure, such as if antibiotics are given before surgery, rather than the factors that improve the value of care the most. We document these easy-to-measure factors, often because of perverse financial incentives or penalties, without measuring to see what effect they have on outcomes. To truly measure value, we should be combining financial measures with outcome measures that matter in the context of each definable whole patient care process. Until we do, we can’t lower costs and improve patient outcomes at the same time.

For over a century, baseball was using data the same as health care is today. At first, baseball statistics were based on the original development of one set of static measurements, like batting average, runs and runs batted in (RBIs). These statistics were invented in 1845, and presented in the “box score” for each game. The more these old statistics were examined, the less sense they made. They were not the best measures of player and team value, so they didn’t give the best insight into how to score more runs and win more games.

In health care, we also use static measurements that don’t measure value well. Take wound infection, for example. Every hospital in the United States reports wound infection based on the CDC definition: superficial, deep or organ space. But when we asked patients who had wound infections what they thought, they said the CDC definition was not very helpful. Patients thought measuring wound infections by the invasiveness of the treatment and the length of time required to heal their wounds was a much better measurement. When we looked at the data for wound infections after open ventral hernia repairs, the patients were right: Some superficial infections took months or years to heal, requiring invasive surgical procedures, whereas some deep infections were resolved with a single course of oral antibiotics. We can learn to apply better measurements in health care.

It wasn’t until the 1970s, when Bill James, a writer and night watchman at a Stokely Van Camp pork and beans cannery, began to question the status quo of baseball statistics. In 1977, James published a periodical called the “1977 Baseball Abstract: Featuring 18 Categories of Statistical Information That You Just Can’t Find Anywhere Else.”

James developed new ways to measure baseball success and found that runs scored were highly correlated with wins. He developed weighted correlations that led to a formula that generated what he called “runs created.” As he developed momentum, he met with a small group of friends, including Sports Illustrated writer Dan Okrent, at La Rotisserie FranÇaise restaurant, in New York City. That is where the concept of “Rotisserie” baseball was born. This has developed into a fantasy sports industry, which is worth nearly $10 billion annually.

At that time, the only people interested in these new baseball measurements were the fans. As James continued to develop better measurements, there was one other group that showed interest: player agents. The agents wanted more statistics that validated the value of their clients, the professional baseball players, to justify negotiating larger salaries.

Interestingly, the group of people who showed no interest in these better measurements and the application of data science to baseball were the owners and managers of the teams. The people most invested in the outcomes of the games had no interest in changing how they used their data and managed their teams. James, working with a company called STATS Inc., tried to persuade teams that they should use the new measures he had developed. Teams just weren’t interested.

Part of the problem was that baseball already had its data company, Elias Sports Bureau. The company had the contract for managing all of baseball’s statistics. Like with the current generation of electronic health records in health care, baseball at that time did not think there was any need to change. The company certainly did not want to admit or believe that the statistics they were paid to collect and publish were poor indicators of player and team value. There was no appetite or incentive for innovation or improvement.

The status quo was not challenged again until two entrepreneurs from the financial industry took what they learned about how to use data science applied to financial derivatives, and realized they could do the same thing in baseball. They started a company called AVM (Advanced Value Matrix) Systems in 1994, and approached teams to see if they could consult and apply their data science methods to baseball.

Change did not come easily. It wasn’t until the Oakland A’s were sold to a more frugal ownership group that there was enough financial pressure to make changes to the status quo. The inequities in baseball budgets rose to the level where some teams could afford the best individual players and others could not. Change usually only occurs when the pain of the status quo rises to a level greater than the discomfort of making a change.

When the new owners refused to match the salary offers for star players who were plucked away by the wealthy teams, like the Yankees, then the A’s management, with Billy Beane in charge as the A’s general manager, felt the pressure to make changes in how they operated. Billy had read every one of Bill James’ “Baseball Abstract” publications, and he discovered that baseball was not using data appropriately.

Paul DePodesta was an intern for the Cleveland Indians when Billy met him. Paul graduated from Harvard University with a degree in economics, but his real passion was the intersection between economics and psychology, a discipline now called behavioral economics. Paul had recently met the Wall Street traders-turned-baseball data gurus during one of their initial sales calls and he was intrigued. Soon after that, Billy Beane hired Paul, and Paul convinced Billy to hire AVM Systems. With the help of Paul and AVM systems, Billy began to apply data science to the Oakland A’s. Moneyball was the result.

Today our health care system’s use of data is similar to where baseball was in the 1990s. We’ve learned many wrong lessons. From reductionist tools, like prospective randomized controlled trials, we’ve learned to apply treatments that seem best for the average patient to all patients, regardless of differences in each local environment and the biologic variability of patient subpopulations. We’ve learned that training to be a doctor should allow us to use our training and experience, without appropriate data, to make treatment recommendations. Probably the most harmful habit of all, we’ve allowed health care leaders to continue to push the growth and volume model despite the harm done not only to patients, but to doctors and other caregivers as well.

The financial constraints and inequities in health care are worsening and are contributing to more and more harm for patients, employers and in some cases, even for doctors themselves. Tragically, there are reports in the United States of young people dying because they can’t afford insulin. Doctors are dying by suicide at a rate greater than in the general population.

A main challenge to make necessary changes in health care is to let go of the pride and the belief that we (doctors, hospitals, insurers, even patients sometimes) know what is best for any given situation. Billy Beane describes the mindset required to make this change in the book: “The hardest thing … is there is a certain pride, or lack of pride to do this right.” Letting go of beliefs and the way we’ve always done things is hard and uncomfortable. But discomfort is a normal and necessary part of learning, and transformation can’t occur without changing our mindsets and the structure for how we care for patients and manage data.

There is a major difference between applying data science to baseball and to health care. Ultimately, baseball is a competitive sport—it’s about winning, beating another team. When other major league teams learned to apply data science to their organizations, the advantage for the Oakland A’s was diminished. In fact, just two years after the A’s had tied the Yankees for the most wins during the 2002 season with one of the lowest budgets in baseball, the Boston Red Sox won their first World Series in almost 100 years using the same principles of data science. This data-driven effort was led by Theo Epstein, the new general manager, and Bill James, who was hired by Boston’s owner, John Henry, in 2003.

In health care, we should not be competing. We should be focused on a goal that aligns all of us: improving the value of care for all patients with any disease or health problem. When we align around the goal of value and work collaboratively to improve value for patients, we can apply one of the most important tools of data science: the ensemble model for learning. If every clinical team in each local environment were to implement a value-based continuous learning model and then network the learnings from each clinical team, we could improve value forever.

Data science is real, but very different from the reductionist science paradigm we’ve been taught and are functioning under in health care today. Until we feel that the pain of continuing to suffer in this reductionist status quo is worse than the discomfort of learning and applying a new data science paradigm, like Moneyball did for baseball, we will continue to suffer the consequences. I believe the inequities and harm resulting from our current system structure are enough to commit to making this change now.


Dr. Ramshaw is a general surgeon and data scientist in Knoxville, Tenn., and a managing partner at CQInsights. You can read more from him on his blog: www.bruceramshaw.com/blog.