[Author’s Note: Only after reading the book did I discover that an updated version has been recently published, so this review just covers the original version.]
Anyone who wishes to understand predictive analytics (PA) but doesn’t want to get bogged down in the math and computer programming should read Eric Siegel’s book Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die.
Siegel accomplishes three things that few writers on technical subjects do:
- He explains his subject clearly and with a minimum of jargon
- He uses plenty of specific and practical examples
- He enjoys himself, allowing his readers to do the same
A Worthwhile Preface
I recommend reading the book’s Preface, which is a section people often jump beyond. It gives you a feeling for both the author and his enthusiasm for his topic. He notes, “Everyone craves the power to see the future; we are collectively obsessed with prediction.”
Siegel uses the example of people who throw down good money for palm readers or who make a point of reading their horoscopes every day. But they are just the tip of the iceberg. Think about our day-to-day concerns. What’s the weather going to be? Where is the stock market going to go? How will my company perform next quarter? Will my current romantic relationship turn out well? What kind of job will I be able to get in another year?
Even at a more mundane level, we fixate on the future, wondering if the supermarket will be carrying the product we want, if our teammates will be throwing the basketball our way, if our new clothes will fit just right, if our boss will react positively to a comment we want to make during a meeting, etc. A large portion of our thoughts focus on trying to predict the outcomes of our (and other people’s) actions and decisions.
A Little Prediction
Predictions are so difficult to make accurately that some people consider them a fool’s errand. Why bother to try? For those prediction cynics, Siegel has good news that he calls The Prediction Effect: that is, “a little prediction goes a long way.” It is the foundation for his book.
The premise is that for many purposes, we don’t need total clairvoyance to benefit from predictions. Instead, we only need to do a little better at guessing—that is, a little better than the laws of probability would dictate, all else being equal.
In short, the goal is get a leg up on chance itself:
Prediction seems to defy a Law of Nature: You cannot see the future because it isn’t there yet. We find a workaround by building machines that learn from experience. It’s the regimented discipline of using what we do know—in the form of data—to place increasingly accurate odds on what’s coming next.
In the Introduction of the book, Siegel expands on The Prediction Effect, using personal examples to show how powerful the effect can be in a range of areas, from health insurance and identity theft to spam filtering and music recommendations.
Then he notes how The Prediction Effect can help organizations, giving them “an entirely new form of competitive armament.”
The Art of Feeding Machines
Early on, Siegel discusses machine learning, easing the reader into this technical term in a nontechnical way. Rather than jumping straight into methods, he lays out a boatload of bullet points summarizing the ways in which various companies and industries have benefited from predictive analytics. This is where the subtitle of the book comes into play, as he breaks the examples into various categories such as “People Love, Work, Procreate, and Divorce” (e.g., how online dating sites use PA) and “People Get Sick and Die” (e.g., how health insurance companies predict whether people will perish within the next 18 months).
Once he’s raised reader expectations by establishing the practical power of PA, he tempers those expectations by citing some of the numbers underlying The Prediction Effect. Using the straightforward example of a direct mail campaign, he notes that boosting the effectiveness of such a campaign by just a couple of percentage points can result in a huge financial boon. “Predictions need not be accurate to score big value,” he concludes.
Defining Predictive Analytics
Siegel defines PA as “technology that learns from experience (data) to predict the future behavior of individuals in order to drive better decisions.” Then he goes on to show why prediction is different from forecasting, a subject I’ve written about in a previous article.
He also describes why, with few exceptions, it is organizations rather than individuals who tend to benefit from PA (I’m not sure this will remain true in the future, but that’s a subject for another article). The short explanation is that “PA rocks the enterprise’s economies of scale.”
How to Apply PA
By the end of the Preface and Introduction, Siegel has already covered a lot of ground (and a considerable percentage of the book). Chapter One is a drilling down into PA concepts, highlighting applications by breaking them into two components: what is predicted and what’s done with the prediction. I found this to be a terrific way to turn a potentially abstruse topic into one anyone can glean.
He make concepts as clear as possible, so I doubt it’s threatening even to mathphobes when he uses a few numbers to illustrate “how straightforward it is to calculate the sheer value resulting from The Prediction Effect.”
The Making of Models
Siegel sells the reader on the potential of PA before moving into the concept of predictive modeling. And, even then, he keeps the topic as jargon-free as possible. For example, he defines a PA model in this way:
A mechanism that predicts behavior of an individual, such as a click, buy, lie, or die. It takes the characteristics of an individual as input, and provides a predictive score as output. The higher the score, the more likely it is that the individual will exhibit the predicted behavior.
Then he slips into predictive model rules, keeping his diagrams about as nonthreatening as possible. Once again, though, he doesn’t dwell on the abstractions but, rather, demonstrates with real-world examples how PA works within organizations.
Keeping It Real
In Chapter Two, Siegel tells how he became caught up in a viral news story about how PA had allowed the retail chain Target to “predict” that a man’s teenage daughter was pregnant before the man himself knew. In this way, Siegal explores some of the perils of PA (and of the media’s coverage of it) in a compelling narrative in which he serves as a character. He describes two camps: those who champion and benefit from PA, and those who are leery of its effects on privacy and personal data.
He sees no simple solutions to finding the right balance between the two points of view: “The world will continue struggling to impose order on the distribution of medical facts, financial secrets and embarrassing photos.”
The book covers a number of potentially problematic applications of PA, such as how some organizations are using personnel data in order to predict when individuals are getting ready to quit their jobs, and how the criminal justice system could potentially use data to predict issues such as a convicted criminal’s risk of recidivism.
Siegel shows that he’s well aware of the ethical challenges associated with using PA. In the end, he may be a PA booster, but he’s certainly not blind to the dangers.
The Data Effect
“Data always speaks,” writes Siegel. “It always has a story to tell, and there’s always something to learn from it.”
That assertion is at the heart of what he calls “The Data Effect.” In short, he believes that “data is always predictive.”
In an age when we are awash in ever-deepening oceans of data, this is an intimidating claim. PA experts will inevitably find patterns in the data. To most of humanity, the flood of data looks like useless slush, but to people of Siegel’s ilk, it is as valuable a resource as oil or iron.
The book contains a massive group of tables that the author touts as “Bizarre and Surprising Insights.” Although fun and informative to read though, these struck me as overkill and, at least in the Kindle version of the book, an impediment to narrative flow.
However, he does a good job of seguing this data into a key PA point: that analysts care much less about causation than they do prediction. For example, it’s not as important to explain the correlation between shopping habits and debt repayment as it is to be able to predict the repayment of debts.
Another key point is that there’s no guarantee that significant correlations will continue on forever. That is, just because shopping habits predict debt repayment today, we can’t assume the same will be true five years from now. In short, the insights from PA often have a shelf life. If companies want to leverage PA, they must intermittently conduct new analyses to ensure the expected correlations still hold.
Climbing Decision Trees
“Predictive analytics (PA) serves as an antidote against the poisonous accumulation of micro-risks,” Siegel writes. He uses examples from Chase Bank to make his case.
One of the risks, in this case, is the “risk” of loan prepayment, because prepayment keeps banks from accruing interest payments. (Siegel briefly alludes to the subprime mortgage disaster, noting that “PA didn’t prevent the global financial crisis.” I would have liked to see him to discuss this more fully–e.g., if most large banks used PA, then why couldn’t they see, as some others did, that the risks of defaults among their own loan portfolios were growing?–but he holds fast to his general claim that PA is not especially useful for adjusting the “absolute measurements of risk when a broad shift in the economic environment is nigh.” )
The PA technique that’s covered in greatest detail here is the “decision tree,” which is a way of illustrating a series of if-then statements along a variety of decision paths. Each parth leads from the root of the tree (which, counter-intuitively, starts at the top of the chart, or at the left-hand side) and descends along various branches to a specific tree “leaf.” An analyst can attach probabilities to decision trees, as can be seen in the image below.
Too often, decision trees are made to seem more complex than they really are. In this case, Siegel does an artful job of making both the appearance and analysis easily understood. Alluding to a version of the tree above, he writes, “With only two factors taken into consideration [that is, income level and mortgage], we’ve identified a particularly risky pocket: higher interest mortgages that are larger in magnitude, which shows a whopping 36% change of prepayment.”
Once he’s covered the subject of decision trees, Siegel goes on to discuss a number of other technical terms, such as “training data” and “lift.” Since lift is basically the payoff associated with an effective PA model, it’s especially critical to anyone trying to evaluate the potential costs and benefits associated with investing in PA.
The Ensemble Effect
In Chapter Five, Siegel discusses what he calls the “ensemble effect,” which occurs when multiple predictive models are joined together in such a way that they compensate for one another’s limitations. The result is a model that is better able than any of the component models to make predictions.
This abstract subject matter could turn into very dull stuff, but he wisely uses competitions–especially the one held by Netflix to develop a better way of making good movie recommendations to its customers–to humanize the idea and turn it into an interesting narrative. We learn about Kaggle, a platform for coordinating predictive modeling and analytics competitions, and about the various types of ensemble models that have been created in recent years.
When people think about machine learning and predictive analytics, they tend to think about massive numbers being crunched inside computers. They’re less likely to think about text analytics and articulate answers to hard questions. Yet, Siegel takes us fairly deep into this subject, using IBM’s Watson (the computer that beat human champions on the game show Jeopardy) as the narrative hook.
Siegel admits that, despite his expertise in this field, he didn’t think the IBM team could pull off the creation of a Jeopardy-winning computer model:
Despite my own 20-odd years studying, teaching, and researching all things artificial intelligence (AI), I was a firm skeptic. But this task required a leap so great that seeing it succeed might leave me, for the first time, agreeing that the term AI is justified.
How is all this related to prediction? Because, in essence, Watson was designed to come with many possible answers to any game-show question and then “predict” which one was the best one. This prediction became its final answer and led to its domination over the two former Jeopardy champions.
Watson wasn’t designed to emulate human thinking processes, but Siegel’s story did get my thinking about the relationship between “thinking” and “predicting.” It left me wondering how much of our own cognition is, essentially, prediction.
Engineering the Art of Persuasion
Siegel demonstrates how PA can be leveraged to not only predict a person’s behavior but predict what influences that behavior.
This may seem like a subtle difference but he shows just how crucial the distinction is. A company, for example, may be able to predict whether a customer is on the fence about renewing their contract (e.g., cell phone service) but still might not know exactly what will keep them from leaving. It turns out that sometimes making contact with the fence-sitter is the wrong way to go because it only encourages them to leave. It’s sometimes better to take another action, or take no action at all. PA can help organizations decide which course to take for any given individual.
The same principles of “persuasion modeling” can be used to influence potential voters, which is why President Obama hired over 50 analytics experts during his second campaign.
Predicting Future Predictions
Siegel closes his book with a scenario of how the world may look in the year 2020, when prediction models are even more pervasive than they are today. In this way, he summarizes the many areas that predictive analytics will influence, from entertainment services to advertising to health care. It’s a fascinating list that helps us recall all the territory he’s covered in the book. And it highlights the fact that predictive analytics is leading to a nearly invisible cultural revolution that will influence the lives of billions of people.
There are a few areas where I might quibble with Siegel’s opinions or editorial decisions in this book, but overall I think he pulls off an amazing feat: making a difficult subject not only comprehensible but downright compelling and fun. I highly recommend it.
BY MARK VICKERS
This is a review of the original book rather than the revised and updated edition.