In the concluding chapter of Everything Is Predictable, Tom Chivers asserts that “…everything decision-related is Bayesian. It simply describes the optimal way to integrate new information with your prior best guesses. When you look at it like that, so many things seem to make more sense.” If you’re in the business of quantifying consumer behavior, it’s quite a compelling statement.
Chivers provides numerous examples throughout his book demonstrating how Bayesian* thinking can be applied to decision-making both on a large scale and in everyday life. He provides a specific example related to political views (which we know can become quite heated) and the ever-present question of who is being more ‘irrational’. When we consider each debater’s prior beliefs about a given candidate’s trustworthiness or integrity, it’s easier to see how those beliefs may overshadow any new information they are exposed to. It’s human nature. It’s also Bayesian.
The author further theorizes that as long as information is presented in a familiar context, people are “pretty good at reasoning” – although some are better than others. Those who are slightly better than average at predicting the future have the ability and willingness to use previously available information, and are open to updating their predictions based on new information. Chivers would say that instead of seeing every issue as a new situation, people who “keep score” and use available information (existing or new) have an edge. Again, it’s Bayesian.
Even our brains process information using a Bayesian framework. How we perceive the world is filtered through what we expect to experience, and when an experience deviates from what we assume to be true, we seek out further information to shift our conclusion (or not). It’s why optical illusions exist, it’s why we can read typos without skipping a beat, and it’s even why we tend to become set in our ways as we get older. When we have a substantial amount of prior experience in a particular area, our brains fill in the gap, which becomes our perceived reality.
What is Bayes’ Theorem?
Formally, the Bayesian perspective says that we have pre-existing beliefs about what we expect, and we continually update those beliefs based on new information. Our beliefs may never be perfect, nor will our new information be “complete”. But when we combine the two, we end up with an estimation that is likely closer to reality than before we had the new information. This, Chivers points out, is where some (i.e. frequentists who shun subjective prior information) bristle at the thought of leaving behind the long-standing null hypothesis (i.e. there is no observed significant difference). While the null requires researchers to either accept or reject a hypothesis based on a calculated p-value (the probability that you’d see the observed results if there was no actual difference), Bayes would say that we’re constantly making decisions under uncertainty and we’d be better off estimating how likely something is to occur instead of saying it either does or doesn’t occur.
Chivers provides several examples of those who are for and against (and some on the fence) using a Bayesian perspective to quantify the world around us. In regards to creating views about the world, he also notes that “…if we want to talk about probabilities – making decisions under uncertainty – we need the in-between numbers. More than that, we need – or, at least, we want – a mathematical framework for moving between those numbers, for changing our beliefs about something…There is no way to do decision theory with frequentist math.”
But rather than debate which is superior, it may be more useful to say that p-values and Bayes’ theorem answer different questions. While there are cases where you would reasonably use a p-value to understand how likely it is that you would see your data, given your hypothesis, there are other times when what we really want to understand is the inverse – the likelihood that a hypothesis is true, given the data. And this is where hierarchical Bayesian modeling is especially useful.
How Bayes’ Theorem is Applied
Using prior information is a powerful tool to qualify the probability of something occurring. As Chivers points out based on a long-standing example from 1959, it’s how we can say that if a friend has two children, and we know that at least one is a boy, we know the probability of the other child being a boy is not 0.5, but 0.3 (there are 4 possible combinations – boy/boy, girl/girl, boy/girl, girl/boy, but once we remove the girl/girl possibility, we’re left with only 3 combinations – so the likelihood of your friend having two boys is one out of three). Although it feels awkward to say the probability of having a boy or a girl isn’t actually fifty-fifty, Chivers reminds us that it’s the prior knowledge that changes our belief about what is probable. The same is true for modeling consumer decision-making.
When we are tasked with understanding how consumers make complex decisions (e.g. choosing among products with numerous features and various prices), we create a model of purchasing behavior from a series of theoretical “shopping tasks” using hierarchical Bayesian modeling. We start with average preferences for each product feature (i.e. the priors) and then supplement each consumer’s individual preferences with their respective choices (i.e. new information) to result in estimates of individual-level preferences for each feature (i.e. the posteriors). While it’s true that individual-level information is dependent on the aggregate and vice versa, this process iterates thousands of times (thanks to modern advances in computing) until it reaches a stable solution – sharing information along the way and leaning more or less on the aggregate data depending on how varied individual choices are. It’s how we can reliably estimate individual-level preferences from a relatively small subset of consumer survey choices.
Once combined, individual feature preferences form estimated shares of preference with a wide range of applications, including market share estimation for specific groups of interest. Modeling capabilities have grown dramatically since software applications have embraced Bayesian analysis, and those capabilities are still expanding. What was once considered unreachable (i.e. estimating large scale consumer-level preferences) is now not only achievable, but mainstream.
Chivers’ book sheds light on how wide-reaching the implications of Bayes’ theorem may be. Simply put, “Bayesian analysis is about the likelihood that something is true, given the information in front of us.” And that is precisely what we’re intending to model when we compile consumer choices and make sense of them through hierarchical Bayesian analysis and simulations. The author not only makes complex concepts accessible, but also has an uncommon ability to make the topic of probability engaging by weaving it through our everyday lives. If understanding consumer (or human) decision-making is of interest to you, reading Everything Is Predictable: How Bayesian Statistics Explain Our World is worth the time investment. You’ll likely enjoy it.
*Thomas Bayes was an eighteenth-century Presbyterian minister from a wealthy and connected family. He had too many resources and too much time on his hands, and he explored mathematical principles for fun. Although his ideas about probability weren’t published until after his death, his foundational formula remains today as one of the ways we quantify consumer decision-making.
About the Author:
Lynn Leszkowicz
Lynn Leszkowicz, PhD is Sr. Research Director, Marketing Sciences