Finding out what the 60-80% will do next

If you’re like most people, I’ll bet you can list your best customers and conversely those that haven’t made a purchase in years. But do you have a good handle on what the majority – the 60 to 80% of your customers in the middle – will do next?

Perhaps you wish you could afford to tap into Big Data strategies to learn more about them. Actually, you can afford the strategies now because Big Data has nothing to do with it. All you need are a laptop, just three integers of data per customer, and ‘small data’ principles and tools that are available on any budget.

Apply small data principles to get a wealth of info

One popular method of segmenting your customers for marketing purposes is RFM analysis: Recency, Frequency, and Monetary. According to this model your best customers are those who have purchased frequently and recently, and buy expensive or many products. In this model, a 3x3x3 array can help you assess where all your customers stand.

RFM Analysis


As you can see, it is easy to identify your best customers (green in the upper right), and your worst (red in the lower left). But what about the vast middle? Are all sixes really equal? A seven (7) has been assigned to a customer with low recency but high frequency and monetary value. If this person was frequent but hasn’t purchased in a while I’ll bet this person isn’t a customer any more. They may not be as valuable as you think.

We can approach this problem in a different way. Professor David Schmittlein, Dean of MIT’s Sloan School of Management, published a paper called ‘What will they do next?’ (David C. Schmittlein, 1987). It uses an advanced statistical model to estimate the probability that a customer will make a purchase or stop making purchases altogether. Although published in 1987, the technique has not been widely used since it requires the use of maximum likelihood, hypergeometric functions, and other advanced statistical computing techniques.

Lucky for us, the Wharton Customer Analytics Initiative (Wharton Customer Analytics Initiative, 2012) published a package (Wadsworth, 2012) that allows analysts without PhDs to perform these methods right on their own laptop computers. Even with tens of millions of customers, you can perform this analysis without any Big Data tools. With only three data points per customer you can be ready to make data-driven decisions in less than a day.  Called Buy ‘Til You Die’ *, the package is available through the R Project for Statistical Computing, a free software environment for statistical computing and graphics.

*Note:  No consumers were harmed in the creation of this model. Life and death in “Buy ‘Til You Die” are defined by purchasing behavior, not vital organ function. 

The model makes certain statistical assumptions about the distribution rate of purchases and inactivity but the author includes methods to validate your model. After fitting the model, you can estimate the probability that a customer is still alive (active) and predict how many purchases they will make in the next year.

For example, in the diagram below we have sorted customers from low to a high probability that they are still active. The dark grey bar shows those customers that are fading from a high probability to a low probability. It may be effective to send promotions to these customers in order to prevent them from sliding further.


Other insights are available, as well:

  1. You can predict which customers will produce the most revenue in the next year.
  2. By running the model over every day in the past you can see how your base of active customers has changed over time, or how the rate of repeat purchases has changed, or how quickly customers become inactive.
  3.  If you correlate the historical data above to marketing efforts you can see if these efforts have affected purchasing behavior.
  4. The model will estimate how many purchases a new customer is likely to make in the next three years. You can use this input on customer lifetime value, which will help you determine a ceiling on new customer acquisition costs.

This is valuable data on which to base your marketing decisions. And you don’t have to pay big bucks for it. A competent data scientist can tap resources like the R Project to mine the data you’ve got now, using the equipment you already have at your disposal. Try it!

David C. Schmittlein, D. G. (1987). Counting Your Customers: Who Are They and What Will They Do Next? Management Science , 33 (1), 1-24.
Wadsworth, L. D. (2012). BTYD: Implementing Buy ‘Til You Die Models. Retrieved from

Wharton Customer Analytics Initiative. (2012, August 27). Wharton Customer Analytics Initiative Announces Buy ‘Til You Die (BTYD) Models Package Release. Retrieved September 11, 2013, from Press Releases: