Reading Time: 4 minutes

With everyone shopping like crazy during Black Friday and Cyber Monday, I figured it would be timely and relevant to discuss a simple way to personalize offers to customers at scale, using a simple recommendation technique: market basket analysis.

I’ll cover a simple, real-world example and show you how straightforward this technique is to use and understand.

In terms of commonly-used techniques for recommending products, here’s how I generally think about them, ranked from least to most complex (and generally, performance):

  1. Random – pick any old product to show to a user
  2. Trending – products that have shown a recent spike in activity
  3. Popularity – the most popular products over a longer time window
  4. Market basket analysis – only if shoppers are purchasing multiple products
  5. Collaborative filtering – powerful when there is a lot of data showing what people just like you also purchased
  6. Factorization machines – very powerful when we have user and product information as well as data showing what people just like you also purchased

Oftentimes simple wins, and market basket analysis is only a bit more complex than simply recommending the most popular products. Let’s have a look at how it works.

Market Basket Analysis and Association Rules

Market basket analysis works by identifying combinations of products often purchased together — called association rules. Imagine a grocery store where a shopper buys cereal and bananas. Market basket analysis might be employed here to suggest that the shopper also buys milk since it was commonly found in “baskets” in the past.

Sometimes market basket analysis can lead to interesting combinations. For example, there’s a famous story of Osco Drug stores discovering that, between 5 PM and 7 PM, many consumers who bought diapers also bought beer. Were parents hoping to dull the pain of staying awake for long periods of time when raising a new baby? Or maybe it’s just a coincidence. Either way, this anecdote shows how unexpected association rules might be found from everyday data (source).

Market basket analysis works by using a computer to discover association rules. This process is aptly called association rule mining. Its results are easy to understand, but we need a few metrics.

Example

Imagine there is a store with a hundred customers and:

  • 15 bought cereal
  • 9 bought milk
  • 7 bought both

If we’re interested in discovering what to recommend to people who buy cereal, we need a bit more information to understand just how powerful the linkage is between cereal and milk. What follows is an introduction to the metrics of support, confidence, and lift, that we use when understanding and ranking association rules.

Association rule: Bought cereal => Bought milk

  • Support: The support of a rule indicates how frequently the items in the rule occur together. In our example;
    Support = P(Cereal and Milk) = 7/100 = 0.07 = 7% 
  • Confidence: The confidence of a rule indicates the probability of both the antecedent and the consequent appearing in the same transaction. In our example;
    Confidence = Support / P(Milk) = 0.07 / 0.09 = 77.8%
  • Lift: The lift indicates the strength of a rule over the random co-occurrence of the antecedent and the consequent, given their individual support. It provides information about the improvement, the increase in the probability of the consequent given the antecedent. In our example;
    Lift = Confidence / P(Cereal) = 0.778 / 0.15 = 5.2

Any rule with an improvement of less than 1 does not indicate a real cross-selling opportunity, no matter how high its support and confidence, because it actually offers less ability to predict a purchase than does random chance. An improvement over 1 can, and since there is such a high lift (5.2) in our example, we would definitely want to recommend milk to a customer with cereal in their cart. (Resource)

This is a vastly oversimplified example, but it shows the power of mining for these association rules. Think of market basket analysis as the ultimate “save your butt” recommender — have you ever purchased an electronic gizmo only to realize you forgot the necessary batteries? Or have you ever bought most of the ingredients for a recipe only to have to return to the store for the one thing you forgot? Market basket analysis solves exactly these problems from a shopper’s perspective.

From the shop owner’s perspective, market basket analysis is the ultimate last-stage recommender that looks at a shopping cart just before checkout and allows the owner to up-sell products to a buying customer as well as increase customer satisfaction by helping the customer avoid missing out on a smart purchase.

Conclusion? Don’t let your customers shop without this recommender. It’s a straightforward technique that works magically for many businesses.

Of Interest

Google’s Smart Fill Autocompletes Sheets Columns Using A.I.
In June, Google debuted Smart Fill, an A.I.-powered Google Sheets feature that automatically detects patterns, generates formulas, and autocompletes columns. As Google’s Ryan Weber explained in an interview with VentureBeat earlier this summer, Smart Fill relies on common patterns of data mappings (e.g., combining one column with another to derive an output column) and analyzes cells within a user’s spreadsheet to assist with data entry. For each document, it evaluates the relevance of formulas and knowledge from Google’s Knowledge Graph to a given column or sets of columns. Read more.

Hinton: “Deep Learning is Going to be Able to do Everything”
A.I. Pioneer Geoff Hinton has been working with deep learning since the 1980s, but its effectiveness was long limited by a lack of data and computational power. His belief in neural networks was contrarian but, ultimately, his steadfast conviction in the technique paid massive dividends and now it’s hard to find anyone who disagrees, he says in this interview by Technology Review.

Is Astrology Real? Students Used Data Science to Find Out
Analyzing thousands of tweets for their Data Mining Class, a group of data science students found popular horoscope Twitter accounts to predict the emojis of each horoscope using a Machine Learning Bertmoticon package. While their model has its limitations and warrants questions about their approach to semiotics, it’s nice to see young data scientists getting creative! Have a read here.