by: Zank, CEO of Bennett Data Science
In business, as in life, change is a big part of our day to day. There’s not much we can or even should do to stop it. It’s inevitable. And changes affect business. There are longer-term trends, like consumer spending, short term changes such as spikes and troughs around major holidays and black swans we’re not prepared for, such as interest rate hikes or another Facebook investigation.
With change to people (your customers) comes changes to product consumption. So, if you’re relying on predictive models to personalize service to your customers, it’s imperative to watch out for contamination of your models! As usual, I have some questions:
- What should be expected when one of these changes occurs?
- Is the data science model supposed to somehow anticipate change?
- How about sales on Christmas Day?
- Or traffic during the Super Bowl?
Some events are rather predictable. But a housing crisis, or major news event can be much more difficult to handle. For most unknown events, we can’t do much to insulate predictive elements from big changes in user behavior. Luckily most of these types of events are short-lived.
Data science model freshness
Let’s look at what happens when seasons or trends change and what data scientists can do to be sure that predictive models stay accurate. To illustrate exactly how we maintain model “freshness”, I’ll show how most data science models are created.
Let’s take a step back and look at the data that models are built upon and assessed with. From a ten thousand foot view, it looks something like this:
Training
-
Be proactive for expected changes to our training dataModels built to respond to expected changes like seasonality or holidays generally benefit from having access to the same times in prior years. These dates or corner seasons can be built into predictive models and handled in a date-aware manner. This means using time as a predictor in our model and going back a few years with the training data. I’m making a lot of simplifications here! Many times, it’s not feasible to go back years, as there can be too much/little data, etc.
-
Be reactive to recent, unexpected changes
When unexpected changes occur, if reflected in the training data, we can update the model to account for this change, or incorporate new data sources where necessary.
For drastic changes, it may become imperative to greatly reduce the look- back of the training data to just a few weeks. This can reduce bias of “How it used to be” and emphasize “How it is now”.
Thanks for reading & talk soon!
-Zank
Zank Bennett is CEO of Bennett Data Science, a group that works with companies from early-stage startups to the Fortune 500. BDS specializes in working with large volumes of data to solve complex business problems, finding novel ways for companies to grow their products and revenue using data, and maximizing the effectiveness of existing data science personnel. https://bennettdatascience.com