Welcome to article four of our five-part series on Deploying Predictive Models. So far, I’ve introduced our proven 10-step data science methodology and shown how deployment is an important part of any organization’s data science roadmap. In part three, Demystifying Models, we looked at an example of a model we might want to deploy and untangled the “what is a model” question.
Today, in part four, we’ll dive into the final deployment considerations that teams must discuss to be successful when approaching a data science project.
Remember, success in this context is defined as moving a predictive model from concept to deployment, where it’s actually used, either in-house or by the world at large. If you get this right, you’ll be in the top 20-30% of such projects, as most fail to reach deployment at all!
Know Your Type
From a high level, there are two main types of predictions your predictive models will make:
- Online Predictions
- Batch predictions
Online predictions are returned each time a user needs an answer. This is the type of prediction we discussed last week with our house-price model. “Online” has nothing to do with the internet, rather, online refers to predictions that are calculated in real-time, each time the model is called.
Online predictions work well for simpler models with low latency (your users don’t want to wait around for a product recommendation to show up on a page). Moreover, they’re also useful for data that may change in real-time, such as personalization for e-commerce sites.
Batch predictions are predictions that are calculated ahead of time and stored for later retrieval.
Batch predictions are best for complex models or cases where large throughput is required such as when sending millions of emails at a specific time of day. It would be much easier to pre-compute (batch process) personalized email messages for a 6 AM send rather than having an email service provider call an online predictor ten million times at 6 AM.
Batch predictions also work well for advertising or marketing campaigns where a huge amount of personalization must be done at once. Another good example is list ranking, where a group of salespeople needs to know which prospects to reach out to on a given day. In this case, a batch predictor would pre-compute lists of prospects each morning and store them for each sales team member.
Each prediction type requires different access to data and vastly different concerns from a developer standpoint. One is real-time, the other is done in bulk, but requires access to current data.
Choosing Your Prediction Type
The decision on what type of predictions to use is one that must be made very early in the data science workflow, and certainly before building predictive models. If not, teams might find that mid-flow, the appropriate data are not available, requiring reworking the pipeline, data warehouse, and model type(s). When this happens, it’s oftentimes easier to give up and start over, making the project a statistic and worse – wasting a lot of time and money.
I recommend having early conversations with your data science team. Make sure they understand which type of predictions are required and make sure the developers can supply the appropriate data to the type of predictive models you choose.
Next week, in my final article of this series, I’ll talk about the critical step of measuring the success of your predictive models. And to wrap things up, we will also include a deployment checklist to help you successfully deploy more predictive models and get real ROI from your data science investment. Stay tuned!
All the best,
How a Hollywood VFX Business Built a Tool to Fight COVID-19
In his more than two decades as a visual effects artist, Mike O’Neal has worked on blockbusters such as “Titanic” and “Life of Pi.” But nothing quite prepared him for his latest gig: how to visually tell the story of a deadly virus and its insidious assault on human cells — and the potential 69 drugs that could stop it. Read more about his process and his data visualizations here:
Researchers Propose a ‘Fairer’ Product Recommendation Algorithm
Several researchers have proposed an approach to mitigate what they characterize as an “unfairness problem” in product recommendation algorithms. They say their algorithm provides high-quality explainable recommendations on state-of-the-art real-world data sets, and that it reduces the recommendation unfairness in several key aspects. Read more about it here:
Here is a List of Resources for A.I.-Generated Art
Wondering how to make art using Artificial Intelligence? Visit this website for some amazing tools to generate A.I. art. I think you’ll be blown away by the plethora of options!