What I Learned Teaching my Computer to Play Tic-Tac-Toe

Reading Time: 4 minutes

Good morning. This is David Albrecht, data scientist at Bennett Data Science, filling in for Zank.

The other day, I saw a post from someone explaining that their father went his entire life having never heard of Tic-Tac-Toe. It blew my mind. Can you imagine that?! Blasphemy, if you ask me.

As a kid, I played it a lot on restaurant napkins with my younger brother. It’s a funny little game – easy to explain and yet difficult to completely master. Sometimes he’d win and sometimes I would, but neither of us ever really could remember how to ensure a win (or at least a tie) each time. We’d forget how we got duped by that double-whammy 4 games ago or whether we should take the middle spot to start the game. Our memories just couldn’t remember all the details.

Computers are the best at these sorts of tasks. They can be programmed to do the same things over and over and remember the past. For Tic-Tac-Toe, however, we’d need to be experts at the game to program a computer to be one as well.

Instead, and much more interestingly, we can create an expert.

Reinforcement Learning is a branch of Artificial Intelligence whereby we have the computer learn the best move at every step of the process to eventually win, which we call receiving a “reward”.

Different from more typical A.I. tasks, Reinforcement Learning requires the system to understand that there is a sequence of events that end in a desired outcome.

Recently, this sort of A.I. has hit a major milestone by beating a top professional player in StarCraft II, a notoriously complex video game that requires careful thought in order to outsmart the opponent and finally win.

Not only useful for video games, Reinforcement Learning can be used to efficiently automate other tasks like stock market trading, computer resource management, traffic light control, and robotics, to name a few.

Some of the world’s most successful tech companies use it too. Did you know that Netflix uses artwork tailored to your preferences to try and get you to watch the movies they recommend? They do, and they use special types of Reinforcement Learning to do it!

They’re incredibly powerful algorithms.

The main drawback, however, is that we usually need data describing every possible state, action, and reward. That means that if you want to know what actions to take to sell your most expensive widget, the agent must have access to historic sales data that shows sales of the widget. Otherwise, there’s no way for it to learn! Luckily, in some cases, it’s possible to simulate that data.

How does it work?
We can teach a computer to play a simple game with multiple steps such as Tic-Tac-Toe. To do this, we take two players who start by playing random moves against each other. Over the course of each game, one is going to learn how to play, the “agent, and the other is going to continue playing randomly.

For each step of the game, called a “state”, we save an entry in a large table so that the agent can look up a state and remember the moves that resulted in winning or losing. The agent is given a reward for a win and a penalty for a loss, and informed by this data it’ll continue to play the moves that resulted in a win and avoid the moves that tended to lead to a loss. Over time, the agent learns that the best first move is a corner or the center, and it effectively learns how to play a game with multiple steps.

As you can imagine, this process is similar to other systems of steps, such as traffic optimization, a robot moving and picking things up, and movie selections based on artwork.

This process is called Reinforcement Learning because we reinforce the actions that we want the agent to take. What’s most interesting here is that we don’t even have to be experts in playing Tic-Tac-Toe to teach the agent how to be an expert – it will learn itself through trial and error just like we would!

Once the agent becomes an expert, it can be used to make decisions at every step of the game or sales process so that, when it gets to the end of the game, it has the best chance of winning. Or the best chance of selling that widget!

Reinforcement Learning is one way we can learn to automate tasks when we have sufficient training data. Have you tried using Reinforcement Learning or another type of A.I. in your business? Please hit reply and let us know what you think.

If you’d like to see our simulation in action, you can see all the code here:
https://github.com/dpalbrecht/TicTacWhoa

Of Interest

If you read one thing this week…
Read this hierarchy of needs for A.I. projects. In clear terms, this post explains all the complexity required to launch data science projects. It’s important to understand all that goes into a deployed predictive model, as each step along the long road is tied to great time and expense.
https://hackernoon.com/the-ai-hierarchy-of-needs-18f111fcc007

Employees need to understand analytics to succeed
“[A study from] Deloitte says 67% of employees are not comfortable accessing data or using insights on company tools. And “The majority of companies today adopt a fragmented, siloed approach to analytics tools and data, which correlates with diminished business success”.
https://medium.com/@datagran/democratization-of-data-science-79ce64b4b98c

Here’s a huge pile of A.I. resources
Here, you’ll find links to a long list of Machine Learning Resources, many with explanations.
https://medium.com/mlait/a-z-machine-learning-resources-5a7e29d9c45c

What I Learned Teaching my Computer to Play Tic-Tac-Toe

Of Interest

Previous PostWant More Revenue? Focus on Customer Engagement!

Next PostIs that Data Science Company a Fake?