Tic-Tac-Toe: Simple Artificial Intelligence in C#

 

 

Introduction

I wrote this code to demonstrate how to program artificial intelligence (AI) into a two-player board game after seeing several requests for help on doing so in an online coding forum. I chose tic-tac-toe (also known as noughts and crosses) since virtually everyone knows the game and the rules are simple enough that we don’t need an elaborate analysis of game configurations. Despite being a simple game, the basic AI principles shown here can be applied to more complicated games such as checkers, Connect 4, go and even chess.

So what do we mean programming AI? Basically, we would like a computer player to ‘look’ at a game board and ‘intelligently’ decide where to move. There are of course many ways to do this. In this post we will explore a common method of searching for moves by building game trees that represent the space of possible moves (game configurations) and selecting a move which yields the best outcome.

Game trees can be quite large, particularly so for complex games, and as such can take quite a bit of time to construct and evaluate. Even tic-tac-toe, as simple as it is, has  255,168 possible games if we don’t take symmetries into account (26,830 games when disregarding symmetric games ). That may sound like quite a lot but it pales in comparison to chess which has an estimated 10123 games (now that’s really a lot!). Thankfully, computers are especially suited for dealing with these types of problems though it is interesting to note that for a long time it was believed that a computer would never be able to defeat humans at games like chess. Sadly (or perhaps not) for humans this is no longer the case.

 

Getting Started

I am going to assume that everyone knows how to play the game so I won’t bother explaining the rules (just in case you don’t read this). If you wish to run the game you will need the .NET 4.0 Framework installed on your machine. If you wish to edit the solution you will need Visual Studio 2010 or Visual Studio Express (I’m pretty sure MonoDevelop works too though I haven’t tested it). The source code for this project can be found here.

In order to follow along, you are going to need some object-oriented programming experience as I assume that the reader understands concepts like using abstract classes and inheritance. The code is written in C# but anyone with java, C++ or VB.NET experience should be okay. The primary focus here will be on the AI portions of the code but I will walk through much of the code.  The code shown here are mostly snippets and cannot be copied directly and expected to compile.  Use the link above to download the full source code if you wish to tinker with the code.

 

Creating the Board

We are going to need a game board so let’s start with that code (see snippet below). Internally, the board is modeled as a two-dimensional array of integers. A value of 0 represents an empty space, a value of 1 an ‘X’, and a value of 2 represents an ‘O’. The array indices represent the row and column respectively.  Positions on the board are referenced either by specifying the row and column or by specifying a position number. The convention used for position number is shown in figure 1 below. Note that while the internal representation of the board pieces is an int, this is not exposed to clients. Instead an enumeration representing the game pieces is used (enum Pieces) This keeps things clean and prevents coding accidents.


Figure 1: The positions on the tic-tac-toe board

 

You will notice that the Board class implements ICloneable. This is important as our computer player will need to test out moves on Board objects and we don’t want to muck around with the actual game board so we will use clones.

 

 Creating the Tic-Tac-Toe Game

With the board in place we can make the TicTacToe class which models the game itself. It is responsible for keeping track of each player’s turn and making moves on the board and ends the game when a player wins or a draw has been reached.  All the logic here is fairly straightforward.

 

 

When a move is attempted a check is made to ensure that the move is valid and if it is the move is made, otherwise an exception is thrown. The move class is shown below. As you can see it an encapsulation class used to store the move position and piece being moved.

 

 

Creating the Players

We will have two kinds of players, human and computer. The human player is responsible for getting a move from a live person while the computer player will search for moves from the game tree it constructs. We will create an abstract Player class to capture the shared properties and functionality of human and computer players. The code for the Player class is shown below.

 

You will notice that the Player class includes the event PlayerMoved. This is used to notify listeners when a move has been made. This is useful because the move selection for the players run in their own thread so that it does not block the main thread (and UI) while waiting for a player to select a move.

The code for HumanPlayer is shown below. All it does is wait for a click on the game board UI and then invokes the PlayerMoved event which is handled by the game TicTacToeForm class (see source code).

 

Computer Player

Now we are getting to the interesting part — getting the computer to make an ‘intelligent’ move. In order to accomplish this we are going to need some way of placing a value on how good a given move is.  For this purpose we will define an evaluation functione(m), that the computer player will use to see how favorable a particular move  m,  is. We will follow usual convention and choose e(m) to be large if a move is likely to result in a win and small if it is likely to result in a loss. With the evaluation function in place we can use the minimax algorithm to select a move that maximizes e(m). We will take a closer look at the minimax algorithm in a bit but for now let’s define the evaluation function.

Defining an Evaluation Function

There are many ways to construct an evaluation function that meets our criteria. The way I chose is fairly simple:  for each row, column and diagonal on the board, count the number of the players pieces (an ‘X’ or an ‘O’) that are present if it is possible for a win in that row, column or diagonal and subtract that from that value we get from the same evaluation for the opponent’s piece. Kind of a mouthful, I know. Have a look at figure 2 below which shows a sample calculation of e(m) with ‘X’ next to move.

 

 

Figure 2: For ‘X’, two pieces in the diagonal = 2
1 ‘X’ in second column = 1
For ‘O’, the ‘O’ in the third column = 1
e(m) = 3 – 1 = 2
This configuration is favorable to ‘X’

 

NOTE: There are certainly other ways to define the evaluation function for tic-tac-toe and you may wish to experiment doing that.

It is convenient to define a very large value for e(m) when a configuration represents a win for ‘X’ and a very small value when there is a loss for ‘X’. We are free to choose any sufficiently large number as long as we can guarantee that it is impossible  for non-winning configurations to yield the same e(m). Conceptually, I like to use ±∞. In this program I used double.MaxValue and double.MinValue respectively.  The evaluation function there has definite upper and lower bounds.  We are certain that -10 ≤ e(m) ≤ 10 so there is no danger of approaching double.MaxValue or double.MinValue in a non-winning configuration.

Note: The .NET framework has an implementation of infinity (double.PositiveInfinity and double.NegativeInfinity), that would also work,  I just happened to not use it.

Side Note: I would like to stress that it is extremely important to have an evaluation function that accurately estimates the ‘goodness’ or ‘badness’ of a position. This is not so difficult for tic-tac-toe because there is not much strategy to consider and we won’t need to look too far ahead to see whether we have chosen a good move or not. A game like chess however, is a completely different matter. A poor evaluation function will lead to a very bad computer player even in end-game scenarios where there are only a few moves to make. In more complex games, you may need to experiment a bit with fine-tuning the evaluation function to ensure that it reasonably characterizes how favorable a given configuration is.

Now that we have an evaluation function we are ready to describe the minimax algorithm (informally), which will enable our computer player to decide on an ‘intelligent’ move. What we will do is make a tree of possible moves down to a given depth. The depth represents how many levels of moves the computer will look ahead. A tree of depth of 2 contains all the possible moves the computer can make and all of the subsequent moves that the computer’s opponent can counter with. The tree’s root node will represent the initial board configuration. We will have two kinds of nodes, MAX nodes and MIN nodes. MAX node are used to represent configurations for which the computer must move and MIN nodes represent board configurations for which the computer’s opponent must move. Thus, the tree’s root node is always a MAX node. MAX nodes select the move that results in the MAXimum e(m) value from its children.

The children of the root node will represent configurations of all the possible moves that the computer can make.  These nodes are MIN nodes because they represent game configurations for which ‘O’ (the opponent) will move and as such they will select moves that MINimize e(m) values. We continue construction of the tree by generating children for these MIN nodes, which are of course MAX nodes since it will be the computer’s turn again in the board configurations represented. The tree construction continues in this manner until we reach the specified depth.

Let’s walk through an example using the configuration in figure 3 shown below. It  is the computer’s move who is ‘X’  and we are going to create a tree of depth of 2.

 


Figure 3: It is ‘X’s move. This configuration will be represented by the root node of our game tree.

 

 

This will be our root node, and since it represents a configuration for which the computer needs to move it is a MAX node. The computer can move in positions 3, 5, and 6. Each of the moves yield a new configuration and are represented as children of the root node. Because the child nodes represent configurations for which the opponent must move they are MIN nodes (see figure 4 below).

 


Figure 4: Game tree showing three children spawned from the root node representing an ‘X’ move to position 3, 5, and 6 respectively

 

We need to make another set of children for each of the min nodes to achieve the desired depth of 2. Since these children are the children of MIN nodes they will be MAX nodes. This is always the case. MAX nodes have MIN children and vice-versa. The graphic below shows the complete tree for depth 2.

 

 


Figure 5: A Tic-Tac-Toe game tree of depth 2

 

 

We can now begin to apply the minimax algorithm.  We do this bottom up. Figure 5 below shows the tree with the e(m) values calculated by the evaluation function for the bottom child nodes.

 


Figure 6: Tree showing e(m) calculation for the bottom-most children

 

With the e(m) values calculated for the bottom children each of the MIN nodes can select their best move. The best move for a MIN node is the child node that has the minimum value since that represents the best configuration for ‘O’.  The image below shows the tree after the MIN nodes have been assigned the values of their best child nodes. The values the MIN nodes have selected are shown circled to the right of the min nodes. These are displayed this way to distinguish selected values from values that are calculated directly from the evaluation function. Only nodes on the bottom level use the evaluation function to compute their values.


Figure 7: The MIN nodes are assigned the values representing their best move, which is the child node that has the minimum e(m) value.

Now the root node selects the maximum value from among it’s children. The node with e(m)= 0 is the maximum so that is selected by the root node as the best move for the computer to make (moving to position 6). The algorithm predicts a loss if a move to position 3 or 5 is made where e(m) = -∞ for those moves.

The code below implements the evaluation function that was described earlier.

 

 

Implementing MINIMAX

Like the HumanPlayer class our ComputerPlayer class will extend Player and need to implement the Move method. We will build our search tree using Node objects that will represent a particular move made. As discussed earlier, there are two types of Nodes, MAX and MIN nodes. In code, we define the MaxNode and MinNode classes that will represent MAX and MIN nodes respectively. We will make use of an abstract Node class to capture the common properties and functionality of MIN and MAX nodes. The code for the Node class is shown below.

 

All Node objects have a parent (with the exception of the root node of course) and zero or more children. Each Node has a value associated with it, either computed directly from the evaluation function or by selecting the maximum (for MAX Nodes) or minimum (for MIN Nodes) value of the children. The Node class also keeps track the move made to its parents board to generate its own board. There is also the BestMove property which enables a node to keep track of its best move. The FindBestMove method defined in the Node class constructs the game tree and drives the search for a best move. Don’t worry about the details now as we will see later how this all works when we do an example.

 

Let’s take a look at the MaxNode and MinNode classes:

 

You can see that the abstract Node class implements the search in the FindBestMove method while the concrete MaxNode and MinNode add some node-specific operations like sorting and creating children.

Now that we can create the game trees and select best moves we are ready to implement the ComputerPlayer class. The code for ComputerPlayer is shown below.

Like HumanPlayer, the search for a move begins when an instance of ComputerPlayer is sent the Move message. To get the ball rolling, the ComputerPlayer creates the root node of the tree (a MaxNode), which is given a reference to the current game board (well actually it is a clone of the current board — we do not want to mess up the actual game board). The evaluation function is set and the root node’s FindBestMove (recall that this is implemented in the abstract Node class) method is invoked with the default search depth of 3.

The first thing done in FindBestMove is to check that depth > 0. Since this is the first pass and depth = 3, it enters the if block and the GenerateChildren method is invoked and a child node is created for each empty position in the root node’s Board representing all the possible initial moves that the computer can make . Each of these children will be MinNodes since their parent is a MaxNode.

After the GenerateChildren method is called the EvaluateChildren method is invoked, which causes the child nodes to computer their Value via the evaluation function. Strictly speaking, the algorithm we described does not call for evaluating the nodes at this point. According to the algorithm, we are only required to evaluate nodes at depth 3 or any terminal nodes (nodes for which there is a completely full board or a board in which one player has won). I’ll go into a little more detail later as to why I chose to evaluate the children now, but for the sake of this example, let us assume that we do not find any winning nodes until we at least complete our tree.

Now we use a bit of recursion and for each child that the root node has that node’s FindBestMove method is invoked passing along a depth of 2. Child nodes are again created, level 2 nodes, which are are MaxNodes since they are the immediate descendants of the MinNodes from level 1. Again a recursive call to FindBestMove is invoked on each of the children this time with a depth of 1. For the last time, children are made, which are MinNodes, and the recursive call to FindBestMove is made with a depth of 0. This time, depth > 0 is false so FindBestMove simply returns doing nothing and control is returned to the foreach block which puts us back to the first level 2 Node (node 2_0).  Then SelectBestMove method is invoked on that node. Since the level 2 nodes are MaxNodes, the SelectBestMove method defined in MaxNode is used to select the maximum e(m) value from among its children. Control then returns to the parent of the first level 2 Node (node 1_0) we created. The series of illustrations below show how a tree of depth 3 is generated in the FindBestMove method.


Figure 8(a): Tree state after the first call to the FindBestMove method and after GenerateChildren is invoked but before first recursive call to FindBestMove

 


Figure 8(b): Tree state after first recursive call to FindBestMove with depth = 2 from the 1_0 node, after the GenerateChildren Method has been invoked but before the recursive call to FindBestMove again

 

 


Figure 8(c): Tree state after second recursive call to FindBestMove with depth = 1 from the 2_0 node after the GenerateChildren method is invoked but before subsequent recursive call to FindBestMove. The call to FindBestMove with depth = 0 from node 3_0 is then made but returns doing nothing.  The SelectBestMove is invoked and node 2_0 sets it’s value to the max value of its children (the value of its only child, node 3_0).

 


Figure 8(d): Tree state after recursive call to FindBestMove with depth = 1 from the 2_1 node after the GenerateChildren method is invoked but before subsequent recursive call to FindBestMove. The call to FindBestMove with depth = 0 from node 3_1 is then made but returns doing nothing.  The SelectBestMove is invoked and node 2_1 sets it’s value to the max value of its children (the value of its only child, node 3_1).

 

 

After the tree state in figure 8(d),  FindBestMove is invoked on node 1_1 and proceeds in the same manner as shown in steps (b) through (d) above but on node 1_1. Likewise, FindBestMove is invoked on node 1_2 and the tree is completed and the root node selects the maximum value from among the Level 1 children. The illustration below shows the final tree.

 


Figure 9: Final Tree of depth 3 with values for each nodes determined.

 

Now that we walked through the find best move method I want to explain why I decided to generate all the children and evaluate them first rather than proceeding one child at a time (a depth-first traversal). This is done in case we find that one of the children represents a move that wins the game and as such we need not proceed any further, since that node will be selected as best from its parent no matter what the other child nodes yield (we might do the same but no better).  Not only does this prevent us from generating the tree unnecessarily, it also ensures we select the shortest path to victory. For example, let’s say we have the configuration shown below and it is X’s turn (the computer).

 


Figure 9: This configuration could lead the computer selecting to block ‘O’ in position 2 rather than move to position 3 and win the game.

Obviously the computer should move to position 3 and win the game. However, with minimax it is possible that the computer selects to move to position 2 and block ‘O’ from winning. But why? Notice that if ‘X’ blocks ‘O’ at position 2, this also guarantees ‘X’ a win as that move gives ‘X’  two ways of winning, one across the diagonal and another in the third column. So no, matter what ‘O’ does next ‘X’ still wins. As far as the standard minimax algorithm is concerned these two moves are equally favorable.  All it cares about is winning. There isn’t any mechanism to select for shorter paths to victory. With the check for a winning node that we do in FindBestMove, we eliminate this from happening. This is good practice for two reasons. First, it cuts down the number of computations required by reducing the tree size (none of the winning nodes siblings will have children generated for them), which can be significant when the game is more complicated than tic-tac-toe. Second, if we have a less than perfect evaluation function that doesn’t accurately estimate the favorability, we may actually end up losing by selecting a move in which we expect to win several moves down the line.

 

Creating Players and Running the Game

So now that we have an ‘intelligent’ computer player capable of making its own moves let’s set up a game. In the Program class (see Program.cs) we create our players and run the game. The code snippet below shows how to create a human Vs computer match. As far as I can tell the computer player is impossible to beat even when the search depth is 3. We aren’t limited to human vs computer games.  You could also have human vs human (not fun) or you could reenact Wargames and set two computer players against one another.

 

 


Figure 10: Screenshot of the game

Final Thoughts

I hope that some of you found some things of interest in this post. Let me know if there are any questions or comments. I love getting feedback. For anyone who wants to take this code to the next level I would suggest looking at the following:

  • Try and define a different evaluation, maybe you could come up with a better one than I presented here.
  • Look into alpha-beta pruning.  This describes a method of ‘pruning’ the tree to reduce the tree size and allow the computer to look ahead further in a shorter amount of time. Again, not really required for tic-tac-toe but it will most certainly return dividends in more complex games.
  • Try coding AI for a harder game like Connect 4. In order to use the game tree ideas presented here in a more complicated game, you can use the code in the Node classes as is but you must change the evaluation function and add support for your own game board. Of course, you would probably want to implement alpha-beta pruning as well.
  • For the more adventurous, you could try and make a ComputerPlayer class that uses a neural network to select moves and in fact ‘learn’ (you will need to train this type of player with a given set of examples or from manually playing against it many times) from experience rather than using a static evaluation function.

 

 

Leave a Reply

  

  

  

Social Widgets powered by AB-WebLog.com.