A game can be thought of as a tree of possible future game states. For example, in Gomoku the game state is the arrangement of the board, plus information about whose move it is. The current state of the game is the root of the tree (drawn at the top). In general this node has several children, representing all of the possible moves that we could make. Each of those nodes has children representing the game state after each of the opponent's moves. These nodes have children corresponding to the possible second moves of the current player, and so on. The leaves of the tree are final states of the game: states where no further move can be made because one player has won, or perhaps the game is a tie. Actually, in general the tree is a graph, because there may be more than one way to get to a particular state. In some games (e.g., checkers) it is even possible to revisit a prior game state.
Suppose that we assign a value of positive infinity to a leaf state in which we win, negative infinity to states in which the opponent wins, and zero to tie states. We define a function evaluate that can be applied to a leaf state to determine which of these values is correct.
If we can traverse the entire game tree, we can figure out whether the game is a win for the current player assuming perfect play: we assign a value to the current game state by we recursively walking the tree. At leaf nodes we return the appropriate values. At nodes where we get to move, we take the max of the child values because we want to pick the best move; at nodes where the opponent moves we take the min of child values. This gives us the following pseudo-code procedure for minimax evaluation of a game tree.
fun minimax(n: node): int = if leaf(n) then return evaluate(n) if n is a max node v := L for each child of n v' := minimax (child) if v' > v, v:= v' return v if n is a min node v := W for each child of n v' := minimax (child) if v' < v, v:= v' return v
Consider the following game tree, where the leaves are annotated with W or L to indicate a winning or losing position for the current player (L < W), and interior nodes are labeled with + or - to indicate whether they are "max" nodes where we move or "min" nodes where the opponent moves. In this game tree, the position at the root of the tree is a losing position because the opponent can force the game to proceed to an "L" node:
+ / \ / \ - - / \ /|\ + L L W W / \ W L
We can see this by doing a minimax evaluation of all the nodes in the tree. Each node is labeled with its minimax value in red:
L / \ / \ L L / \ /|\ W L L W W / \ W L
Usually expanding the entire game tree is infeasible because there are so
many possible states. The solution is to only search the tree to a specified
depth. The evaluate
function (the static evaluator) is
extended so it returns a value between L and W for game positions that are
not final positions. For game positions that look better for the current player,
it returns larger numbers. When the depth limit of the search is exceeded, the
static evaluator is applied to the node as if it were a leaf:
(* the minimax value of n, searched to depth d *)
fun minimax(n: node, d: int): int =
if leaf(n) or depth=0 return evaluate(n)
if n is a max node
v := L
for each child of n
v' := minimax (child,d-1)
if v' > v, v:= v'
return v
if n is a min node
v := W
for each child of n
v' := minimax (child,d-1)
if v' < v, v:= v'
return v
For example, consider the following game tree searched to depth 3, where the static evaluator is applied to a number of nodes that are not leaves in the game tree:
6 / \ / \ 6 L / \ |\ 6 8 L W /|\ |\ |\ 1 5 6 8 3 1 W
The value of the root of the tree is 6 because the current player can force the game to go to a "leaf" node (as defined by the depth cutoff) whose value is at least 6. Notice that by finding out the value of the current position, the player also learns what is the best move to make: the move that transitions the game to the immediate child with maximum value.
Designing the static evaluator is an art: a good static evaluator should be very fast, because it is the limiting factor in how quickly the search algorithm runs. But it also needs to capture a reasonable approximation of how good the current board position is, because it captures what the player is trying to achieve during play. In practice, game AI designers have found that it doesn't pay to build intelligence into the static evaluator when the same information can be obtained by searching a level or two deeper in the game tree.
How deeply should the tree be searched? This depends on how much time is available to do the search. Each increase in depth multiplies the total search time by about the number of moves available at each level.
6 / \ / 6 (a) / \ 6 8 (b) /|\ |\ 1 5 6 8 ...
Once we have seen the node whose static evaluation is 8, we know that there is no point to exploring any of the rest of the children of the max node above it. Those children could only increase the value of the max node (b) above, but the min node above that (a) is going to have value at most 6 anyway. No matter what happens in the part of the tree under the ..., it can't affect the minimax value of the min node labeled 6. Avoiding searching a part of a tree is called pruning; this is an example of alpha-beta pruning.
In general the minimax value of a node is going to be worth computing only if
it lies within a particular range of values. We can capture this by extending
the code of the minimax function with a pair of arguments min
and max
. The new spec
of minimax
is that it always returns a value in the range [min
,
max
]. For
example, when evaluating the node (b) above, we can set max
to 6 because there
is no reason to find out about values greater than 6. There are corresponding
cases where there is no reason to find out about values less than some minimum
value. The min
and max
bounds are
used to prune away subtrees by terminating a call to search
early. Once a child node has been seen that pushes
the node's value outside the range of interest, there is no point in exploring the rest
of the children. This idea is captured by adding the tests if v > max return
max
and if v < min return min
in the following code:
(* the minimax value of n, searched to depth d.
* If the value is less than min, returns min.
* If greater than max, returns max. *)
fun minimax(n: node, d: int, min: int, max: int): int =
if leaf(n) or depth=0 return evaluate(n)
if n is a max node
v := min
for each child of n
v' := minimax (child,d-1,...,...)
if v' > v, v:= v'
if v > max return max
return v
if n is a min node
v := max
for each child of n
v' := minimax (child,d-1,...,...)
if v' < v, v:= v'
if v < min return min
return v
Because we don't care about values less than min
or greater than
max
, we also initialize the variable v
to min
in the max case and max
in the min case, rather than to L
and W
. Notice that if this procedure is invoked as minimax(n,d,L,W)
, it
will behave just like the minimax
procedure without min
and
max
bounds, assuming
that the static evaluator only returns values between L
and W
. Thus, a top-level
search is invoked in this way so that we get the same answer as before pruning.
The only thing missing from our search algorithm now is to compute the right
min
and max
values to pass down. Clearly we could safely pass down the same
min
and max
received in the call, but then we wouldn't have achieved anything.
Consider the max node case after we have gone around the loop. In general the
variable v
will be greater than min
. In the recursive invocation of
minimax
there is no point to finding out about values less or equal to v
; they can't
possibly affect the value of v
that is returned. Therefore, instead of passing
min
down in the recursive call, we pass v
itself. Conversely, in the min node
case, we pass v
in place of max
:
(* the minimax value of n, searched to depth d.
* If the value is less than min, returns min.
* If greater than max, returns max. *)
fun minimax(n: node, d: int, min: int, max: int): int =
if leaf(n) or depth=0 return evaluate(n)
if n is a max node
v := min
for each child of n
v' := minimax (child,d-1,v,max)
if v' > v, v:= v'
if v > max return max
return v
if n is a min node
v := max
for each child of n
v' := minimax (child,d-1,min,v)
if v' < v, v:= v'
if v < min return min
return v
This is pseudo-code for minimax search with alpha-beta pruning, or simply alpha-beta
search. We can verify that it works as intended by checking what it does on
the example tree above. Each node is shown with the [min
,max
] range that minimax
is invoked with. Pruned parts of the tree are marked with X.
6[L,W] / \ / \ / \ 6[L,W] L[6,W] / \ | \ / \ | X 6[L,W] 6[L,6] L[6,W] / | \ | \ / | \ | X / | \ | 1[L,W] 5[1,W] 6[5,W] 8[L,6]
In general the [min,max] bounds become tighter and tighter as you proceed down the tree from the root.
How effective is alpha-beta pruning? It depends on the order in which children are visited. If children of a node are visited in the worst possible order, it may be that no pruning occurs. For max nodes, we want to visit the best child first so that time is not wasted in the rest of the children exploring worse scenarios. For min nodes, we want to visit the worst child first (from our perspective, not the opponent's.) There are two obvious sources of this information:
When the optimal child is selected at every opportunity, alpha-beta pruning causes all the rest of the children to be pruned away at every other level of the tree; only that one child is explored. This means that on average the tree can searched twice as deeply as before—a huge increase in searching performance.