Minimax algorithm

I have a simple question regarding the Minimax algorithm: for example, for a tic-tac-toe game, how do I define a utility function for each player? It does not do this automatically, right? I have to hard code the values ​​in the game, he can't study them on his own, right?

+7
language-agnostic artificial-intelligence minimax
source share
4 answers

No, MiniMax is not learning. This is a smarter version of brute force search.

+10
source share

Generally, you should implement the utility function directly. In this case, the algorithm will not know how to play the game, it will use information that you clearly hardcoded in the implementation.

However, one could use genetic programming (GP) or some equivalent method to automatically obtain a utility function. In this case, you will not need to code any explicit strategy. Instead, evolution will open its own way of playing the game.

You can either combine your minimax code and GP code into one (possibly very slow) adaptive program, or first run GP, ​​find a useful utility function, and then add this function to your minimax code just like you would any function with manual coding.

+3
source share

Tic-Tac-Toe is small enough to run the game to the end and assign 1 to win, 0 to draw and -1 to play.

Otherwise, you must provide a function that determines the heuristic value of the position. In chess, for example, a big factor is the value of the material, but also the one who controls the center or how easily the pieces move.

As for training, you can add weights to various aspects of the position and try to optimize them by playing games many times.

+2
source share

How to determine the utility function for each playback?

Carefully ;-) This article shows how a slightly spoiled evaluation function (for example, which either does not go β€œdeep” enough to look forward in the tree of possible layers, or one that could not fix the relative strength of some positions of the board) leads to a general weak algorithm (which loses more often).

he cannot recognize them by himself, can he?

No, it is not. However, there are ways to get the computer to study the relative strength of the positions on the board. For example, looking at Donald Mitchie and his MENACE program , you will see how the stochastic process can be used to learn advice without any a priori knowledge, but the rules of the game. The funny thing is that, although this can be implemented on computers, it requires several hundred colored beads and matches, due to the relatively small size of the playing space, as well as due to various symmetries.

Having learned such a cool way to teach a computer how to play, it may not be so interesting for us to return to MinMax for Tic-Tac-Toe. In the end, MinMax is a relatively simple way to prune a decision tree , which is hardly necessary for a small tic-tac-toe game space. But, if we must ;-) [back to MinMax] ...

We can look into the β€œmatch box” associated with the next game (that is, without going deep at all), and use the percentage of balls associated with each square as an additional factor. Then we can evaluate the traditional tree, but only go, say, 2 or 3 steps forward (a dull perspective depth, which usually ends usually in losses or draws) and evaluate each next move based on a simple -1 (loss), 0 (draw / unknown), +1 (win). By that time, combining the percentage of beads and a simple rating (for example, adding, of course, not by multiplication), we can effectively use MinMax in a way that is more akin to how it is used in cases where it is impossible to evaluate the game tree to the end.

Bottom line: in the case of Tic-Tac-Toe, MinMax becomes more interesting (for example, helping us examine the effectiveness of a particular utility function) when we remove the deterministic nature of the game associated with the easy estimation of a complete tree. Another way to make the game [mathematically] interesting is to play with an adversary who makes mistakes ...

+2
source share

All Articles