Tic-Tac-Toe minimax algorithm does not work with 4x4 card

So, I have been working on this project for the last 3 weeks. I managed to get the minimax function to work at an early stage for a 3x3 board, however, there were problems that arose when I tried to use it for a 4x4 board, namely a Java heap error. Since then, using Alpha beta pruning, I managed to knock down the number of minimax calls in the minimax function from aprox. From 59000 to 16000 to 11000, and then finally up to 8000 calls (this implies an initial minimax call for a board with one slot already filled). The problem now, however, is that the method just continues to work for 4x4 games. He simply continues to call himself non-stop, without errors, without result, nothing. Theoretically, as I see it, my function should work on arbitrary board sizes, the only problem was memory. Now, since I greatly reduced the greed of my function, I expected it to work. Well, this is for 3x3. However, this is not for 4x4. A brief explanation of what the function does: The function returns an array of size 2 containing the most favorable next transition of all possible subsequent moves along with the estimate expected from this move. The scoring system is simple. +10 for winning O, -10 for winning X, and 0 for a draw. A function is, of course, recursive. In it you will find certain shortcuts that reduce the number of necessary calls for yourself. For example, if X rotates, and the returned result is -10 (which is the best indicator for X), then we exit the loop, i.e. We stop observing other potential movements from this state. Here is the code for the class state:

private String [] state; //Actual content of the board private String turn; //Whose turn it is private static int n; //Size of the board public State(int n) { state = new String[n*n]; for(int i = 0; i < state.length; i++) { state[i] = "-"; } State.n = n; } public int[] newminimax47(int z) { int bestScore = (turn == "O") ? +9 : -9; //X is minimizer, O is maximizer int bestPos = -1; int currentScore; int lastAdded = z; if(isGameOver() != "Not Gameover") { bestScore= score(); } else { int i = 0; for(int x:getAvailableMoves()) { if(turn == "X") { //X is minimizer setX(x); currentScore = newminimax47(x)[0]; if(i == 0) { bestScore = currentScore; bestPos = x; if(bestScore == -10) break; } else if(currentScore < bestScore) { bestScore = currentScore; bestPos = x; if(bestScore == -10) break; } } else if(turn == "O") { //O is maximizer setO(x); currentScore = newminimax47(x)[0]; if(i == 0) { bestScore = currentScore; bestPos = x; if(bestScore == 10) break; } else if(currentScore > bestScore) { bestScore = currentScore; bestPos = x; if(bestScore == 10) break; } } i++; } } revert(lastAdded); return new int [] {bestScore, bestPos}; } 

Additional functions used by newminimax47 ():

isGameOver ():

 public String isGameOver() { if(n == 3) { //Rows 1 to 3 if((state[0] != "-") && (state[0] == state[1]) && (state[1] == state[2])) return (state[0] == "X") ? "X Won" : "O Won"; else if((state[3] != "-") && (state[3] == state[4]) && (state[4] == state[5])) return (state[3] == "X") ? "X Won" : "O Won"; else if((state[6] != "-") && (state[6] == state[7]) && (state[7] == state[8])) return (state[6] == "X") ? "X Won" : "O Won"; //Columns 1 to 3 else if((state[0] != "-")&&(state[0] == state[3]) && (state[3] == state[6])) return (state[0] == "X") ? "X Won" : "O Won"; else if((state[1] != "-") && (state[1] == state[4]) && (state[4] == state[7])) return (state[1] == "X") ? "X Won" : "O Won"; else if((state[2] != "-") && (state[2] == state[5]) && (state[5] == state[8])) return (state[2] == "X") ? "X Won" : "O Won"; //Diagonals else if((state[0] != "-") && (state[0]==state[4]) && (state[4] == state[8])) return (state[0] == "X") ? "X Won" : "O Won"; else if((state[6] != "-") && (state[6] == state[4]) && (state[4] == state[2])) return (state[6] == "X") ? "X Won" : "O Won"; //Checking if draw else if((state[0] != "-") && (state[1]!="-") && (state[2] != "-") && (state[3]!="-") && (state[4] != "-") && (state[5] != "-") && (state[6] != "-") && (state[7] != "-") && (state[8] != "-")) return "Draw"; else return "Not Gameover"; } else { //Rows 1 to 4 if((state[0] != "-") && (state[0] == state[1]) && (state[1] == state[2]) && (state[2] == state[3])) return (state[0] == "X") ? "X Won" : "O Won"; else if((state[4] != "-") && (state[4] == state[5]) && (state[5]==state[6]) && (state[6] == state[7])) return (state[4] == "X") ? "X Won" : "O Won"; else if((state[8] != "-") && (state[8] == state[9]) && (state[9]==state[10]) && (state[10] == state[11])) return (state[8] == "X") ? "X Won" : "O Won"; else if((state[12] != "-") && (state[12] == state[13]) &&(state[13] == state[14]) && (state[14] == state[15])) return (state[12] == "X") ? "X Won" : "O Won"; //Columns 1 to 4 else if((state[0] != "-") && (state[0] == state[4]) && (state[4] == state[8]) && (state[8] == state[12])) return (state[0] == "X") ? "X Won" : "O Won"; else if((state[1] != "-") && (state[1] == state[5]) && (state[5] == state[9]) && (state[9] == state[13])) return (state[1] == "X") ? "X Won" : "O Won"; else if((state[2] != "-") && (state[2] == state[6]) && (state[6] == state[10]) && (state[10] == state[14])) return (state[2] == "X") ? "X Won" : "O Won"; else if((state[3] != "-") && (state[3] == state[7]) && (state[7] == state[11]) && (state[11] == state[15])) return (state[3] == "X") ? "X Won" : "O Won"; //Diagonale else if((state[0] != "-") && (state[0] == state[5]) && (state[5] == state[10]) && (state[10] == state[15])) return (state[0] == "X") ? "X Won" : "O Won"; else if((state[12] != "-") && (state[12] == state[9]) && (state[9] == state[6]) && (state[6] == state[3])) return (state[0] == "X") ? "X Won" : "O Won"; //Pruefe ob Gleichstand else if((state[0] != "-") && (state[1] != "-") && (state[2] != "-") && (state[3]!="-") && (state[4] != "-") && (state[5] != "-") && (state[6] != "-") && (state[7] != "-") && (state[8] != "-") && (state[9] != "-") && (state[10] != "-") && (state[11] != "-") && (state[12] != "-") && (state[13] != "-") && (state[14] != "-") && (state[15] != "-")) return "Draw"; else return "Not Gameover"; } } 

Please excuse the stupidity of the isGameOver () method, it just checks the status of the board (i.e. Win, Draw, Game not Over)

GetAvailableMoves () method:

 public int[] getAvailableMoves() { int count = 0; int i = 0; for(int j = 0; j < state.length; j++) { if(state[j] == "-") count++; } int [] availableSlots = new int[count]; for(int j = 0; j < state.length; j++){ if(state[j] == "-") availableSlots[i++] = j; } return availableSlots; } 

This method simply returns an int array with all the next moves available (relative to the current state) or returns an empty array if no moves are available or the game has ended.

Score () method:

 public int score() { if(isGameOver() == "X Won") return -10; else if(isGameOver() == "O Won") return +10; else return 0; } 

setO (), setX () and revert ():

 public void setX(int i) { state[i] = "X"; turn = "O"; lastAdded = i; } public void setO(int i) { state[i] = "O"; turn = "X"; lastAdded = i; } public void revert(int i) { state[i] = "-"; if(turn == "X") turn = "O"; else turn = "X"; } 

My main method looks like this for a 3x3 game:

 public static void main(String args[]) { State s = new State(3); int [] ScoreAndRecommendedMove = new int[2]; s.setX(8); ScoreAndRecommendedMove = s.newminimax47(8); System.out.println("Score: "+ScoreAndRecommendedMove[0]+" Position: "+ ScoreAndRecommendedMove[1]); } 

In this game, X started the game with movement in position 8. The method in this case will return

 Score: 0 Position: 4 

Assuming that O, the most promising move is in position 4, and in the worst case, 0 (i.e. a draw).

The following image is intended to give an idea of ​​how newminimax47 () works. In this case, our starting state (board) is assigned the number 1. Note. The numbers indicate priority in creating the states in question. 1 was created before creation 2, 2 to 3, 3 was created to 4, etc. enter image here

In this scenario, the valuation and position eventually returned to state 1 will be

 Score: 0 Position: 6 

outgoing from state 8.

Note. The code you see is just fragments of the actual State class. These fragments themselves should allow you to recreate and work with the newminimax47 function without problems (at least for 3x3). Any errors that you may find are not really errors, they are simply not included in the fragments that I copied here, and the code should work without them. For example, the lastAdded variable in the setO and setX functions is not included in the fragments, but I just realized that you do not need it to work with the minimax function, so you can just comment on it.

+4
source share
2 answers

I played with your code and there is a little to say

error

First of all, there is a mistake. I don't think your code really works for a 3x3 board. The problem is that you revert move you add to the board. You do this at the end of the newminimax47 method exactly once, even if it moves to the board inside the for loop in the method you add. This means that the method call not only calculates something, but also changes the state of the board, and the rest of the code does not expect it.

So, remove revert where it is now, and as soon as you can:

 setX(x); currentScore = newminimax47(x)[0]; revert(x); 

it also means you don't need the lastAdded variable.

play

It's much easier to see what happens if you really play against your own algorithm. Add a method to the State class

 public void dump() { for (int y = 0; y < n; y++) { for (int x = 0; x < n; x++) { System.out.print(state[y * n + x]); } System.out.println(); } } 

and the main thing in you is something like

 public void play() { State s=new State(3); Scanner in = new Scanner (System.in); while (s.isGameOver().equals("Not Gameover")) { int[] options = s.getAvailableMoves(); s.dump(); System.out.println ("Your options are " + Arrays.toString(options)); int move = in.nextInt(); s.setX(move); int [] ScoreAndRecommendedMove=new int[2]; ScoreAndRecommendedMove=s.newminimax47(0); System.out.println("Score: "+ScoreAndRecommendedMove[0]+" Position: "+ ScoreAndRecommendedMove[1]); s.setO(ScoreAndRecommendedMove[1]); } s.dump(); } 

and you can play against him. On a 3x3 board, this works fine for me. Unfortunately, I calculated that calculating the first move at 4x4 takes about 48 hours on my computer.

data types

Your choice of data types is often a little ... weird. If you want to remember a single character, use char instead of String . If you want to return a yes / no solution, try using boolean . There are also some parts of the program that can be replaced with less code doing the same. But that was not your question, and so on ...

algorithm

So what happened with the minimax solution to this problem? Suppose the first four movements are X5, O8, X6 O7. Another possibility is to start the game with X5, O7, X6, O8. Another one is X6, O7, X5, O8. And finally, X6, O8, X5, O7.

All four of these possibilities for the first four turns of the game lead to the same game match. But minimax does not recognize that they are the same (basically there is no memory of parallel branches), so it will calculate all four of them. And the number of calculations of each state of the board will increase rapidly if you look deeper.

The number of possible games significantly exceeds the number of possible board states. Estimate the number of games: first there are 16 possible moves, then 15, then 14, 13, ... and so on. The rough estimate is 16!, Although minimax will not have to calculate all of them, because many of them will be ready before the 16th move.

Assessment of the number of game states: each square on the board can be either empty, or X, or O. So 3 ^ 16 boards. Not all of them are really valid boards, because the number of Xs on the board can be no more than one, and then the number of Os, but still it is close to 3 ^ 16.

sixteen! possible games are approximately one and a half million times more than 3 ^ 16 possible states of the board. This means that we roughly calculate each board half a million times at a time.

The solution is to start memorizing every game that you figure out. Each time a recursive function is called, first check to see if you know the answer, and if so, just return the old answer. This is a method called memoization .

memorization

I will describe how to add memoization using already created data structures (although I do not agree with them). To perform memoization, you need a collection in which you can quickly add and quickly search. A list (e.g. ArrayList ) will not do us any good. Quickly add values, but search very slowly in long lists. There are several options, but the simplest of them is HashMap . To use the HashMap , you need to create something that represents your state, and you can use it as a key. The easiest way is to simply make a String with all the X / O / - characters that represent your panel.

So add

 Map<String,int[]> oldAnswers = new HashMap<String,int[]>(); 

to your State object.

Then at the beginning of your newminimax47 method newminimax47 create a line that represents the state, and check if we know the answer:

  String stateString = ""; for (String field : state) stateString += field; int[] oldAnswer = oldAnswers.get(stateString); if (oldAnswer != null) return oldAnswer; 

Finally, when you compute a new answer at the end of newminimax47 , you must not only return it, but also save it on the map:

  int[] answer = {bestScore, bestPos}; oldAnswers.put (stateString, answer); return answer; 

With a note in place, I was able to play a 4x4 game against your code. The first move is still slow (20 seconds), but after that everything is calculated and very fast. If you want to speed it up, you can learn alpha beta pruning , but the improvement will not be almost the same as a memory. Another option is to use more efficient data types. This will not reduce the theoretical order of your algorithm, but it can still easily make it 5 times faster.

+5
source

As user3386109 explains, the problem is how many times you calculate everything. There are a few things that can help you, given that in an N-size grid:

  • the user cannot win if his characters are less than N, so you can use it in the isGameOver function as the first check.
  • The first thing you need to do is to prevent your opponent from winning if there is a chance that his next move will be winning.
  • Keep track of how many β€œX” and β€œO” are in each row and column and in two diagonals, increasing the number of counters at each step. If there is N-1 of the same symbol, the next one will be winning for you or your opponent.
  • That way, you can easily tell which one is best, because then:
    • If you have a winning move, you put a symbol there
    • check your opponent if he has N-1 characters in the same row / column / diagonal, you put him there
    • If your opponent has more characters than you have in some place, you even leave the place (this means +1 or +2, depending on who starts the game).
    • If this is not the case, you put your next character in the line / colum / diagonal, where you have more characters.
    • If you have the same number of characters in some places, you just put it where your opponent has more characters.
    • If you and your opponent are completely equal, just go for your own strategy (random will not be bad, I think :-))

If you really don’t need it (for example, as homework), I would not use recursion for this.

As a note: I do not consider it a good practice to have what is actually a logical function, return a string, and then compare it with a fixed value. The true / false value for the isGameOver function looks much better for me.

0
source

All Articles