Search Accidents

Question

Search Accidents

I am writing a Connect4 game with an adversary of the AI using adversarial search methods, and I came across a wall somewhat. I feel that I am not far from the solution, but that there may be a problem when I move on to the prospects (like: regarding which participant I base the assessment scores on), there is no minus sign somewhere or something like that .

The problem is that in the variations that I tried, the AI chooses not to block the player when the player has three in a row, but otherwise the AI plays the perfect game or that he prefers to block the player, even if he has a chance to win the game. It also seems to matter if the search depth is an even or uneven number, since AIs are pants slowed down by a 6-layer search, which pretty much suggests that something is wrong.

Search

The algorithm used is negamax with alpha-beta trimming and is implemented as follows:

private int Negamax(int depth, int alpha, int beta, Player player) { Player winner; if (Evaluator.IsLeafNode(game, out winner)) { return winner == player ? (10000 / depth) : (-10000 / depth); } if (depth == Constants.RecursionDepth) { return Evaluator.Evaluate(game, depth, player); } foreach (var move in moves) { int row; if (board.DoMove(move, player, out row)) { var value = -Negamax(depth + 1, -beta, -alpha, (Player)1 - (int)player); board.UndoMove(move, row, player); if (value > alpha) { alpha = value; if (player == Player.AI) { bestColumn = move; } } if (alpha >= beta) { return alpha; } } } return alpha; }

I do not suspect that the problem is in this function, but maybe.

Rating

I based the evaluation function on the fact that there are only 69 possible ways to get four in a row on a 7x6 board. I have a lookup table of about 350 elements, which contains hard-coded information for each column and row, which are combinations of wins that include the row + column. For example, for row 0 and column 0, the table looks like this:

 //c1r1 table[0][0] = new int[3]; table[0][0][0] = 21; table[0][0][1] = 27; table[0][0][2] = 61;

This means that column 0, row 0 is part of a combination of wins 21, 27 and 61.

I have a second table that contains for both players the number of stones in each of the winning combinations. When I make a move, I do the following:

 public bool DoMove(int column, Player p, out int row) { row = moves[column]; if (row >= 0) { Cells[column + row * Constants.Columns] = p; moves[column]--; var combinations = this.Game.PlayerCombinations[p]; foreach (int i in TerminalPositionsTable.Get(column,row)) { combinations[i]++; } return true; } else { return false; } }

The opposite, of course, is done for UndoMove .

So, after moving through column 0, row 0 on Player.Human , the table will be populated with a value of 1 at indices 21, 27 and 61. If I take one more step in the cell, which is also part of win combination 27, then the combination table players increases with an index of 27 to 2.

Hopefully I made this clear as he used the scoring function to quickly determine how close a player should score four in a row.

The evaluation function, where I suspect that the problem lies, is as follows:

 public static int Evaluate(Game game, int depth, Player player) { var combinations = game.PlayerCombinations[player]; int score = 0; for (int i = 0; i < combinations.Length; i++) { switch (combinations[i]) { case 1: score += 1; break; case 2: score += 5; break; case 3: score += 15; break; } } return score; }

So, I just go through 69 possible combinations of wins and add the amount to the points based on whether it is one stone, two in a row or three.

The part that I still confuse in all this competitive search is whether I have to care about which player makes the move? I mean, should I go to the player like I am here, or should I always evaluate the advice from the perspective of an AI player? I have tried many combinations of aiScore - humanScore , or just always look in terms of Player.AI , etc. But I was stumped, and every combination I tried was pretty wrong.

So:

Is the logic of my assessment solid at its core?
When do I need to "switch perspectives"?

Any help would be greatly appreciated.

Update

I implemented the Brennan suggestions below, and although it has improved significantly , for some reason it does not block three rows in any column, but two on the left and right, and only when the search depth is uneven. AI is unsurpassed even with a search depth, but only to a depth of 8 and above. Then he refuses to block again. This pretty suggests that I'm probably very close, but still have some important flaws.

Perhaps this is due to the fact that I set the column into which the AI should throw a stone, as Brennan commented, but I do not know when it is needed. Setting it only at a depth of 0 does not work.

Update 2

Edited the code as it now looks with Brennan's changes.

Update 3

Created a Github repository with full code. If you do not know how to work with Git, just download the zip file from here .

This is a .NET 4.0 project, and running it will create the negamax algorithm log files in the document / log directory. The solution also contains a test project that contains a test for each column of the column, regardless of whether the AI wants to block the player when the player has three in a row.

+7

c # .net artificial-intelligence

JulianR Jul 03 '10 at 0:35

source share

2 answers

If it does not block some combinations, it sounds as if you have a flaw in the table of possible victories.

I also see the problem in your evaluation function: it gives value to moves that have no hope of victory. Suppose you have xoo.x, you play o. Your routine says that it costs 15 points to play here when in fact it costs 0. Nobody needs any win template that already contains tiles from both players.

I found that when debugging this kind of debugger is of little importance, since it does not allow you to see the big picture well. Try writing every verified sample to the log file — place the actual drawing in the log.

+1

Loren pechtel Jul 03 '10 at 19:47

source share

Brennan vincent · Accepted Answer · 2010-07-03T02:03:59+0000

This stuff makes my brain hurt, so I'm not sure this answer is correct, but it does.

In non-hamax, the score is always evaluated relative to the player in motion. If this is a white move, then a high score is good for white. If this is a black move, then a high score is good for black. Therefore, if you have a node sheet, whether the + inf or -inf score will be determined not by whether the node is a win for white or black, but whether it is a victory for the player you are currently evaluating. Replace this:

 return winner == Player.AI ? (10000 / depth) : (-10000 / depth);

with this:

 return winner == player ? (10000 / depth) : (-10000 / depth);

A similar problem is observed in your evaluation function. Replace this:

 return player == Player.AI ? score : -score;

with this:

 return score;

Again, I'm not sure if this is correct. But I hope you try these two changes and let me know if this works. I'm very curious!

Search Accidents

More articles: