Using minimax search for card games with imperfect information

I want to use minimax search (with alpha-beta cropping) or, rather, non-max search so that the computer program plays a card game.

The card game consists of 4 players. Therefore, in order to be able to use minimax, etc., I simplify the game "I" against "others." After each "move", you can objectively read the current state assessment from the game itself. When all 4 players have placed a card, the highest wins them all - and the values ​​of the cards are counted.

As you do not know how a card is distributed between three other players, I thought that you should simulate all possible distributions (“worlds”) with cards that are not yours. You have 12 cards, the remaining 3 players have only 36 cards.

So, my approach is an algorithm where player is a number from 1 to 3, symbolizing three computer players who may need a program to search for moves. And -player means opponents, namely all the other three players.

 private Card computerPickCard(GameState state, ArrayList<Card> cards) { int bestScore = Integer.MIN_VALUE; Card bestMove = null; int nCards = cards.size(); for (int i = 0; i < nCards; i++) { if (state.moveIsLegal(cards.get(i))) { // if you are allowed to place this card int score; GameState futureState = state.testMove(cards.get(i)); // a move is the placing of a card (which returns a new game state) score = negamaxSearch(-state.getPlayersTurn(), futureState, 1, Integer.MIN_VALUE, Integer.MAX_VALUE); if (score > bestScore) { bestScore = score; bestMove = cards.get(i); } } } // now bestMove is the card to place } private int negamaxSearch(int player, GameState state, int depthLeft, int alpha, int beta) { ArrayList<Card> cards; if (player >= 1 && player <= 3) { cards = state.getCards(player); } else { if (player == -1) { cards = state.getCards(0); cards.addAll(state.getCards(2)); cards.addAll(state.getCards(3)); } else if (player == -2) { cards = state.getCards(0); cards.addAll(state.getCards(1)); cards.addAll(state.getCards(3)); } else { cards = state.getCards(0); cards.addAll(state.getCards(1)); cards.addAll(state.getCards(2)); } } if (depthLeft <= 0 || state.isEnd()) { // end of recursion as the game is finished or max depth is reached if (player >= 1 && player <= 3) { return state.getCurrentPoints(player); // player points as a positive value (for self) } else { return -state.getCurrentPoints(-player); // player points as a negative value (for others) } } else { int score; int nCards = cards.size(); if (player > 0) { // make one move (it player turn) for (int i = 0; i < nCards; i++) { GameState futureState = state.testMove(cards.get(i)); if (futureState != null) { // wenn Zug gültig ist score = negamaxSuche(-player, futureState, depthLeft-1, -beta, -alpha); if (score >= beta) { return score; } if (score > alpha) { alpha = score; // alpha acts like max } } } return alpha; } else { // make three moves (it the others' turn) for (int i = 0; i < nCards; i++) { GameState futureState = state.testMove(cards.get(i)); if (futureState != null) { // if move is valid for (int k = 0; k < nCards; k++) { if (k != i) { GameState futureStateLevel2 = futureState.testMove(cards.get(k)); if (futureStateLevel2 != null) { // if move is valid for (int m = 0; m < nCards; m++) { if (m != i && m != k) { GameState futureStateLevel3 = futureStateLevel2.testMove(cards.get(m)); if (futureStateLevel3 != null) { // if move is valid score = negamaxSuche(-player, futureStateLevel3, depthLeft-1, -beta, -alpha); if (score >= beta) { return score; } if (score > alpha) { alpha = score; // alpha acts like max } } } } } } } } } return alpha; } } } 

This works fine, but for depth 1 ( depthLeft=1 ) the program already needs to calculate 50,000 moves (placed cards) on average. This is too much, of course!

So my questions are:

  • Is the implementation correct? Can you imitate such a game? Regarding imperfect information, especially?
  • How can you improve the speed and workload algorithm?
  • Can I, for example, reduce the set of possible moves to a random set by 50% to improve speed, while maintaining good results?
  • I found the UCT algorithm to be a good solution (maybe). Do you know this algorithm? Can you help me realize it?
+6
source share
2 answers

Minimal search as you implement it is the wrong approach for games that have so much uncertainty. Since you do not know the distribution of cards among other players, your search will spend an exponential amount of time studying games that cannot occur with the actual distribution of cards.

I think the best approach is to start with good game rules when you have little or no information about the hands of other players. Such things as:

  • If you play first in the round, play your lowest card since you have little chance of winning the round.
  • If you play in the last round, play your lowest card, which will win the round. If you cannot win the round, then play your lowest card.

Ask your program not to worry about searching initially and just play by these rules, and assume that all other players will use these heuristics. Since the program monitors which cards are played by the first and last players of each round, it can create a table with information about the cards that the players probably have. For instance. 9 would have won this round, but player 3 did not play it, so he should not have cards 9 or higher. As information is gathered about each side of the player, the search space will ultimately be limited to the point at which minimax search for possible games can provide useful information about the next map for playback.

+3
source

I want to clarify the details that the accepted answer is really not included.

In many card games, you can try out unknown cards that your opponent might have, instead of generating all of them. You can take into account information, such as short suits and the likelihood that some cards will still play in this sample to evaluate the probability of each possible hand (each hand is a possible world that we will allow ourselves). Then you solve each hand using the perfect search for information. The best move in all of these worlds is the best result overall - with some caveat.

In games like Poker, this will not work very well - the game is dedicated to hidden information. You must balance your actions to keep hidden information about your hand.

But in games, such as trick-based card games, this works very well - especially because new information is being revealed all the time. Really good players have a good idea that everyone holds anyway. Thus, based on these ideas, reasonably strong Skat and Bridge programs were founded.

If you can completely solve the main world, it’s better, but if you can’t, you can use minimax or UCT to choose the best move in every world. There are also hybrid algorithms (ISMCTS) that try to integrate this process. Be careful with the claims here. Simple sampling approaches are easier to code - you should try a simpler approach before a more complex one.

Here are some scientific articles that will provide more information on when the sampling approach to imperfect information worked well:

Understanding the Success of Perfect Information Monte Carlo Sample in Searching for a Game Tree (This document analyzes when the approach to sampling is likely to work.)

Improving condition assessment, output and search in card games with a trick (This document describes the use of sampling in Skat)

Imperfect information in a complex computing game (This article describes sampling in Bridge)

Monte Carlo Tree Search Information Set (This article combines the UCT / Monte Carlo tree search and tree search to avoid problems in the first link.)

The problem with rule-based approaches in the accepted answer is that they cannot use computing resources that exceed the requirements needed to create the initial rules. In addition, rule-based approaches will be limited by the power of rules you can write. Search-based approaches can use the power of combinatorial search to produce a much stronger game than the author of the program.

+4
source

Source: https://habr.com/ru/post/926635/


All Articles