Artificial intelligence that can learn

I know that the name of the question is a bit vague, but to carry a problem with me here, every time I write a game or bot for a game, I use a state machine, a decision tree or a behavior tree. The problem with these methods is that they require me to pre-program all the conditions that the character / bot encounters, so at any time the user does something unexpected that I did not put a condition for their loss.

Right now I'm wotking on a Starcraft bot (bwapi) using state machines, I am thinking of using one state machine for each unit and one master commanding minions, what needs to be done, but it still requires that I always program ever , for a game like starcraft, it is impossible, the only way I can think that I can learn this is to use gp to develop these state machines.

Tell me, is there a bridge on the map if 20 marines try to cross the bridge, at the same time there will be a lot of traffic jams, what methods can I use so that it can learn from a mistake? therefore, I do not need to pre-program a condition that says that the bridge passes one by one.

EDIT: simply because the question has the words starcraft or bot in it, it does not automatically make the question not related to this question also apply to robotics.

+8
language-agnostic algorithm artificial-intelligence
source share
6 answers

To get anywhere, you need to first determine an empirical measurement of suitability for your bots. This should be something much clearer than a “big cork”.

How do you rate this?

What is a victory? Are there any numerical indicators that your bot is “winning”? First, resolve this issue, and then, when you have a real way of rating one bot against any number of others, connect it as a fitness function for the GP algorithm.

+7
source share

Here you ask two different questions.

First, "what is an AI that can learn"? Machine learning is trying to answer this question. There are dozens of different tools for implementing machine learning, but there is no silver bullet for your application. You will need to think much harder about what AI needs to “learn” - what will be its input and what will it output?

Secondly, "how 20 marines can cross a bridge at the same time without turning into a cluster." You describe a group search for a path , which is another area of ​​AI called heuristic search . The solution to the path for several units (agents) at the same time has its own set of algorithms, but, as a rule, it understands the problem much better than your first. At the top of my head, you can use the approach to solve for a subset of units at the same time (depending on the width of the bridge) and move the element in each group at the same time. You can also try to decide for each individual marine vessel, starting from the closest to the other side of the bridge.

Using a Google search to search, you will get much more information than I can insert the answer, especially considering that your question is quite open (wrt training systems).

+2
source share

Since you are already using state machines, try taking a look at KB-NEAT.

This is a technique of neuroevolution; that is, the creation of neural networks through evolution.

Also look at rtNEAT, which may come in handy. A typical NEAT implementation uses a generational approach, i.e. Launches a series of games, say, a hundred, selects the best candidates and creates offspring of them. (The other mentioned suitability, which is always used in the evolutionary approach, as well as here) rtNEAT allows evolution to occur in real time; that is, during the game. (This requires a more complex calculation of suitability, as it happens in the middle of the game, where you still don't know the result)

Implementation is actually not such a difficult task, but the key to this technique is genetic history, which is vital to the evolutionary process. (This also makes this technique so amazing compared to previous attempts like neuroevolution, the problem here is that the input and output must be identical, and this may not be the case)

Oh, and your problem can be solved either by a planner at a higher level, or units can study it themselves. And the entrance, which includes the nearest friendly units and obstacles, could, with the right fitness, find out that it works productively to skip the phone so quickly. This is called emerging behavior, and it has been shown that the above methods are able to independently develop such behavior.

Here is an implementation that I find very beautiful to base your work on:

http://nn.cs.utexas.edu/?windowsneat

The above uses generations. I have not seen the implementation of rtNEAT. But you can take a look at John Holland's book, Adaptation in Natural and Artificial Systems. Admittedly, it can be difficult to read because it is very mathematically. But skip most of this and look at the suggestions for algorithms. It is common to all evolutionary algorithms, of which neuroevolution is a subfield. It has an algorithm that rtNEAT is commonly used. (And if you are not familiar with the genetics that are used in evolutionary algorithms, it perfectly defines what a gene, allele, chromosome, phenotype, and genome you will find in NEAT publications; NEAT uses the genome to describe things in a common, which is just a set of chromosomes that together describe the phenotype, since coding is a little more complicated than just genetic algorithms and genetic programming)

The homepage of this technique is here;

http://www.cs.ucf.edu/~kstanley/neat.html

Here is the publication in chronological order;

http://nn.cs.utexas.edu/keyword?stanley:ec02

http://nn.cs.utexas.edu/keyword?stanley:ieeetec05

http://nn.cs.utexas.edu/?kbneat

(KB-NEAT uses rtNEAT already in the above publication)

The fact is that you can basically take what you have, put it into neuro-evolutionary technology and develop from there. This is a combination between a domain-specific ID and a machine learning AI.

Oh and a note; the evolutionary part has processor intensity, at least without rtNEAT. rtNEAT takes time instead, since you need to play a lot against him before he finds out. (KB-NEAT gives him an intelligence base, although obviously). However, when it develops, it is very, very fast, since the response from the neural network is very quickly calculated. (This is a rather small graph, and there was no search)

Oh, and secondly; you need to think a lot about input and output. The result can be easy, as it allows players to carry out their actions. But the entrance is what you want them to see, and you cannot include everything; which would make the problem too difficult for evolution, at least in realistic times. (Although theoretically it will converge with the optimal solution at infinite time)

Oh, and the third note; you can develop several brains for units and even have different brains for each type of unit. The sky is the limit. Perhaps you need a brain for every level of technology of yours or your enemy. It takes extra time to develop, but the brains are small in memory, so the amount is not a problem.

Ack and fourth note; This is a black box technique. I cannot transfer my brain back to FSM, which I am afraid of. The coding in the neural network is not understandable to humans and, therefore, it is not known exactly how it works. Thus, there is a danger that you will get something that you like, but you cannot understand why. And you cannot easily share this knowledge with other agents. (Although you can, of course, use it as a basis for developing new behavior for them)

+1
source share

Doesn't Starcraft AI implement movement of units? for example, just select 12 Marines, and then tell them to move across the bridge, as a human would. Of course, AI is very upset and not where it is close, like a swarm of AI SCII. However, I think there are many other problems when designing AI before you need to worry about the micromanagement of each individual unit.

For example, knowing where / when to position your devices is perhaps more important than figuring out how to do this with 100% efficiency.

I personally think that the best way to get closer to this game is to use state machines. There are key stages in a game in which there are correct answers (for example, "lurkers out" are scientific vessels?), Etc.). But instead of making each individual SM unit, focus more on a larger object, such as a control group of units. I believe that this greatly simplifies the situation, and if later you can improve your AI micro, if necessary.

0
source share

I think actually getting your characters to learn is a bad idea. They will develop consciousness, break out of the computer and try to strangle you in a dream with the mouse cable.

OK, just kidding. I think this is just an add-on. A character’s evolution will not necessarily force him to respond decisively to all cases, but it will most likely be very difficult to work on, expand on, and keep an overview of his attributes.

You may be interested in a goal-based approach:

http://www.media.mit.edu/~jorkin/gdc2006_orkin_jeff_fear.pdf

Work hard, but in the end it will greatly simplify and simplify the behavior of your guys.

0
source share

Instead of giving you suggestions about specific algorithms or methods, I think this will help to start with the specific problem you are trying to solve. I often play StarCraft, so I know what you mean by "big traffic jam." However, the “big traffic jam” is just a mark and pointless in the context of machine learning.

We can say that a bot in your particular domain learns from experience E regarding a certain class of tasks T and productivity indicator P, if its performance when performing tasks in T, measured by P, improves with experience E.

Now we must work on defining E, T, and P so that it is meaningful for every problem that you are likely to encounter in the game. For example, a class of tasks may include moving units from one area to another in a group. This area may have some features that indicate that it is narrow and therefore it is not possible to move units at the optimal group size. Thus, an indicator of effectiveness may be the time that may be required, for example, a unit flow (movement of a certain number of marines per unit time per unit area). Of course, you want to measure this stream through one that is just the normalized sum of the operations of the stream. With experience, you would maximize this flow.

Once you understand this problem a little better, you can start developing projects that best fulfill all your requirements.

0
source share

All Articles