Implementation of training in NetLogo (training in multi-agent models)

I am going to implement a learning strategy for different types of agents in my model. Honestly, I still do not know what questions I should ask first or where to start.

I have two types of agents that I want them to learn from experience, they have a pool of actions, each of which has different rewards based on specific situations that can occur. I am new to strengthening teaching methods, so any suggestions regarding what questions I should ask myself are welcome :)

This is how I am going to formulate my problem:

  • Agents have a lifetime, and they track several things that are important to them, and these indicators are different for different agents, for example, one agent wants to increase. The other wants B more than A.
  • States are the points in the agent’s lifespan that they have several options (I don’t have a clear definition for States, because they can happen several times or not at all, because Agents move and they can never encounter a situation).
  • A reward is an increase or decrease in an indicator that agents can receive from an action in a particular state, and the agent does not know what the gain will be if he chooses another action.
  • The gain is not constant, the states are not defined correctly and there is no formal transition of one state to another,
  • , ( 1) ( 2). , A , 2 ; , , ( ), .

, , RL .

, , , - , . , , , , .

+4
1

, , RL .

. , ? , , . ( , , ). , .

, - , . , RL - , , .

, , , .

, , "". , , , . ( RL ).

, . , current-patch X internal-variable-value X other-agents-present. , . , . , .

. . . RL . , .

, . , , . , , . . , . , . , , , , . , plank-in-place , , , . , ; .

, !

2/7/2018: . , , RL NetLogo . , python NetLogo, , . , Q- .

+3

All Articles