Grid World
In Grid World, the agent with $\pi_i$ aims to reach a goal, while avoiding any collision with the opponent with $\tau_i$.
Create the environment
from grid_world.env import *
env = GridWorld()
Reset the environment to initial state
env.reset()
env.show()
The env.show() will print out the following result
1.2
...
...
△.○
Move the agent
There are five moves to choose from
Move.UPMove.RIGHTMove.DOWNMove.LEFTMove.STANDBY
env.step(Move.RIGHT)
You can check the result by calling env.show() again.
The env.step() method will return the following values
(<done>, <reward>, <state>, <actions>)
<done> is a boolean value indicating if the game reaches an end, and <reward> is an integer showing the reward the agent gets.
The <state> return value contains
(agent_x, agent_y, opponent_x, opponent_y)
The <actions> return value contains
(agent_action, opponent_action)
For instance, env.step() could return the following values
(True, 30, (0, 3, 2, 3), (<Move.STANDBY: 4>, <Move.LEFT: 3>))
Change the opponent policy
There are five policies to choose from
Opponent.Policy.ONEOpponent.Policy.TWOOpponent.Policy.THREEOpponent.Policy.FOUROpponent.Policy.FIVE
Set the policy using the following code (env being the GridWorld environment)
env.opponent.policy = Opponent.Policy.ONE
You will see the following message printing on the console
Opponent policy is Policy.ONE
Change the agent policy
To set a policy, your agent has to be a BprAgent, see BprAgent in grid_world/agent.py
There are five policies to choose from
BprAgent.Policy.ONEBprAgent.Policy.TWOBprAgent.Policy.THREEBprAgent.Policy.FOURBprAgent.Policy.FIVE
Set the policy using the following code (env being the GridWorld environment)
env.agent.policy = BprAgent.Policy.ONE
You will see the following message printing on the console
BprAgent policy is Policy.ONE
