Skip to content

Grid World

Grid World

In Grid World, the agent with $\pi_i$ aims to reach a goal, while avoiding any collision with the opponent with $\tau_i$.

Create the environment

from grid_world.env import *
env = GridWorld()

Reset the environment to initial state

env.reset()
env.show()

The env.show() will print out the following result

1.2
...
...
△.○

Move the agent

There are five moves to choose from

  • Move.UP
  • Move.RIGHT
  • Move.DOWN
  • Move.LEFT
  • Move.STANDBY
env.step(Move.RIGHT)

You can check the result by calling env.show() again.

The env.step() method will return the following values

(<done>, <reward>, <state>, <actions>)

<done> is a boolean value indicating if the game reaches an end, and <reward> is an integer showing the reward the agent gets.

The <state> return value contains

(agent_x, agent_y, opponent_x, opponent_y)

The <actions> return value contains

(agent_action, opponent_action)

For instance, env.step() could return the following values

(True, 30, (0, 3, 2, 3), (<Move.STANDBY: 4>, <Move.LEFT: 3>))

Change the opponent policy

There are five policies to choose from

  • Opponent.Policy.ONE
  • Opponent.Policy.TWO
  • Opponent.Policy.THREE
  • Opponent.Policy.FOUR
  • Opponent.Policy.FIVE

Set the policy using the following code (env being the GridWorld environment)

env.opponent.policy = Opponent.Policy.ONE

You will see the following message printing on the console

Opponent policy is Policy.ONE

Change the agent policy

To set a policy, your agent has to be a BprAgent, see BprAgent in grid_world/agent.py

There are five policies to choose from

  • BprAgent.Policy.ONE
  • BprAgent.Policy.TWO
  • BprAgent.Policy.THREE
  • BprAgent.Policy.FOUR
  • BprAgent.Policy.FIVE

Set the policy using the following code (env being the GridWorld environment)

env.agent.policy = BprAgent.Policy.ONE

You will see the following message printing on the console

BprAgent policy is Policy.ONE