Browse Source

Update README.md

Maxime Chevalier-Boisvert 7 years ago
parent
commit
24f7678f57
1 changed files with 37 additions and 16 deletions
  1. 37 16
      README.md

+ 37 - 16
README.md

@@ -3,7 +3,7 @@
 There are other gridworld Gym environments out there, but this one is
 designed to be particularly simple, lightweight and fast. The code has very few
 dependencies, making it less likely to break or fail to install. It loads no
-external sprites/textures, and it can run at up to 5800 FPS on a quad-core
+external sprites/textures, and it can run at up to 6000 FPS on a quad-core i7
 laptop, which means you can run your experiments faster. Batteries are
 included: a known-working RL implementation is supplied in this repository
 to help you get started.
@@ -72,21 +72,42 @@ python3 pytorch_rl/enjoy.py --env-name MiniGrid-Empty-6x6-v0 --load-dir ./traine
 
 ## Design
 
-The environment is partially observable and uses a compact and efficient
-encoding, with just 3 inputs per visible grid cell. It is also easy to
-produce an array of pixels for observations if desired.
-
-Each cell/tile in the grid world contains one object, each object has an
-associated discrete color. The objects currently supported are walls, doors,
-locked doors, keys, balls, boxes and a goal square. The basic version of the
-environment has 5 possible actions: turn left, turn right, move
-forward, pickup/toggle to interact with objects, and a wait/noop action. The
-agent can carry one carryable item at a time (eg: ball or key). By default,
-only sparse rewards for reaching the goal square are provided.
-
-Extending the environment with new object types and dynamics should be
-very easy. If you wish to do this, you should take a look at
-the [gym_minigrid/minigrid.py](gym_minigrid/minigrid.py) source file.
+MiniGrid is built to support tasks involving natural language and sparse rewards.
+The observations are dictionaries, with an 'image' field, partially observable
+view of the environment, and a 'mission' field which is a textual string
+describing the objective the agent should reach to get a reward. Using
+dictionaries makes it easy for you to add additional information to observations
+if you need to, without having to force everything into a single tensor.
+If your RL code expects a tensor for observations, please take a look at
+`FlatObsWrapper` in 
+[gym_minigrid/wrappers.py](/gym_minigrid/wrappers.py).
+
+The partially observable view of the environment uses a compact and efficient
+encoding, with just 3 input values per visible grid cell, 147 values total.
+If you want to obtain an array of RGB pixels instead, see the `getObsRender` method in
+[gym_minigrid/minigrid.py](gym_minigrid/minigrid.py).
+
+Structure of the world:
+- The world is an NxM grid of tiles
+- Each tile in the grid world contains zero or one object
+  - Cells that do not contain an object have the value `None`
+- Each object has an associated discrete color (string)
+- Each object has an associated type (string)
+  - Provided object types are: wall, door, locked_doors, key, ball, box and goal
+- The agent can pick up and carry exactly one object (eg: ball or key)
+
+Actions in the basic environment:
+- Turn left
+- Turn right
+- Move forward
+- Toggle (pick up or interact with objects)
+- Wait (noop, do nothing)
+
+By default, sparse rewards for reaching a goal square are provided, but you can
+define your own reward function by creating a class derived from MiniGridEnv. Extending
+the environment with new object types or action should be very easy very easy.
+If you wish to do this, you should take a look at the
+[gym_minigrid/minigrid.py](gym_minigrid/minigrid.py) source file.
 
 ## Included Environments