|
@@ -87,11 +87,11 @@ produce an array of pixels for observations if desired.
|
|
|
|
|
|
Each cell/tile in the grid world contains one object, each object has an
|
|
|
associated discrete color. The objects currently supported are walls, doors,
|
|
|
-locked doors, keys, balls,boxes and a goal square. The basic version of the
|
|
|
-environment has just 4 possible actions: turn left, turn right, move
|
|
|
-forward and pickup/toggle to interact with objects. The agent can carry
|
|
|
-one carryable item at a time (eg: ball or key). By default, only sparse
|
|
|
-rewards for reaching the goal square are provided.
|
|
|
+locked doors, keys, balls, boxes and a goal square. The basic version of the
|
|
|
+environment has 5 possible actions: turn left, turn right, move
|
|
|
+forward, pickup/toggle to interact with objects, and a wait/noop action. The
|
|
|
+agent can carry one carryable item at a time (eg: ball or key). By default,
|
|
|
+only sparse rewards for reaching the goal square are provided.
|
|
|
|
|
|
Extending the environment with new object types and dynamics should be
|
|
|
very easy. If you wish to do this, you should take a look at
|
|
@@ -153,6 +153,24 @@ square the agent must get to. This environment is extremely difficult to
|
|
|
solve using classical RL. However, by gradually increasing the number of
|
|
|
rooms and building a curriculum, the environment can be solved.
|
|
|
|
|
|
+### Go to door environment
|
|
|
+
|
|
|
+Registered configurations:
|
|
|
+- `MiniGrid-GoToDoor-5x5-v0`
|
|
|
+- `MiniGrid-GoToDoor-6x6-v0`
|
|
|
+- `MiniGrid-GoToDoor-8x8-v0`
|
|
|
+
|
|
|
+<p align="center">
|
|
|
+<video autoplay loop>
|
|
|
+<source src="/figures/gotodoor-6x6.mp4" type="video/mp4">
|
|
|
+</video>
|
|
|
+</p>
|
|
|
+
|
|
|
+This environment is a room with four doors, one on each wall. The agent
|
|
|
+receives a textual (mission) string as input, telling it which door to go to,
|
|
|
+(eg: "go to the red door"). It receives a positive reward for performing the
|
|
|
+`wait` action next to the correct door, as indicated in the mission string.
|
|
|
+
|
|
|
### Fetch environment
|
|
|
|
|
|
Registered configurations:
|