Просмотр исходного кода

Added go to door environment to README

Maxime Chevalier-Boisvert 7 лет назад
Родитель
Сommit
ad4358543a
2 измененных файлов с 23 добавлено и 5 удалено
  1. 23 5
      README.md
  2. BIN
      figures/gotodoor-6x6.mp4

+ 23 - 5
README.md

@@ -87,11 +87,11 @@ produce an array of pixels for observations if desired.
 
 Each cell/tile in the grid world contains one object, each object has an
 associated discrete color. The objects currently supported are walls, doors,
-locked doors, keys, balls,boxes and a goal square. The basic version of the
-environment has just 4 possible actions: turn left, turn right, move
-forward and pickup/toggle to interact with objects. The agent can carry
-one carryable item at a time (eg: ball or key). By default, only sparse
-rewards for reaching the goal square are provided.
+locked doors, keys, balls, boxes and a goal square. The basic version of the
+environment has 5 possible actions: turn left, turn right, move
+forward, pickup/toggle to interact with objects, and a wait/noop action. The
+agent can carry one carryable item at a time (eg: ball or key). By default,
+only sparse rewards for reaching the goal square are provided.
 
 Extending the environment with new object types and dynamics should be
 very easy. If you wish to do this, you should take a look at
@@ -153,6 +153,24 @@ square the agent must get to. This environment is extremely difficult to
 solve using classical RL. However, by gradually increasing the number of
 rooms and building a curriculum, the environment can be solved.
 
+### Go to door environment
+
+Registered configurations:
+- `MiniGrid-GoToDoor-5x5-v0`
+- `MiniGrid-GoToDoor-6x6-v0`
+- `MiniGrid-GoToDoor-8x8-v0`
+
+<p align="center">
+<video autoplay loop>
+<source src="/figures/gotodoor-6x6.mp4" type="video/mp4">
+</video>
+</p>
+
+This environment is a room with four doors, one on each wall. The agent
+receives a textual (mission) string as input, telling it which door to go to,
+(eg: "go to the red door"). It receives a positive reward for performing the
+`wait` action next to the correct door, as indicated in the mission string.
+
 ### Fetch environment
 
 Registered configurations:

BIN
figures/gotodoor-6x6.mp4