8 лет назад · db29cb23d6
--- a/README.md
+++ b/README.md
@@ -48,7 +48,7 @@ To run the standalone UI application, which allows you to manually control the a
 
				 The environment being run can be selected with the `--env-name` option, eg:
			
 
				 
			
 
				 ```
			
 
				-./standalone.py --env-name MiniGrid-Fetch-8x8-v0
			
 
				+./standalone.py --env-name MiniGrid-Empty-8x8-v0
			
 
				 ```
			
 
				 
			
 
				 Basic reinforcement learning code is provided in the `basicrl` subdirectory.
			
@@ -71,27 +71,47 @@ The environments listed below are implemented and registered in [simple_envs.py]
 
				 ### Empty environment
			
 
				 
			
 
				 Registered configurations:
			
 
				-- MiniGrid-Empty-8x8-v0
			
 
				-- MiniGrid-Empty-6x6-v0
			
 
				+- `MiniGrid-Empty-8x8-v0`
			
 
				+- `MiniGrid-Empty-6x6-v0`
			
 
				+
			
 
				+This environment is an empty room, and the goal of the agent is to reach the
			
 
				+green goal square, which provides a sparse reward. A small penalty is
			
 
				+subtracted for the number of steps to reach the goal. This environment is
			
 
				+useful, with small rooms, to validate that your RL algorithm works correctly,
			
 
				+and with large rooms to experiment with sparse rewards.
			
 
				 
			
 
				 ### Door & key environment
			
 
				 
			
 
				 Registered configurations:
			
 
				-- MiniGrid-Door-Key-8x8-v0
			
 
				-- MiniGrid-Door-Key-16x16-v0
			
 
				-- MiniGrid-Multi-Room-N6-v0
			
 
				-- MiniGrid-Fetch-8x8-v0
			
 
				+- `MiniGrid-Door-Key-8x8-v0`
			
 
				+- `MiniGrid-Door-Key-16x16-v0`
			
 
				+
			
 
				+This environment has a key that the agent must pick up in order to unlock
			
 
				+a goal and then get to the green goal square. This environment is difficult,
			
 
				+because of the sparse reward, to solve using classical RL algorithms. It is
			
 
				+useful to experiment with curiosity or curriculum learning.
			
 
				 
			
 
				 ### Multi-room environment
			
 
				 
			
 
				 Registered configurations:
			
 
				-- MiniGrid-Multi-Room-N6-v0
			
 
				+- `MiniGrid-Multi-Room-N6-v0`
			
 
				 
			
 
				-TODO: curriculum learning, include gif
			
 
				+![Multi-room environment](/figures/multi-room.gif)
			
 
				+
			
 
				+This environment has a series of connected rooms with doors that must be
			
 
				+opened in order to get to the next room. The final room has the green goal
			
 
				+square the agent must get to. This environment is extremely difficult to
			
 
				+solve using classical RL. However, by gradually increasing the number of
			
 
				+rooms and building a curriculum, the environment can be solved.
			
 
				 
			
 
				 ### Fetch Environment
			
 
				 
			
 
				 Registered configurations:
			
 
				-- MiniGrid-Fetch-8x8-v0
			
 
				+- `MiniGrid-Fetch-8x8-v0`
			
 
				+
			
 
				+![Fetch environment](/figures/fetch-env.gif)
			
 
				 
			
 
				-Natural language observation ("mission").
			
 
				+This environment has multiple objects of assorted types and colors. The
			
 
				+agent receives a textual string as part of its observation telling it
			
 
				+which object to pick up. Picking up the wrong object produces a negative
			
 
				+reward.
			
--- a/figures/fetch-env.png
+++ b/figures/fetch-env.png
--- a/figures/multi-room.gif
+++ b/figures/multi-room.gif