2 年之前 · e30599f258
--- a/.github/ISSUE_TEMPLATE/bug.md
+++ b/.github/ISSUE_TEMPLATE/bug.md
@@ -0,0 +1,27 @@
 
				+---
			
 
				+name: Bug Report
			
 
				+about: Submit a bug report
			
 
				+title: "[Bug Report] Bug title"
			
 
				+
			
 
				+---
			
 
				+
			
 
				+If you are submitting a bug report, please fill in the following details and use the tag [bug].
			
 
				+
			
 
				+**Describe the bug**
			
 
				+A clear and concise description of what the bug is.
			
 
				+
			
 
				+**Code example**
			
 
				+Please try to provide a minimal example to reproduce the bug. Error messages and stack traces are also helpful.
			
 
				+
			
 
				+**System Info**
			
 
				+Describe the characteristic of your environment:
			
 
				+ * Describe how `gym-minigrid` was installed (pip, docker, source, ...)
			
 
				+ * What OS/version of Linux you're using. Note that while we will accept PRs to improve Window's support, we do not officially support it.
			
 
				+ * Python version
			
 
				+
			
 
				+**Additional context**
			
 
				+Add any other context about the problem here.
			
 
				+
			
 
				+### Checklist
			
 
				+
			
 
				+- [ ] I have checked that there is no similar [issue](https://github.com/Farama-Foundation/gym-minigrid/issues) in the repo (**required**)
			
--- a/.github/ISSUE_TEMPLATE/proposal.md
+++ b/.github/ISSUE_TEMPLATE/proposal.md
@@ -0,0 +1,33 @@
 
				+---
			
 
				+name: Proposal
			
 
				+about: Propose changes that are not fixes bugs
			
 
				+title: "[Proposal] Proposal title"
			
 
				+---
			
 
				+
			
 
				+
			
 
				+
			
 
				+### Proposal 
			
 
				+
			
 
				+A clear and concise description of the proposal.
			
 
				+
			
 
				+### Motivation
			
 
				+
			
 
				+Please outline the motivation for the proposal.
			
 
				+Is your feature request related to a problem? e.g.,"I'm always frustrated when [...]".
			
 
				+If this is related to another GitHub issue, please link here too.
			
 
				+
			
 
				+### Pitch
			
 
				+
			
 
				+A clear and concise description of what you want to happen.
			
 
				+
			
 
				+### Alternatives
			
 
				+
			
 
				+A clear and concise description of any alternative solutions or features you've considered, if any.
			
 
				+
			
 
				+### Additional context
			
 
				+
			
 
				+Add any other context or screenshots about the feature request here.
			
 
				+
			
 
				+### Checklist
			
 
				+
			
 
				+- [ ] I have checked that there is no similar [issue](https://github.com/Farama-Foundation/gym-minigrid/issues) in the repo (**required**)
			
--- a/.github/ISSUE_TEMPLATE/question.md
+++ b/.github/ISSUE_TEMPLATE/question.md
@@ -0,0 +1,12 @@
 
				+---
			
 
				+name: Question
			
 
				+about: Ask a question
			
 
				+title: "[Question] Question title"
			
 
				+---
			
 
				+
			
 
				+
			
 
				+### Question
			
 
				+
			
 
				+If you're a beginner and have basic questions, please ask on [r/reinforcementlearning](https://www.reddit.com/r/reinforcementlearning/) or in the [RL Discord](https://discord.com/invite/xhfNqQv) (if you're new please use the beginners channel). Basic questions that are not bugs or feature requests will be closed without reply, because GitHub issues are not an appropriate venue for these.
			
 
				+
			
 
				+Advanced/nontrivial questions, especially in areas where documentation is lacking, are very much welcome.
			
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -0,0 +1,45 @@
 
				+# Description
			
 
				+
			
 
				+Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
			
 
				+
			
 
				+Fixes # (issue)
			
 
				+
			
 
				+## Type of change
			
 
				+
			
 
				+Please delete options that are not relevant.
			
 
				+
			
 
				+- [ ] Bug fix (non-breaking change which fixes an issue)
			
 
				+- [ ] New feature (non-breaking change which adds functionality)
			
 
				+- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
			
 
				+- [ ] This change requires a documentation update
			
 
				+
			
 
				+### Screenshots
			
 
				+Please attach before and after screenshots of the change if applicable.
			
 
				+
			
 
				+<!--
			
 
				+Example:
			
 
				+
			
 
				+| Before | After |
			
 
				+| ------ | ----- |
			
 
				+| _gif/png before_ | _gif/png after_ |
			
 
				+
			
 
				+
			
 
				+To upload images to a PR -- simply drag and drop an image while in edit mode and it should upload the image directly. You can then paste that source into the above before/after sections.
			
 
				+-->
			
 
				+
			
 
				+# Checklist:
			
 
				+
			
 
				+- [ ] I have run the [`pre-commit` checks](https://pre-commit.com/) with `pre-commit run --all-files` (see `CONTRIBUTING.md` instructions to set it up)
			
 
				+- [ ] I have commented my code, particularly in hard-to-understand areas
			
 
				+- [ ] I have made corresponding changes to the documentation
			
 
				+- [ ] My changes generate no new warnings
			
 
				+- [ ] I have added tests that prove my fix is effective or that my feature works
			
 
				+- [ ] New and existing unit tests pass locally with my changes
			
 
				+
			
 
				+<!--
			
 
				+As you go through the checklist above, you can mark something as done by putting an x character in it
			
 
				+
			
 
				+For example,
			
 
				+- [x] I have done this task
			
 
				+- [ ] I have not done this task
			
 
				+-->
			
--- a/.github/stale.yml
+++ b/.github/stale.yml
@@ -0,0 +1,62 @@
 
				+# Configuration for probot-stale - https://github.com/probot/stale
			
 
				+
			
 
				+# Number of days of inactivity before an Issue or Pull Request becomes stale
			
 
				+daysUntilStale: 60
			
 
				+
			
 
				+# Number of days of inactivity before an Issue or Pull Request with the stale label is closed.
			
 
				+# Set to false to disable. If disabled, issues still need to be closed manually, but will remain marked as stale.
			
 
				+daysUntilClose: 14
			
 
				+
			
 
				+# Only issues or pull requests with all of these labels are check if stale. Defaults to `[]` (disabled)
			
 
				+onlyLabels:
			
 
				+  - more-information-needed
			
 
				+
			
 
				+# Issues or Pull Requests with these labels will never be considered stale. Set to `[]` to disable
			
 
				+exemptLabels:
			
 
				+  - pinned
			
 
				+  - security
			
 
				+  - "[Status] Maybe Later"
			
 
				+
			
 
				+# Set to true to ignore issues in a project (defaults to false)
			
 
				+exemptProjects: true
			
 
				+
			
 
				+# Set to true to ignore issues in a milestone (defaults to false)
			
 
				+exemptMilestones: true
			
 
				+
			
 
				+# Set to true to ignore issues with an assignee (defaults to false)
			
 
				+exemptAssignees: true
			
 
				+
			
 
				+# Label to use when marking as stale
			
 
				+staleLabel: stale
			
 
				+
			
 
				+# Comment to post when marking as stale. Set to `false` to disable
			
 
				+markComment: >
			
 
				+  This issue has been automatically marked as stale because it has not had
			
 
				+  recent activity. It will be closed if no further activity occurs. Thank you
			
 
				+  for your contributions.
			
 
				+
			
 
				+# Comment to post when removing the stale label.
			
 
				+# unmarkComment: >
			
 
				+#   Your comment here.
			
 
				+
			
 
				+# Comment to post when closing a stale Issue or Pull Request.
			
 
				+# closeComment: >
			
 
				+#   Your comment here.
			
 
				+
			
 
				+# Limit the number of actions per hour, from 1-30. Default is 30
			
 
				+limitPerRun: 30
			
 
				+
			
 
				+# Limit to only `issues` or `pulls`
			
 
				+only: issues
			
 
				+
			
 
				+# Optionally, specify configuration settings that are specific to just 'issues' or 'pulls':
			
 
				+# pulls:
			
 
				+#   daysUntilStale: 30
			
 
				+#   markComment: >
			
 
				+#     This pull request has been automatically marked as stale because it has not had
			
 
				+#     recent activity. It will be closed if no further activity occurs. Thank you
			
 
				+#     for your contributions.
			
 
				+
			
 
				+# issues:
			
 
				+#   exemptLabels:
			
 
				+#     - confirmed
			
--- a/.github/workflows/build.yml
+++ b/.github/workflows/build.yml
@@ -0,0 +1,19 @@
 
				+name: build
			
 
				+on: [pull_request, push]
			
 
				+
			
 
				+jobs:
			
 
				+  build:
			
 
				+    runs-on: ubuntu-latest
			
 
				+    strategy:
			
 
				+      matrix:
			
 
				+        python-version: ['3.6', '3.7', '3.8', '3.9', '3.10']
			
 
				+    steps:
			
 
				+      - uses: actions/checkout@v2
			
 
				+      - run: |
			
 
				+           docker build -f py.Dockerfile \
			
 
				+             --build-arg PYTHON_VERSION=${{ matrix.python-version }} \
			
 
				+             --tag gym-minigrid-docker .
			
 
				+      
			
 
				+      # TODO: Add and fix tests for pytest
			
 
				+      # - name: Run tests
			
 
				+      #   run: docker run gym-docker pytest
			
--- a/.github/workflows/pre-commit.yml
+++ b/.github/workflows/pre-commit.yml
@@ -0,0 +1,17 @@
 
				+# https://pre-commit.com
			
 
				+# This GitHub Action assumes that the repo contains a valid .pre-commit-config.yaml file.
			
 
				+name: pre-commit
			
 
				+on:
			
 
				+  pull_request:
			
 
				+  push:
			
 
				+    branches: [master]
			
 
				+jobs:
			
 
				+  pre-commit:
			
 
				+    runs-on: ubuntu-latest
			
 
				+    steps:
			
 
				+      - uses: actions/checkout@v2
			
 
				+      - uses: actions/setup-python@v2
			
 
				+      - run: pip install pre-commit
			
 
				+      - run: pre-commit --version
			
 
				+      - run: pre-commit install
			
 
				+      - run: pre-commit run --all-files
			
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -0,0 +1,53 @@
 
				+---
			
 
				+repos:
			
 
				+  - repo: https://github.com/python/black
			
 
				+    rev: 22.3.0
			
 
				+    hooks:
			
 
				+      - id: black
			
 
				+  - repo: https://github.com/codespell-project/codespell
			
 
				+    rev: v2.1.0
			
 
				+    hooks:
			
 
				+      - id: codespell
			
 
				+#        args:
			
 
				+#          - --ignore-words-list=
			
 
				+  - repo: https://gitlab.com/PyCQA/flake8
			
 
				+    rev: 4.0.1
			
 
				+    hooks:
			
 
				+      - id: flake8
			
 
				+        args:
			
 
				+          - '--per-file-ignores=*/__init__.py:F401'
			
 
				+#          - --ignore=
			
 
				+          - --max-complexity=30
			
 
				+          - --max-line-length=456
			
 
				+          - --show-source
			
 
				+          - --statistics
			
 
				+  - repo: https://github.com/PyCQA/isort
			
 
				+    rev: 5.10.1
			
 
				+    hooks:
			
 
				+      - id: isort
			
 
				+        args: ["--profile", "black"]
			
 
				+#  - repo: https://github.com/pycqa/pydocstyle
			
 
				+#    rev: 6.1.1
			
 
				+#    hooks:
			
 
				+#      - id: pydocstyle
			
 
				+#        args:
			
 
				+#          - --source
			
 
				+#          - --explain
			
 
				+#          - --convention=google
			
 
				+#        additional_dependencies: ["toml"]
			
 
				+  - repo: https://github.com/asottile/pyupgrade
			
 
				+    rev: v2.32.0
			
 
				+    hooks:
			
 
				+      - id: pyupgrade
			
 
				+        args: ["--py37-plus"]
			
 
				+#  - repo: local
			
 
				+#    hooks:
			
 
				+#      - id: pyright
			
 
				+#        name: pyright
			
 
				+#        entry: pyright
			
 
				+#        language: node
			
 
				+#        pass_filenames: false
			
 
				+#        types: [python]
			
 
				+#        additional_dependencies: ["pyright"]
			
 
				+#        args:
			
 
				+#          - --project=pyproject.toml
			
--- a/.travis.yml
+++ b/.travis.yml
@@ -1,10 +0,0 @@
 
				-language: python
			
 
				-python:
			
 
				-  - "3.5"
			
 
				-
			
 
				-# command to install dependencies
			
 
				-install:
			
 
				-  - pip3 install -e .
			
 
				-
			
 
				-# command to run tests
			
 
				-script: ./run_tests.py
			
--- a/CODE_OF_CONDUCT.rst
+++ b/CODE_OF_CONDUCT.rst
@@ -0,0 +1,13 @@
 
				+Farama Foundation is dedicated to providing a harassment-free experience for
			
 
				+everyone, regardless of gender, gender identity and expression, sexual
			
 
				+orientation, disability, physical appearance, body size, age, race, or
			
 
				+religion. We do not tolerate harassment of participants in any form.
			
 
				+
			
 
				+This code of conduct applies to all Farama Foundation spaces (including Gist
			
 
				+comments) both online and off. Anyone who violates this code of
			
 
				+conduct may be sanctioned or expelled from these spaces at the
			
 
				+discretion of the Farama Foundation team.
			
 
				+
			
 
				+We may add additional rules over time, which will be made clearly
			
 
				+available to participants. Participants are responsible for knowing
			
 
				+and abiding by these rules.
			
--- a/README.md
+++ b/README.md
@@ -10,8 +10,8 @@ laptop, which means you can run your experiments faster. A known-working RL
 
				 implementation can be found [in this repository](https://github.com/lcswillems/torch-rl).
			
 
				 
			
 
				 Requirements:
			
 
				-- Python 3.5+
			
 
				-- OpenAI Gym
			
 
				+- Python 3.7+
			
 
				+- OpenAI Gym 0.25
			
 
				 - NumPy
			
 
				 - Matplotlib (optional, only needed for display)
			
 
				 
			
@@ -132,8 +132,10 @@ compact and efficient encoding, with 3 input values per visible grid cell, 7x7x3
 
				 These values are **not pixels**. If you want to obtain an array of RGB pixels as observations instead,
			
 
				 use the `RGBImgPartialObsWrapper`. You can use it as follows:
			
 
				 
			
 
				-```
			
 
				-from gym_minigrid.wrappers import *
			
 
				+```python
			
 
				+import gym
			
 
				+from gym_minigrid.wrappers import RGBImgPartialObsWrapper, ImgObsWrapper
			
 
				+
			
 
				 env = gym.make('MiniGrid-Empty-8x8-v0')
			
 
				 env = RGBImgPartialObsWrapper(env) # Get pixel observations
			
 
				 env = ImgObsWrapper(env) # Get rid of the 'mission' field
			
@@ -323,7 +325,7 @@ object at split.
 
				 
			
 
				 ### Locked room environment
			
 
				 
			
 
				-Registed configurations:
			
 
				+Registered configurations:
			
 
				 - `MiniGrid-LockedRoom-v0`
			
 
				 
			
 
				 The environment has six rooms, one of which is locked. The agent receives
			
@@ -334,7 +336,7 @@ to solve with vanilla reinforcement learning alone.
 
				 
			
 
				 ### Key corridor environment
			
 
				 
			
 
				-Registed configurations:
			
 
				+Registered configurations:
			
 
				 - `MiniGrid-KeyCorridorS3R1-v0`
			
 
				 - `MiniGrid-KeyCorridorS3R2-v0`
			
 
				 - `MiniGrid-KeyCorridorS3R3-v0`
			
@@ -361,7 +363,7 @@ key is placed. This environment can be solved without relying on language.
 
				 
			
 
				 ### Unlock environment
			
 
				 
			
 
				-Registed configurations:
			
 
				+Registered configurations:
			
 
				 - `MiniGrid-Unlock-v0`
			
 
				 
			
 
				 <p align="center">
			
@@ -373,7 +375,7 @@ relying on language.
 
				 
			
 
				 ### Unlock pickup environment
			
 
				 
			
 
				-Registed configurations:
			
 
				+Registered configurations:
			
 
				 - `MiniGrid-UnlockPickup-v0`
			
 
				 
			
 
				 <p align="center">
			
@@ -385,7 +387,7 @@ locked door. This environment can be solved without relying on language.
 
				 
			
 
				 ### Blocked unlock pickup environment
			
 
				 
			
 
				-Registed configurations:
			
 
				+Registered configurations:
			
 
				 - `MiniGrid-BlockedUnlockPickup-v0`
			
 
				 
			
 
				 <p align="center">
			
--- a/benchmark.py
+++ b/benchmark.py
@@ -1,23 +1,24 @@
 
				 #!/usr/bin/env python3
			
 
				 
			
 
				-import time
			
 
				 import argparse
			
 
				-import gym_minigrid
			
 
				+import time
			
 
				+
			
 
				 import gym
			
 
				-from gym_minigrid.wrappers import *
			
 
				+
			
 
				+from gym_minigrid.wrappers import ImgObsWrapper, RGBImgPartialObsWrapper
			
 
				 
			
 
				 parser = argparse.ArgumentParser()
			
 
				 parser.add_argument(
			
 
				     "--env-name",
			
 
				     dest="env_name",
			
 
				     help="gym environment to load",
			
 
				-    default='MiniGrid-LavaGapS7-v0'
			
 
				+    default="MiniGrid-LavaGapS7-v0",
			
 
				 )
			
 
				 parser.add_argument("--num_resets", default=200)
			
 
				 parser.add_argument("--num_frames", default=5000)
			
 
				 args = parser.parse_args()
			
 
				 
			
 
				-env = gym.make(args.env_name)
			
 
				+env = gym.make(args.env_name, render_mode="rgb_array")
			
 
				 
			
 
				 # Benchmark env.reset
			
 
				 t0 = time.time()
			
@@ -30,7 +31,7 @@ reset_time = (1000 * dt) / args.num_resets
 
				 # Benchmark rendering
			
 
				 t0 = time.time()
			
 
				 for i in range(args.num_frames):
			
 
				-    env.render('rgb_array')
			
 
				+    env.render("rgb_array")
			
 
				 t1 = time.time()
			
 
				 dt = t1 - t0
			
 
				 frames_per_sec = args.num_frames / dt
			
@@ -48,6 +49,6 @@ t1 = time.time()
 
				 dt = t1 - t0
			
 
				 agent_view_fps = args.num_frames / dt
			
 
				 
			
 
				-print('Env reset time: {:.1f} ms'.format(reset_time))
			
 
				-print('Rendering FPS : {:.0f}'.format(frames_per_sec))
			
 
				-print('Agent view FPS: {:.0f}'.format(agent_view_fps))
			
 
				+print(f"Env reset time: {reset_time:.1f} ms")
			
 
				+print(f"Rendering FPS : {frames_per_sec:.0f}")
			
 
				+print(f"Agent view FPS: {agent_view_fps:.0f}")
			
--- a/gym_minigrid/__init__.py
+++ b/gym_minigrid/__init__.py
@@ -1,5 +1,3 @@
 
				 # Import the envs module so that envs register themselves
			
 
				-import gym_minigrid.envs
			
 
				-
			
 
				 # Import wrappers so it's accessible when installing with pip
			
 
				-import gym_minigrid.wrappers
			
 
				+from gym_minigrid import envs, wrappers
			
--- a/gym_minigrid/envs/blockedunlockpickup.py
+++ b/gym_minigrid/envs/blockedunlockpickup.py
@@ -1,6 +1,7 @@
 
				 from gym_minigrid.minigrid import Ball
			
 
				-from gym_minigrid.roomgrid import RoomGrid
			
 
				 from gym_minigrid.register import register
			
 
				+from gym_minigrid.roomgrid import RoomGrid
			
 
				+
			
 
				 
			
 
				 class BlockedUnlockPickupEnv(RoomGrid):
			
 
				     """
			
@@ -8,14 +9,14 @@ class BlockedUnlockPickupEnv(RoomGrid):
 
				     in another room
			
 
				     """
			
 
				 
			
 
				-    def __init__(self, seed=None):
			
 
				+    def __init__(self, **kwargs):
			
 
				         room_size = 6
			
 
				         super().__init__(
			
 
				             num_rows=1,
			
 
				             num_cols=2,
			
 
				             room_size=room_size,
			
 
				-            max_steps=16*room_size**2,
			
 
				-            seed=seed
			
 
				+            max_steps=16 * room_size**2,
			
 
				+            **kwargs,
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -27,14 +28,14 @@ class BlockedUnlockPickupEnv(RoomGrid):
 
				         door, pos = self.add_door(0, 0, 0, locked=True)
			
 
				         # Block the door with a ball
			
 
				         color = self._rand_color()
			
 
				-        self.grid.set(pos[0]-1, pos[1], Ball(color))
			
 
				+        self.grid.set(pos[0] - 1, pos[1], Ball(color))
			
 
				         # Add a key to unlock the door
			
 
				-        self.add_object(0, 0, 'key', door.color)
			
 
				+        self.add_object(0, 0, "key", door.color)
			
 
				 
			
 
				         self.place_agent(0, 0)
			
 
				 
			
 
				         self.obj = obj
			
 
				-        self.mission = "pick up the %s %s" % (obj.color, obj.type)
			
 
				+        self.mission = f"pick up the {obj.color} {obj.type}"
			
 
				 
			
 
				     def step(self, action):
			
 
				         obs, reward, done, info = super().step(action)
			
@@ -46,6 +47,7 @@ class BlockedUnlockPickupEnv(RoomGrid):
 
				 
			
 
				         return obs, reward, done, info
			
 
				 
			
 
				+
			
 
				 register(
			
 
				     id='MiniGrid-BlockedUnlockPickup-v0',
			
 
				     entry_point='gym_minigrid.envs.blockedunlockpickup:BlockedUnlockPickupEnv'
			
--- a/gym_minigrid/envs/crossing.py
+++ b/gym_minigrid/envs/crossing.py
@@ -1,23 +1,23 @@
 
				-from gym_minigrid.minigrid import *
			
 
				-from gym_minigrid.register import register
			
 
				-
			
 
				 import itertools as itt
			
 
				 
			
 
				+from gym_minigrid.minigrid import Goal, Grid, Lava, MiniGridEnv, Wall
			
 
				+from gym_minigrid.register import register
			
 
				+
			
 
				 
			
 
				 class CrossingEnv(MiniGridEnv):
			
 
				     """
			
 
				     Environment with wall or lava obstacles, sparse reward.
			
 
				     """
			
 
				 
			
 
				-    def __init__(self, size=9, num_crossings=1, obstacle_type=Lava, seed=None):
			
 
				+    def __init__(self, size=9, num_crossings=1, obstacle_type=Lava, **kwargs):
			
 
				         self.num_crossings = num_crossings
			
 
				         self.obstacle_type = obstacle_type
			
 
				         super().__init__(
			
 
				             grid_size=size,
			
 
				-            max_steps=4*size*size,
			
 
				+            max_steps=4 * size * size,
			
 
				             # Set this to True for maximum speed
			
 
				             see_through_walls=False,
			
 
				-            seed=None
			
 
				+            **kwargs
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -43,9 +43,9 @@ class CrossingEnv(MiniGridEnv):
 
				         rivers = [(v, i) for i in range(2, height - 2, 2)]
			
 
				         rivers += [(h, j) for j in range(2, width - 2, 2)]
			
 
				         self.np_random.shuffle(rivers)
			
 
				-        rivers = rivers[:self.num_crossings]  # sample random rivers
			
 
				-        rivers_v = sorted([pos for direction, pos in rivers if direction is v])
			
 
				-        rivers_h = sorted([pos for direction, pos in rivers if direction is h])
			
 
				+        rivers = rivers[: self.num_crossings]  # sample random rivers
			
 
				+        rivers_v = sorted(pos for direction, pos in rivers if direction is v)
			
 
				+        rivers_h = sorted(pos for direction, pos in rivers if direction is h)
			
 
				         obstacle_pos = itt.chain(
			
 
				             itt.product(range(1, width - 1), rivers_h),
			
 
				             itt.product(rivers_v, range(1, height - 1)),
			
@@ -65,11 +65,13 @@ class CrossingEnv(MiniGridEnv):
 
				             if direction is h:
			
 
				                 i = limits_v[room_i + 1]
			
 
				                 j = self.np_random.choice(
			
 
				-                    range(limits_h[room_j] + 1, limits_h[room_j + 1]))
			
 
				+                    range(limits_h[room_j] + 1, limits_h[room_j + 1])
			
 
				+                )
			
 
				                 room_i += 1
			
 
				             elif direction is v:
			
 
				                 i = self.np_random.choice(
			
 
				-                    range(limits_v[room_i] + 1, limits_v[room_i + 1]))
			
 
				+                    range(limits_v[room_i] + 1, limits_v[room_i + 1])
			
 
				+                )
			
 
				                 j = limits_h[room_j + 1]
			
 
				                 room_j += 1
			
 
				             else:
			
--- a/gym_minigrid/envs/distshift.py
+++ b/gym_minigrid/envs/distshift.py
@@ -1,6 +1,7 @@
 
				-from gym_minigrid.minigrid import *
			
 
				+from gym_minigrid.minigrid import Goal, Grid, Lava, MiniGridEnv
			
 
				 from gym_minigrid.register import register
			
 
				 
			
 
				+
			
 
				 class DistShiftEnv(MiniGridEnv):
			
 
				     """
			
 
				     Distributional shift environment.
			
@@ -10,21 +11,23 @@ class DistShiftEnv(MiniGridEnv):
 
				         self,
			
 
				         width=9,
			
 
				         height=7,
			
 
				-        agent_start_pos=(1,1),
			
 
				+        agent_start_pos=(1, 1),
			
 
				         agent_start_dir=0,
			
 
				-        strip2_row=2
			
 
				+        strip2_row=2,
			
 
				+        **kwargs
			
 
				     ):
			
 
				         self.agent_start_pos = agent_start_pos
			
 
				         self.agent_start_dir = agent_start_dir
			
 
				-        self.goal_pos = (width-2, 1)
			
 
				+        self.goal_pos = (width - 2, 1)
			
 
				         self.strip2_row = strip2_row
			
 
				 
			
 
				         super().__init__(
			
 
				             width=width,
			
 
				             height=height,
			
 
				-            max_steps=4*width*height,
			
 
				+            max_steps=4 * width * height,
			
 
				             # Set this to True for maximum speed
			
 
				-            see_through_walls=True
			
 
				+            see_through_walls=True,
			
 
				+            **kwargs
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -39,8 +42,8 @@ class DistShiftEnv(MiniGridEnv):
 
				 
			
 
				         # Place the lava rows
			
 
				         for i in range(self.width - 6):
			
 
				-            self.grid.set(3+i, 1, Lava())
			
 
				-            self.grid.set(3+i, self.strip2_row, Lava())
			
 
				+            self.grid.set(3 + i, 1, Lava())
			
 
				+            self.grid.set(3 + i, self.strip2_row, Lava())
			
 
				 
			
 
				         # Place the agent
			
 
				         if self.agent_start_pos is not None:
			
--- a/gym_minigrid/envs/doorkey.py
+++ b/gym_minigrid/envs/doorkey.py
@@ -1,16 +1,16 @@
 
				-from gym_minigrid.minigrid import *
			
 
				+from gym_minigrid.minigrid import Door, Goal, Grid, Key, MiniGridEnv
			
 
				 from gym_minigrid.register import register
			
 
				 
			
 
				+
			
 
				 class DoorKeyEnv(MiniGridEnv):
			
 
				     """
			
 
				     Environment with a door and key, sparse reward
			
 
				     """
			
 
				 
			
 
				-    def __init__(self, size=8):
			
 
				-        super().__init__(
			
 
				-            grid_size=size,
			
 
				-            max_steps=10*size*size
			
 
				-        )
			
 
				+    def __init__(self, size=8, **kwargs):
			
 
				+        if "max_steps" not in kwargs:
			
 
				+            kwargs["max_steps"] = 10 * size * size
			
 
				+        super().__init__(grid_size=size, **kwargs)
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
 
				         # Create an empty grid
			
@@ -23,7 +23,7 @@ class DoorKeyEnv(MiniGridEnv):
 
				         self.put_obj(Goal(), width - 2, height - 2)
			
 
				 
			
 
				         # Create a vertical splitting wall
			
 
				-        splitIdx = self._rand_int(2, width-2)
			
 
				+        splitIdx = self._rand_int(2, width - 2)
			
 
				         self.grid.vert_wall(splitIdx, 0)
			
 
				 
			
 
				         # Place the agent at a random position and orientation
			
@@ -31,15 +31,11 @@ class DoorKeyEnv(MiniGridEnv):
 
				         self.place_agent(size=(splitIdx, height))
			
 
				 
			
 
				         # Place a door in the wall
			
 
				-        doorIdx = self._rand_int(1, width-2)
			
 
				-        self.put_obj(Door('yellow', is_locked=True), splitIdx, doorIdx)
			
 
				+        doorIdx = self._rand_int(1, width - 2)
			
 
				+        self.put_obj(Door("yellow", is_locked=True), splitIdx, doorIdx)
			
 
				 
			
 
				         # Place a yellow key on the left side
			
 
				-        self.place_obj(
			
 
				-            obj=Key('yellow'),
			
 
				-            top=(0, 0),
			
 
				-            size=(splitIdx, height)
			
 
				-        )
			
 
				+        self.place_obj(obj=Key("yellow"), top=(0, 0), size=(splitIdx, height))
			
 
				 
			
 
				         self.mission = "use the key to open the door and then get to the goal"
			
 
				 
			
--- a/gym_minigrid/envs/dynamicobstacles.py
+++ b/gym_minigrid/envs/dynamicobstacles.py
@@ -1,35 +1,36 @@
 
				-from gym_minigrid.minigrid import *
			
 
				-from gym_minigrid.register import register
			
 
				 from operator import add
			
 
				 
			
 
				+import gym
			
 
				+
			
 
				+from gym_minigrid.minigrid import Ball, Goal, Grid, MiniGridEnv
			
 
				+from gym_minigrid.register import register
			
 
				+
			
 
				+
			
 
				 class DynamicObstaclesEnv(MiniGridEnv):
			
 
				     """
			
 
				     Single-room square grid environment with moving obstacles
			
 
				     """
			
 
				 
			
 
				     def __init__(
			
 
				-            self,
			
 
				-            size=8,
			
 
				-            agent_start_pos=(1, 1),
			
 
				-            agent_start_dir=0,
			
 
				-            n_obstacles=4
			
 
				+        self, size=8, agent_start_pos=(1, 1), agent_start_dir=0, n_obstacles=4, **kwargs
			
 
				     ):
			
 
				         self.agent_start_pos = agent_start_pos
			
 
				         self.agent_start_dir = agent_start_dir
			
 
				 
			
 
				         # Reduce obstacles if there are too many
			
 
				-        if n_obstacles <= size/2 + 1:
			
 
				+        if n_obstacles <= size / 2 + 1:
			
 
				             self.n_obstacles = int(n_obstacles)
			
 
				         else:
			
 
				-            self.n_obstacles = int(size/2)
			
 
				+            self.n_obstacles = int(size / 2)
			
 
				         super().__init__(
			
 
				             grid_size=size,
			
 
				             max_steps=4 * size * size,
			
 
				             # Set this to True for maximum speed
			
 
				             see_through_walls=True,
			
 
				+            **kwargs
			
 
				         )
			
 
				         # Allow only 3 actions permitted: left, right, forward
			
 
				-        self.action_space = spaces.Discrete(self.actions.forward + 1)
			
 
				+        self.action_space = gym.spaces.Discrete(self.actions.forward + 1)
			
 
				         self.reward_range = (-1, 1)
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -64,7 +65,7 @@ class DynamicObstaclesEnv(MiniGridEnv):
 
				 
			
 
				         # Check if there is an obstacle in front of the agent
			
 
				         front_cell = self.grid.get(*self.front_pos)
			
 
				-        not_clear = front_cell and front_cell.type != 'goal'
			
 
				+        not_clear = front_cell and front_cell.type != "goal"
			
 
				 
			
 
				         # Update obstacle positions
			
 
				         for i_obst in range(len(self.obstacles)):
			
@@ -72,13 +73,15 @@ class DynamicObstaclesEnv(MiniGridEnv):
 
				             top = tuple(map(add, old_pos, (-1, -1)))
			
 
				 
			
 
				             try:
			
 
				-                self.place_obj(self.obstacles[i_obst], top=top, size=(3,3), max_tries=100)
			
 
				+                self.place_obj(
			
 
				+                    self.obstacles[i_obst], top=top, size=(3, 3), max_tries=100
			
 
				+                )
			
 
				                 self.grid.set(*old_pos, None)
			
 
				-            except:
			
 
				+            except Exception:
			
 
				                 pass
			
 
				 
			
 
				         # Update the agent's position/direction
			
 
				-        obs, reward, done, info = MiniGridEnv.step(self, action)
			
 
				+        obs, reward, done, info = super().step(action)
			
 
				 
			
 
				         # If the agent tried to walk over an obstacle or wall
			
 
				         if action == self.actions.forward and not_clear:
			
--- a/gym_minigrid/envs/empty.py
+++ b/gym_minigrid/envs/empty.py
@@ -1,25 +1,22 @@
 
				-from gym_minigrid.minigrid import *
			
 
				+from gym_minigrid.minigrid import Goal, Grid, MiniGridEnv
			
 
				 from gym_minigrid.register import register
			
 
				 
			
 
				+
			
 
				 class EmptyEnv(MiniGridEnv):
			
 
				     """
			
 
				     Empty grid environment, no obstacles, sparse reward
			
 
				     """
			
 
				 
			
 
				-    def __init__(
			
 
				-        self,
			
 
				-        size=8,
			
 
				-        agent_start_pos=(1,1),
			
 
				-        agent_start_dir=0,
			
 
				-    ):
			
 
				+    def __init__(self, size=8, agent_start_pos=(1, 1), agent_start_dir=0, **kwargs):
			
 
				         self.agent_start_pos = agent_start_pos
			
 
				         self.agent_start_dir = agent_start_dir
			
 
				 
			
 
				         super().__init__(
			
 
				             grid_size=size,
			
 
				-            max_steps=4*size*size,
			
 
				+            max_steps=4 * size * size,
			
 
				             # Set this to True for maximum speed
			
 
				-            see_through_walls=True
			
 
				+            see_through_walls=True,
			
 
				+            **kwargs
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
--- a/gym_minigrid/envs/fetch.py
+++ b/gym_minigrid/envs/fetch.py
@@ -1,24 +1,22 @@
 
				-from gym_minigrid.minigrid import *
			
 
				+from gym_minigrid.minigrid import COLOR_NAMES, Ball, Grid, Key, MiniGridEnv
			
 
				 from gym_minigrid.register import register
			
 
				 
			
 
				+
			
 
				 class FetchEnv(MiniGridEnv):
			
 
				     """
			
 
				     Environment in which the agent has to fetch a random object
			
 
				     named using English text strings
			
 
				     """
			
 
				 
			
 
				-    def __init__(
			
 
				-        self,
			
 
				-        size=8,
			
 
				-        numObjs=3
			
 
				-    ):
			
 
				+    def __init__(self, size=8, numObjs=3, **kwargs):
			
 
				         self.numObjs = numObjs
			
 
				 
			
 
				         super().__init__(
			
 
				             grid_size=size,
			
 
				-            max_steps=5*size**2,
			
 
				+            max_steps=5 * size**2,
			
 
				             # Set this to True for maximum speed
			
 
				-            see_through_walls=True
			
 
				+            see_through_walls=True,
			
 
				+            **kwargs,
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -26,11 +24,11 @@ class FetchEnv(MiniGridEnv):
 
				 
			
 
				         # Generate the surrounding walls
			
 
				         self.grid.horz_wall(0, 0)
			
 
				-        self.grid.horz_wall(0, height-1)
			
 
				+        self.grid.horz_wall(0, height - 1)
			
 
				         self.grid.vert_wall(0, 0)
			
 
				-        self.grid.vert_wall(width-1, 0)
			
 
				+        self.grid.vert_wall(width - 1, 0)
			
 
				 
			
 
				-        types = ['key', 'ball']
			
 
				+        types = ["key", "ball"]
			
 
				 
			
 
				         objs = []
			
 
				 
			
@@ -39,9 +37,9 @@ class FetchEnv(MiniGridEnv):
 
				             objType = self._rand_elem(types)
			
 
				             objColor = self._rand_elem(COLOR_NAMES)
			
 
				 
			
 
				-            if objType == 'key':
			
 
				+            if objType == "key":
			
 
				                 obj = Key(objColor)
			
 
				-            elif objType == 'ball':
			
 
				+            elif objType == "ball":
			
 
				                 obj = Ball(objColor)
			
 
				 
			
 
				             self.place_obj(obj)
			
@@ -55,28 +53,30 @@ class FetchEnv(MiniGridEnv):
 
				         self.targetType = target.type
			
 
				         self.targetColor = target.color
			
 
				 
			
 
				-        descStr = '%s %s' % (self.targetColor, self.targetType)
			
 
				+        descStr = f"{self.targetColor} {self.targetType}"
			
 
				 
			
 
				         # Generate the mission string
			
 
				         idx = self._rand_int(0, 5)
			
 
				         if idx == 0:
			
 
				-            self.mission = 'get a %s' % descStr
			
 
				+            self.mission = "get a %s" % descStr
			
 
				         elif idx == 1:
			
 
				-            self.mission = 'go get a %s' % descStr
			
 
				+            self.mission = "go get a %s" % descStr
			
 
				         elif idx == 2:
			
 
				-            self.mission = 'fetch a %s' % descStr
			
 
				+            self.mission = "fetch a %s" % descStr
			
 
				         elif idx == 3:
			
 
				-            self.mission = 'go fetch a %s' % descStr
			
 
				+            self.mission = "go fetch a %s" % descStr
			
 
				         elif idx == 4:
			
 
				-            self.mission = 'you must fetch a %s' % descStr
			
 
				-        assert hasattr(self, 'mission')
			
 
				+            self.mission = "you must fetch a %s" % descStr
			
 
				+        assert hasattr(self, "mission")
			
 
				 
			
 
				     def step(self, action):
			
 
				         obs, reward, done, info = MiniGridEnv.step(self, action)
			
 
				 
			
 
				         if self.carrying:
			
 
				-            if self.carrying.color == self.targetColor and \
			
 
				-               self.carrying.type == self.targetType:
			
 
				+            if (
			
 
				+                self.carrying.color == self.targetColor
			
 
				+                and self.carrying.type == self.targetType
			
 
				+            ):
			
 
				                 reward = self._reward()
			
 
				                 done = True
			
 
				             else:
			
--- a/gym_minigrid/envs/fourrooms.py
+++ b/gym_minigrid/envs/fourrooms.py
@@ -1,7 +1,5 @@
 
				 #!/usr/bin/env python
			
 
				-# -*- coding: utf-8 -*-
			
 
				-
			
 
				-from gym_minigrid.minigrid import *
			
 
				+from gym_minigrid.minigrid import Goal, Grid, MiniGridEnv
			
 
				 from gym_minigrid.register import register
			
 
				 
			
 
				 
			
@@ -11,10 +9,10 @@ class FourRoomsEnv(MiniGridEnv):
 
				     Can specify agent and goal position, if not it set at random.
			
 
				     """
			
 
				 
			
 
				-    def __init__(self, agent_pos=None, goal_pos=None):
			
 
				+    def __init__(self, agent_pos=None, goal_pos=None, **kwargs):
			
 
				         self._agent_default_pos = agent_pos
			
 
				         self._goal_default_pos = goal_pos
			
 
				-        super().__init__(grid_size=19, max_steps=100)
			
 
				+        super().__init__(grid_size=19, max_steps=100, **kwargs)
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
 
				         # Create the grid
			
@@ -66,7 +64,8 @@ class FourRoomsEnv(MiniGridEnv):
 
				         else:
			
 
				             self.place_obj(Goal())
			
 
				 
			
 
				-        self.mission = 'Reach the goal'
			
 
				+        self.mission = "reach the goal"
			
 
				+        self.mission = "Reach the goal"
			
 
				 
			
 
				     def step(self, action):
			
 
				         obs, reward, done, info = MiniGridEnv.step(self, action)
			
--- a/gym_minigrid/envs/gotodoor.py
+++ b/gym_minigrid/envs/gotodoor.py
@@ -1,23 +1,22 @@
 
				-from gym_minigrid.minigrid import *
			
 
				+from gym_minigrid.minigrid import COLOR_NAMES, Door, Grid, MiniGridEnv
			
 
				 from gym_minigrid.register import register
			
 
				 
			
 
				+
			
 
				 class GoToDoorEnv(MiniGridEnv):
			
 
				     """
			
 
				     Environment in which the agent is instructed to go to a given object
			
 
				     named using an English text string
			
 
				     """
			
 
				 
			
 
				-    def __init__(
			
 
				-        self,
			
 
				-        size=5
			
 
				-    ):
			
 
				+    def __init__(self, size=5, **kwargs):
			
 
				         assert size >= 5
			
 
				 
			
 
				         super().__init__(
			
 
				             grid_size=size,
			
 
				-            max_steps=5*size**2,
			
 
				+            max_steps=5 * size**2,
			
 
				             # Set this to True for maximum speed
			
 
				-            see_through_walls=True
			
 
				+            see_through_walls=True,
			
 
				+            **kwargs
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -25,18 +24,18 @@ class GoToDoorEnv(MiniGridEnv):
 
				         self.grid = Grid(width, height)
			
 
				 
			
 
				         # Randomly vary the room width and height
			
 
				-        width = self._rand_int(5, width+1)
			
 
				-        height = self._rand_int(5, height+1)
			
 
				+        width = self._rand_int(5, width + 1)
			
 
				+        height = self._rand_int(5, height + 1)
			
 
				 
			
 
				         # Generate the surrounding walls
			
 
				         self.grid.wall_rect(0, 0, width, height)
			
 
				 
			
 
				         # Generate the 4 doors at random positions
			
 
				         doorPos = []
			
 
				-        doorPos.append((self._rand_int(2, width-2), 0))
			
 
				-        doorPos.append((self._rand_int(2, width-2), height-1))
			
 
				-        doorPos.append((0, self._rand_int(2, height-2)))
			
 
				-        doorPos.append((width-1, self._rand_int(2, height-2)))
			
 
				+        doorPos.append((self._rand_int(2, width - 2), 0))
			
 
				+        doorPos.append((self._rand_int(2, width - 2), height - 1))
			
 
				+        doorPos.append((0, self._rand_int(2, height - 2)))
			
 
				+        doorPos.append((width - 1, self._rand_int(2, height - 2)))
			
 
				 
			
 
				         # Generate the door colors
			
 
				         doorColors = []
			
@@ -60,7 +59,7 @@ class GoToDoorEnv(MiniGridEnv):
 
				         self.target_color = doorColors[doorIdx]
			
 
				 
			
 
				         # Generate the mission string
			
 
				-        self.mission = 'go to the %s door' % self.target_color
			
 
				+        self.mission = "go to the %s door" % self.target_color
			
 
				 
			
 
				     def step(self, action):
			
 
				         obs, reward, done, info = super().step(action)
			
--- a/gym_minigrid/envs/gotoobject.py
+++ b/gym_minigrid/envs/gotoobject.py
@@ -1,24 +1,22 @@
 
				-from gym_minigrid.minigrid import *
			
 
				+from gym_minigrid.minigrid import COLOR_NAMES, Ball, Box, Grid, Key, MiniGridEnv
			
 
				 from gym_minigrid.register import register
			
 
				 
			
 
				+
			
 
				 class GoToObjectEnv(MiniGridEnv):
			
 
				     """
			
 
				     Environment in which the agent is instructed to go to a given object
			
 
				     named using an English text string
			
 
				     """
			
 
				 
			
 
				-    def __init__(
			
 
				-        self,
			
 
				-        size=6,
			
 
				-        numObjs=2
			
 
				-    ):
			
 
				+    def __init__(self, size=6, numObjs=2, **kwargs):
			
 
				         self.numObjs = numObjs
			
 
				 
			
 
				         super().__init__(
			
 
				             grid_size=size,
			
 
				-            max_steps=5*size**2,
			
 
				+            max_steps=5 * size**2,
			
 
				             # Set this to True for maximum speed
			
 
				-            see_through_walls=True
			
 
				+            see_through_walls=True,
			
 
				+            **kwargs,
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -28,7 +26,7 @@ class GoToObjectEnv(MiniGridEnv):
 
				         self.grid.wall_rect(0, 0, width, height)
			
 
				 
			
 
				         # Types and colors of objects we can generate
			
 
				-        types = ['key', 'ball', 'box']
			
 
				+        types = ["key", "ball", "box"]
			
 
				 
			
 
				         objs = []
			
 
				         objPos = []
			
@@ -42,11 +40,11 @@ class GoToObjectEnv(MiniGridEnv):
 
				             if (objType, objColor) in objs:
			
 
				                 continue
			
 
				 
			
 
				-            if objType == 'key':
			
 
				+            if objType == "key":
			
 
				                 obj = Key(objColor)
			
 
				-            elif objType == 'ball':
			
 
				+            elif objType == "ball":
			
 
				                 obj = Ball(objColor)
			
 
				-            elif objType == 'box':
			
 
				+            elif objType == "box":
			
 
				                 obj = Box(objColor)
			
 
				 
			
 
				             pos = self.place_obj(obj)
			
@@ -61,12 +59,12 @@ class GoToObjectEnv(MiniGridEnv):
 
				         self.targetType, self.target_color = objs[objIdx]
			
 
				         self.target_pos = objPos[objIdx]
			
 
				 
			
 
				-        descStr = '%s %s' % (self.target_color, self.targetType)
			
 
				-        self.mission = 'go to the %s' % descStr
			
 
				-        #print(self.mission)
			
 
				+        descStr = f"{self.target_color} {self.targetType}"
			
 
				+        self.mission = "go to the %s" % descStr
			
 
				+        # print(self.mission)
			
 
				 
			
 
				     def step(self, action):
			
 
				-        obs, reward, done, info = MiniGridEnv.step(self, action)
			
 
				+        obs, reward, done, info = super().step(action)
			
 
				 
			
 
				         ax, ay = self.agent_pos
			
 
				         tx, ty = self.target_pos
			
--- a/gym_minigrid/envs/keycorridor.py
+++ b/gym_minigrid/envs/keycorridor.py
@@ -1,5 +1,6 @@
 
				-from gym_minigrid.roomgrid import RoomGrid
			
 
				 from gym_minigrid.register import register
			
 
				+from gym_minigrid.roomgrid import RoomGrid
			
 
				+
			
 
				 
			
 
				 class KeyCorridorEnv(RoomGrid):
			
 
				     """
			
@@ -7,20 +8,14 @@ class KeyCorridorEnv(RoomGrid):
 
				     random room.
			
 
				     """
			
 
				 
			
 
				-    def __init__(
			
 
				-        self,
			
 
				-        num_rows=3,
			
 
				-        obj_type="ball",
			
 
				-        room_size=6,
			
 
				-        seed=None
			
 
				-    ):
			
 
				+    def __init__(self, num_rows=3, obj_type="ball", room_size=6, **kwargs):
			
 
				         self.obj_type = obj_type
			
 
				 
			
 
				         super().__init__(
			
 
				             room_size=room_size,
			
 
				             num_rows=num_rows,
			
 
				-            max_steps=30*room_size**2,
			
 
				-            seed=seed,
			
 
				+            max_steps=30 * room_size**2,
			
 
				+            **kwargs,
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -37,7 +32,7 @@ class KeyCorridorEnv(RoomGrid):
 
				         obj, _ = self.add_object(2, room_idx, kind=self.obj_type)
			
 
				 
			
 
				         # Add a key in a random room on the left side
			
 
				-        self.add_object(0, self._rand_int(0, self.num_rows), 'key', door.color)
			
 
				+        self.add_object(0, self._rand_int(0, self.num_rows), "key", door.color)
			
 
				 
			
 
				         # Place the agent in the middle
			
 
				         self.place_agent(1, self.num_rows // 2)
			
@@ -46,7 +41,7 @@ class KeyCorridorEnv(RoomGrid):
 
				         self.connect_all()
			
 
				 
			
 
				         self.obj = obj
			
 
				-        self.mission = "pick up the %s %s" % (obj.color, obj.type)
			
 
				+        self.mission = f"pick up the {obj.color} {obj.type}"
			
 
				 
			
 
				     def step(self, action):
			
 
				         obs, reward, done, info = super().step(action)
			
--- a/gym_minigrid/envs/lavagap.py
+++ b/gym_minigrid/envs/lavagap.py
@@ -1,20 +1,23 @@
 
				-from gym_minigrid.minigrid import *
			
 
				+import numpy as np
			
 
				+
			
 
				+from gym_minigrid.minigrid import Goal, Grid, Lava, MiniGridEnv
			
 
				 from gym_minigrid.register import register
			
 
				 
			
 
				+
			
 
				 class LavaGapEnv(MiniGridEnv):
			
 
				     """
			
 
				     Environment with one wall of lava with a small gap to cross through
			
 
				     This environment is similar to LavaCrossing but simpler in structure.
			
 
				     """
			
 
				 
			
 
				-    def __init__(self, size, obstacle_type=Lava):
			
 
				+    def __init__(self, size, obstacle_type=Lava, **kwargs):
			
 
				         self.obstacle_type = obstacle_type
			
 
				         super().__init__(
			
 
				             grid_size=size,
			
 
				-            max_steps=4*size*size,
			
 
				+            max_steps=4 * size * size,
			
 
				             # Set this to True for maximum speed
			
 
				             see_through_walls=False,
			
 
				-            seed=None
			
 
				+            **kwargs
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -35,10 +38,12 @@ class LavaGapEnv(MiniGridEnv):
 
				         self.put_obj(Goal(), *self.goal_pos)
			
 
				 
			
 
				         # Generate and store random gap position
			
 
				-        self.gap_pos = np.array((
			
 
				-            self._rand_int(2, width - 2),
			
 
				-            self._rand_int(1, height - 1),
			
 
				-        ))
			
 
				+        self.gap_pos = np.array(
			
 
				+            (
			
 
				+                self._rand_int(2, width - 2),
			
 
				+                self._rand_int(1, height - 1),
			
 
				+            )
			
 
				+        )
			
 
				 
			
 
				         # Place the obstacle wall
			
 
				         self.grid.vert_wall(self.gap_pos[0], 1, height - 2, self.obstacle_type)
			
--- a/gym_minigrid/envs/lockedroom.py
+++ b/gym_minigrid/envs/lockedroom.py
@@ -1,5 +1,4 @@
 
				-from gym import spaces
			
 
				-from gym_minigrid.minigrid import *
			
 
				+from gym_minigrid.minigrid import COLOR_NAMES, Door, Goal, Grid, Key, MiniGridEnv, Wall
			
 
				 from gym_minigrid.register import register
			
 
				 
			
 
				 class LockedRoom:
			
@@ -17,10 +16,8 @@ class LockedRoom:
 
				     def rand_pos(self, env):
			
 
				         topX, topY = self.top
			
 
				         sizeX, sizeY = self.size
			
 
				-        return env._rand_pos(
			
 
				-            topX + 1, topX + sizeX - 1,
			
 
				-            topY + 1, topY + sizeY - 1
			
 
				-        )
			
 
				+        return env._rand_pos(topX + 1, topX + sizeX - 1, topY + 1, topY + sizeY - 1)
			
 
				+
			
 
				 
			
 
				 class LockedRoomEnv(MiniGridEnv):
			
 
				     """
			
@@ -28,11 +25,8 @@ class LockedRoomEnv(MiniGridEnv):
 
				     named using an English text string
			
 
				     """
			
 
				 
			
 
				-    def __init__(
			
 
				-        self,
			
 
				-        size=19
			
 
				-    ):
			
 
				-        super().__init__(grid_size=size, max_steps=10*size)
			
 
				+    def __init__(self, size=19, **kwargs):
			
 
				+        super().__init__(grid_size=size, max_steps=10 * size, **kwargs)
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
 
				         # Create the grid
			
@@ -41,10 +35,10 @@ class LockedRoomEnv(MiniGridEnv):
 
				         # Generate the surrounding walls
			
 
				         for i in range(0, width):
			
 
				             self.grid.set(i, 0, Wall())
			
 
				-            self.grid.set(i, height-1, Wall())
			
 
				+            self.grid.set(i, height - 1, Wall())
			
 
				         for j in range(0, height):
			
 
				             self.grid.set(0, j, Wall())
			
 
				-            self.grid.set(width-1, j, Wall())
			
 
				+            self.grid.set(width - 1, j, Wall())
			
 
				 
			
 
				         # Hallway walls
			
 
				         lWallIdx = width // 2 - 2
			
@@ -103,15 +97,14 @@ class LockedRoomEnv(MiniGridEnv):
 
				 
			
 
				         # Randomize the player start position and orientation
			
 
				         self.agent_pos = self.place_agent(
			
 
				-            top=(lWallIdx, 0),
			
 
				-            size=(rWallIdx-lWallIdx, height)
			
 
				+            top=(lWallIdx, 0), size=(rWallIdx - lWallIdx, height)
			
 
				         )
			
 
				 
			
 
				         # Generate the mission string
			
 
				         self.mission = (
			
 
				-            'get the %s key from the %s room, '
			
 
				-            'unlock the %s door and '
			
 
				-            'go to the goal'
			
 
				+            "get the %s key from the %s room, "
			
 
				+            "unlock the %s door and "
			
 
				+            "go to the goal"
			
 
				         ) % (lockedRoom.color, keyRoom.color, lockedRoom.color)
			
 
				 
			
 
				     def step(self, action):
			
--- a/gym_minigrid/envs/memory.py
+++ b/gym_minigrid/envs/memory.py
@@ -1,6 +1,7 @@
 
				-from gym_minigrid.minigrid import *
			
 
				+from gym_minigrid.minigrid import Ball, Grid, Key, MiniGridEnv, Wall
			
 
				 from gym_minigrid.register import register
			
 
				 
			
 
				+
			
 
				 class MemoryEnv(MiniGridEnv):
			
 
				     """
			
 
				     This environment is a memory test. The agent starts in a small room
			
@@ -11,19 +12,14 @@ class MemoryEnv(MiniGridEnv):
 
				     object at split.
			
 
				     """
			
 
				 
			
 
				-    def __init__(
			
 
				-        self,
			
 
				-        seed=None,
			
 
				-        size=8,
			
 
				-        random_length=False,
			
 
				-    ):
			
 
				+    def __init__(self, size=8, random_length=False, **kwargs):
			
 
				         self.random_length = random_length
			
 
				         super().__init__(
			
 
				-            seed=seed,
			
 
				             grid_size=size,
			
 
				-            max_steps=5*size**2,
			
 
				+            max_steps=5 * size**2,
			
 
				             # Set this to True for maximum speed
			
 
				             see_through_walls=False,
			
 
				+            **kwargs
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -31,7 +27,7 @@ class MemoryEnv(MiniGridEnv):
 
				 
			
 
				         # Generate the surrounding walls
			
 
				         self.grid.horz_wall(0, 0)
			
 
				-        self.grid.horz_wall(0, height-1)
			
 
				+        self.grid.horz_wall(0, height - 1)
			
 
				         self.grid.vert_wall(0, 0)
			
 
				         self.grid.vert_wall(width - 1, 0)
			
 
				 
			
@@ -67,13 +63,13 @@ class MemoryEnv(MiniGridEnv):
 
				 
			
 
				         # Place objects
			
 
				         start_room_obj = self._rand_elem([Key, Ball])
			
 
				-        self.grid.set(1, height // 2 - 1, start_room_obj('green'))
			
 
				+        self.grid.set(1, height // 2 - 1, start_room_obj("green"))
			
 
				 
			
 
				         other_objs = self._rand_elem([[Ball, Key], [Key, Ball]])
			
 
				         pos0 = (hallway_end + 1, height // 2 - 2)
			
 
				         pos1 = (hallway_end + 1, height // 2 + 2)
			
 
				-        self.grid.set(*pos0, other_objs[0]('green'))
			
 
				-        self.grid.set(*pos1, other_objs[1]('green'))
			
 
				+        self.grid.set(*pos0, other_objs[0]("green"))
			
 
				+        self.grid.set(*pos1, other_objs[1]("green"))
			
 
				 
			
 
				         # Choose the target objects
			
 
				         if start_room_obj == other_objs[0]:
			
@@ -83,7 +79,7 @@ class MemoryEnv(MiniGridEnv):
 
				             self.success_pos = (pos1[0], pos1[1] - 1)
			
 
				             self.failure_pos = (pos0[0], pos0[1] + 1)
			
 
				 
			
 
				-        self.mission = 'go to the matching object at the end of the hallway'
			
 
				+        self.mission = "go to the matching object at the end of the hallway"
			
 
				 
			
 
				     def step(self, action):
			
 
				         if action == MiniGridEnv.Actions.pickup:
			
--- a/gym_minigrid/envs/multiroom.py
+++ b/gym_minigrid/envs/multiroom.py
@@ -1,4 +1,4 @@
 
				-from gym_minigrid.minigrid import *
			
 
				+from gym_minigrid.minigrid import COLOR_NAMES, Door, Goal, Grid, MiniGridEnv, Wall
			
 
				 from gym_minigrid.register import register
			
 
				 
			
 
				 class MultiRoom:
			
@@ -13,16 +13,13 @@ class MultiRoom:
 
				         self.entryDoorPos = entryDoorPos
			
 
				         self.exitDoorPos = exitDoorPos
			
 
				 
			
 
				+
			
 
				 class MultiRoomEnv(MiniGridEnv):
			
 
				     """
			
 
				     Environment with multiple rooms (subgoals)
			
 
				     """
			
 
				 
			
 
				-    def __init__(self,
			
 
				-        minNumRooms,
			
 
				-        maxNumRooms,
			
 
				-        maxRoomSize=10
			
 
				-    ):
			
 
				+    def __init__(self, minNumRooms, maxNumRooms, maxRoomSize=10, **kwargs):
			
 
				         assert minNumRooms > 0
			
 
				         assert maxNumRooms >= minNumRooms
			
 
				         assert maxRoomSize >= 4
			
@@ -33,24 +30,18 @@ class MultiRoomEnv(MiniGridEnv):
 
				 
			
 
				         self.rooms = []
			
 
				 
			
 
				-        super(MultiRoomEnv, self).__init__(
			
 
				-            grid_size=25,
			
 
				-            max_steps=self.maxNumRooms * 20
			
 
				-        )
			
 
				+        super().__init__(grid_size=25, max_steps=self.maxNumRooms * 20, **kwargs)
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
 
				         roomList = []
			
 
				 
			
 
				         # Choose a random number of rooms to generate
			
 
				-        numRooms = self._rand_int(self.minNumRooms, self.maxNumRooms+1)
			
 
				+        numRooms = self._rand_int(self.minNumRooms, self.maxNumRooms + 1)
			
 
				 
			
 
				         while len(roomList) < numRooms:
			
 
				             curRoomList = []
			
 
				 
			
 
				-            entryDoorPos = (
			
 
				-                self._rand_int(0, width - 2),
			
 
				-                self._rand_int(0, width - 2)
			
 
				-            )
			
 
				+            entryDoorPos = (self._rand_int(0, width - 2), self._rand_int(0, width - 2))
			
 
				 
			
 
				             # Recursively place the rooms
			
 
				             self._placeRoom(
			
@@ -59,7 +50,7 @@ class MultiRoomEnv(MiniGridEnv):
 
				                 minSz=4,
			
 
				                 maxSz=self.maxRoomSize,
			
 
				                 entryDoorWall=2,
			
 
				-                entryDoorPos=entryDoorPos
			
 
				+                entryDoorPos=entryDoorPos,
			
 
				             )
			
 
				 
			
 
				             if len(curRoomList) > len(roomList):
			
@@ -105,7 +96,7 @@ class MultiRoomEnv(MiniGridEnv):
 
				                 self.grid.set(*room.entryDoorPos, entryDoor)
			
 
				                 prevDoorColor = doorColor
			
 
				 
			
 
				-                prevRoom = roomList[idx-1]
			
 
				+                prevRoom = roomList[idx - 1]
			
 
				                 prevRoom.exitDoorPos = room.entryDoorPos
			
 
				 
			
 
				         # Randomize the starting agent position and direction
			
@@ -114,20 +105,12 @@ class MultiRoomEnv(MiniGridEnv):
 
				         # Place the final goal in the last room
			
 
				         self.goal_pos = self.place_obj(Goal(), roomList[-1].top, roomList[-1].size)
			
 
				 
			
 
				-        self.mission = 'traverse the rooms to get to the goal'
			
 
				+        self.mission = "traverse the rooms to get to the goal"
			
 
				 
			
 
				-    def _placeRoom(
			
 
				-        self,
			
 
				-        numLeft,
			
 
				-        roomList,
			
 
				-        minSz,
			
 
				-        maxSz,
			
 
				-        entryDoorWall,
			
 
				-        entryDoorPos
			
 
				-    ):
			
 
				+    def _placeRoom(self, numLeft, roomList, minSz, maxSz, entryDoorWall, entryDoorPos):
			
 
				         # Choose the room size randomly
			
 
				-        sizeX = self._rand_int(minSz, maxSz+1)
			
 
				-        sizeY = self._rand_int(minSz, maxSz+1)
			
 
				+        sizeX = self._rand_int(minSz, maxSz + 1)
			
 
				+        sizeY = self._rand_int(minSz, maxSz + 1)
			
 
				 
			
 
				         # The first room will be at the door position
			
 
				         if len(roomList) == 0:
			
@@ -163,11 +146,12 @@ class MultiRoomEnv(MiniGridEnv):
 
				 
			
 
				         # If the room intersects with previous rooms, can't place it here
			
 
				         for room in roomList[:-1]:
			
 
				-            nonOverlap = \
			
 
				-                topX + sizeX < room.top[0] or \
			
 
				-                room.top[0] + room.size[0] <= topX or \
			
 
				-                topY + sizeY < room.top[1] or \
			
 
				-                room.top[1] + room.size[1] <= topY
			
 
				+            nonOverlap = (
			
 
				+                topX + sizeX < room.top[0]
			
 
				+                or room.top[0] + room.size[0] <= topX
			
 
				+                or topY + sizeY < room.top[1]
			
 
				+                or room.top[1] + room.size[1] <= topY
			
 
				+            )
			
 
				 
			
 
				             if not nonOverlap:
			
 
				                 return False
			
@@ -188,7 +172,7 @@ class MultiRoomEnv(MiniGridEnv):
 
				         for i in range(0, 8):
			
 
				 
			
 
				             # Pick which wall to place the out door on
			
 
				-            wallSet = set((0, 1, 2, 3))
			
 
				+            wallSet = {0, 1, 2, 3}
			
 
				             wallSet.remove(entryDoorWall)
			
 
				             exitDoorWall = self._rand_elem(sorted(wallSet))
			
 
				             nextEntryWall = (exitDoorWall + 2) % 4
			
@@ -196,28 +180,16 @@ class MultiRoomEnv(MiniGridEnv):
 
				             # Pick the exit door position
			
 
				             # Exit on right wall
			
 
				             if exitDoorWall == 0:
			
 
				-                exitDoorPos = (
			
 
				-                    topX + sizeX - 1,
			
 
				-                    topY + self._rand_int(1, sizeY - 1)
			
 
				-                )
			
 
				+                exitDoorPos = (topX + sizeX - 1, topY + self._rand_int(1, sizeY - 1))
			
 
				             # Exit on south wall
			
 
				             elif exitDoorWall == 1:
			
 
				-                exitDoorPos = (
			
 
				-                    topX + self._rand_int(1, sizeX - 1),
			
 
				-                    topY + sizeY - 1
			
 
				-                )
			
 
				+                exitDoorPos = (topX + self._rand_int(1, sizeX - 1), topY + sizeY - 1)
			
 
				             # Exit on left wall
			
 
				             elif exitDoorWall == 2:
			
 
				-                exitDoorPos = (
			
 
				-                    topX,
			
 
				-                    topY + self._rand_int(1, sizeY - 1)
			
 
				-                )
			
 
				+                exitDoorPos = (topX, topY + self._rand_int(1, sizeY - 1))
			
 
				             # Exit on north wall
			
 
				             elif exitDoorWall == 3:
			
 
				-                exitDoorPos = (
			
 
				-                    topX + self._rand_int(1, sizeX - 1),
			
 
				-                    topY
			
 
				-                )
			
 
				+                exitDoorPos = (topX + self._rand_int(1, sizeX - 1), topY)
			
 
				             else:
			
 
				                 assert False
			
 
				 
			
@@ -228,7 +200,7 @@ class MultiRoomEnv(MiniGridEnv):
 
				                 minSz=minSz,
			
 
				                 maxSz=maxSz,
			
 
				                 entryDoorWall=nextEntryWall,
			
 
				-                entryDoorPos=exitDoorPos
			
 
				+                entryDoorPos=exitDoorPos,
			
 
				             )
			
 
				 
			
 
				             if success:
			
--- a/gym_minigrid/envs/obstructedmaze.py
+++ b/gym_minigrid/envs/obstructedmaze.py
@@ -1,6 +1,7 @@
 
				-from gym_minigrid.minigrid import *
			
 
				-from gym_minigrid.roomgrid import RoomGrid
			
 
				+from gym_minigrid.minigrid import COLOR_NAMES, DIR_TO_VEC, Ball, Box, Key
			
 
				 from gym_minigrid.register import register
			
 
				+from gym_minigrid.roomgrid import RoomGrid
			
 
				+
			
 
				 
			
 
				 class ObstructedMazeEnv(RoomGrid):
			
 
				     """
			
@@ -8,21 +9,16 @@ class ObstructedMazeEnv(RoomGrid):
 
				     doors may be obstructed by a ball and keys may be hidden in boxes.
			
 
				     """
			
 
				 
			
 
				-    def __init__(self,
			
 
				-        num_rows,
			
 
				-        num_cols,
			
 
				-        num_rooms_visited,
			
 
				-        seed=None
			
 
				-    ):
			
 
				+    def __init__(self, num_rows, num_cols, num_rooms_visited, **kwargs):
			
 
				         room_size = 6
			
 
				-        max_steps = 4*num_rooms_visited*room_size**2
			
 
				+        max_steps = 4 * num_rooms_visited * room_size**2
			
 
				 
			
 
				         super().__init__(
			
 
				             room_size=room_size,
			
 
				             num_rows=num_rows,
			
 
				             num_cols=num_cols,
			
 
				             max_steps=max_steps,
			
 
				-            seed=seed
			
 
				+            **kwargs
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -49,7 +45,16 @@ class ObstructedMazeEnv(RoomGrid):
 
				 
			
 
				         return obs, reward, done, info
			
 
				 
			
 
				-    def add_door(self, i, j, door_idx=0, color=None, locked=False, key_in_box=False, blocked=False):
			
 
				+    def add_door(
			
 
				+        self,
			
 
				+        i,
			
 
				+        j,
			
 
				+        door_idx=0,
			
 
				+        color=None,
			
 
				+        locked=False,
			
 
				+        key_in_box=False,
			
 
				+        blocked=False,
			
 
				+    ):
			
 
				         """
			
 
				         Add a door. If the door must be locked, it also adds the key.
			
 
				         If the key must be hidden, it is put in a box. If the door must
			
@@ -61,8 +66,8 @@ class ObstructedMazeEnv(RoomGrid):
 
				         if blocked:
			
 
				             vec = DIR_TO_VEC[door_idx]
			
 
				             blocking_ball = Ball(self.blocking_ball_color) if blocked else None
			
 
				-            self.grid.set(door_pos[0]-vec[0], door_pos[1]-vec[1], blocking_ball)
			
 
				-            
			
 
				+            self.grid.set(door_pos[0] - vec[0], door_pos[1] - vec[1], blocking_ball)
			
 
				+
			
 
				         if locked:
			
 
				             obj = Key(door.color)
			
 
				             if key_in_box:
			
@@ -73,30 +78,31 @@ class ObstructedMazeEnv(RoomGrid):
 
				 
			
 
				         return door, door_pos
			
 
				 
			
 
				+
			
 
				 class ObstructedMaze_1Dlhb(ObstructedMazeEnv):
			
 
				     """
			
 
				     A blue ball is hidden in a 2x1 maze. A locked door separates
			
 
				     rooms. Doors are obstructed by a ball and keys are hidden in boxes.
			
 
				     """
			
 
				 
			
 
				-    def __init__(self, key_in_box=True, blocked=True, seed=None):
			
 
				+    def __init__(self, key_in_box=True, blocked=True, **kwargs):
			
 
				         self.key_in_box = key_in_box
			
 
				         self.blocked = blocked
			
 
				 
			
 
				-        super().__init__(
			
 
				-            num_rows=1,
			
 
				-            num_cols=2,
			
 
				-            num_rooms_visited=2,
			
 
				-            seed=seed
			
 
				-        )
			
 
				+        super().__init__(num_rows=1, num_cols=2, num_rooms_visited=2, **kwargs)
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
 
				         super()._gen_grid(width, height)
			
 
				 
			
 
				-        self.add_door(0, 0, door_idx=0, color=self.door_colors[0],
			
 
				-                      locked=True,
			
 
				-                      key_in_box=self.key_in_box,
			
 
				-                      blocked=self.blocked)
			
 
				+        self.add_door(
			
 
				+            0,
			
 
				+            0,
			
 
				+            door_idx=0,
			
 
				+            color=self.door_colors[0],
			
 
				+            locked=True,
			
 
				+            key_in_box=self.key_in_box,
			
 
				+            blocked=self.blocked,
			
 
				+        )
			
 
				 
			
 
				         self.obj, _ = self.add_object(1, 0, "ball", color=self.ball_to_find_color)
			
 
				         self.place_agent(0, 0)
			
@@ -109,18 +115,22 @@ class ObstructedMaze_Full(ObstructedMazeEnv):
 
				     boxes.
			
 
				     """
			
 
				 
			
 
				-    def __init__(self, agent_room=(1, 1), key_in_box=True, blocked=True,
			
 
				-                 num_quarters=4, num_rooms_visited=25, seed=None):
			
 
				+    def __init__(
			
 
				+        self,
			
 
				+        agent_room=(1, 1),
			
 
				+        key_in_box=True,
			
 
				+        blocked=True,
			
 
				+        num_quarters=4,
			
 
				+        num_rooms_visited=25,
			
 
				+        **kwargs
			
 
				+    ):
			
 
				         self.agent_room = agent_room
			
 
				         self.key_in_box = key_in_box
			
 
				         self.blocked = blocked
			
 
				         self.num_quarters = num_quarters
			
 
				 
			
 
				         super().__init__(
			
 
				-            num_rows=3,
			
 
				-            num_cols=3,
			
 
				-            num_rooms_visited=num_rooms_visited,
			
 
				-            seed=seed
			
 
				+            num_rows=3, num_cols=3, num_rooms_visited=num_rooms_visited, **kwargs
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -129,28 +139,48 @@ class ObstructedMaze_Full(ObstructedMazeEnv):
 
				         middle_room = (1, 1)
			
 
				         # Define positions of "side rooms" i.e. rooms that are neither
			
 
				         # corners nor the center.
			
 
				-        side_rooms = [(2, 1), (1, 2), (0, 1), (1, 0)][:self.num_quarters]
			
 
				+        side_rooms = [(2, 1), (1, 2), (0, 1), (1, 0)][: self.num_quarters]
			
 
				         for i in range(len(side_rooms)):
			
 
				             side_room = side_rooms[i]
			
 
				 
			
 
				             # Add a door between the center room and the side room
			
 
				-            self.add_door(*middle_room, door_idx=i, color=self.door_colors[i], locked=False)
			
 
				+            self.add_door(
			
 
				+                *middle_room, door_idx=i, color=self.door_colors[i], locked=False
			
 
				+            )
			
 
				 
			
 
				             for k in [-1, 1]:
			
 
				                 # Add a door to each side of the side room
			
 
				-                self.add_door(*side_room, locked=True,
			
 
				-                              door_idx=(i+k)%4,
			
 
				-                              color=self.door_colors[(i+k)%len(self.door_colors)],
			
 
				-                              key_in_box=self.key_in_box,
			
 
				-                              blocked=self.blocked)
			
 
				-
			
 
				-        corners = [(2, 0), (2, 2), (0, 2), (0, 0)][:self.num_quarters]
			
 
				+                self.add_door(
			
 
				+                    *side_room,
			
 
				+                    locked=True,
			
 
				+                    door_idx=(i + k) % 4,
			
 
				+                    color=self.door_colors[(i + k) % len(self.door_colors)],
			
 
				+                    key_in_box=self.key_in_box,
			
 
				+                    blocked=self.blocked
			
 
				+                )
			
 
				+
			
 
				+        corners = [(2, 0), (2, 2), (0, 2), (0, 0)][: self.num_quarters]
			
 
				         ball_room = self._rand_elem(corners)
			
 
				 
			
 
				         self.obj, _ = self.add_object(*ball_room, "ball", color=self.ball_to_find_color)
			
 
				         self.place_agent(*self.agent_room)
			
 
				 
			
 
				 
			
 
				+class ObstructedMaze_2Dl(ObstructedMaze_Full):
			
 
				+    def __init__(self, **kwargs):
			
 
				+        super().__init__((2, 1), False, False, 1, 4, **kwargs)
			
 
				+
			
 
				+
			
 
				+class ObstructedMaze_2Dlh(ObstructedMaze_Full):
			
 
				+    def __init__(self, **kwargs):
			
 
				+        super().__init__((2, 1), True, False, 1, 4, **kwargs)
			
 
				+
			
 
				+
			
 
				+class ObstructedMaze_2Dlhb(ObstructedMaze_Full):
			
 
				+    def __init__(self, **kwargs):
			
 
				+        super().__init__((2, 1), True, True, 1, 4, **kwargs)
			
 
				+
			
 
				+
			
 
				 
			
 
				 register(
			
 
				     id="MiniGrid-ObstructedMaze-1Dl-v0",
			
@@ -207,4 +237,4 @@ register(
 
				 register(
			
 
				     id="MiniGrid-ObstructedMaze-Full-v0",
			
 
				     entry_point="gym_minigrid.envs.obstructedmaze:ObstructedMaze_Full"
			
 
				-)
			
 
				+)
			
--- a/gym_minigrid/envs/playground.py
+++ b/gym_minigrid/envs/playground.py
@@ -1,4 +1,4 @@
 
				-from gym_minigrid.minigrid import *
			
 
				+from gym_minigrid.minigrid import COLOR_NAMES, Ball, Box, Door, Grid, Key, MiniGridEnv
			
 
				 from gym_minigrid.register import register
			
 
				 
			
 
				 class PlaygroundEnv(MiniGridEnv):
			
@@ -7,8 +7,8 @@ class PlaygroundEnv(MiniGridEnv):
 
				     This environment has no specific goals or rewards.
			
 
				     """
			
 
				 
			
 
				-    def __init__(self):
			
 
				-        super().__init__(grid_size=19, max_steps=100)
			
 
				+    def __init__(self, **kwargs):
			
 
				+        super().__init__(grid_size=19, max_steps=100, **kwargs)
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
 
				         # Create the grid
			
@@ -16,9 +16,9 @@ class PlaygroundEnv(MiniGridEnv):
 
				 
			
 
				         # Generate the surrounding walls
			
 
				         self.grid.horz_wall(0, 0)
			
 
				-        self.grid.horz_wall(0, height-1)
			
 
				+        self.grid.horz_wall(0, height - 1)
			
 
				         self.grid.vert_wall(0, 0)
			
 
				-        self.grid.vert_wall(width-1, 0)
			
 
				+        self.grid.vert_wall(width - 1, 0)
			
 
				 
			
 
				         roomW = width // 3
			
 
				         roomH = height // 3
			
@@ -34,16 +34,16 @@ class PlaygroundEnv(MiniGridEnv):
 
				                 yB = yT + roomH
			
 
				 
			
 
				                 # Bottom wall and door
			
 
				-                if i+1 < 3:
			
 
				+                if i + 1 < 3:
			
 
				                     self.grid.vert_wall(xR, yT, roomH)
			
 
				-                    pos = (xR, self._rand_int(yT+1, yB-1))
			
 
				+                    pos = (xR, self._rand_int(yT + 1, yB - 1))
			
 
				                     color = self._rand_elem(COLOR_NAMES)
			
 
				                     self.grid.set(*pos, Door(color))
			
 
				 
			
 
				                 # Bottom wall and door
			
 
				-                if j+1 < 3:
			
 
				+                if j + 1 < 3:
			
 
				                     self.grid.horz_wall(xL, yB, roomW)
			
 
				-                    pos = (self._rand_int(xL+1, xR-1), yB)
			
 
				+                    pos = (self._rand_int(xL + 1, xR - 1), yB)
			
 
				                     color = self._rand_elem(COLOR_NAMES)
			
 
				                     self.grid.set(*pos, Door(color))
			
 
				 
			
@@ -51,23 +51,23 @@ class PlaygroundEnv(MiniGridEnv):
 
				         self.place_agent()
			
 
				 
			
 
				         # Place random objects in the world
			
 
				-        types = ['key', 'ball', 'box']
			
 
				+        types = ["key", "ball", "box"]
			
 
				         for i in range(0, 12):
			
 
				             objType = self._rand_elem(types)
			
 
				             objColor = self._rand_elem(COLOR_NAMES)
			
 
				-            if objType == 'key':
			
 
				+            if objType == "key":
			
 
				                 obj = Key(objColor)
			
 
				-            elif objType == 'ball':
			
 
				+            elif objType == "ball":
			
 
				                 obj = Ball(objColor)
			
 
				-            elif objType == 'box':
			
 
				+            elif objType == "box":
			
 
				                 obj = Box(objColor)
			
 
				             self.place_obj(obj)
			
 
				 
			
 
				         # No explicit mission in this environment
			
 
				-        self.mission = ''
			
 
				+        self.mission = ""
			
 
				 
			
 
				     def step(self, action):
			
 
				-        obs, reward, done, info = MiniGridEnv.step(self, action)
			
 
				+        obs, reward, done, info = super().step(action)
			
 
				         return obs, reward, done, info
			
 
				 
			
 
				 register(
			
--- a/gym_minigrid/envs/putnear.py
+++ b/gym_minigrid/envs/putnear.py
@@ -1,24 +1,22 @@
 
				-from gym_minigrid.minigrid import *
			
 
				+from gym_minigrid.minigrid import COLOR_NAMES, Ball, Box, Grid, Key, MiniGridEnv
			
 
				 from gym_minigrid.register import register
			
 
				 
			
 
				+
			
 
				 class PutNearEnv(MiniGridEnv):
			
 
				     """
			
 
				     Environment in which the agent is instructed to place an object near
			
 
				     another object through a natural language string.
			
 
				     """
			
 
				 
			
 
				-    def __init__(
			
 
				-        self,
			
 
				-        size=6,
			
 
				-        numObjs=2
			
 
				-    ):
			
 
				+    def __init__(self, size=6, numObjs=2, **kwargs):
			
 
				         self.numObjs = numObjs
			
 
				 
			
 
				         super().__init__(
			
 
				             grid_size=size,
			
 
				-            max_steps=5*size,
			
 
				+            max_steps=5 * size,
			
 
				             # Set this to True for maximum speed
			
 
				-            see_through_walls=True
			
 
				+            see_through_walls=True,
			
 
				+            **kwargs
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -26,12 +24,12 @@ class PutNearEnv(MiniGridEnv):
 
				 
			
 
				         # Generate the surrounding walls
			
 
				         self.grid.horz_wall(0, 0)
			
 
				-        self.grid.horz_wall(0, height-1)
			
 
				+        self.grid.horz_wall(0, height - 1)
			
 
				         self.grid.vert_wall(0, 0)
			
 
				-        self.grid.vert_wall(width-1, 0)
			
 
				+        self.grid.vert_wall(width - 1, 0)
			
 
				 
			
 
				         # Types and colors of objects we can generate
			
 
				-        types = ['key', 'ball', 'box']
			
 
				+        types = ["key", "ball", "box"]
			
 
				 
			
 
				         objs = []
			
 
				         objPos = []
			
@@ -53,11 +51,11 @@ class PutNearEnv(MiniGridEnv):
 
				             if (objType, objColor) in objs:
			
 
				                 continue
			
 
				 
			
 
				-            if objType == 'key':
			
 
				+            if objType == "key":
			
 
				                 obj = Key(objColor)
			
 
				-            elif objType == 'ball':
			
 
				+            elif objType == "ball":
			
 
				                 obj = Ball(objColor)
			
 
				-            elif objType == 'box':
			
 
				+            elif objType == "box":
			
 
				                 obj = Box(objColor)
			
 
				 
			
 
				             pos = self.place_obj(obj, reject_fn=near_obj)
			
@@ -81,11 +79,11 @@ class PutNearEnv(MiniGridEnv):
 
				         self.target_type, self.target_color = objs[targetIdx]
			
 
				         self.target_pos = objPos[targetIdx]
			
 
				 
			
 
				-        self.mission = 'put the %s %s near the %s %s' % (
			
 
				+        self.mission = "put the {} {} near the {} {}".format(
			
 
				             self.moveColor,
			
 
				             self.move_type,
			
 
				             self.target_color,
			
 
				-            self.target_type
			
 
				+            self.target_type,
			
 
				         )
			
 
				 
			
 
				     def step(self, action):
			
@@ -99,7 +97,10 @@ class PutNearEnv(MiniGridEnv):
 
				 
			
 
				         # If we picked up the wrong object, terminate the episode
			
 
				         if action == self.actions.pickup and self.carrying:
			
 
				-            if self.carrying.type != self.move_type or self.carrying.color != self.moveColor:
			
 
				+            if (
			
 
				+                self.carrying.type != self.move_type
			
 
				+                or self.carrying.color != self.moveColor
			
 
				+            ):
			
 
				                 done = True
			
 
				 
			
 
				         # If successfully dropping an object near the target
			
--- a/gym_minigrid/envs/redbluedoors.py
+++ b/gym_minigrid/envs/redbluedoors.py
@@ -1,6 +1,7 @@
 
				-from gym_minigrid.minigrid import *
			
 
				+from gym_minigrid.minigrid import Door, Grid, MiniGridEnv
			
 
				 from gym_minigrid.register import register
			
 
				 
			
 
				+
			
 
				 class RedBlueDoorEnv(MiniGridEnv):
			
 
				     """
			
 
				     Single room with red and blue doors on opposite sides.
			
@@ -8,13 +9,11 @@ class RedBlueDoorEnv(MiniGridEnv):
 
				     obtain a reward.
			
 
				     """
			
 
				 
			
 
				-    def __init__(self, size=8):
			
 
				+    def __init__(self, size=8, **kwargs):
			
 
				         self.size = size
			
 
				 
			
 
				         super().__init__(
			
 
				-            width=2*size,
			
 
				-            height=size,
			
 
				-            max_steps=20*size*size
			
 
				+            width=2 * size, height=size, max_steps=20 * size * size, **kwargs
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -22,21 +21,21 @@ class RedBlueDoorEnv(MiniGridEnv):
 
				         self.grid = Grid(width, height)
			
 
				 
			
 
				         # Generate the grid walls
			
 
				-        self.grid.wall_rect(0, 0, 2*self.size, self.size)
			
 
				-        self.grid.wall_rect(self.size//2, 0, self.size, self.size)
			
 
				+        self.grid.wall_rect(0, 0, 2 * self.size, self.size)
			
 
				+        self.grid.wall_rect(self.size // 2, 0, self.size, self.size)
			
 
				 
			
 
				         # Place the agent in the top-left corner
			
 
				-        self.place_agent(top=(self.size//2, 0), size=(self.size, self.size))
			
 
				+        self.place_agent(top=(self.size // 2, 0), size=(self.size, self.size))
			
 
				 
			
 
				         # Add a red door at a random position in the left wall
			
 
				         pos = self._rand_int(1, self.size - 1)
			
 
				         self.red_door = Door("red")
			
 
				-        self.grid.set(self.size//2, pos, self.red_door)
			
 
				+        self.grid.set(self.size // 2, pos, self.red_door)
			
 
				 
			
 
				         # Add a blue door at a random position in the right wall
			
 
				         pos = self._rand_int(1, self.size - 1)
			
 
				         self.blue_door = Door("blue")
			
 
				-        self.grid.set(self.size//2 + self.size - 1, pos, self.blue_door)
			
 
				+        self.grid.set(self.size // 2 + self.size - 1, pos, self.blue_door)
			
 
				 
			
 
				         # Generate the mission string
			
 
				         self.mission = "open the red door then the blue door"
			
--- a/gym_minigrid/envs/unlock.py
+++ b/gym_minigrid/envs/unlock.py
@@ -1,20 +1,20 @@
 
				-from gym_minigrid.minigrid import Ball
			
 
				-from gym_minigrid.roomgrid import RoomGrid
			
 
				 from gym_minigrid.register import register
			
 
				+from gym_minigrid.roomgrid import RoomGrid
			
 
				+
			
 
				 
			
 
				 class UnlockEnv(RoomGrid):
			
 
				     """
			
 
				     Unlock a door
			
 
				     """
			
 
				 
			
 
				-    def __init__(self, seed=None):
			
 
				+    def __init__(self, **kwargs):
			
 
				         room_size = 6
			
 
				         super().__init__(
			
 
				             num_rows=1,
			
 
				             num_cols=2,
			
 
				             room_size=room_size,
			
 
				-            max_steps=8*room_size**2,
			
 
				-            seed=seed
			
 
				+            max_steps=8 * room_size**2,
			
 
				+            **kwargs
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -23,7 +23,7 @@ class UnlockEnv(RoomGrid):
 
				         # Make sure the two rooms are directly connected by a locked door
			
 
				         door, _ = self.add_door(0, 0, 0, locked=True)
			
 
				         # Add a key to unlock the door
			
 
				-        self.add_object(0, 0, 'key', door.color)
			
 
				+        self.add_object(0, 0, "key", door.color)
			
 
				 
			
 
				         self.place_agent(0, 0)
			
 
				 
			
--- a/gym_minigrid/envs/unlockpickup.py
+++ b/gym_minigrid/envs/unlockpickup.py
@@ -1,20 +1,20 @@
 
				-from gym_minigrid.minigrid import Ball
			
 
				-from gym_minigrid.roomgrid import RoomGrid
			
 
				 from gym_minigrid.register import register
			
 
				+from gym_minigrid.roomgrid import RoomGrid
			
 
				+
			
 
				 
			
 
				 class UnlockPickupEnv(RoomGrid):
			
 
				     """
			
 
				     Unlock a door, then pick up a box in another room
			
 
				     """
			
 
				 
			
 
				-    def __init__(self, seed=None):
			
 
				+    def __init__(self, **kwargs):
			
 
				         room_size = 6
			
 
				         super().__init__(
			
 
				             num_rows=1,
			
 
				             num_cols=2,
			
 
				             room_size=room_size,
			
 
				-            max_steps=8*room_size**2,
			
 
				-            seed=seed
			
 
				+            max_steps=8 * room_size**2,
			
 
				+            **kwargs,
			
 
				         )
			
 
				 
			
 
				     def _gen_grid(self, width, height):
			
@@ -25,12 +25,12 @@ class UnlockPickupEnv(RoomGrid):
 
				         # Make sure the two rooms are directly connected by a locked door
			
 
				         door, _ = self.add_door(0, 0, 0, locked=True)
			
 
				         # Add a key to unlock the door
			
 
				-        self.add_object(0, 0, 'key', door.color)
			
 
				+        self.add_object(0, 0, "key", door.color)
			
 
				 
			
 
				         self.place_agent(0, 0)
			
 
				 
			
 
				         self.obj = obj
			
 
				-        self.mission = "pick up the %s %s" % (obj.color, obj.type)
			
 
				+        self.mission = f"pick up the {obj.color} {obj.type}"
			
 
				 
			
 
				     def step(self, action):
			
 
				         obs, reward, done, info = super().step(action)
			
--- a/gym_minigrid/minigrid.py
+++ b/gym_minigrid/minigrid.py
@@ -1,61 +1,65 @@
 
				-import math
			
 
				 import hashlib
			
 
				-import gym
			
 
				+import math
			
 
				+import string
			
 
				 from enum import IntEnum
			
 
				+
			
 
				+import gym
			
 
				 import numpy as np
			
 
				-from gym import error, spaces, utils
			
 
				-from gym.utils import seeding
			
 
				-from .rendering import *
			
 
				+from gym import spaces
			
 
				 
			
 
				 # Size in pixels of a tile in the full-scale human view
			
 
				+from gym_minigrid.rendering import (
			
 
				+    downsample,
			
 
				+    fill_coords,
			
 
				+    highlight_img,
			
 
				+    point_in_circle,
			
 
				+    point_in_line,
			
 
				+    point_in_rect,
			
 
				+    point_in_triangle,
			
 
				+    rotate_fn,
			
 
				+)
			
 
				+
			
 
				 TILE_PIXELS = 32
			
 
				 
			
 
				 # Map of color names to RGB values
			
 
				 COLORS = {
			
 
				-    'red'   : np.array([255, 0, 0]),
			
 
				-    'green' : np.array([0, 255, 0]),
			
 
				-    'blue'  : np.array([0, 0, 255]),
			
 
				-    'purple': np.array([112, 39, 195]),
			
 
				-    'yellow': np.array([255, 255, 0]),
			
 
				-    'grey'  : np.array([100, 100, 100])
			
 
				+    "red": np.array([255, 0, 0]),
			
 
				+    "green": np.array([0, 255, 0]),
			
 
				+    "blue": np.array([0, 0, 255]),
			
 
				+    "purple": np.array([112, 39, 195]),
			
 
				+    "yellow": np.array([255, 255, 0]),
			
 
				+    "grey": np.array([100, 100, 100]),
			
 
				 }
			
 
				 
			
 
				 COLOR_NAMES = sorted(list(COLORS.keys()))
			
 
				 
			
 
				 # Used to map colors to integers
			
 
				-COLOR_TO_IDX = {
			
 
				-    'red'   : 0,
			
 
				-    'green' : 1,
			
 
				-    'blue'  : 2,
			
 
				-    'purple': 3,
			
 
				-    'yellow': 4,
			
 
				-    'grey'  : 5
			
 
				-}
			
 
				+COLOR_TO_IDX = {"red": 0, "green": 1, "blue": 2, "purple": 3, "yellow": 4, "grey": 5}
			
 
				 
			
 
				 IDX_TO_COLOR = dict(zip(COLOR_TO_IDX.values(), COLOR_TO_IDX.keys()))
			
 
				 
			
 
				 # Map of object type to integers
			
 
				 OBJECT_TO_IDX = {
			
 
				-    'unseen'        : 0,
			
 
				-    'empty'         : 1,
			
 
				-    'wall'          : 2,
			
 
				-    'floor'         : 3,
			
 
				-    'door'          : 4,
			
 
				-    'key'           : 5,
			
 
				-    'ball'          : 6,
			
 
				-    'box'           : 7,
			
 
				-    'goal'          : 8,
			
 
				-    'lava'          : 9,
			
 
				-    'agent'         : 10,
			
 
				+    "unseen": 0,
			
 
				+    "empty": 1,
			
 
				+    "wall": 2,
			
 
				+    "floor": 3,
			
 
				+    "door": 4,
			
 
				+    "key": 5,
			
 
				+    "ball": 6,
			
 
				+    "box": 7,
			
 
				+    "goal": 8,
			
 
				+    "lava": 9,
			
 
				+    "agent": 10,
			
 
				 }
			
 
				 
			
 
				 IDX_TO_OBJECT = dict(zip(OBJECT_TO_IDX.values(), OBJECT_TO_IDX.keys()))
			
 
				 
			
 
				 # Map of state names to integers
			
 
				 STATE_TO_IDX = {
			
 
				-    'open'  : 0,
			
 
				-    'closed': 1,
			
 
				-    'locked': 2,
			
 
				+    "open": 0,
			
 
				+    "closed": 1,
			
 
				+    "locked": 2,
			
 
				 }
			
 
				 
			
 
				 # Map of agent direction indices to vectors
			
@@ -70,6 +74,7 @@ DIR_TO_VEC = [
 
				     np.array((0, -1)),
			
 
				 ]
			
 
				 
			
 
				+
			
 
				 class WorldObj:
			
 
				     """
			
 
				     Base class for grid world objects
			
@@ -119,28 +124,28 @@ class WorldObj:
 
				         obj_type = IDX_TO_OBJECT[type_idx]
			
 
				         color = IDX_TO_COLOR[color_idx]
			
 
				 
			
 
				-        if obj_type == 'empty' or obj_type == 'unseen':
			
 
				+        if obj_type == "empty" or obj_type == "unseen":
			
 
				             return None
			
 
				 
			
 
				         # State, 0: open, 1: closed, 2: locked
			
 
				         is_open = state == 0
			
 
				         is_locked = state == 2
			
 
				 
			
 
				-        if obj_type == 'wall':
			
 
				+        if obj_type == "wall":
			
 
				             v = Wall(color)
			
 
				-        elif obj_type == 'floor':
			
 
				+        elif obj_type == "floor":
			
 
				             v = Floor(color)
			
 
				-        elif obj_type == 'ball':
			
 
				+        elif obj_type == "ball":
			
 
				             v = Ball(color)
			
 
				-        elif obj_type == 'key':
			
 
				+        elif obj_type == "key":
			
 
				             v = Key(color)
			
 
				-        elif obj_type == 'box':
			
 
				+        elif obj_type == "box":
			
 
				             v = Box(color)
			
 
				-        elif obj_type == 'door':
			
 
				+        elif obj_type == "door":
			
 
				             v = Door(color, is_open, is_locked)
			
 
				-        elif obj_type == 'goal':
			
 
				+        elif obj_type == "goal":
			
 
				             v = Goal()
			
 
				-        elif obj_type == 'lava':
			
 
				+        elif obj_type == "lava":
			
 
				             v = Lava()
			
 
				         else:
			
 
				             assert False, "unknown object type in decode '%s'" % obj_type
			
@@ -151,9 +156,10 @@ class WorldObj:
 
				         """Draw this object with the given renderer"""
			
 
				         raise NotImplementedError
			
 
				 
			
 
				+
			
 
				 class Goal(WorldObj):
			
 
				     def __init__(self):
			
 
				-        super().__init__('goal', 'green')
			
 
				+        super().__init__("goal", "green")
			
 
				 
			
 
				     def can_overlap(self):
			
 
				         return True
			
@@ -161,13 +167,14 @@ class Goal(WorldObj):
 
				     def render(self, img):
			
 
				         fill_coords(img, point_in_rect(0, 1, 0, 1), COLORS[self.color])
			
 
				 
			
 
				+
			
 
				 class Floor(WorldObj):
			
 
				     """
			
 
				     Colored floor tile the agent can walk over
			
 
				     """
			
 
				 
			
 
				-    def __init__(self, color='blue'):
			
 
				-        super().__init__('floor', color)
			
 
				+    def __init__(self, color="blue"):
			
 
				+        super().__init__("floor", color)
			
 
				 
			
 
				     def can_overlap(self):
			
 
				         return True
			
@@ -180,7 +187,7 @@ class Floor(WorldObj):
 
				 
			
 
				 class Lava(WorldObj):
			
 
				     def __init__(self):
			
 
				-        super().__init__('lava', 'red')
			
 
				+        super().__init__("lava", "red")
			
 
				 
			
 
				     def can_overlap(self):
			
 
				         return True
			
@@ -195,14 +202,15 @@ class Lava(WorldObj):
 
				         for i in range(3):
			
 
				             ylo = 0.3 + 0.2 * i
			
 
				             yhi = 0.4 + 0.2 * i
			
 
				-            fill_coords(img, point_in_line(0.1, ylo, 0.3, yhi, r=0.03), (0,0,0))
			
 
				-            fill_coords(img, point_in_line(0.3, yhi, 0.5, ylo, r=0.03), (0,0,0))
			
 
				-            fill_coords(img, point_in_line(0.5, ylo, 0.7, yhi, r=0.03), (0,0,0))
			
 
				-            fill_coords(img, point_in_line(0.7, yhi, 0.9, ylo, r=0.03), (0,0,0))
			
 
				+            fill_coords(img, point_in_line(0.1, ylo, 0.3, yhi, r=0.03), (0, 0, 0))
			
 
				+            fill_coords(img, point_in_line(0.3, yhi, 0.5, ylo, r=0.03), (0, 0, 0))
			
 
				+            fill_coords(img, point_in_line(0.5, ylo, 0.7, yhi, r=0.03), (0, 0, 0))
			
 
				+            fill_coords(img, point_in_line(0.7, yhi, 0.9, ylo, r=0.03), (0, 0, 0))
			
 
				+
			
 
				 
			
 
				 class Wall(WorldObj):
			
 
				-    def __init__(self, color='grey'):
			
 
				-        super().__init__('wall', color)
			
 
				+    def __init__(self, color="grey"):
			
 
				+        super().__init__("wall", color)
			
 
				 
			
 
				     def see_behind(self):
			
 
				         return False
			
@@ -210,9 +218,10 @@ class Wall(WorldObj):
 
				     def render(self, img):
			
 
				         fill_coords(img, point_in_rect(0, 1, 0, 1), COLORS[self.color])
			
 
				 
			
 
				+
			
 
				 class Door(WorldObj):
			
 
				     def __init__(self, color, is_open=False, is_locked=False):
			
 
				-        super().__init__('door', color)
			
 
				+        super().__init__("door", color)
			
 
				         self.is_open = is_open
			
 
				         self.is_locked = is_locked
			
 
				 
			
@@ -253,7 +262,7 @@ class Door(WorldObj):
 
				 
			
 
				         if self.is_open:
			
 
				             fill_coords(img, point_in_rect(0.88, 1.00, 0.00, 1.00), c)
			
 
				-            fill_coords(img, point_in_rect(0.92, 0.96, 0.04, 0.96), (0,0,0))
			
 
				+            fill_coords(img, point_in_rect(0.92, 0.96, 0.04, 0.96), (0, 0, 0))
			
 
				             return
			
 
				 
			
 
				         # Door frame and door
			
@@ -265,16 +274,17 @@ class Door(WorldObj):
 
				             fill_coords(img, point_in_rect(0.52, 0.75, 0.50, 0.56), c)
			
 
				         else:
			
 
				             fill_coords(img, point_in_rect(0.00, 1.00, 0.00, 1.00), c)
			
 
				-            fill_coords(img, point_in_rect(0.04, 0.96, 0.04, 0.96), (0,0,0))
			
 
				+            fill_coords(img, point_in_rect(0.04, 0.96, 0.04, 0.96), (0, 0, 0))
			
 
				             fill_coords(img, point_in_rect(0.08, 0.92, 0.08, 0.92), c)
			
 
				-            fill_coords(img, point_in_rect(0.12, 0.88, 0.12, 0.88), (0,0,0))
			
 
				+            fill_coords(img, point_in_rect(0.12, 0.88, 0.12, 0.88), (0, 0, 0))
			
 
				 
			
 
				             # Draw door handle
			
 
				             fill_coords(img, point_in_circle(cx=0.75, cy=0.50, r=0.08), c)
			
 
				 
			
 
				+
			
 
				 class Key(WorldObj):
			
 
				-    def __init__(self, color='blue'):
			
 
				-        super(Key, self).__init__('key', color)
			
 
				+    def __init__(self, color="blue"):
			
 
				+        super().__init__("key", color)
			
 
				 
			
 
				     def can_pickup(self):
			
 
				         return True
			
@@ -291,11 +301,12 @@ class Key(WorldObj):
 
				 
			
 
				         # Ring
			
 
				         fill_coords(img, point_in_circle(cx=0.56, cy=0.28, r=0.190), c)
			
 
				-        fill_coords(img, point_in_circle(cx=0.56, cy=0.28, r=0.064), (0,0,0))
			
 
				+        fill_coords(img, point_in_circle(cx=0.56, cy=0.28, r=0.064), (0, 0, 0))
			
 
				+
			
 
				 
			
 
				 class Ball(WorldObj):
			
 
				-    def __init__(self, color='blue'):
			
 
				-        super(Ball, self).__init__('ball', color)
			
 
				+    def __init__(self, color="blue"):
			
 
				+        super().__init__("ball", color)
			
 
				 
			
 
				     def can_pickup(self):
			
 
				         return True
			
@@ -303,9 +314,10 @@ class Ball(WorldObj):
 
				     def render(self, img):
			
 
				         fill_coords(img, point_in_circle(0.5, 0.5, 0.31), COLORS[self.color])
			
 
				 
			
 
				+
			
 
				 class Box(WorldObj):
			
 
				     def __init__(self, color, contains=None):
			
 
				-        super(Box, self).__init__('box', color)
			
 
				+        super().__init__("box", color)
			
 
				         self.contains = contains
			
 
				 
			
 
				     def can_pickup(self):
			
@@ -316,7 +328,7 @@ class Box(WorldObj):
 
				 
			
 
				         # Outline
			
 
				         fill_coords(img, point_in_rect(0.12, 0.88, 0.12, 0.88), c)
			
 
				-        fill_coords(img, point_in_rect(0.18, 0.82, 0.18, 0.82), (0,0,0))
			
 
				+        fill_coords(img, point_in_rect(0.18, 0.82, 0.18, 0.82), (0, 0, 0))
			
 
				 
			
 
				         # Horizontal slit
			
 
				         fill_coords(img, point_in_rect(0.16, 0.84, 0.47, 0.53), c)
			
@@ -326,6 +338,7 @@ class Box(WorldObj):
 
				         env.grid.set(*pos, self.contains)
			
 
				         return True
			
 
				 
			
 
				+
			
 
				 class Grid:
			
 
				     """
			
 
				     Represent a grid and operations on it
			
@@ -359,7 +372,7 @@ class Grid:
 
				         return False
			
 
				 
			
 
				     def __eq__(self, other):
			
 
				-        grid1  = self.encode()
			
 
				+        grid1 = self.encode()
			
 
				         grid2 = other.encode()
			
 
				         return np.array_equal(grid2, grid1)
			
 
				 
			
@@ -368,6 +381,7 @@ class Grid:
 
				 
			
 
				     def copy(self):
			
 
				         from copy import deepcopy
			
 
				+
			
 
				         return deepcopy(self)
			
 
				 
			
 
				     def set(self, i, j, v):
			
@@ -394,9 +408,9 @@ class Grid:
 
				 
			
 
				     def wall_rect(self, x, y, w, h):
			
 
				         self.horz_wall(x, y, w)
			
 
				-        self.horz_wall(x, y+h-1, w)
			
 
				+        self.horz_wall(x, y + h - 1, w)
			
 
				         self.vert_wall(x, y, h)
			
 
				-        self.vert_wall(x+w-1, y, h)
			
 
				+        self.vert_wall(x + w - 1, y, h)
			
 
				 
			
 
				     def rotate_left(self):
			
 
				         """
			
@@ -424,8 +438,7 @@ class Grid:
 
				                 x = topX + i
			
 
				                 y = topY + j
			
 
				 
			
 
				-                if x >= 0 and x < self.width and \
			
 
				-                   y >= 0 and y < self.height:
			
 
				+                if x >= 0 and x < self.width and y >= 0 and y < self.height:
			
 
				                     v = self.get(x, y)
			
 
				                 else:
			
 
				                     v = Wall()
			
@@ -436,12 +449,7 @@ class Grid:
 
				 
			
 
				     @classmethod
			
 
				     def render_tile(
			
 
				-        cls,
			
 
				-        obj,
			
 
				-        agent_dir=None,
			
 
				-        highlight=False,
			
 
				-        tile_size=TILE_PIXELS,
			
 
				-        subdivs=3
			
 
				+        cls, obj, agent_dir=None, highlight=False, tile_size=TILE_PIXELS, subdivs=3
			
 
				     ):
			
 
				         """
			
 
				         Render a tile and cache the result
			
@@ -454,13 +462,15 @@ class Grid:
 
				         if key in cls.tile_cache:
			
 
				             return cls.tile_cache[key]
			
 
				 
			
 
				-        img = np.zeros(shape=(tile_size * subdivs, tile_size * subdivs, 3), dtype=np.uint8)
			
 
				+        img = np.zeros(
			
 
				+            shape=(tile_size * subdivs, tile_size * subdivs, 3), dtype=np.uint8
			
 
				+        )
			
 
				 
			
 
				         # Draw the grid lines (top and left edges)
			
 
				         fill_coords(img, point_in_rect(0, 0.031, 0, 1), (100, 100, 100))
			
 
				         fill_coords(img, point_in_rect(0, 1, 0, 0.031), (100, 100, 100))
			
 
				 
			
 
				-        if obj != None:
			
 
				+        if obj is not None:
			
 
				             obj.render(img)
			
 
				 
			
 
				         # Overlay the agent on top
			
@@ -472,7 +482,7 @@ class Grid:
 
				             )
			
 
				 
			
 
				             # Rotate the agent based on its direction
			
 
				-            tri_fn = rotate_fn(tri_fn, cx=0.5, cy=0.5, theta=0.5*math.pi*agent_dir)
			
 
				+            tri_fn = rotate_fn(tri_fn, cx=0.5, cy=0.5, theta=0.5 * math.pi * agent_dir)
			
 
				             fill_coords(img, tri_fn, (255, 0, 0))
			
 
				 
			
 
				         # Highlight the cell if needed
			
@@ -487,13 +497,7 @@ class Grid:
 
				 
			
 
				         return img
			
 
				 
			
 
				-    def render(
			
 
				-        self,
			
 
				-        tile_size,
			
 
				-        agent_pos=None,
			
 
				-        agent_dir=None,
			
 
				-        highlight_mask=None
			
 
				-    ):
			
 
				+    def render(self, tile_size, agent_pos=None, agent_dir=None, highlight_mask=None):
			
 
				         """
			
 
				         Render this grid at a given scale
			
 
				         :param r: target renderer object
			
@@ -519,13 +523,13 @@ class Grid:
 
				                     cell,
			
 
				                     agent_dir=agent_dir if agent_here else None,
			
 
				                     highlight=highlight_mask[i, j],
			
 
				-                    tile_size=tile_size
			
 
				+                    tile_size=tile_size,
			
 
				                 )
			
 
				 
			
 
				                 ymin = j * tile_size
			
 
				-                ymax = (j+1) * tile_size
			
 
				+                ymax = (j + 1) * tile_size
			
 
				                 xmin = i * tile_size
			
 
				-                xmax = (i+1) * tile_size
			
 
				+                xmax = (i + 1) * tile_size
			
 
				                 img[ymin:ymax, xmin:xmax, :] = tile_img
			
 
				 
			
 
				         return img
			
@@ -538,7 +542,7 @@ class Grid:
 
				         if vis_mask is None:
			
 
				             vis_mask = np.ones((self.width, self.height), dtype=bool)
			
 
				 
			
 
				-        array = np.zeros((self.width, self.height, 3), dtype='uint8')
			
 
				+        array = np.zeros((self.width, self.height, 3), dtype="uint8")
			
 
				 
			
 
				         for i in range(self.width):
			
 
				             for j in range(self.height):
			
@@ -546,7 +550,7 @@ class Grid:
 
				                     v = self.get(i, j)
			
 
				 
			
 
				                     if v is None:
			
 
				-                        array[i, j, 0] = OBJECT_TO_IDX['empty']
			
 
				+                        array[i, j, 0] = OBJECT_TO_IDX["empty"]
			
 
				                         array[i, j, 1] = 0
			
 
				                         array[i, j, 2] = 0
			
 
				 
			
@@ -572,7 +576,7 @@ class Grid:
 
				                 type_idx, color_idx, state = array[i, j]
			
 
				                 v = WorldObj.decode(type_idx, color_idx, state)
			
 
				                 grid.set(i, j, v)
			
 
				-                vis_mask[i, j] = (type_idx != OBJECT_TO_IDX['unseen'])
			
 
				+                vis_mask[i, j] = type_idx != OBJECT_TO_IDX["unseen"]
			
 
				 
			
 
				         return grid, vis_mask
			
 
				 
			
@@ -582,7 +586,7 @@ class Grid:
 
				         mask[agent_pos[0], agent_pos[1]] = True
			
 
				 
			
 
				         for j in reversed(range(0, grid.height)):
			
 
				-            for i in range(0, grid.width-1):
			
 
				+            for i in range(0, grid.width - 1):
			
 
				                 if not mask[i, j]:
			
 
				                     continue
			
 
				 
			
@@ -590,10 +594,10 @@ class Grid:
 
				                 if cell and not cell.see_behind():
			
 
				                     continue
			
 
				 
			
 
				-                mask[i+1, j] = True
			
 
				+                mask[i + 1, j] = True
			
 
				                 if j > 0:
			
 
				-                    mask[i+1, j-1] = True
			
 
				-                    mask[i, j-1] = True
			
 
				+                    mask[i + 1, j - 1] = True
			
 
				+                    mask[i, j - 1] = True
			
 
				 
			
 
				             for i in reversed(range(1, grid.width)):
			
 
				                 if not mask[i, j]:
			
@@ -603,10 +607,10 @@ class Grid:
 
				                 if cell and not cell.see_behind():
			
 
				                     continue
			
 
				 
			
 
				-                mask[i-1, j] = True
			
 
				+                mask[i - 1, j] = True
			
 
				                 if j > 0:
			
 
				-                    mask[i-1, j-1] = True
			
 
				-                    mask[i, j-1] = True
			
 
				+                    mask[i - 1, j - 1] = True
			
 
				+                    mask[i, j - 1] = True
			
 
				 
			
 
				         for j in range(0, grid.height):
			
 
				             for i in range(0, grid.width):
			
@@ -615,14 +619,18 @@ class Grid:
 
				 
			
 
				         return mask
			
 
				 
			
 
				+
			
 
				 class MiniGridEnv(gym.Env):
			
 
				     """
			
 
				     2D grid world game environment
			
 
				     """
			
 
				 
			
 
				     metadata = {
			
 
				-        'render.modes': ['human', 'rgb_array'],
			
 
				-        'video.frames_per_second' : 10
			
 
				+        # Deprecated: use 'render_modes' instead
			
 
				+        "render.modes": ["human", "rgb_array"],
			
 
				+        "video.frames_per_second": 10,  # Deprecated: use 'render_fps' instead
			
 
				+        "render_modes": ["human", "rgb_array"],
			
 
				+        "render_fps": 10,
			
 
				     }
			
 
				 
			
 
				     # Enumeration of possible actions
			
@@ -649,12 +657,13 @@ class MiniGridEnv(gym.Env):
 
				         height=None,
			
 
				         max_steps=100,
			
 
				         see_through_walls=False,
			
 
				-        seed=1337,
			
 
				-        agent_view_size=7
			
 
				+        agent_view_size=7,
			
 
				+        render_mode=None,
			
 
				+        **kwargs
			
 
				     ):
			
 
				         # Can't set both grid_size and width/height
			
 
				         if grid_size:
			
 
				-            assert width == None and height == None
			
 
				+            assert width is None and height is None
			
 
				             width = grid_size
			
 
				             height = grid_size
			
 
				 
			
@@ -675,11 +684,21 @@ class MiniGridEnv(gym.Env):
 
				             low=0,
			
 
				             high=255,
			
 
				             shape=(self.agent_view_size, self.agent_view_size, 3),
			
 
				-            dtype='uint8'
			
 
				+            dtype="uint8",
			
 
				+        )
			
 
				+        self.observation_space = spaces.Dict(
			
 
				+            {
			
 
				+                "image": self.observation_space,
			
 
				+                "direction": spaces.Discrete(4),
			
 
				+                "mission": spaces.Text(
			
 
				+                    max_length=200,
			
 
				+                    charset=string.ascii_letters + string.digits + " .,!-",
			
 
				+                ),
			
 
				+            }
			
 
				         )
			
 
				-        self.observation_space = spaces.Dict({
			
 
				-            'image': self.observation_space
			
 
				-        })
			
 
				+
			
 
				+        # render mode
			
 
				+        self.render_mode = render_mode
			
 
				 
			
 
				         # Range of possible rewards
			
 
				         self.reward_range = (0, 1)
			
@@ -697,20 +716,16 @@ class MiniGridEnv(gym.Env):
 
				         self.agent_pos = None
			
 
				         self.agent_dir = None
			
 
				 
			
 
				-        # Initialize the RNG
			
 
				-        self.seed(seed=seed)
			
 
				-
			
 
				         # Initialize the state
			
 
				         self.reset()
			
 
				 
			
 
				-    def reset(self):
			
 
				+    def reset(self, *, seed=None, return_info=False, options=None):
			
 
				+        super().reset(seed=seed)
			
 
				         # Current position and direction of the agent
			
 
				         self.agent_pos = None
			
 
				         self.agent_dir = None
			
 
				 
			
 
				         # Generate a new random grid at the start of each episode
			
 
				-        # To keep the same grid for each episode, call env.seed() with
			
 
				-        # the same seed before calling env.reset()
			
 
				         self._gen_grid(self.width, self.height)
			
 
				 
			
 
				         # These fields should be defined by _gen_grid
			
@@ -731,11 +746,6 @@ class MiniGridEnv(gym.Env):
 
				         obs = self.gen_obs()
			
 
				         return obs
			
 
				 
			
 
				-    def seed(self, seed=1337):
			
 
				-        # Seed the random number generator
			
 
				-        self.np_random, _ = seeding.np_random(seed)
			
 
				-        return [seed]
			
 
				-
			
 
				     def hash(self, size=16):
			
 
				         """Compute a hash that uniquely identifies the current state of the environment.
			
 
				         :param size: Size of the hashing
			
@@ -744,7 +754,7 @@ class MiniGridEnv(gym.Env):
 
				 
			
 
				         to_encode = [self.grid.encode().tolist(), self.agent_pos, self.agent_dir]
			
 
				         for item in to_encode:
			
 
				-            sample_hash.update(str(item).encode('utf8'))
			
 
				+            sample_hash.update(str(item).encode("utf8"))
			
 
				 
			
 
				         return sample_hash.hexdigest()[:size]
			
 
				 
			
@@ -761,28 +771,20 @@ class MiniGridEnv(gym.Env):
 
				 
			
 
				         # Map of object types to short string
			
 
				         OBJECT_TO_STR = {
			
 
				-            'wall'          : 'W',
			
 
				-            'floor'         : 'F',
			
 
				-            'door'          : 'D',
			
 
				-            'key'           : 'K',
			
 
				-            'ball'          : 'A',
			
 
				-            'box'           : 'B',
			
 
				-            'goal'          : 'G',
			
 
				-            'lava'          : 'V',
			
 
				+            "wall": "W",
			
 
				+            "floor": "F",
			
 
				+            "door": "D",
			
 
				+            "key": "K",
			
 
				+            "ball": "A",
			
 
				+            "box": "B",
			
 
				+            "goal": "G",
			
 
				+            "lava": "V",
			
 
				         }
			
 
				 
			
 
				-        # Short string for opened door
			
 
				-        OPENDED_DOOR_IDS = '_'
			
 
				-
			
 
				         # Map agent's direction to short string
			
 
				-        AGENT_DIR_TO_STR = {
			
 
				-            0: '>',
			
 
				-            1: 'V',
			
 
				-            2: '<',
			
 
				-            3: '^'
			
 
				-        }
			
 
				+        AGENT_DIR_TO_STR = {0: ">", 1: "V", 2: "<", 3: "^"}
			
 
				 
			
 
				-        str = ''
			
 
				+        str = ""
			
 
				 
			
 
				         for j in range(self.grid.height):
			
 
				 
			
@@ -793,23 +795,23 @@ class MiniGridEnv(gym.Env):
 
				 
			
 
				                 c = self.grid.get(i, j)
			
 
				 
			
 
				-                if c == None:
			
 
				-                    str += '  '
			
 
				+                if c is None:
			
 
				+                    str += "  "
			
 
				                     continue
			
 
				 
			
 
				-                if c.type == 'door':
			
 
				+                if c.type == "door":
			
 
				                     if c.is_open:
			
 
				-                        str += '__'
			
 
				+                        str += "__"
			
 
				                     elif c.is_locked:
			
 
				-                        str += 'L' + c.color[0].upper()
			
 
				+                        str += "L" + c.color[0].upper()
			
 
				                     else:
			
 
				-                        str += 'D' + c.color[0].upper()
			
 
				+                        str += "D" + c.color[0].upper()
			
 
				                     continue
			
 
				 
			
 
				                 str += OBJECT_TO_STR[c.type] + c.color[0].upper()
			
 
				 
			
 
				             if j < self.grid.height - 1:
			
 
				-                str += '\n'
			
 
				+                str += "\n"
			
 
				 
			
 
				         return str
			
 
				 
			
@@ -828,7 +830,7 @@ class MiniGridEnv(gym.Env):
 
				         Generate random integer in [low,high[
			
 
				         """
			
 
				 
			
 
				-        return self.np_random.randint(low, high)
			
 
				+        return self.np_random.integers(low, high)
			
 
				 
			
 
				     def _rand_float(self, low, high):
			
 
				         """
			
@@ -842,7 +844,7 @@ class MiniGridEnv(gym.Env):
 
				         Generate random boolean value
			
 
				         """
			
 
				 
			
 
				-        return (self.np_random.randint(0, 2) == 0)
			
 
				+        return self.np_random.integers(0, 2) == 0
			
 
				 
			
 
				     def _rand_elem(self, iterable):
			
 
				         """
			
@@ -883,17 +885,11 @@ class MiniGridEnv(gym.Env):
 
				         """
			
 
				 
			
 
				         return (
			
 
				-            self.np_random.randint(xLow, xHigh),
			
 
				-            self.np_random.randint(yLow, yHigh)
			
 
				+            self.np_random.integers(xLow, xHigh),
			
 
				+            self.np_random.integers(yLow, yHigh),
			
 
				         )
			
 
				 
			
 
				-    def place_obj(self,
			
 
				-        obj,
			
 
				-        top=None,
			
 
				-        size=None,
			
 
				-        reject_fn=None,
			
 
				-        max_tries=math.inf
			
 
				-    ):
			
 
				+    def place_obj(self, obj, top=None, size=None, reject_fn=None, max_tries=math.inf):
			
 
				         """
			
 
				         Place an object at an empty position in the grid
			
 
				 
			
@@ -916,17 +912,19 @@ class MiniGridEnv(gym.Env):
 
				             # This is to handle with rare cases where rejection sampling
			
 
				             # gets stuck in an infinite loop
			
 
				             if num_tries > max_tries:
			
 
				-                raise RecursionError('rejection sampling failed in place_obj')
			
 
				+                raise RecursionError("rejection sampling failed in place_obj")
			
 
				 
			
 
				             num_tries += 1
			
 
				 
			
 
				-            pos = np.array((
			
 
				-                self._rand_int(top[0], min(top[0] + size[0], self.grid.width)),
			
 
				-                self._rand_int(top[1], min(top[1] + size[1], self.grid.height))
			
 
				-            ))
			
 
				+            pos = np.array(
			
 
				+                (
			
 
				+                    self._rand_int(top[0], min(top[0] + size[0], self.grid.width)),
			
 
				+                    self._rand_int(top[1], min(top[1] + size[1], self.grid.height)),
			
 
				+                )
			
 
				+            )
			
 
				 
			
 
				             # Don't place the object on top of another object
			
 
				-            if self.grid.get(*pos) != None:
			
 
				+            if self.grid.get(*pos) is not None:
			
 
				                 continue
			
 
				 
			
 
				             # Don't place the object where the agent is
			
@@ -956,13 +954,7 @@ class MiniGridEnv(gym.Env):
 
				         obj.init_pos = (i, j)
			
 
				         obj.cur_pos = (i, j)
			
 
				 
			
 
				-    def place_agent(
			
 
				-        self,
			
 
				-        top=None,
			
 
				-        size=None,
			
 
				-        rand_dir=True,
			
 
				-        max_tries=math.inf
			
 
				-    ):
			
 
				+    def place_agent(self, top=None, size=None, rand_dir=True, max_tries=math.inf):
			
 
				         """
			
 
				         Set the agent's starting point at an empty position in the grid
			
 
				         """
			
@@ -1017,46 +1009,49 @@ class MiniGridEnv(gym.Env):
 
				         # Compute the absolute coordinates of the top-left view corner
			
 
				         sz = self.agent_view_size
			
 
				         hs = self.agent_view_size // 2
			
 
				-        tx = ax + (dx * (sz-1)) - (rx * hs)
			
 
				-        ty = ay + (dy * (sz-1)) - (ry * hs)
			
 
				+        tx = ax + (dx * (sz - 1)) - (rx * hs)
			
 
				+        ty = ay + (dy * (sz - 1)) - (ry * hs)
			
 
				 
			
 
				         lx = i - tx
			
 
				         ly = j - ty
			
 
				 
			
 
				         # Project the coordinates of the object relative to the top-left
			
 
				         # corner onto the agent's own coordinate system
			
 
				-        vx = (rx*lx + ry*ly)
			
 
				-        vy = -(dx*lx + dy*ly)
			
 
				+        vx = rx * lx + ry * ly
			
 
				+        vy = -(dx * lx + dy * ly)
			
 
				 
			
 
				         return vx, vy
			
 
				 
			
 
				-    def get_view_exts(self):
			
 
				+    def get_view_exts(self, agent_view_size=None):
			
 
				         """
			
 
				         Get the extents of the square set of tiles visible to the agent
			
 
				         Note: the bottom extent indices are not included in the set
			
 
				+        if agent_view_size is None, use self.agent_view_size
			
 
				         """
			
 
				 
			
 
				+        agent_view_size = agent_view_size or self.agent_view_size
			
 
				+
			
 
				         # Facing right
			
 
				         if self.agent_dir == 0:
			
 
				             topX = self.agent_pos[0]
			
 
				-            topY = self.agent_pos[1] - self.agent_view_size // 2
			
 
				+            topY = self.agent_pos[1] - agent_view_size // 2
			
 
				         # Facing down
			
 
				         elif self.agent_dir == 1:
			
 
				-            topX = self.agent_pos[0] - self.agent_view_size // 2
			
 
				+            topX = self.agent_pos[0] - agent_view_size // 2
			
 
				             topY = self.agent_pos[1]
			
 
				         # Facing left
			
 
				         elif self.agent_dir == 2:
			
 
				-            topX = self.agent_pos[0] - self.agent_view_size + 1
			
 
				-            topY = self.agent_pos[1] - self.agent_view_size // 2
			
 
				+            topX = self.agent_pos[0] - agent_view_size + 1
			
 
				+            topY = self.agent_pos[1] - agent_view_size // 2
			
 
				         # Facing up
			
 
				         elif self.agent_dir == 3:
			
 
				-            topX = self.agent_pos[0] - self.agent_view_size // 2
			
 
				-            topY = self.agent_pos[1] - self.agent_view_size + 1
			
 
				+            topX = self.agent_pos[0] - agent_view_size // 2
			
 
				+            topY = self.agent_pos[1] - agent_view_size + 1
			
 
				         else:
			
 
				             assert False, "invalid agent direction"
			
 
				 
			
 
				-        botX = topX + self.agent_view_size
			
 
				-        botY = topY + self.agent_view_size
			
 
				+        botX = topX + agent_view_size
			
 
				+        botY = topY + agent_view_size
			
 
				 
			
 
				         return (topX, topY, botX, botY)
			
 
				 
			
@@ -1090,7 +1085,7 @@ class MiniGridEnv(gym.Env):
 
				         vx, vy = coordinates
			
 
				 
			
 
				         obs = self.gen_obs()
			
 
				-        obs_grid, _ = Grid.decode(obs['image'])
			
 
				+        obs_grid, _ = Grid.decode(obs["image"])
			
 
				         obs_cell = obs_grid.get(vx, vy)
			
 
				         world_cell = self.grid.get(x, y)
			
 
				 
			
@@ -1120,12 +1115,12 @@ class MiniGridEnv(gym.Env):
 
				 
			
 
				         # Move forward
			
 
				         elif action == self.actions.forward:
			
 
				-            if fwd_cell == None or fwd_cell.can_overlap():
			
 
				+            if fwd_cell is None or fwd_cell.can_overlap():
			
 
				                 self.agent_pos = fwd_pos
			
 
				-            if fwd_cell != None and fwd_cell.type == 'goal':
			
 
				+            if fwd_cell is not None and fwd_cell.type == "goal":
			
 
				                 done = True
			
 
				                 reward = self._reward()
			
 
				-            if fwd_cell != None and fwd_cell.type == 'lava':
			
 
				+            if fwd_cell is not None and fwd_cell.type == "lava":
			
 
				                 done = True
			
 
				 
			
 
				         # Pick up an object
			
@@ -1162,16 +1157,19 @@ class MiniGridEnv(gym.Env):
 
				 
			
 
				         return obs, reward, done, {}
			
 
				 
			
 
				-    def gen_obs_grid(self):
			
 
				+    def gen_obs_grid(self, agent_view_size=None):
			
 
				         """
			
 
				         Generate the sub-grid observed by the agent.
			
 
				         This method also outputs a visibility mask telling us which grid
			
 
				         cells the agent can actually see.
			
 
				+        if agent_view_size is None, self.agent_view_size is used
			
 
				         """
			
 
				 
			
 
				-        topX, topY, botX, botY = self.get_view_exts()
			
 
				+        topX, topY, botX, botY = self.get_view_exts(agent_view_size)
			
 
				+
			
 
				+        agent_view_size = agent_view_size or self.agent_view_size
			
 
				 
			
 
				-        grid = self.grid.slice(topX, topY, self.agent_view_size, self.agent_view_size)
			
 
				+        grid = self.grid.slice(topX, topY, agent_view_size, agent_view_size)
			
 
				 
			
 
				         for i in range(self.agent_dir + 1):
			
 
				             grid = grid.rotate_left()
			
@@ -1179,7 +1177,9 @@ class MiniGridEnv(gym.Env):
 
				         # Process occluders and visibility
			
 
				         # Note that this incurs some performance cost
			
 
				         if not self.see_through_walls:
			
 
				-            vis_mask = grid.process_vis(agent_pos=(self.agent_view_size // 2 , self.agent_view_size - 1))
			
 
				+            vis_mask = grid.process_vis(
			
 
				+                agent_pos=(agent_view_size // 2, agent_view_size - 1)
			
 
				+            )
			
 
				         else:
			
 
				             vis_mask = np.ones(shape=(grid.width, grid.height), dtype=bool)
			
 
				 
			
@@ -1204,21 +1204,19 @@ class MiniGridEnv(gym.Env):
 
				         # Encode the partially observable view into a numpy array
			
 
				         image = grid.encode(vis_mask)
			
 
				 
			
 
				-        assert hasattr(self, 'mission'), "environments must define a textual mission string"
			
 
				+        assert hasattr(
			
 
				+            self, "mission"
			
 
				+        ), "environments must define a textual mission string"
			
 
				 
			
 
				         # Observations are dictionaries containing:
			
 
				         # - an image (partially observable view of the environment)
			
 
				         # - the agent's direction/orientation (acting as a compass)
			
 
				         # - a textual mission string (instructions for the agent)
			
 
				-        obs = {
			
 
				-            'image': image,
			
 
				-            'direction': self.agent_dir,
			
 
				-            'mission': self.mission
			
 
				-        }
			
 
				+        obs = {"image": image, "direction": self.agent_dir, "mission": self.mission}
			
 
				 
			
 
				         return obs
			
 
				 
			
 
				-    def get_obs_render(self, obs, tile_size=TILE_PIXELS//2):
			
 
				+    def get_obs_render(self, obs, tile_size=TILE_PIXELS // 2):
			
 
				         """
			
 
				         Render an agent observation for visualization
			
 
				         """
			
@@ -1230,24 +1228,26 @@ class MiniGridEnv(gym.Env):
 
				             tile_size,
			
 
				             agent_pos=(self.agent_view_size // 2, self.agent_view_size - 1),
			
 
				             agent_dir=3,
			
 
				-            highlight_mask=vis_mask
			
 
				+            highlight_mask=vis_mask,
			
 
				         )
			
 
				 
			
 
				         return img
			
 
				 
			
 
				-    def render(self, mode='human', close=False, highlight=True, tile_size=TILE_PIXELS):
			
 
				+    def render(self, mode="human", close=False, highlight=True, tile_size=TILE_PIXELS):
			
 
				         """
			
 
				         Render the whole-grid human view
			
 
				         """
			
 
				-
			
 
				+        if self.render_mode is not None:
			
 
				+            mode = self.render_mode
			
 
				         if close:
			
 
				             if self.window:
			
 
				                 self.window.close()
			
 
				             return
			
 
				 
			
 
				-        if mode == 'human' and not self.window:
			
 
				+        if mode == "human" and not self.window:
			
 
				             import gym_minigrid.window
			
 
				-            self.window = gym_minigrid.window.Window('gym_minigrid')
			
 
				+
			
 
				+            self.window = gym_minigrid.window.Window("gym_minigrid")
			
 
				             self.window.show(block=False)
			
 
				 
			
 
				         # Compute which cells are visible to the agent
			
@@ -1257,7 +1257,11 @@ class MiniGridEnv(gym.Env):
 
				         # of the agent's view area
			
 
				         f_vec = self.dir_vec
			
 
				         r_vec = self.right_vec
			
 
				-        top_left = self.agent_pos + f_vec * (self.agent_view_size-1) - r_vec * (self.agent_view_size // 2)
			
 
				+        top_left = (
			
 
				+            self.agent_pos
			
 
				+            + f_vec * (self.agent_view_size - 1)
			
 
				+            - r_vec * (self.agent_view_size // 2)
			
 
				+        )
			
 
				 
			
 
				         # Mask of which cells to highlight
			
 
				         highlight_mask = np.zeros(shape=(self.width, self.height), dtype=bool)
			
@@ -1285,10 +1289,10 @@ class MiniGridEnv(gym.Env):
 
				             tile_size,
			
 
				             self.agent_pos,
			
 
				             self.agent_dir,
			
 
				-            highlight_mask=highlight_mask if highlight else None
			
 
				+            highlight_mask=highlight_mask if highlight else None,
			
 
				         )
			
 
				 
			
 
				-        if mode == 'human':
			
 
				+        if mode == "human":
			
 
				             self.window.set_caption(self.mission)
			
 
				             self.window.show_img(img)
			
 
				 
			
--- a/gym_minigrid/rendering.py
+++ b/gym_minigrid/rendering.py
@@ -1,6 +1,8 @@
 
				 import math
			
 
				+
			
 
				 import numpy as np
			
 
				 
			
 
				+
			
 
				 def downsample(img, factor):
			
 
				     """
			
 
				     Downsample an image along both dimensions by some factor
			
@@ -9,12 +11,15 @@ def downsample(img, factor):
 
				     assert img.shape[0] % factor == 0
			
 
				     assert img.shape[1] % factor == 0
			
 
				 
			
 
				-    img = img.reshape([img.shape[0]//factor, factor, img.shape[1]//factor, factor, 3])
			
 
				+    img = img.reshape(
			
 
				+        [img.shape[0] // factor, factor, img.shape[1] // factor, factor, 3]
			
 
				+    )
			
 
				     img = img.mean(axis=3)
			
 
				     img = img.mean(axis=1)
			
 
				 
			
 
				     return img
			
 
				 
			
 
				+
			
 
				 def fill_coords(img, fn, color):
			
 
				     """
			
 
				     Fill pixels of an image with coordinates matching a filter function
			
@@ -29,6 +34,7 @@ def fill_coords(img, fn, color):
 
				 
			
 
				     return img
			
 
				 
			
 
				+
			
 
				 def rotate_fn(fin, cx, cy, theta):
			
 
				     def fout(x, y):
			
 
				         x = x - cx
			
@@ -41,6 +47,7 @@ def rotate_fn(fin, cx, cy, theta):
 
				 
			
 
				     return fout
			
 
				 
			
 
				+
			
 
				 def point_in_line(x0, y0, x1, y1, r):
			
 
				     p0 = np.array([x0, y0])
			
 
				     p1 = np.array([x1, y1])
			
@@ -71,16 +78,21 @@ def point_in_line(x0, y0, x1, y1, r):
 
				 
			
 
				     return fn
			
 
				 
			
 
				+
			
 
				 def point_in_circle(cx, cy, r):
			
 
				     def fn(x, y):
			
 
				-        return (x-cx)*(x-cx) + (y-cy)*(y-cy) <= r * r
			
 
				+        return (x - cx) * (x - cx) + (y - cy) * (y - cy) <= r * r
			
 
				+
			
 
				     return fn
			
 
				 
			
 
				+
			
 
				 def point_in_rect(xmin, xmax, ymin, ymax):
			
 
				     def fn(x, y):
			
 
				         return x >= xmin and x <= xmax and y >= ymin and y <= ymax
			
 
				+
			
 
				     return fn
			
 
				 
			
 
				+
			
 
				 def point_in_triangle(a, b, c):
			
 
				     a = np.array(a)
			
 
				     b = np.array(b)
			
@@ -108,6 +120,7 @@ def point_in_triangle(a, b, c):
 
				 
			
 
				     return fn
			
 
				 
			
 
				+
			
 
				 def highlight_img(img, color=(255, 255, 255), alpha=0.30):
			
 
				     """
			
 
				     Add highlighting to an image
			
--- a/gym_minigrid/roomgrid.py
+++ b/gym_minigrid/roomgrid.py
@@ -1,4 +1,5 @@
 
				-from .minigrid import *
			
 
				+from gym_minigrid.minigrid import COLOR_NAMES, Ball, Box, Door, Grid, Key, MiniGridEnv
			
 
				+
			
 
				 
			
 
				 def reject_next_to(env, pos):
			
 
				     """
			
@@ -11,12 +12,9 @@ def reject_next_to(env, pos):
 
				     d = abs(sx - x) + abs(sy - y)
			
 
				     return d < 2
			
 
				 
			
 
				+
			
 
				 class Room:
			
 
				-    def __init__(
			
 
				-        self,
			
 
				-        top,
			
 
				-        size
			
 
				-    ):
			
 
				+    def __init__(self, top, size):
			
 
				         # Top-left corner and size (tuples)
			
 
				         self.top = top
			
 
				         self.size = size
			
@@ -39,10 +37,7 @@ class Room:
 
				     def rand_pos(self, env):
			
 
				         topX, topY = self.top
			
 
				         sizeX, sizeY = self.size
			
 
				-        return env._randPos(
			
 
				-            topX + 1, topX + sizeX - 1,
			
 
				-            topY + 1, topY + sizeY - 1
			
 
				-        )
			
 
				+        return env._randPos(topX + 1, topX + sizeX - 1, topY + 1, topY + sizeY - 1)
			
 
				 
			
 
				     def pos_inside(self, x, y):
			
 
				         """
			
@@ -60,6 +55,7 @@ class Room:
 
				 
			
 
				         return True
			
 
				 
			
 
				+
			
 
				 class RoomGrid(MiniGridEnv):
			
 
				     """
			
 
				     Environment with multiple rooms and random objects.
			
@@ -72,8 +68,8 @@ class RoomGrid(MiniGridEnv):
 
				         num_rows=3,
			
 
				         num_cols=3,
			
 
				         max_steps=100,
			
 
				-        seed=0,
			
 
				-        agent_view_size=7
			
 
				+        agent_view_size=7,
			
 
				+        **kwargs
			
 
				     ):
			
 
				         assert room_size > 0
			
 
				         assert room_size >= 3
			
@@ -87,15 +83,15 @@ class RoomGrid(MiniGridEnv):
 
				         width = (room_size - 1) * num_cols + 1
			
 
				 
			
 
				         # By default, this environment has no mission
			
 
				-        self.mission = ''
			
 
				+        self.mission = ""
			
 
				 
			
 
				         super().__init__(
			
 
				             width=width,
			
 
				             height=height,
			
 
				             max_steps=max_steps,
			
 
				             see_through_walls=False,
			
 
				-            seed=seed,
			
 
				-            agent_view_size=agent_view_size
			
 
				+            agent_view_size=agent_view_size,
			
 
				+            **kwargs
			
 
				         )
			
 
				 
			
 
				     def room_from_pos(self, x, y):
			
@@ -104,8 +100,8 @@ class RoomGrid(MiniGridEnv):
 
				         assert x >= 0
			
 
				         assert y >= 0
			
 
				 
			
 
				-        i = x // (self.room_size-1)
			
 
				-        j = y // (self.room_size-1)
			
 
				+        i = x // (self.room_size - 1)
			
 
				+        j = y // (self.room_size - 1)
			
 
				 
			
 
				         assert i < self.num_cols
			
 
				         assert j < self.num_rows
			
@@ -130,8 +126,8 @@ class RoomGrid(MiniGridEnv):
 
				             # For each column of rooms
			
 
				             for i in range(0, self.num_cols):
			
 
				                 room = Room(
			
 
				-                    (i * (self.room_size-1), j * (self.room_size-1)),
			
 
				-                    (self.room_size, self.room_size)
			
 
				+                    (i * (self.room_size - 1), j * (self.room_size - 1)),
			
 
				+                    (self.room_size, self.room_size),
			
 
				                 )
			
 
				                 row.append(room)
			
 
				 
			
@@ -147,26 +143,29 @@ class RoomGrid(MiniGridEnv):
 
				                 room = self.room_grid[j][i]
			
 
				 
			
 
				                 x_l, y_l = (room.top[0] + 1, room.top[1] + 1)
			
 
				-                x_m, y_m = (room.top[0] + room.size[0] - 1, room.top[1] + room.size[1] - 1)
			
 
				+                x_m, y_m = (
			
 
				+                    room.top[0] + room.size[0] - 1,
			
 
				+                    room.top[1] + room.size[1] - 1,
			
 
				+                )
			
 
				 
			
 
				                 # Door positions, order is right, down, left, up
			
 
				                 if i < self.num_cols - 1:
			
 
				-                    room.neighbors[0] = self.room_grid[j][i+1]
			
 
				+                    room.neighbors[0] = self.room_grid[j][i + 1]
			
 
				                     room.door_pos[0] = (x_m, self._rand_int(y_l, y_m))
			
 
				                 if j < self.num_rows - 1:
			
 
				-                    room.neighbors[1] = self.room_grid[j+1][i]
			
 
				+                    room.neighbors[1] = self.room_grid[j + 1][i]
			
 
				                     room.door_pos[1] = (self._rand_int(x_l, x_m), y_m)
			
 
				                 if i > 0:
			
 
				-                    room.neighbors[2] = self.room_grid[j][i-1]
			
 
				+                    room.neighbors[2] = self.room_grid[j][i - 1]
			
 
				                     room.door_pos[2] = room.neighbors[2].door_pos[0]
			
 
				                 if j > 0:
			
 
				-                    room.neighbors[3] = self.room_grid[j-1][i]
			
 
				+                    room.neighbors[3] = self.room_grid[j - 1][i]
			
 
				                     room.door_pos[3] = room.neighbors[3].door_pos[1]
			
 
				 
			
 
				         # The agent starts in the middle, facing right
			
 
				         self.agent_pos = (
			
 
				-            (self.num_cols // 2) * (self.room_size-1) + (self.room_size // 2),
			
 
				-            (self.num_rows // 2) * (self.room_size-1) + (self.room_size // 2)
			
 
				+            (self.num_cols // 2) * (self.room_size - 1) + (self.room_size // 2),
			
 
				+            (self.num_rows // 2) * (self.room_size - 1) + (self.room_size // 2),
			
 
				         )
			
 
				         self.agent_dir = 0
			
 
				 
			
@@ -178,11 +177,7 @@ class RoomGrid(MiniGridEnv):
 
				         room = self.get_room(i, j)
			
 
				 
			
 
				         pos = self.place_obj(
			
 
				-            obj,
			
 
				-            room.top,
			
 
				-            room.size,
			
 
				-            reject_fn=reject_next_to,
			
 
				-            max_tries=1000
			
 
				+            obj, room.top, room.size, reject_fn=reject_next_to, max_tries=1000
			
 
				         )
			
 
				 
			
 
				         room.objs.append(obj)
			
@@ -194,19 +189,19 @@ class RoomGrid(MiniGridEnv):
 
				         Add a new object to room (i, j)
			
 
				         """
			
 
				 
			
 
				-        if kind == None:
			
 
				-            kind = self._rand_elem(['key', 'ball', 'box'])
			
 
				+        if kind is None:
			
 
				+            kind = self._rand_elem(["key", "ball", "box"])
			
 
				 
			
 
				-        if color == None:
			
 
				+        if color is None:
			
 
				             color = self._rand_color()
			
 
				 
			
 
				         # TODO: we probably want to add an Object.make helper function
			
 
				-        assert kind in ['key', 'ball', 'box']
			
 
				-        if kind == 'key':
			
 
				+        assert kind in ["key", "ball", "box"]
			
 
				+        if kind == "key":
			
 
				             obj = Key(color)
			
 
				-        elif kind == 'ball':
			
 
				+        elif kind == "ball":
			
 
				             obj = Ball(color)
			
 
				-        elif kind == 'box':
			
 
				+        elif kind == "box":
			
 
				             obj = Box(color)
			
 
				 
			
 
				         return self.place_in_room(i, j, obj)
			
@@ -218,7 +213,7 @@ class RoomGrid(MiniGridEnv):
 
				 
			
 
				         room = self.get_room(i, j)
			
 
				 
			
 
				-        if door_idx == None:
			
 
				+        if door_idx is None:
			
 
				             # Need to make sure that there is a neighbor along this wall
			
 
				             # and that there is not already a door
			
 
				             while True:
			
@@ -226,7 +221,7 @@ class RoomGrid(MiniGridEnv):
 
				                 if room.neighbors[door_idx] and room.doors[door_idx] is None:
			
 
				                     break
			
 
				 
			
 
				-        if color == None:
			
 
				+        if color is None:
			
 
				             color = self._rand_color()
			
 
				 
			
 
				         if locked is None:
			
@@ -243,7 +238,7 @@ class RoomGrid(MiniGridEnv):
 
				 
			
 
				         neighbor = room.neighbors[door_idx]
			
 
				         room.doors[door_idx] = door
			
 
				-        neighbor.doors[(door_idx+2) % 4] = door
			
 
				+        neighbor.doors[(door_idx + 2) % 4] = door
			
 
				 
			
 
				         return door, pos
			
 
				 
			
@@ -281,16 +276,16 @@ class RoomGrid(MiniGridEnv):
 
				 
			
 
				         # Mark the rooms as connected
			
 
				         room.doors[wall_idx] = True
			
 
				-        neighbor.doors[(wall_idx+2) % 4] = True
			
 
				+        neighbor.doors[(wall_idx + 2) % 4] = True
			
 
				 
			
 
				     def place_agent(self, i=None, j=None, rand_dir=True):
			
 
				         """
			
 
				         Place the agent in a room
			
 
				         """
			
 
				 
			
 
				-        if i == None:
			
 
				+        if i is None:
			
 
				             i = self._rand_int(0, self.num_cols)
			
 
				-        if j == None:
			
 
				+        if j is None:
			
 
				             j = self._rand_int(0, self.num_rows)
			
 
				 
			
 
				         room = self.room_grid[j][i]
			
@@ -299,7 +294,7 @@ class RoomGrid(MiniGridEnv):
 
				         while True:
			
 
				             super().place_agent(room.top, room.size, rand_dir, max_tries=1000)
			
 
				             front_cell = self.grid.get(*self.front_pos)
			
 
				-            if front_cell is None or front_cell.type == 'wall':
			
 
				+            if front_cell is None or front_cell.type == "wall":
			
 
				                 break
			
 
				 
			
 
				         return self.agent_pos
			
@@ -333,7 +328,7 @@ class RoomGrid(MiniGridEnv):
 
				             # This is to handle rare situations where random sampling produces
			
 
				             # a level that cannot be connected, producing in an infinite loop
			
 
				             if num_itrs > max_itrs:
			
 
				-                raise RecursionError('connect_all failed')
			
 
				+                raise RecursionError("connect_all failed")
			
 
				             num_itrs += 1
			
 
				 
			
 
				             # If all rooms are reachable, stop
			
@@ -377,7 +372,7 @@ class RoomGrid(MiniGridEnv):
 
				 
			
 
				         while len(dists) < num_distractors:
			
 
				             color = self._rand_elem(COLOR_NAMES)
			
 
				-            type = self._rand_elem(['key', 'ball', 'box'])
			
 
				+            type = self._rand_elem(["key", "ball", "box"])
			
 
				             obj = (type, color)
			
 
				 
			
 
				             if all_unique and obj in objs:
			
@@ -386,9 +381,9 @@ class RoomGrid(MiniGridEnv):
 
				             # Add the object to a random room if no room specified
			
 
				             room_i = i
			
 
				             room_j = j
			
 
				-            if room_i == None:
			
 
				+            if room_i is None:
			
 
				                 room_i = self._rand_int(0, self.num_cols)
			
 
				-            if room_j == None:
			
 
				+            if room_j is None:
			
 
				                 room_j = self._rand_int(0, self.num_rows)
			
 
				 
			
 
				             dist, pos = self.add_object(room_i, room_j, *obj)
			
--- a/gym_minigrid/window.py
+++ b/gym_minigrid/window.py
@@ -1,13 +1,11 @@
 
				-import sys
			
 
				-import numpy as np
			
 
				-
			
 
				 # Only ask users to install matplotlib if they actually need it
			
 
				 try:
			
 
				     import matplotlib.pyplot as plt
			
 
				-except:
			
 
				-    print('To display the environment in a window, please install matplotlib, eg:')
			
 
				-    print('pip3 install --user matplotlib')
			
 
				-    sys.exit(-1)
			
 
				+except ImportError:
			
 
				+    raise ImportError(
			
 
				+        "To display the environment in a window, please install matplotlib, eg: `pip3 install --user matplotlib`"
			
 
				+    )
			
 
				+
			
 
				 
			
 
				 class Window:
			
 
				     """
			
@@ -23,11 +21,11 @@ class Window:
 
				         self.fig, self.ax = plt.subplots()
			
 
				 
			
 
				         # Show the env name in the window title
			
 
				-        self.fig.canvas.set_window_title(title)
			
 
				+        self.fig.canvas.manager.set_window_title(title)
			
 
				 
			
 
				         # Turn off x/y axis numbering/ticks
			
 
				-        self.ax.xaxis.set_ticks_position('none')
			
 
				-        self.ax.yaxis.set_ticks_position('none')
			
 
				+        self.ax.xaxis.set_ticks_position("none")
			
 
				+        self.ax.yaxis.set_ticks_position("none")
			
 
				         _ = self.ax.set_xticklabels([])
			
 
				         _ = self.ax.set_yticklabels([])
			
 
				 
			
@@ -37,7 +35,7 @@ class Window:
 
				         def close_handler(evt):
			
 
				             self.closed = True
			
 
				 
			
 
				-        self.fig.canvas.mpl_connect('close_event', close_handler)
			
 
				+        self.fig.canvas.mpl_connect("close_event", close_handler)
			
 
				 
			
 
				     def show_img(self, img):
			
 
				         """
			
@@ -47,7 +45,7 @@ class Window:
 
				         # If no image has been shown yet,
			
 
				         # show the first image of the environment
			
 
				         if self.imshow_obj is None:
			
 
				-            self.imshow_obj = self.ax.imshow(img, interpolation='bilinear')
			
 
				+            self.imshow_obj = self.ax.imshow(img, interpolation="bilinear")
			
 
				 
			
 
				         # Update the image data
			
 
				         self.imshow_obj.set_data(img)
			
@@ -72,7 +70,7 @@ class Window:
 
				         """
			
 
				 
			
 
				         # Keyboard handler
			
 
				-        self.fig.canvas.mpl_connect('key_press_event', key_handler)
			
 
				+        self.fig.canvas.mpl_connect("key_press_event", key_handler)
			
 
				 
			
 
				     def show(self, block=True):
			
 
				         """
			
--- a/gym_minigrid/wrappers.py
+++ b/gym_minigrid/wrappers.py
@@ -2,12 +2,14 @@ import math
 
				 import operator
			
 
				 from functools import reduce
			
 
				 
			
 
				-import numpy as np
			
 
				 import gym
			
 
				-from gym import error, spaces, utils
			
 
				-from .minigrid import OBJECT_TO_IDX, COLOR_TO_IDX, STATE_TO_IDX, Goal
			
 
				+import numpy as np
			
 
				+from gym import spaces
			
 
				+
			
 
				+from gym_minigrid.minigrid import COLOR_TO_IDX, OBJECT_TO_IDX, STATE_TO_IDX, Goal
			
 
				 
			
 
				-class ReseedWrapper(gym.core.Wrapper):
			
 
				+
			
 
				+class ReseedWrapper(gym.Wrapper):
			
 
				     """
			
 
				     Wrapper to always regenerate an environment with the same set of seeds.
			
 
				     This can be used to force an environment to always keep the same
			
@@ -22,14 +24,14 @@ class ReseedWrapper(gym.core.Wrapper):
 
				     def reset(self, **kwargs):
			
 
				         seed = self.seeds[self.seed_idx]
			
 
				         self.seed_idx = (self.seed_idx + 1) % len(self.seeds)
			
 
				-        self.env.seed(seed)
			
 
				-        return self.env.reset(**kwargs)
			
 
				+        return self.env.reset(seed=seed, **kwargs)
			
 
				 
			
 
				     def step(self, action):
			
 
				         obs, reward, done, info = self.env.step(action)
			
 
				         return obs, reward, done, info
			
 
				 
			
 
				-class ActionBonus(gym.core.Wrapper):
			
 
				+
			
 
				+class ActionBonus(gym.Wrapper):
			
 
				     """
			
 
				     Wrapper which adds an exploration bonus.
			
 
				     This is a reward to encourage exploration of less
			
@@ -63,7 +65,8 @@ class ActionBonus(gym.core.Wrapper):
 
				     def reset(self, **kwargs):
			
 
				         return self.env.reset(**kwargs)
			
 
				 
			
 
				-class StateBonus(gym.core.Wrapper):
			
 
				+
			
 
				+class StateBonus(gym.Wrapper):
			
 
				     """
			
 
				     Adds an exploration bonus based on which positions
			
 
				     are visited on the grid.
			
@@ -79,7 +82,7 @@ class StateBonus(gym.core.Wrapper):
 
				         # Tuple based on which we index the counts
			
 
				         # We use the position after an update
			
 
				         env = self.unwrapped
			
 
				-        tup = (tuple(env.agent_pos))
			
 
				+        tup = tuple(env.agent_pos)
			
 
				 
			
 
				         # Get the count for this key
			
 
				         pre_count = 0
			
@@ -98,19 +101,21 @@ class StateBonus(gym.core.Wrapper):
 
				     def reset(self, **kwargs):
			
 
				         return self.env.reset(**kwargs)
			
 
				 
			
 
				-class ImgObsWrapper(gym.core.ObservationWrapper):
			
 
				+
			
 
				+class ImgObsWrapper(gym.ObservationWrapper):
			
 
				     """
			
 
				     Use the image as the only observation output, no language/mission.
			
 
				     """
			
 
				 
			
 
				     def __init__(self, env):
			
 
				         super().__init__(env)
			
 
				-        self.observation_space = env.observation_space.spaces['image']
			
 
				+        self.observation_space = env.observation_space.spaces["image"]
			
 
				 
			
 
				     def observation(self, obs):
			
 
				-        return obs['image']
			
 
				+        return obs["image"]
			
 
				+
			
 
				 
			
 
				-class OneHotPartialObsWrapper(gym.core.ObservationWrapper):
			
 
				+class OneHotPartialObsWrapper(gym.ObservationWrapper):
			
 
				     """
			
 
				     Wrapper to get a one-hot encoding of a partially observable
			
 
				     agent view as observation.
			
@@ -121,21 +126,21 @@ class OneHotPartialObsWrapper(gym.core.ObservationWrapper):
 
				 
			
 
				         self.tile_size = tile_size
			
 
				 
			
 
				-        obs_shape = env.observation_space['image'].shape
			
 
				+        obs_shape = env.observation_space["image"].shape
			
 
				 
			
 
				         # Number of bits per cell
			
 
				         num_bits = len(OBJECT_TO_IDX) + len(COLOR_TO_IDX) + len(STATE_TO_IDX)
			
 
				 
			
 
				-        self.observation_space.spaces["image"] = spaces.Box(
			
 
				-            low=0,
			
 
				-            high=255,
			
 
				-            shape=(obs_shape[0], obs_shape[1], num_bits),
			
 
				-            dtype='uint8'
			
 
				+        new_image_space = spaces.Box(
			
 
				+            low=0, high=255, shape=(obs_shape[0], obs_shape[1], num_bits), dtype="uint8"
			
 
				+        )
			
 
				+        self.observation_space = spaces.Dict(
			
 
				+            {**self.observation_space.spaces, "image": new_image_space}
			
 
				         )
			
 
				 
			
 
				     def observation(self, obs):
			
 
				-        img = obs['image']
			
 
				-        out = np.zeros(self.observation_space.spaces['image'].shape, dtype='uint8')
			
 
				+        img = obs["image"]
			
 
				+        out = np.zeros(self.observation_space.spaces["image"].shape, dtype="uint8")
			
 
				 
			
 
				         for i in range(img.shape[0]):
			
 
				             for j in range(img.shape[1]):
			
@@ -147,15 +152,14 @@ class OneHotPartialObsWrapper(gym.core.ObservationWrapper):
 
				                 out[i, j, len(OBJECT_TO_IDX) + color] = 1
			
 
				                 out[i, j, len(OBJECT_TO_IDX) + len(COLOR_TO_IDX) + state] = 1
			
 
				 
			
 
				-        return {
			
 
				-            **obs,
			
 
				-            'image': out
			
 
				-        }
			
 
				+        return {**obs, "image": out}
			
 
				+
			
 
				 
			
 
				-class RGBImgObsWrapper(gym.core.ObservationWrapper):
			
 
				+class RGBImgObsWrapper(gym.ObservationWrapper):
			
 
				     """
			
 
				     Wrapper to use fully observable RGB image as observation,
			
 
				     This can be used to have the agent to solve the gridworld in pixel space.
			
 
				+    To use it, make the unwrapped environment with render_mode='rgb_array'.
			
 
				     """
			
 
				 
			
 
				     def __init__(self, env, tile_size=8):
			
@@ -163,29 +167,27 @@ class RGBImgObsWrapper(gym.core.ObservationWrapper):
 
				 
			
 
				         self.tile_size = tile_size
			
 
				 
			
 
				-        self.observation_space.spaces['image'] = spaces.Box(
			
 
				+        new_image_space = spaces.Box(
			
 
				             low=0,
			
 
				             high=255,
			
 
				             shape=(self.env.width * tile_size, self.env.height * tile_size, 3),
			
 
				-            dtype='uint8'
			
 
				+            dtype="uint8",
			
 
				+        )
			
 
				+
			
 
				+        self.observation_space = spaces.Dict(
			
 
				+            {**self.observation_space.spaces, "image": new_image_space}
			
 
				         )
			
 
				 
			
 
				     def observation(self, obs):
			
 
				         env = self.unwrapped
			
 
				+        assert env.render_mode == "rgb_array", env.render_mode
			
 
				 
			
 
				-        rgb_img = env.render(
			
 
				-            mode='rgb_array',
			
 
				-            highlight=False,
			
 
				-            tile_size=self.tile_size
			
 
				-        )
			
 
				+        rgb_img = env.render(highlight=False, tile_size=self.tile_size)
			
 
				 
			
 
				-        return {
			
 
				-            **obs,
			
 
				-            'image': rgb_img
			
 
				-        }
			
 
				+        return {**obs, "image": rgb_img}
			
 
				 
			
 
				 
			
 
				-class RGBImgPartialObsWrapper(gym.core.ObservationWrapper):
			
 
				+class RGBImgPartialObsWrapper(gym.ObservationWrapper):
			
 
				     """
			
 
				     Wrapper to use partially observable RGB image as observation.
			
 
				     This can be used to have the agent to solve the gridworld in pixel space.
			
@@ -196,28 +198,27 @@ class RGBImgPartialObsWrapper(gym.core.ObservationWrapper):
 
				 
			
 
				         self.tile_size = tile_size
			
 
				 
			
 
				-        obs_shape = env.observation_space.spaces['image'].shape
			
 
				-        self.observation_space.spaces['image'] = spaces.Box(
			
 
				+        obs_shape = env.observation_space.spaces["image"].shape
			
 
				+        new_image_space = spaces.Box(
			
 
				             low=0,
			
 
				             high=255,
			
 
				             shape=(obs_shape[0] * tile_size, obs_shape[1] * tile_size, 3),
			
 
				-            dtype='uint8'
			
 
				+            dtype="uint8",
			
 
				+        )
			
 
				+
			
 
				+        self.observation_space = spaces.Dict(
			
 
				+            {**self.observation_space.spaces, "image": new_image_space}
			
 
				         )
			
 
				 
			
 
				     def observation(self, obs):
			
 
				         env = self.unwrapped
			
 
				 
			
 
				-        rgb_img_partial = env.get_obs_render(
			
 
				-            obs['image'],
			
 
				-            tile_size=self.tile_size
			
 
				-        )
			
 
				+        rgb_img_partial = env.get_obs_render(obs["image"], tile_size=self.tile_size)
			
 
				 
			
 
				-        return {
			
 
				-            **obs,
			
 
				-            'image': rgb_img_partial
			
 
				-        }
			
 
				+        return {**obs, "image": rgb_img_partial}
			
 
				 
			
 
				-class FullyObsWrapper(gym.core.ObservationWrapper):
			
 
				+
			
 
				+class FullyObsWrapper(gym.ObservationWrapper):
			
 
				     """
			
 
				     Fully observable gridworld using a compact grid encoding
			
 
				     """
			
@@ -225,28 +226,146 @@ class FullyObsWrapper(gym.core.ObservationWrapper):
 
				     def __init__(self, env):
			
 
				         super().__init__(env)
			
 
				 
			
 
				-        self.observation_space.spaces["image"] = spaces.Box(
			
 
				+        new_image_space = spaces.Box(
			
 
				             low=0,
			
 
				             high=255,
			
 
				             shape=(self.env.width, self.env.height, 3),  # number of cells
			
 
				-            dtype='uint8'
			
 
				+            dtype="uint8",
			
 
				+        )
			
 
				+
			
 
				+        self.observation_space = spaces.Dict(
			
 
				+            {**self.observation_space.spaces, "image": new_image_space}
			
 
				         )
			
 
				 
			
 
				     def observation(self, obs):
			
 
				         env = self.unwrapped
			
 
				         full_grid = env.grid.encode()
			
 
				-        full_grid[env.agent_pos[0]][env.agent_pos[1]] = np.array([
			
 
				-            OBJECT_TO_IDX['agent'],
			
 
				-            COLOR_TO_IDX['red'],
			
 
				-            env.agent_dir
			
 
				-        ])
			
 
				-
			
 
				-        return {
			
 
				-            **obs,
			
 
				-            'image': full_grid
			
 
				-        }
			
 
				-
			
 
				-class FlatObsWrapper(gym.core.ObservationWrapper):
			
 
				+        full_grid[env.agent_pos[0]][env.agent_pos[1]] = np.array(
			
 
				+            [OBJECT_TO_IDX["agent"], COLOR_TO_IDX["red"], env.agent_dir]
			
 
				+        )
			
 
				+
			
 
				+        return {**obs, "image": full_grid}
			
 
				+
			
 
				+
			
 
				+class DictObservationSpaceWrapper(gym.ObservationWrapper):
			
 
				+    """
			
 
				+    Transforms the observation space (that has a textual component) to a fully numerical observation space,
			
 
				+    where the textual instructions are replaced by arrays representing the indices of each word in a fixed vocabulary.
			
 
				+    """
			
 
				+
			
 
				+    def __init__(self, env, max_words_in_mission=50, word_dict=None):
			
 
				+        """
			
 
				+        max_words_in_mission is the length of the array to represent a mission, value 0 for missing words
			
 
				+        word_dict is a dictionary of words to use (keys=words, values=indices from 1 to < max_words_in_mission),
			
 
				+                  if None, use the Minigrid language
			
 
				+        """
			
 
				+        super().__init__(env)
			
 
				+
			
 
				+        if word_dict is None:
			
 
				+            word_dict = self.get_minigrid_words()
			
 
				+
			
 
				+        self.max_words_in_mission = max_words_in_mission
			
 
				+        self.word_dict = word_dict
			
 
				+
			
 
				+        image_observation_space = spaces.Box(
			
 
				+            low=0,
			
 
				+            high=255,
			
 
				+            shape=(self.agent_view_size, self.agent_view_size, 3),
			
 
				+            dtype="uint8",
			
 
				+        )
			
 
				+        self.observation_space = spaces.Dict(
			
 
				+            {
			
 
				+                "image": image_observation_space,
			
 
				+                "direction": spaces.Discrete(4),
			
 
				+                "mission": spaces.MultiDiscrete(
			
 
				+                    [len(self.word_dict.keys())] * max_words_in_mission
			
 
				+                ),
			
 
				+            }
			
 
				+        )
			
 
				+
			
 
				+    @staticmethod
			
 
				+    def get_minigrid_words():
			
 
				+        colors = ["red", "green", "blue", "yellow", "purple", "grey"]
			
 
				+        objects = [
			
 
				+            "unseen",
			
 
				+            "empty",
			
 
				+            "wall",
			
 
				+            "floor",
			
 
				+            "box",
			
 
				+            "key",
			
 
				+            "ball",
			
 
				+            "door",
			
 
				+            "goal",
			
 
				+            "agent",
			
 
				+            "lava",
			
 
				+        ]
			
 
				+
			
 
				+        verbs = [
			
 
				+            "pick",
			
 
				+            "avoid",
			
 
				+            "get",
			
 
				+            "find",
			
 
				+            "put",
			
 
				+            "use",
			
 
				+            "open",
			
 
				+            "go",
			
 
				+            "fetch",
			
 
				+            "reach",
			
 
				+            "unlock",
			
 
				+            "traverse",
			
 
				+        ]
			
 
				+
			
 
				+        extra_words = [
			
 
				+            "up",
			
 
				+            "the",
			
 
				+            "a",
			
 
				+            "at",
			
 
				+            ",",
			
 
				+            "square",
			
 
				+            "and",
			
 
				+            "then",
			
 
				+            "to",
			
 
				+            "of",
			
 
				+            "rooms",
			
 
				+            "near",
			
 
				+            "opening",
			
 
				+            "must",
			
 
				+            "you",
			
 
				+            "matching",
			
 
				+            "end",
			
 
				+            "hallway",
			
 
				+            "object",
			
 
				+            "from",
			
 
				+            "room",
			
 
				+        ]
			
 
				+
			
 
				+        all_words = colors + objects + verbs + extra_words
			
 
				+        assert len(all_words) == len(set(all_words))
			
 
				+        return {word: i for i, word in enumerate(all_words)}
			
 
				+
			
 
				+    def string_to_indices(self, string, offset=1):
			
 
				+        """
			
 
				+        Convert a string to a list of indices.
			
 
				+        """
			
 
				+        indices = []
			
 
				+        # adding space before and after commas
			
 
				+        string = string.replace(",", " , ")
			
 
				+        for word in string.split():
			
 
				+            if word in self.word_dict.keys():
			
 
				+                indices.append(self.word_dict[word] + offset)
			
 
				+            else:
			
 
				+                raise ValueError(f"Unknown word: {word}")
			
 
				+        return indices
			
 
				+
			
 
				+    def observation(self, obs):
			
 
				+        obs["mission"] = self.string_to_indices(obs["mission"])
			
 
				+        assert len(obs["mission"]) < self.max_words_in_mission
			
 
				+        obs["mission"] += [0] * (self.max_words_in_mission - len(obs["mission"]))
			
 
				+
			
 
				+        return obs
			
 
				+
			
 
				+
			
 
				+class FlatObsWrapper(gym.ObservationWrapper):
			
 
				     """
			
 
				     Encode mission strings using a one-hot scheme,
			
 
				     and combine these with observed images into one flat array
			
@@ -258,36 +377,40 @@ class FlatObsWrapper(gym.core.ObservationWrapper):
 
				         self.maxStrLen = maxStrLen
			
 
				         self.numCharCodes = 27
			
 
				 
			
 
				-        imgSpace = env.observation_space.spaces['image']
			
 
				+        imgSpace = env.observation_space.spaces["image"]
			
 
				         imgSize = reduce(operator.mul, imgSpace.shape, 1)
			
 
				 
			
 
				         self.observation_space = spaces.Box(
			
 
				             low=0,
			
 
				             high=255,
			
 
				             shape=(imgSize + self.numCharCodes * self.maxStrLen,),
			
 
				-            dtype='uint8'
			
 
				+            dtype="uint8",
			
 
				         )
			
 
				 
			
 
				         self.cachedStr = None
			
 
				         self.cachedArray = None
			
 
				 
			
 
				     def observation(self, obs):
			
 
				-        image = obs['image']
			
 
				-        mission = obs['mission']
			
 
				+        image = obs["image"]
			
 
				+        mission = obs["mission"]
			
 
				 
			
 
				         # Cache the last-encoded mission string
			
 
				         if mission != self.cachedStr:
			
 
				-            assert len(mission) <= self.maxStrLen, 'mission string too long ({} chars)'.format(len(mission))
			
 
				+            assert (
			
 
				+                len(mission) <= self.maxStrLen
			
 
				+            ), f"mission string too long ({len(mission)} chars)"
			
 
				             mission = mission.lower()
			
 
				 
			
 
				-            strArray = np.zeros(shape=(self.maxStrLen, self.numCharCodes), dtype='float32')
			
 
				+            strArray = np.zeros(
			
 
				+                shape=(self.maxStrLen, self.numCharCodes), dtype="float32"
			
 
				+            )
			
 
				 
			
 
				             for idx, ch in enumerate(mission):
			
 
				-                if ch >= 'a' and ch <= 'z':
			
 
				-                    chNo = ord(ch) - ord('a')
			
 
				-                elif ch == ' ':
			
 
				-                    chNo = ord('z') - ord('a') + 1
			
 
				-                assert chNo < self.numCharCodes, '%s : %d' % (ch, chNo)
			
 
				+                if ch >= "a" and ch <= "z":
			
 
				+                    chNo = ord(ch) - ord("a")
			
 
				+                elif ch == " ":
			
 
				+                    chNo = ord("z") - ord("a") + 1
			
 
				+                assert chNo < self.numCharCodes, "%s : %d" % (ch, chNo)
			
 
				                 strArray[idx, chNo] = 1
			
 
				 
			
 
				             self.cachedStr = mission
			
@@ -297,7 +420,8 @@ class FlatObsWrapper(gym.core.ObservationWrapper):
 
				 
			
 
				         return obs
			
 
				 
			
 
				-class ViewSizeWrapper(gym.core.Wrapper):
			
 
				+
			
 
				+class ViewSizeWrapper(gym.Wrapper):
			
 
				     """
			
 
				     Wrapper to customize the agent field of view size.
			
 
				     This cannot be used with fully observable wrappers.
			
@@ -309,34 +433,36 @@ class ViewSizeWrapper(gym.core.Wrapper):
 
				         assert agent_view_size % 2 == 1
			
 
				         assert agent_view_size >= 3
			
 
				 
			
 
				-        # Override default view size
			
 
				-        env.unwrapped.agent_view_size = agent_view_size
			
 
				+        self.agent_view_size = agent_view_size
			
 
				 
			
 
				         # Compute observation space with specified view size
			
 
				-        observation_space = gym.spaces.Box(
			
 
				-            low=0,
			
 
				-            high=255,
			
 
				-            shape=(agent_view_size, agent_view_size, 3),
			
 
				-            dtype='uint8'
			
 
				+        new_image_space = gym.spaces.Box(
			
 
				+            low=0, high=255, shape=(agent_view_size, agent_view_size, 3), dtype="uint8"
			
 
				         )
			
 
				 
			
 
				-        # Override the environment's observation space
			
 
				-        self.observation_space = spaces.Dict({
			
 
				-            'image': observation_space
			
 
				-        })
			
 
				+        # Override the environment's observation spaceexit
			
 
				+        self.observation_space = spaces.Dict(
			
 
				+            {**self.observation_space.spaces, "image": new_image_space}
			
 
				+        )
			
 
				 
			
 
				-    def reset(self, **kwargs):
			
 
				-        return self.env.reset(**kwargs)
			
 
				+    def observation(self, obs):
			
 
				+        env = self.unwrapped
			
 
				+
			
 
				+        grid, vis_mask = env.gen_obs_grid(self.agent_view_size)
			
 
				+
			
 
				+        # Encode the partially observable view into a numpy array
			
 
				+        image = grid.encode(vis_mask)
			
 
				+
			
 
				+        return {**obs, "image": image}
			
 
				 
			
 
				-    def step(self, action):
			
 
				-        return self.env.step(action)
			
 
				 
			
 
				-class DirectionObsWrapper(gym.core.ObservationWrapper):
			
 
				+class DirectionObsWrapper(gym.ObservationWrapper):
			
 
				     """
			
 
				     Provides the slope/angular direction to the goal with the observations as modeled by (y2 - y2 )/( x2 - x1)
			
 
				     type = {slope , angle}
			
 
				     """
			
 
				-    def __init__(self, env,type='slope'):
			
 
				+
			
 
				+    def __init__(self, env, type="slope"):
			
 
				         super().__init__(env)
			
 
				         self.goal_position = None
			
 
				         self.type = type
			
@@ -344,17 +470,27 @@ class DirectionObsWrapper(gym.core.ObservationWrapper):
 
				     def reset(self):
			
 
				         obs = self.env.reset()
			
 
				         if not self.goal_position:
			
 
				-            self.goal_position = [x for x,y in enumerate(self.grid.grid) if isinstance(y,(Goal) ) ]
			
 
				-            if len(self.goal_position) >= 1: # in case there are multiple goals , needs to be handled for other env types
			
 
				-                self.goal_position = (int(self.goal_position[0]/self.height) , self.goal_position[0]%self.width)
			
 
				+            self.goal_position = [
			
 
				+                x for x, y in enumerate(self.grid.grid) if isinstance(y, Goal)
			
 
				+            ]
			
 
				+            # in case there are multiple goals , needs to be handled for other env types
			
 
				+            if len(self.goal_position) >= 1:
			
 
				+                self.goal_position = (
			
 
				+                    int(self.goal_position[0] / self.height),
			
 
				+                    self.goal_position[0] % self.width,
			
 
				+                )
			
 
				         return obs
			
 
				 
			
 
				     def observation(self, obs):
			
 
				-        slope = np.divide( self.goal_position[1] - self.agent_pos[1] ,  self.goal_position[0] - self.agent_pos[0])
			
 
				-        obs['goal_direction'] = np.arctan( slope ) if self.type == 'angle' else slope
			
 
				+        slope = np.divide(
			
 
				+            self.goal_position[1] - self.agent_pos[1],
			
 
				+            self.goal_position[0] - self.agent_pos[0],
			
 
				+        )
			
 
				+        obs["goal_direction"] = np.arctan(slope) if self.type == "angle" else slope
			
 
				         return obs
			
 
				 
			
 
				-class SymbolicObsWrapper(gym.core.ObservationWrapper):
			
 
				+
			
 
				+class SymbolicObsWrapper(gym.ObservationWrapper):
			
 
				     """
			
 
				     Fully observable grid with a symbolic state representation.
			
 
				     The symbol is a triple of (X, Y, IDX), where X and Y are
			
@@ -364,12 +500,15 @@ class SymbolicObsWrapper(gym.core.ObservationWrapper):
 
				     def __init__(self, env):
			
 
				         super().__init__(env)
			
 
				 
			
 
				-        self.observation_space.spaces["image"] = spaces.Box(
			
 
				+        new_image_space = spaces.Box(
			
 
				             low=0,
			
 
				             high=max(OBJECT_TO_IDX.values()),
			
 
				             shape=(self.env.width, self.env.height, 3),  # number of cells
			
 
				             dtype="uint8",
			
 
				         )
			
 
				+        self.observation_space = spaces.Dict(
			
 
				+            {**self.observation_space.spaces, "image": new_image_space}
			
 
				+        )
			
 
				 
			
 
				     def observation(self, obs):
			
 
				         objects = np.array(
			
@@ -379,5 +518,5 @@ class SymbolicObsWrapper(gym.core.ObservationWrapper):
 
				         grid = np.mgrid[:w, :h]
			
 
				         grid = np.concatenate([grid, objects.reshape(1, w, h)])
			
 
				         grid = np.transpose(grid, (1, 2, 0))
			
 
				-        obs['image'] = grid
			
 
				+        obs["image"] = grid
			
 
				         return obs
			
--- a/manual_control.py
+++ b/manual_control.py
@@ -1,111 +1,105 @@
 
				 #!/usr/bin/env python3
			
 
				 
			
 
				-import time
			
 
				 import argparse
			
 
				-import numpy as np
			
 
				+
			
 
				 import gym
			
 
				-import gym_minigrid
			
 
				-from gym_minigrid.wrappers import *
			
 
				+
			
 
				 from gym_minigrid.window import Window
			
 
				+from gym_minigrid.wrappers import ImgObsWrapper, RGBImgPartialObsWrapper
			
 
				+
			
 
				 
			
 
				 def redraw(img):
			
 
				     if not args.agent_view:
			
 
				-        img = env.render('rgb_array', tile_size=args.tile_size)
			
 
				+        img = env.render(tile_size=args.tile_size)
			
 
				 
			
 
				     window.show_img(img)
			
 
				 
			
 
				-def reset():
			
 
				-    if args.seed != -1:
			
 
				-        env.seed(args.seed)
			
 
				 
			
 
				-    obs = env.reset()
			
 
				+def reset():
			
 
				+    seed = None if args.seed == -1 else args.seed
			
 
				+    obs = env.reset(seed=seed)
			
 
				 
			
 
				-    if hasattr(env, 'mission'):
			
 
				-        print('Mission: %s' % env.mission)
			
 
				+    if hasattr(env, "mission"):
			
 
				+        print("Mission: %s" % env.mission)
			
 
				         window.set_caption(env.mission)
			
 
				 
			
 
				     redraw(obs)
			
 
				 
			
 
				+
			
 
				 def step(action):
			
 
				     obs, reward, done, info = env.step(action)
			
 
				-    print('step=%s, reward=%.2f' % (env.step_count, reward))
			
 
				+    print(f"step={env.step_count}, reward={reward:.2f}")
			
 
				 
			
 
				     if done:
			
 
				-        print('done!')
			
 
				+        print("done!")
			
 
				         reset()
			
 
				     else:
			
 
				         redraw(obs)
			
 
				 
			
 
				+
			
 
				 def key_handler(event):
			
 
				-    print('pressed', event.key)
			
 
				+    print("pressed", event.key)
			
 
				 
			
 
				-    if event.key == 'escape':
			
 
				+    if event.key == "escape":
			
 
				         window.close()
			
 
				         return
			
 
				 
			
 
				-    if event.key == 'backspace':
			
 
				+    if event.key == "backspace":
			
 
				         reset()
			
 
				         return
			
 
				 
			
 
				-    if event.key == 'left':
			
 
				+    if event.key == "left":
			
 
				         step(env.actions.left)
			
 
				         return
			
 
				-    if event.key == 'right':
			
 
				+    if event.key == "right":
			
 
				         step(env.actions.right)
			
 
				         return
			
 
				-    if event.key == 'up':
			
 
				+    if event.key == "up":
			
 
				         step(env.actions.forward)
			
 
				         return
			
 
				 
			
 
				     # Spacebar
			
 
				-    if event.key == ' ':
			
 
				+    if event.key == " ":
			
 
				         step(env.actions.toggle)
			
 
				         return
			
 
				-    if event.key == 'pageup':
			
 
				+    if event.key == "pageup":
			
 
				         step(env.actions.pickup)
			
 
				         return
			
 
				-    if event.key == 'pagedown':
			
 
				+    if event.key == "pagedown":
			
 
				         step(env.actions.drop)
			
 
				         return
			
 
				 
			
 
				-    if event.key == 'enter':
			
 
				+    if event.key == "enter":
			
 
				         step(env.actions.done)
			
 
				         return
			
 
				 
			
 
				+
			
 
				 parser = argparse.ArgumentParser()
			
 
				 parser.add_argument(
			
 
				-    "--env",
			
 
				-    help="gym environment to load",
			
 
				-    default='MiniGrid-MultiRoom-N6-v0'
			
 
				+    "--env", help="gym environment to load", default="MiniGrid-MultiRoom-N6-v0"
			
 
				 )
			
 
				 parser.add_argument(
			
 
				-    "--seed",
			
 
				-    type=int,
			
 
				-    help="random seed to generate the environment with",
			
 
				-    default=-1
			
 
				+    "--seed", type=int, help="random seed to generate the environment with", default=-1
			
 
				 )
			
 
				 parser.add_argument(
			
 
				-    "--tile_size",
			
 
				-    type=int,
			
 
				-    help="size at which to render tiles",
			
 
				-    default=32
			
 
				+    "--tile_size", type=int, help="size at which to render tiles", default=32
			
 
				 )
			
 
				 parser.add_argument(
			
 
				-    '--agent_view',
			
 
				+    "--agent_view",
			
 
				     default=False,
			
 
				     help="draw the agent sees (partially observable view)",
			
 
				-    action='store_true'
			
 
				+    action="store_true",
			
 
				 )
			
 
				 
			
 
				 args = parser.parse_args()
			
 
				 
			
 
				-env = gym.make(args.env)
			
 
				+env = gym.make(args.env, render_mode="rgb_array")
			
 
				 
			
 
				 if args.agent_view:
			
 
				     env = RGBImgPartialObsWrapper(env)
			
 
				     env = ImgObsWrapper(env)
			
 
				 
			
 
				-window = Window('gym_minigrid - ' + args.env)
			
 
				+window = Window("gym_minigrid - " + args.env)
			
 
				 window.reg_key_handler(key_handler)
			
 
				 
			
 
				 reset()
			
--- a/py.Dockerfile
+++ b/py.Dockerfile
@@ -0,0 +1,12 @@
 
				+# A Dockerfile that sets up a full Gym install with test dependencies
			
 
				+ARG PYTHON_VERSION
			
 
				+FROM python:$PYTHON_VERSION
			
 
				+
			
 
				+SHELL ["/bin/bash", "-o", "pipefail", "-c"]
			
 
				+
			
 
				+RUN apt-get -y update
			
 
				+
			
 
				+COPY . /usr/local/gym_minigrid/
			
 
				+WORKDIR /usr/local/gym_minigrid/
			
 
				+
			
 
				+RUN pip install .[testing] --no-cache-dir
			
--- a/requirements.txt
+++ b/requirements.txt
@@ -0,0 +1,3 @@
 
				+numpy>=1.18.0
			
 
				+gym>=0.25
			
 
				+matplotlib>=3.0
			
--- a/run_tests.py
+++ b/run_tests.py
@@ -1,37 +1,46 @@
 
				 #!/usr/bin/env python3
			
 
				 
			
 
				 import random
			
 
				-import numpy as np
			
 
				+
			
 
				 import gym
			
 
				-import gym_minigrid
			
 
				-from gym_minigrid.register import env_list
			
 
				-from gym_minigrid.minigrid import Grid, OBJECT_TO_IDX
			
 
				+import numpy as np
			
 
				+from gym import spaces
			
 
				 
			
 
				-# Test specifically importing a specific environment
			
 
				-from gym_minigrid.envs import DoorKeyEnv
			
 
				+from gym_minigrid.envs.empty import EmptyEnv5x5
			
 
				+from gym_minigrid.minigrid import Grid
			
 
				+from gym_minigrid.register import env_list
			
 
				+from gym_minigrid.wrappers import (
			
 
				+    DictObservationSpaceWrapper,
			
 
				+    FlatObsWrapper,
			
 
				+    FullyObsWrapper,
			
 
				+    ImgObsWrapper,
			
 
				+    OneHotPartialObsWrapper,
			
 
				+    ReseedWrapper,
			
 
				+    RGBImgObsWrapper,
			
 
				+    RGBImgPartialObsWrapper,
			
 
				+    ViewSizeWrapper,
			
 
				+)
			
 
				 
			
 
				 # Test importing wrappers
			
 
				-from gym_minigrid.wrappers import *
			
 
				 
			
 
				-##############################################################################
			
 
				 
			
 
				-print('%d environments registered' % len(env_list))
			
 
				+print("%d environments registered" % len(env_list))
			
 
				 
			
 
				 for env_idx, env_name in enumerate(env_list):
			
 
				-    print('testing {} ({}/{})'.format(env_name, env_idx+1, len(env_list)))
			
 
				+    print(f"testing {env_name} ({env_idx + 1}/{len(env_list)})")
			
 
				 
			
 
				     # Load the gym environment
			
 
				-    env = gym.make(env_name)
			
 
				+    env = gym.make(env_name, render_mode="rgb_array")
			
 
				     env.max_steps = min(env.max_steps, 200)
			
 
				     env.reset()
			
 
				-    env.render('rgb_array')
			
 
				+    env.render()
			
 
				 
			
 
				     # Verify that the same seed always produces the same environment
			
 
				     for i in range(0, 5):
			
 
				         seed = 1337 + i
			
 
				-        env.seed(seed)
			
 
				+        _ = env.reset(seed=seed)
			
 
				         grid1 = env.grid
			
 
				-        env.seed(seed)
			
 
				+        _ = env.reset(seed=seed)
			
 
				         grid2 = env.grid
			
 
				         assert grid1 == grid2
			
 
				 
			
@@ -50,7 +59,7 @@ for env_idx, env_name in enumerate(env_list):
 
				         assert env.agent_pos[1] < env.height
			
 
				 
			
 
				         # Test observation encode/decode roundtrip
			
 
				-        img = obs['image']
			
 
				+        img = obs["image"]
			
 
				         grid, vis_mask = Grid.decode(img)
			
 
				         img2 = grid.encode(vis_mask=vis_mask)
			
 
				         assert np.array_equal(img, img2)
			
@@ -66,7 +75,7 @@ for env_idx, env_name in enumerate(env_list):
 
				             num_episodes += 1
			
 
				             env.reset()
			
 
				 
			
 
				-        env.render('rgb_array')
			
 
				+        env.render()
			
 
				 
			
 
				     # Test the close method
			
 
				     env.close()
			
@@ -89,7 +98,7 @@ for env_idx, env_name in enumerate(env_list):
 
				     env = FullyObsWrapper(env)
			
 
				     env.reset()
			
 
				     obs, _, _, _ = env.step(0)
			
 
				-    assert obs['image'].shape == env.observation_space.spaces['image'].shape
			
 
				+    assert obs["image"].shape == env.observation_space.spaces["image"].shape
			
 
				     env.close()
			
 
				 
			
 
				     # RGB image observation wrapper
			
@@ -97,7 +106,7 @@ for env_idx, env_name in enumerate(env_list):
 
				     env = RGBImgPartialObsWrapper(env)
			
 
				     env.reset()
			
 
				     obs, _, _, _ = env.step(0)
			
 
				-    assert obs['image'].mean() > 0
			
 
				+    assert obs["image"].mean() > 0
			
 
				     env.close()
			
 
				 
			
 
				     env = gym.make(env_name)
			
@@ -112,20 +121,25 @@ for env_idx, env_name in enumerate(env_list):
 
				     env.step(0)
			
 
				     env.close()
			
 
				 
			
 
				-    # Test the wrappers return proper observation spaces.
			
 
				-    wrappers = [
			
 
				-        RGBImgObsWrapper,
			
 
				-        RGBImgPartialObsWrapper,
			
 
				-        OneHotPartialObsWrapper
			
 
				+    # Test the DictObservationSpaceWrapper
			
 
				+    env = gym.make(env_name)
			
 
				+    env = DictObservationSpaceWrapper(env)
			
 
				+    env.reset()
			
 
				+    mission = env.mission
			
 
				+    obs, _, _, _ = env.step(0)
			
 
				+    assert env.string_to_indices(mission) == [
			
 
				+        value for value in obs["mission"] if value != 0
			
 
				     ]
			
 
				+    env.close()
			
 
				+
			
 
				+    # Test the wrappers return proper observation spaces.
			
 
				+    wrappers = [RGBImgObsWrapper, RGBImgPartialObsWrapper, OneHotPartialObsWrapper]
			
 
				     for wrapper in wrappers:
			
 
				-        env = wrapper(gym.make(env_name))
			
 
				+        env = wrapper(gym.make(env_name, render_mode="rgb_array"))
			
 
				         obs_space, wrapper_name = env.observation_space, wrapper.__name__
			
 
				         assert isinstance(
			
 
				             obs_space, spaces.Dict
			
 
				-        ), "Observation space for {0} is not a Dict: {1}.".format(
			
 
				-            wrapper_name, obs_space
			
 
				-        )
			
 
				+        ), f"Observation space for {wrapper_name} is not a Dict: {obs_space}."
			
 
				         # This should not fail either
			
 
				         ImgObsWrapper(env)
			
 
				         env.reset()
			
@@ -134,30 +148,34 @@ for env_idx, env_name in enumerate(env_list):
 
				 
			
 
				 ##############################################################################
			
 
				 
			
 
				-print('testing extra observations')
			
 
				-class EmptyEnvWithExtraObs(gym_minigrid.envs.EmptyEnv5x5):
			
 
				+print("testing extra observations")
			
 
				+
			
 
				+
			
 
				+class EmptyEnvWithExtraObs(EmptyEnv5x5):
			
 
				     """
			
 
				     Custom environment with an extra observation
			
 
				     """
			
 
				-    def __init__(self) -> None:
			
 
				-        super().__init__()
			
 
				-        self.observation_space['size'] = spaces.Box(
			
 
				+
			
 
				+    def __init__(self, **kwargs) -> None:
			
 
				+        super().__init__(**kwargs)
			
 
				+        self.observation_space["size"] = spaces.Box(
			
 
				             low=0,
			
 
				-            high=np.iinfo(np.uint).max,
			
 
				+            high=1000,  # gym does not like np.iinfo(np.uint).max,
			
 
				             shape=(2,),
			
 
				-            dtype=np.uint
			
 
				+            dtype=np.uint,
			
 
				         )
			
 
				 
			
 
				-    def reset(self):
			
 
				-        obs = super().reset()
			
 
				-        obs['size'] = np.array([self.width, self.height])
			
 
				+    def reset(self, **kwargs):
			
 
				+        obs = super().reset(**kwargs)
			
 
				+        obs["size"] = np.array([self.width, self.height], dtype=np.uint)
			
 
				         return obs
			
 
				 
			
 
				     def step(self, action):
			
 
				         obs, reward, done, info = super().step(action)
			
 
				-        obs['size'] = np.array([self.width, self.height])
			
 
				+        obs["size"] = np.array([self.width, self.height], dtype=np.uint)
			
 
				         return obs, reward, done, info
			
 
				 
			
 
				+
			
 
				 wrappers = [
			
 
				     OneHotPartialObsWrapper,
			
 
				     RGBImgObsWrapper,
			
@@ -165,37 +183,34 @@ wrappers = [
 
				     FullyObsWrapper,
			
 
				 ]
			
 
				 for wrapper in wrappers:
			
 
				-    env1 = wrapper(EmptyEnvWithExtraObs())
			
 
				-    env2 = wrapper(gym.make('MiniGrid-Empty-5x5-v0'))
			
 
				-
			
 
				-    env1.seed(0)
			
 
				-    env2.seed(0)
			
 
				-
			
 
				-    obs1 = env1.reset()
			
 
				-    obs2 = env2.reset()
			
 
				-    assert 'size' in obs1
			
 
				-    assert obs1['size'].shape == (2,)
			
 
				-    assert (obs1['size'] == [5,5]).all()
			
 
				+    env1 = wrapper(EmptyEnvWithExtraObs(render_mode="rgb_array"))
			
 
				+    env2 = wrapper(gym.make("MiniGrid-Empty-5x5-v0", render_mode="rgb_array"))
			
 
				+
			
 
				+    obs1 = env1.reset(seed=0)
			
 
				+    obs2 = env2.reset(seed=0)
			
 
				+    assert "size" in obs1
			
 
				+    assert obs1["size"].shape == (2,)
			
 
				+    assert (obs1["size"] == [5, 5]).all()
			
 
				     for key in obs2:
			
 
				         assert np.array_equal(obs1[key], obs2[key])
			
 
				 
			
 
				     obs1, reward1, done1, _ = env1.step(0)
			
 
				     obs2, reward2, done2, _ = env2.step(0)
			
 
				-    assert 'size' in obs1
			
 
				-    assert obs1['size'].shape == (2,)
			
 
				-    assert (obs1['size'] == [5,5]).all()
			
 
				+    assert "size" in obs1
			
 
				+    assert obs1["size"].shape == (2,)
			
 
				+    assert (obs1["size"] == [5, 5]).all()
			
 
				     for key in obs2:
			
 
				         assert np.array_equal(obs1[key], obs2[key])
			
 
				 
			
 
				 ##############################################################################
			
 
				 
			
 
				-print('testing agent_sees method')
			
 
				-env = gym.make('MiniGrid-DoorKey-6x6-v0')
			
 
				+print("testing agent_sees method")
			
 
				+env = gym.make("MiniGrid-DoorKey-6x6-v0")
			
 
				 goal_pos = (env.grid.width - 2, env.grid.height - 2)
			
 
				 
			
 
				 # Test the "in" operator on grid objects
			
 
				-assert ('green', 'goal') in env.grid
			
 
				-assert ('blue', 'key') not in env.grid
			
 
				+assert ("green", "goal") in env.grid
			
 
				+assert ("blue", "key") not in env.grid
			
 
				 
			
 
				 # Test the env.agent_sees() function
			
 
				 env.reset()
			
@@ -203,8 +218,8 @@ for i in range(0, 500):
 
				     action = random.randint(0, env.action_space.n - 1)
			
 
				     obs, reward, done, info = env.step(action)
			
 
				 
			
 
				-    grid, _ = Grid.decode(obs['image'])
			
 
				-    goal_visible = ('green', 'goal') in grid
			
 
				+    grid, _ = Grid.decode(obs["image"])
			
 
				+    goal_visible = ("green", "goal") in grid
			
 
				 
			
 
				     agent_sees_goal = env.agent_sees(*goal_pos)
			
 
				     assert agent_sees_goal == goal_visible
			
--- a/setup.py
+++ b/setup.py
@@ -11,28 +11,36 @@ with open("README.md") as fh:
 
				         else:
			
 
				             break
			
 
				 
			
 
				+# pytest is pinned to 7.0.1 as this is last version for python 3.6
			
 
				+extras = {"testing": ["pytest==7.0.1"]}
			
 
				+
			
 
				 setup(
			
 
				-    name='gym_minigrid',
			
 
				+    name="gym_minigrid",
			
 
				     author="Farama Foundation",
			
 
				     author_email="jkterry@farama.org",
			
 
				-    version='1.0.2',
			
 
				-    keywords='memory, environment, agent, rl, gym',
			
 
				-    url='https://github.com/Farama-Foundation/gym-minigrid',
			
 
				-    description='Minimalistic gridworld reinforcement learning environments',
			
 
				-    packages=['gym_minigrid', 'gym_minigrid.envs'],
			
 
				+    classifiers=[
			
 
				+        "Development Status :: 5 - Production/Stable",
			
 
				+        "Programming Language :: Python :: 3",
			
 
				+        "Programming Language :: Python :: 3.6",
			
 
				+        "Programming Language :: Python :: 3.7",
			
 
				+        "Programming Language :: Python :: 3.8",
			
 
				+        "Programming Language :: Python :: 3.9",
			
 
				+        "Programming Language :: Python :: 3.10",
			
 
				+    ],
			
 
				+    version="1.1.0",
			
 
				+    keywords="memory, environment, agent, rl, gym",
			
 
				+    url="https://github.com/Farama-Foundation/gym-minigrid",
			
 
				+    description="Minimalistic gridworld reinforcement learning environments",
			
 
				+    extras_require=extras,
			
 
				+    packages=["gym_minigrid", "gym_minigrid.envs"],
			
 
				+    license="Apache",
			
 
				     long_description=long_description,
			
 
				-    python_requires=">=3.7, <3.11",
			
 
				     long_description_content_type="text/markdown",
			
 
				     install_requires=[
			
 
				-        'gym>=0.24.0',
			
 
				-        "numpy>=1.18.0"
			
 
				+        "gym>=0.25.0",
			
 
				+        "numpy>=1.18.0",
			
 
				+        "matplotlib>=3.0",
			
 
				     ],
			
 
				-    classifiers=[
			
 
				-    "Development Status :: 5 - Production/Stable",
			
 
				-    "Programming Language :: Python :: 3",
			
 
				-    "Programming Language :: Python :: 3.7",
			
 
				-    "Programming Language :: Python :: 3.8",
			
 
				-    "Programming Language :: Python :: 3.9",
			
 
				-    "Programming Language :: Python :: 3.10",
			
 
				-],
			
 
				+    python_requires=">=3.6",
			
 
				+    tests_require=extras["testing"],
			
 
				 )
			
--- a/test_interactive_mode.py
+++ b/test_interactive_mode.py
@@ -1,16 +1,16 @@
 
				 #!/usr/bin/env python3
			
 
				 
			
 
				-import time
			
 
				 import random
			
 
				+import time
			
 
				+
			
 
				 import gym
			
 
				-import gym_minigrid
			
 
				 
			
 
				 # Load the gym environment
			
 
				-env = gym.make('MiniGrid-Empty-8x8-v0')
			
 
				+env = gym.make("MiniGrid-Empty-8x8-v0")
			
 
				 env.reset()
			
 
				 
			
 
				 for i in range(0, 100):
			
 
				-    print("step {}".format(i))
			
 
				+    print(f"step {i}")
			
 
				 
			
 
				     # Pick a random action
			
 
				     action = random.randint(0, env.action_space.n - 1)
			
--- a/test_requirements.txt
+++ b/test_requirements.txt
@@ -0,0 +1 @@
 
				+pytest==7.0.1