浏览代码

grass.jupyter: Return session from init (#1834)

Calling grass.script.setup.init needs to paired with a call to finish. However, given the iterative nature of working in notebook, an explicit call to finish would have to be commented out in most notebooks with notes about nothing working once this is called, but with an explanation of the need to call it at some point (and that would be confusing).

Several common solution don't work well with notebooks. Context manager would have to be limited to one call which is exactly what we don't want for a session which should be available in the whole notebook. atexit does not or may not work in Jupyter (IPython kernel). It didn't work not for me locally and the Internet is full of various notes about not working in certain cases or versions.

Therefore, the new session object returned from grass.jupyter.setup.init uses the weakref.finalize technique (aka finalizer) to call the finish procedure. This technique is used, e.g., in the Python standard library temporary directory object. The finalize technique guarantees that the "cleanup" method will be called (unlike the `__del__` method is which is not guaranteed to be called). This works together with keeping reference to one instance of global session in the module (keeping the session reference by the user is optional). The finish function is called at kernel end or restart unless called manually before that (multiple finish (finalizer) calls are okay and ignored).

The documentation does not discuss the finalizer and garbage collection issues in detail, because it works in a expected and straightforward way, although the reasons for it are complex.

The new class represents a global GRASS session and the init function keeps a module-level private global reference to one session only. The session class is used as a singleton, although not strictly (new objects are sometimes created) and without any enforcement (nothing in the code prevents from creating more objects). However, the init function, documentation, and the class name starting with underscore supports the right use of the class.

The session object is always created only by the init function. If the session object does not exist yet, it is created. If the session was finished (not active), a new session object is created. If the session exists and is active, the init function switches the mapset.

In other words, the init function can be called more than once and the user doesn't need to keep the returned reference. However, the returned reference to the session object can be used to handle the session. The same could be done with standalone functions, but this may be good for consistency with (future version of) the grass.script.setup.init function which needs a similar session object to provide context manager functionality.

The session object takes care of switching the mapset although there is nothing in the object needed to do that since the state is global anyway. The session object can be also used to finish (terminate) the session explicitly (mapset part only, runtime stays the same). User needs can hold result of init, e.g., `session = init(...)`, but does not have to. Notebooks were updated to use `session = init(...)` but the session is mostly not used further.

Not assigning the reference to a variable results in printing of `<grass.jupyter.setup._JupyterGlobalSession at ...>` after the cell because the call to init is usually the last item in the cell. This is not pretty, but it is common enough not to create confusion. Notebooks can always assign to variable to avoid that (although the variable is then unused).

The `switch_mapset` method of the session object takes the same parameters as init, i.e., can do the mapset path versus database path, location name, mapset name resolution. Additionally, it also accepts just mapset name to change mapset within the current location. It uses g.gisenv (and not g.mapset) to switch mapset because grass.script.setup.init doesn't lock the mapset (while g.mapset, so that would create inconsistency).
Vaclav Petras 3 年之前
父节点
当前提交
45c1e7f47a

+ 1 - 1
doc/notebooks/basic_example_grass_jupyter.ipynb

@@ -46,7 +46,7 @@
     "import grass.jupyter as gj\n",
     "\n",
     "# Start GRASS Session\n",
-    "gj.init(\"../../data/grassdata\", \"nc_basic_spm_grass7\", \"user1\")"
+    "session = gj.init(\"../../data/grassdata\", \"nc_basic_spm_grass7\", \"user1\")"
    ]
   },
   {

+ 55 - 20
doc/notebooks/grass_jupyter.ipynb

@@ -40,7 +40,7 @@
     "import grass.jupyter as gj\n",
     "\n",
     "# Start GRASS Session\n",
-    "gj.init(\"../../data/grassdata\", \"nc_basic_spm_grass7\", \"user1\")\n",
+    "session = gj.init(\"../../data/grassdata\", \"nc_basic_spm_grass7\", \"user1\")\n",
     "\n",
     "# Set computational region to the elevation raster.\n",
     "gs.run_command(\"g.region\", raster=\"elevation\")"
@@ -293,66 +293,101 @@
    "metadata": {},
    "source": [
     "Now, render a 3D visualization of an elevation raster as a surface colored using, again, the elevation raster:"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "metadata": {},
+   "outputs": [],
    "source": [
     "img.render(elevation_map=\"elevation\", color_map=\"elevation\", perspective=20)"
-   ],
-   "outputs": [],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "To add a raster legend on the image as an overlay using the 2D rendering capabilities accessible with `overlay.d_legend`:"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "metadata": {},
+   "outputs": [],
    "source": [
     "img.overlay.d_legend(raster=\"elevation\", at=(60, 97, 87, 92))"
-   ],
-   "outputs": [],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "Finally, we show "
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "metadata": {},
+   "outputs": [],
    "source": [
     "img.show()"
-   ],
-   "outputs": [],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "Now, let's color the elevation surface using a landuse raster (note that the call to `render` removes the result of the previous `render` as well as the current overlays):"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "metadata": {},
+   "outputs": [],
    "source": [
     "img.render(elevation_map=\"elevation\", color_map=\"landuse\", perspective=20)\n",
     "img.show()"
-   ],
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Switching Mapsets and Session Management\n",
+    "\n",
+    "The `init` function returns a reference to a session object which can be used to manipulate the current session. The session is global, i.e., the global state of the environment is changed. The session object is a handle for accessing this global session. When the kernel for the notebooks shuts down or is restarted, the session ends automatically. The session can be explicitly ended using `session.finish()`, but that's usually not needed in notebooks.\n",
+    "\n",
+    "Additionally, the session object can be used to change the current mapset. Here, we will switch to mapset called *PERMANENT*:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
    "outputs": [],
-   "metadata": {}
+   "source": [
+    "session.switch_mapset(\"PERMANENT\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we could add more data to the PERMANENT mapset or modify the existing data there. We don't need to do anything there, so we switch back to the mapset we were in before:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "session.switch_mapset(\"user1\")"
+   ]
   }
  ],
  "metadata": {

+ 1 - 1
doc/notebooks/hydrology.ipynb

@@ -43,7 +43,7 @@
     "import grass.jupyter as gj\n",
     "\n",
     "# Start GRASS Session\n",
-    "gj.init(\"../../data/grassdata\", \"nc_basic_spm_grass7\", \"user1\")\n",
+    "session = gj.init(\"../../data/grassdata\", \"nc_basic_spm_grass7\", \"user1\")\n",
     "\n",
     "# Set computational region to elevation raster\n",
     "gs.run_command(\"g.region\", raster=\"elevation\", flags=\"pg\")"

+ 1 - 1
doc/notebooks/solar_potential.ipynb

@@ -42,7 +42,7 @@
     "import grass.jupyter as gj\n",
     "\n",
     "# Start GRASS Session\n",
-    "gj.init(\"../../data/grassdata\", \"nc_basic_spm_grass7\", \"user1\")\n",
+    "session = gj.init(\"../../data/grassdata\", \"nc_basic_spm_grass7\", \"user1\")\n",
     "\n",
     "# Set computational region to elevation raster\n",
     "gs.run_command(\"g.region\", raster=\"elevation@PERMANENT\", flags=\"pg\")"

+ 1 - 1
doc/notebooks/viewshed_analysis.ipynb

@@ -44,7 +44,7 @@
     "import grass.jupyter as gj\n",
     "\n",
     "# Start GRASS Session\n",
-    "gj.init(\"../../data/grassdata\", \"nc_basic_spm_grass7\", \"user1\")\n",
+    "session = gj.init(\"../../data/grassdata\", \"nc_basic_spm_grass7\", \"user1\")\n",
     "\n",
     "# Set computational region to elevation raster\n",
     "gs.run_command(\"g.region\", raster=\"elevation@PERMANENT\", flags=\"pg\")"

+ 137 - 15
python/grass/jupyter/setup.py

@@ -1,24 +1,29 @@
 # MODULE:    grass.jupyter.setup
 #
 # AUTHOR(S): Caitlin Haedrich <caitlin DOT haedrich AT gmail>
+#            Vaclav Petras <wenzeslaus gmail com>
 #
 # PURPOSE:   This module contains functions for launching a GRASS session
-#           in Jupyter Notebooks
+#            in Jupyter Notebooks
 #
-# COPYRIGHT: (C) 2021 Caitlin Haedrich, and by the GRASS Development Team
+# COPYRIGHT: (C) 2021-2022 Caitlin Haedrich, and by the GRASS Development Team
 #
-#           This program is free software under the GNU General Public
-#           License (>=v2). Read the file COPYING that comes with GRASS
-#           for details.
+#            This program is free software under the GNU General Public
+#            License (>=v2). Read the file COPYING that comes with GRASS
+#            for details.
+
+"""Initialization GRASS GIS session and its finalization"""
 
 import os
+import weakref
 
 import grass.script as gs
 import grass.script.setup as gsetup
 
 
 def _set_notebook_defaults():
-    """
+    """Set defaults appropriate for Jupyter Notebooks.
+
     This function sets several GRASS environment variables that are
     important for GRASS to run smoothly in Jupyter.
 
@@ -33,16 +38,133 @@ def _set_notebook_defaults():
     os.environ["GRASS_OVERWRITE"] = "1"
 
 
-def init(path, location=None, mapset=None, grass_path=None):
+class _JupyterGlobalSession:
+    """Represents a global GRASS session for Jupyter Notebooks.
+
+    Do not create objects of this class directly. Use the standalone *init* function
+    and an object will be returned to you, e.g.:
+
+    >>> import grass.jupyter as gj
+    >>> session = gj.init(...)
+
+    An object ends the session when it is destroyed or when the *finish* method is
+    called explicitely.
+
+    Notably, only the mapset is closed, but the libraries and GRASS modules
+    remain on path.
     """
-    This function initiates a GRASS session and sets GRASS
-    environment variables.
 
-    :param str path: path to grass databases
-    :param str location: name of GRASS location
+    def __init__(self):
+        self._finalizer = weakref.finalize(self, gsetup.finish)
+
+    def switch_mapset(self, path, location=None, mapset=None):
+        """Switch to a mapset provided as a name or path.
+
+        The mapset can be provided as a name, as a path,
+        or as database, location, and mapset.
+        Specifically, the *path* positional-only parameter can be either
+        name of a mapset in the current location or a full path to a mapset.
+        When location and mapset are provided using the additional parameters,
+        the *path* parameter is path to a database.
+
+        Raises ValueError if the mapset does not exist (e.g., when the name is
+        misspelled or the mapset is invalid).
+        """
+        # The method could be a function, but this is more general (would work even for
+        # a non-global session).
+        # pylint: disable=no-self-use
+        # Functions needed only here.
+        # pylint: disable=import-outside-toplevel
+        from grass.grassdb.checks import (
+            get_mapset_invalid_reason,
+            is_mapset_valid,
+            mapset_exists,
+        )
+        from grass.grassdb.manage import resolve_mapset_path
+
+        # For only one parameter, try if it is a mapset in the current location to
+        # support switching only by its name.
+        gisenv = gs.gisenv()
+        if (
+            not location
+            and not mapset
+            and mapset_exists(
+                path=gisenv["GISDBASE"], location=gisenv["LOCATION_NAME"], mapset=path
+            )
+        ):
+            gs.run_command("g.gisenv", set=f"MAPSET={path}")
+            return
+
+        mapset_path = resolve_mapset_path(path=path, location=location, mapset=mapset)
+        if not is_mapset_valid(mapset_path):
+            raise ValueError(
+                _("Mapset {path} is not valid: {reason}").format(
+                    path=mapset_path.path,
+                    reason=get_mapset_invalid_reason(
+                        mapset_path.directory, mapset_path.location, mapset_path.mapset
+                    ),
+                )
+            )
+        # This requires direct session file modification using g.gisenv because
+        # g.mapset locks the mapset which is not how init and finish behave.
+        # For code simplicity, we just change all even when only mapset is changed.
+        gs.run_command("g.gisenv", set=f"GISDBASE={mapset_path.directory}")
+        gs.run_command("g.gisenv", set=f"LOCATION_NAME={mapset_path.location}")
+        gs.run_command("g.gisenv", set=f"MAPSET={mapset_path.mapset}")
+
+    def finish(self):
+        """Close the session, i.e., close the open mapset.
+
+        Subsequent calls to GRASS GIS modules will fail because there will be
+        no current (open) mapset anymore.
+
+        The finish procedure is done automatically when process finishes or the object
+        is destroyed.
+        """
+        self._finalizer()
+
+    @property
+    def active(self):
+        """True unless the session was finalized (e.g., with the *finish* function)"""
+        return self._finalizer.alive
+
+
+_global_session_handle = None
+
+
+def init(path, location=None, mapset=None, grass_path=None):
+    """Initiates a GRASS session and sets GRASS environment variables.
+
+    Calling this function returns an object which represents the session.
+
+    >>> import grass.jupyter as gj
+    >>> session = gj.init(...)
+
+    The session is ended when `session.finish` is called or when the object is
+    destroyed when kernel ends or restarts. This function returns a copy of an
+    internally kept reference, so the return value can be safely ignored when not
+    needed.
+
+    The returned object can be used to switch to another mapset:
+
+    >>> session.switch_mapset("mapset_name")
+
+    Subsequent calls to the *init* function result in switching the mapset if
+    a session is active and result in creation of new session if it is not active.
+    On the other hand, if you see ``GISRC - variable not set`` after calling
+    a GRASS module, you know you don't have an active GRASS session.
+
+    :param str path: path to GRASS mapset or database
+    :param str location: name of GRASS location within the database
     :param str mapset: name of mapset within location
     """
-    # Create a GRASS GIS session.
-    gsetup.init(path, location=location, mapset=mapset, grass_path=grass_path)
-    # Set GRASS env. variables
-    _set_notebook_defaults()
+    global _global_session_handle  # pylint: disable=global-statement
+    if not _global_session_handle or not _global_session_handle.active:
+        # Create a GRASS session.
+        gsetup.init(path, location=location, mapset=mapset, grass_path=grass_path)
+        # Set defaults for environmental variables and library.
+        _set_notebook_defaults()
+        _global_session_handle = _JupyterGlobalSession()
+    else:
+        _global_session_handle.switch_mapset(path, location=location, mapset=mapset)
+    return _global_session_handle