浏览代码

r.surf.random: double min max for DCELL and r.mapcalc-style random numbers (#322)

The original code did not allow for floating point numbers
as min and max parameters. The decimal part, if provided, was ignored without
any warning. Now the code accepts doubles for DCELL output and ints for
CELL output. Numbers not fully parseable as ints are now rejected by an
explicit check. (The min and max values travel through the code as doubles
in both cases.)

With the original implementation, the max was could have been exceeded for CELL
maps due to the equation used. On the other hand, the min might not have been reached
with the similar probability as max could have been exceeded for ranges
which contained zero.

For example, in 7.6.1 using 10,000,0000 cells and
-i (CELL output), parameters min=2 max=15 produce histogram (r.report u=p,c)
with bins 2-15 approximatelly 7.14-15% and bin 16 with 2-4 cells.
Same setup with min=-2 max=13 produces similar bin 14 with 0-5 cells,
bin -2 is 0 or 1 cell, bins -1,1-15 6.25% and bin 0 12.50%.

Now using the method r.mapcalc is using for rand()
implemented in f_rand() in xrand.c which does not suffer from the same issues.
For consistency, using the same implementation also for DCELL,
so now both CELL and DCELL options are now using functions from gis.h:
G_mrand48() and G_drand48(). Consequently, the gmath.h dependency
was removed (header file includes and in the Makefile) together with unused math.h.

Additionally to enforcing the int, a parameter check also enforces that min <= max
since min > max seems like a input error. (This is different from r.mapcalc which
flips the values, but there we assume less control over the input and it is too
late to report a message, so it is appropriate and more practical to be more
permissive.) min == max is allowed and not treated in a special way.

Raster history is now being written as well as a map title which contains
the range with interval written differently for CELL and DCELL as
the CELL contains max while DCELL does not (based on [0,1) from G_drand48()).

Test is provided for the min-max comparison and int-double check.
A larger test provided for the generated values as well, although
even with a larger number of cells the desired result might not achieved.
A fixed GRASS_RANDOM_SEED is provided for stability (but it may not be the
one making a future problem to show, although it does for 7.6.2).
Also the exceeded max is much easier to catch than the not reached min,
but both range and closeness are checked for both CELL and DCELL.
Vaclav Petras 3 年之前
父节点
当前提交
866c7f057a

+ 2 - 2
raster/r.surf.random/Makefile

@@ -2,8 +2,8 @@ MODULE_TOPDIR = ../..
 
 
 PGM = r.surf.random
 PGM = r.surf.random
 
 
-LIBES = $(GMATHLIB) $(RASTERLIB) $(GISLIB)
-DEPENDENCIES = $(GMATHDEP) $(RASTERDEP) $(GISDEP)
+LIBES = $(RASTERLIB) $(GISLIB)
+DEPENDENCIES = $(RASTERDEP) $(GISDEP)
 
 
 include $(MODULE_TOPDIR)/include/Make/Module.make
 include $(MODULE_TOPDIR)/include/Make/Module.make
 
 

+ 1 - 1
raster/r.surf.random/local_proto.h

@@ -1,2 +1,2 @@
 /* randsurf.c */
 /* randsurf.c */
-int randsurf(char *, int, int, int);
+int randsurf(char *, double, double, int);

+ 88 - 6
raster/r.surf.random/main.c

@@ -5,8 +5,9 @@
  * AUTHOR(S):    Jo Wood, 19th October, 23rd October 1991 (original contributor)
  * AUTHOR(S):    Jo Wood, 19th October, 23rd October 1991 (original contributor)
  *               Midlands Regional Research Laboratory (ASSIST)
  *               Midlands Regional Research Laboratory (ASSIST)
  * AUTHOR(S):    Markus Neteler <neteler itc.it> (original contributor)
  * AUTHOR(S):    Markus Neteler <neteler itc.it> (original contributor)
+ *               Vaclav Petras <wenzeslaus gmail com>
  * PURPOSE:      produces a raster map layer of uniform random deviates
  * PURPOSE:      produces a raster map layer of uniform random deviates
- * COPYRIGHT:    (C) 1999-2006, 2010 by the GRASS Development Team
+ * COPYRIGHT:    (C) 1999-2020 by the GRASS Development Team
  *
  *
  *               This program is free software under the GNU General Public
  *               This program is free software under the GNU General Public
  *               License (>=v2). Read the file COPYING that comes with GRASS
  *               License (>=v2). Read the file COPYING that comes with GRASS
@@ -15,10 +16,50 @@
  *****************************************************************************/
  *****************************************************************************/
 
 
 #include <stdlib.h>
 #include <stdlib.h>
+#include <string.h>
+
 #include <grass/gis.h>
 #include <grass/gis.h>
 #include <grass/glocale.h>
 #include <grass/glocale.h>
+#include <grass/raster.h>
 #include "local_proto.h"
 #include "local_proto.h"
 
 
+
+/** Return TRUE if text contains only an integer number
+ *
+ * @param buffer text with the number or other content
+ * @returns TRUE if all chars are read as an int, FALSE otherwise
+ */
+int is_int_only(const char *buffer)
+{
+    int unused_value;
+    int num_items;
+    int num_characters;
+
+    /* %n is the number of characters read so far */
+    num_items = sscanf(buffer, "%d%n", &unused_value, &num_characters);
+    /* strlen returns size_t, but %n is int */
+    if (num_items == 1 && num_characters == (int)strlen(buffer)) {
+        return TRUE;
+    }
+    return FALSE;
+}
+
+/** Issue a fatal error if the option value is not integer
+ *
+ * This catches the cases when option is readble as integer,
+ * but there would be additional characters left.
+ * For example, when a number with a decimal point is read by C
+ * functions, the decimal part is simply truncated and an integer is
+ * read without an error. Although this might be fine in some cases,
+ * here it can lead to different results as well as to unclear metadata.
+ */
+void option_must_be_int(struct Option *option)
+{
+    if (!is_int_only(option->answer))
+        G_fatal_error(_("Option %s must be an integer, <%s> provided"),
+            option->key, option->answer);
+}
+
 int main(int argc, char *argv[])
 int main(int argc, char *argv[])
 {
 {
 
 
@@ -29,6 +70,11 @@ int main(int argc, char *argv[])
     struct Option *min;
     struct Option *min;
     struct Option *max;
     struct Option *max;
     struct Flag *i_flag;
     struct Flag *i_flag;
+    double min_value;
+    double max_value;
+
+    struct History history;
+    char title[64];
 
 
     G_gisinit(argv[0]);
     G_gisinit(argv[0]);
 
 
@@ -44,13 +90,13 @@ int main(int argc, char *argv[])
     min = G_define_option();
     min = G_define_option();
     min->key = "min";
     min->key = "min";
     min->description = _("Minimum random value");
     min->description = _("Minimum random value");
-    min->type = TYPE_INTEGER;
+    min->type = TYPE_DOUBLE;
     min->answer = "0";
     min->answer = "0";
 
 
     max = G_define_option();
     max = G_define_option();
     max->key = "max";
     max->key = "max";
     max->description = _("Maximum random value");
     max->description = _("Maximum random value");
-    max->type = TYPE_INTEGER;
+    max->type = TYPE_DOUBLE;
     max->answer = "100";
     max->answer = "100";
 
 
     i_flag = G_define_flag();
     i_flag = G_define_flag();
@@ -60,10 +106,46 @@ int main(int argc, char *argv[])
     if (G_parser(argc, argv))
     if (G_parser(argc, argv))
 	exit(EXIT_FAILURE);
 	exit(EXIT_FAILURE);
 
 
-    randsurf(out->answer, atoi(min->answer), atoi(max->answer),
-	     i_flag->answer);
+    min_value = atof(min->answer);
+    max_value = atof(max->answer);
+
+    /* We disallow max=5.5 for integer output since there are unclear
+     * expectations on what it should do. */
+    if (i_flag->answer) {
+        option_must_be_int(min);
+        option_must_be_int(max);
+    }
+
+    /* We disallow min > max as a likely mistake, but we allow
+     * min == max as a possible extreme case. */
+    if (min_value > max_value) {
+        /* showing the not parsed numbers to show exactly what user
+         * provided and to avoid any issues with formating %f vs %d */
+        G_fatal_error(_("Minimum %s should be higher than maximum %s,"
+                        " but %s > %s"),
+                        min->key, max->key, min->answer, max->answer);
+    }
+
+    randsurf(out->answer, min_value, max_value, i_flag->answer);
+
+    /* Using user-provided strings instead of attempting to guess the
+     * right formatting. */
+    if (i_flag->answer) {
+        sprintf(title,
+            _("Uniform random integer values in range [%s, %s]"),
+            min->answer, max->answer);
+    }
+    else {
+        sprintf(title,
+            _("Uniform random float values in range [%s, %s)"),
+            min->answer, max->answer);
+    }
+    Rast_put_cell_title(out->answer, title);
+    Rast_short_history(out->answer, "raster", &history);
+    Rast_command_history(&history);
+    Rast_write_history(out->answer, &history);
 
 
     G_done_msg(_("Raster map <%s> created."), out->answer);
     G_done_msg(_("Raster map <%s> created."), out->answer);
-    
+
     exit(EXIT_SUCCESS);
     exit(EXIT_SUCCESS);
 }
 }

+ 19 - 13
raster/r.surf.random/randsurf.c

@@ -1,14 +1,19 @@
-#include <math.h>
 #include <grass/gis.h>
 #include <grass/gis.h>
 #include <grass/raster.h>
 #include <grass/raster.h>
-#include <grass/gmath.h>
 #include <grass/glocale.h>
 #include <grass/glocale.h>
 
 
 
 
-int randsurf(char *out,		/* Name of raster maps to be opened.    */
-	     int min, int max,	/* Minimum and maximum cell values.     */
+/** Generate random values in a raster map
+ *
+ * @param out Name of raster maps to be opened
+ * @param min Minimum cell value
+ * @param min Maximum cell value
+ * @param int_map TRUE for a CELL map, FALSE for DCELL
+ */
+int randsurf(char *out,
+	     double min, double max,
 	     int int_map)
 	     int int_map)
-{				/* if map is to be written as a CELL map */
+{
     int nrows, ncols;		/* Number of cell rows and columns      */
     int nrows, ncols;		/* Number of cell rows and columns      */
 
 
     DCELL *row_out_D;		/* Buffer just large enough to hold one */
     DCELL *row_out_D;		/* Buffer just large enough to hold one */
@@ -21,7 +26,7 @@ int randsurf(char *out,		/* Name of raster maps to be opened.    */
 
 
 	/****** INITIALISE RANDOM NUMBER GENERATOR ******/
 	/****** INITIALISE RANDOM NUMBER GENERATOR ******/
     /* You can set GRASS_RANDOM_SEED for repeatability */
     /* You can set GRASS_RANDOM_SEED for repeatability */
-    G_math_srand_auto();
+    G_srand48_auto();
 
 
 	/****** OPEN CELL FILES AND GET CELL DETAILS ******/
 	/****** OPEN CELL FILES AND GET CELL DETAILS ******/
     fd_out = Rast_open_new(out, int_map ? CELL_TYPE : DCELL_TYPE);
     fd_out = Rast_open_new(out, int_map ? CELL_TYPE : DCELL_TYPE);
@@ -38,14 +43,15 @@ int randsurf(char *out,		/* Name of raster maps to be opened.    */
     for (row_count = 0; row_count < nrows; row_count++) {
     for (row_count = 0; row_count < nrows; row_count++) {
 	G_percent(row_count, nrows, 2);
 	G_percent(row_count, nrows, 2);
 	for (col_count = 0; col_count < ncols; col_count++) {
 	for (col_count = 0; col_count < ncols; col_count++) {
-	    if (int_map)
-		*(row_out_C + col_count) =
-		    (CELL) (G_math_rand() * (max + 1 - min) + min);
-	    /* under represents first and last bin */
-	    /*                  *(row_out_C + col_count) = (CELL) floor(rand1(2742)*(max-min)+min +0.5); */
-	    else
+	    if (int_map) {
+		unsigned int x = (unsigned int)G_mrand48();
+                *(row_out_C + col_count) =
+		    (CELL) (min + x % (unsigned int)(max + 1 - min));
+	    }
+	    else {
 		*(row_out_D + col_count) =
 		*(row_out_D + col_count) =
-		    (DCELL) (G_math_rand() * (max - min) + min);
+		    (DCELL) (min + G_drand48() * (max - min));
+	    }
 	}
 	}
 
 
 	/* Write contents row by row */
 	/* Write contents row by row */

+ 121 - 0
raster/r.surf.random/testsuite/test_min_max.py

@@ -0,0 +1,121 @@
+#!/usr/bin/env python3
+
+"""
+MODULE:    Test of r.surf.random
+
+AUTHOR(S): Vaclav Petras <wenzeslaus gmail com>
+
+PURPOSE:   Test of min and max paramters
+
+COPYRIGHT: (C) 2020 by Vaclav Petras and the GRASS Development Team
+
+This program is free software under the GNU General Public
+License (>=v2). Read the file COPYING that comes with GRASS
+for details.
+"""
+
+import os
+
+import grass.script as gs
+
+from grass.gunittest.case import TestCase
+from grass.gunittest.main import test
+
+
+def get_raster_min_max(raster_map):
+    """Get minimum and maximum value from raster metadata as a tuple"""
+    info = gs.raster_info(raster_map)
+    return info["min"], info["max"]
+
+
+class MinMaxTestCase(TestCase):
+    """Test min and max of r.surf.random module"""
+
+    # Raster map name be used as output
+    output = "random_result"
+
+    @classmethod
+    def setUpClass(cls):
+        """Ensures expected computational region"""
+        os.environ["GRASS_RANDOM_SEED"] = "42"
+        # modfying region just for this script
+        cls.use_temp_region()
+        # Only 100,000,000 seem to resonably (not 100%) ensure that all values
+        # are generated, so exceeding of ranges actually shows up.
+        cls.runModule("g.region", rows=10000, cols=10000)
+
+    @classmethod
+    def tearDownClass(cls):
+        """Remove the temporary region"""
+        cls.del_temp_region()
+
+    def tearDown(self):
+        """Remove the output created from the module"""
+        self.runModule("g.remove", flags="f", type="raster", name=[self.output])
+
+    def test_min_max_double(self):
+        """Check to see if double output has the expected range"""
+        min_value = -3.3
+        max_value = 5.8
+        # arbitrary, but with more cells, we expect higher precision
+        precision = 0.00001
+        self.assertModule(
+            "r.surf.random", min=min_value, max=max_value, output=self.output
+        )
+        self.assertRasterExists(self.output, msg="Output was not created")
+        self.assertRasterMinMax(
+            map=self.output,
+            refmin=min_value,
+            refmax=max_value,
+            msg="Output exceeds the min and max values from parameters",
+        )
+        self.assertRasterFitsInfo(
+            raster=self.output,
+            reference=dict(min=min_value, max=max_value),
+            precision=precision,
+            msg="Output min and max too far from parameters",
+        )
+
+    def test_min_max_int(self):
+        """Check to see if integer output has the expected range"""
+        # GRASS_RANDOM_SEED=42 r.surf.random output=test min=-2 max=13 -i
+        # in 7.6.2 causes all no 2 bin and extra 14 bin (also doubles 0).
+        min_value = -2
+        max_value = 13
+        precision = 0
+        self.assertModule(
+            "r.surf.random", min=min_value, max=max_value, output=self.output, flags="i"
+        )
+        self.assertRasterExists(self.output, msg="Output was not created")
+        self.assertRasterMinMax(
+            map=self.output,
+            refmin=min_value,
+            refmax=max_value,
+            msg="Output exceeds the min and max values from parameters",
+        )
+        self.assertRasterFitsInfo(
+            raster=self.output,
+            reference=dict(min=min_value, max=max_value),
+            precision=precision,
+            msg="Output min and max too far from parameters",
+        )
+
+    def test_double_params_with_int(self):
+        """Check if doubles instead of ints are refused"""
+        min_value = -3.3
+        max_value = 5.8
+        self.assertModuleFail(
+            "r.surf.random", min=min_value, max=max_value, output=self.output, flags="i"
+        )
+
+    def test_min_greater_than_max(self):
+        """Check if minimum greater than maximum is refused"""
+        min_value = 10
+        max_value = 5.8
+        self.assertModuleFail(
+            "r.surf.random", min=min_value, max=max_value, output=self.output
+        )
+
+
+if __name__ == "__main__":
+    test()