123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264 |
- <h2>DESCRIPTION</h2>
- <em>i.cluster</em>
- performs the first pass in the GRASS two-pass unsupervised
- classification of imagery, while the GRASS program <em>
- <a href="i.maxlik.html">i.maxlik</a></em> executes
- the second pass. Both programs must be run to complete the unsupervised
- classification.
- <p>
- <em>i.cluster</em> is a clustering algorithm that reads
- through the (raster) imagery data and builds pixel clusters
- based on the spectral reflectances of the pixels (see Figure).
- The pixel clusters are imagery categories that can be related
- to land cover types on the ground. The spectral
- distributions of the clusters (which will be the land cover
- spectral signatures) are influenced by six parameters set
- by the user. The first parameter set by the user is the
- initial number of clusters to be discriminated.
- <p>
- <center>
- <img src="landsat_cluster.png" border=1><br>
- <table border=0 width=590>
- <tr><td><center>
- <i>Fig.: Land use/land cover clustering of LANDSAT scene (simplified)</i>
- </center></td></tr>
- </table>
- </center>
- <p>
- <em>i.cluster</em> starts by generating spectral signatures
- for this number of clusters and "attempts" to end up with
- this number of clusters during the clustering process. The
- resulting number of clusters and their spectral
- distributions, however, are also influenced by the range of
- the spectral values (category values) in the image files
- and the other parameters set by the user. These parameters
- are: the minimum cluster size, minimum cluster separation,
- the percent convergence, the maximum number of iterations,
- and the row and column sampling intervals.
- <p>
- The cluster spectral signatures that result are composed of
- cluster means and covariance matrices. These cluster means
- and covariance matrices are used in the second pass
- (<em><a href="i.maxlik.html">i.maxlik</a></em>) to
- classify the image. The clusters or spectral classes
- result can be related to land cover types on the ground.
- The user has to specify the name of group file, the name of subgroup
- file, the name of a file to contain result signatures, the
- initial number of clusters to be discriminated, and
- optionally other parameters (see below)
- where the <em>group</em> should contain the imagery files
- that the user wishes to classify. The <em>subgroup</em> is
- a subset of this group. The user must create a group and
- subgroup by running the GRASS program
- <em><a href="i.group.html">i.group</a></em>
- before running <em>i.cluster</em>. The subgroup should
- contain only the imagery band files that the user wishes to
- classify. Note that this subgroup must contain more than
- one band file. The purpose of the group and subgroup is to
- collect map layers for classification or analysis. The
- <em>sigfile</em> is the file to contain result signatures
- which can be used as input for
- <em><a href="i.maxlik.html">i.maxlik</a></em>.
- The classes value is the initial number of clusters to be
- discriminated; any parameter values left unspecified are
- set to their default values.
- <h3>Parameters:</h3>
- <dl>
- <dt><b>group=</b><em>name</em>
- <dd>The name of the group file which contains the imagery
- files that the user wishes to classify.
- <dt><b>subgroup=</b><em>name</em>
- <dd>The name of the subset of the group specified in group
- option, which must contain only imagery band files and more
- than one band file. The user must create a group and a
- subgroup by running the GRASS program
- <em><a href="i.group.html">i.group</a></em>
- before
- running <em>i.cluster</em>.
- <dt><b>sigfile=</b><em>name</em>
- <dd>The name assigned to output signature file which
- contains signatures of classes and can be used as the input
- file for the GRASS program
- <em><a href="i.maxlik.html">i.maxlik</a></em>
- for an unsupervised classification.
- <dt><b>classes=</b><em>value</em>
- <dd>The number of clusters that will initially be
- identified in the clustering process before the iterations
- begin.
- <dt><b>seed=</b><em>name</em>
- <dd>The name of a seed signature file is optional. The seed
- signatures are signatures that contain cluster means and
- covariance matrices which were calculated prior to the
- current run of <em>i.cluster</em>. They may be acquired
- from a previously run of <em>i.cluster</em> or from a
- supervised classification signature training site section
- (e.g., using the signature file output by
- <em><a href="g.gui.iclass.html">g.gui.iclass</a></em>).
- The purpose of seed signatures is to optimize the cluster
- decision boundaries (means) for the number of clusters
- specified.
- <dt><b>sample=</b><em>row_interval,col_interval</em>
- <dd>These numbers are optional with default values based on
- the size of the data set such that the total pixels to be
- processed is approximately 10,000 (consider round up).
- <dt><b>iterations=</b><em>value</em>
- <dd>This parameter determines the maximum number of
- iterations which is greater than the number of iterations
- predicted to achieve the optimum percent convergence. The
- default value is 30. If the number of iterations reaches
- the maximum designated by the user; the user may want to
- rerun <em>i.cluster</em> with a higher number of iterations
- (see <a href="#reportfile"><em>reportfile</em></a>).
- <br>
- Default: 30
- <a name="convergence"></a>
- <dt><b>convergence=</b><em>value</em>
- <dd>A high percent convergence is the point at which
- cluster means become stable during the iteration process.
- The default value is 98.0 percent. When clusters are being
- created, their means constantly change as pixels are
- assigned to them and the means are recalculated to include
- the new pixel. After all clusters have been created,
- <em>i.cluster</em> begins iterations that change cluster
- means by maximizing the distances between them. As these
- means shift, a higher and higher convergence is
- approached. Because means will never become totally
- static, a percent convergence and a maximum number of
- iterations are supplied to stop the iterative process. The
- percent convergence should be reached before the maximum
- number of iterations. If the maximum number of iterations
- is reached, it is probable that the desired percent
- convergence was not reached. The number of iterations is
- reported in the cluster statistics in the report file
- (see <a href="#reportfile"><em>reportfile</em></a>).
- <br>
- Default: 98.0
- <dt><b>separation=</b><em>value</em>
- <dd>This is the minimum separation below which clusters
- will be merged in the iteration process. The default value
- is 0.0. This is an image-specific number (a "magic" number)
- that depends on the image data being classified and the
- number of final clusters that are acceptable. Its
- determination requires experimentation. Note that as the
- minimum class (or cluster) separation is increased, the
- maximum number of iterations should also be increased to
- achieve this separation with a high percentage of
- convergence
- (see <a href="#convergence"><em>convergence</em></a>).
- <br>
- Default: 0.0
- <dt><b>min_size=</b><em>value</em>
- <dd>This is the minimum number of pixels that will be used
- to define a cluster, and is therefore the minimum number of
- pixels for which means and covariance matrices will be
- calculated.
- <br>
- Default: 17
- <A NAME="reportfile"></a>
- <dt><b>reportfile=</b><em>name</em>
- <dd>The reportfile is an optional parameter which contains
- the result, i.e., the statistics for each cluster. Also
- included are the resulting percent convergence for the
- clusters, the number of iterations that was required to
- achieve the convergence, and the separability matrix.
- </dl>
- <h2>NOTES</h2>
- Running in command line mode, <em>i.cluster</em> will
- overwrite the output signature file and reportfile (if
- required by the user) without prompting if the files
- existed.
- <h2>EXAMPLE</h2>
- Preparing the statistics for unsupervised classification of
- a LANDSAT subscene in North Carolina:
- <div class="code"><pre>
- g.region rast=lsat7_2002_10 -p
- # store VIZ, NIR, MIR into group/subgroup
- i.group group=my_lsat7_2002 subgroup=my_lsat7_2002 \
- input=lsat7_2002_10,lsat7_2002_20,lsat7_2002_30,lsat7_2002_40,lsat7_2002_50,lsat7_2002_70
- i.cluster group=my_lsat7_2002 subgroup=my_lsat7_2002 sigfile=sig_clust_lsat2002 \
- classes=10 report=rep_clust_lsat2002.txt
- </pre></div>
- To complete the unsupervised classification, <em>i.maxlik</em> is subsequently used.
- <h2>SEE ALSO</h2>
- The GRASS 4 <em>
- <a href="http://grass.osgeo.org/gdp/imagery/grass4_image_processing.pdf">Image
- Processing manual</a></em>
- <p>
- <em>
- <a href="g.gui.iclass.html">g.gui.iclass</a>,
- <a href="i.group.html">i.group</a>,
- <a href="i.gensig.html">i.gensig</a>,
- <a href="i.maxlik.html">i.maxlik</a>,
- <a href="i.segment.html">i.segment</a>,
- <a href="i.smap.html">i.smap</a>,
- <a href="r.kappa.html">r.kappa</a>
- </em>
- <h2>AUTHORS</h2>
- Michael Shapiro,
- U.S.Army Construction Engineering
- Research Laboratory
- <br>
- Tao Wen,
- University of Illinois at
- Urbana-Champaign,
- Illinois
- <p><i>Last changed: $Date$</i>
|