i.cluster.html 8.7 KB


  1. <h2>DESCRIPTION</h2>
  2. <em>i.cluster</em>
  3. performs the first pass in the GRASS two-pass unsupervised
  4. classification of imagery, while the GRASS program <em>
  5. <a href="i.maxlik.html">i.maxlik</a></em> executes
  6. the second pass. Both programs must be run to complete the unsupervised
  7. classification.
  8. <p>
  9. <em>i.cluster</em> is a clustering algorithm that reads
  10. through the (raster) imagery data and builds pixel clusters
  11. based on the spectral reflectances of the pixels (see Figure).
  12. The pixel clusters are imagery categories that can be related
  13. to land cover types on the ground. The spectral
  14. distributions of the clusters (which will be the land cover
  15. spectral signatures) are influenced by six parameters set
  16. by the user. The first parameter set by the user is the
  17. initial number of clusters to be discriminated.
  18. <p>
  19. <center>
  20. <img src="landsat_cluster.png" border=1><br>
  21. <table border=0 width=590>
  22. <tr><td><center>
  23. <i>Fig.: Land use/land cover clustering of LANDSAT scene (simplified)</i>
  24. </center></td></tr>
  25. </table>
  26. </center>
  27. <p>
  28. <em>i.cluster</em> starts by generating spectral signatures
  29. for this number of clusters and "attempts" to end up with
  30. this number of clusters during the clustering process. The
  31. resulting number of clusters and their spectral
  32. distributions, however, are also influenced by the range of
  33. the spectral values (category values) in the image files
  34. and the other parameters set by the user. These parameters
  35. are: the minimum cluster size, minimum cluster separation,
  36. the percent convergence, the maximum number of iterations,
  37. and the row and column sampling intervals.
  38. <p>
  39. The cluster spectral signatures that result are composed of
  40. cluster means and covariance matrices. These cluster means
  41. and covariance matrices are used in the second pass
  42. (<em><a href="i.maxlik.html">i.maxlik</a></em>) to
  43. classify the image. The clusters or spectral classes
  44. result can be related to land cover types on the ground.
  45. The user has to specify the name of group file, the name of subgroup
  46. file, the name of a file to contain result signatures, the
  47. initial number of clusters to be discriminated, and
  48. optionally other parameters (see below)
  49. where the <em>group</em> should contain the imagery files
  50. that the user wishes to classify. The <em>subgroup</em> is
  51. a subset of this group. The user must create a group and
  52. subgroup by running the GRASS program
  53. <em><a href="i.group.html">i.group</a></em>
  54. before running <em>i.cluster</em>. The subgroup should
  55. contain only the imagery band files that the user wishes to
  56. classify. Note that this subgroup must contain more than
  57. one band file. The purpose of the group and subgroup is to
  58. collect map layers for classification or analysis. The
  59. <em>sigfile</em> is the file to contain result signatures
  60. which can be used as input for
  61. <em><a href="i.maxlik.html">i.maxlik</a></em>.
  62. The classes value is the initial number of clusters to be
  63. discriminated; any parameter values left unspecified are
  64. set to their default values.
  65. <h3>Parameters:</h3>
  66. <dl>
  67. <dt><b>group=</b><em>name</em>
  68. <dd>The name of the group file which contains the imagery
  69. files that the user wishes to classify.
  70. <dt><b>subgroup=</b><em>name</em>
  71. <dd>The name of the subset of the group specified in group
  72. option, which must contain only imagery band files and more
  73. than one band file. The user must create a group and a
  74. subgroup by running the GRASS program
  75. <em><a href="i.group.html">i.group</a></em>
  76. before
  77. running <em>i.cluster</em>.
  78. <dt><b>sigfile=</b><em>name</em>
  79. <dd>The name assigned to output signature file which
  80. contains signatures of classes and can be used as the input
  81. file for the GRASS program
  82. <em><a href="i.maxlik.html">i.maxlik</a></em>
  83. for an unsupervised classification.
  84. <dt><b>classes=</b><em>value</em>
  85. <dd>The number of clusters that will initially be
  86. identified in the clustering process before the iterations
  87. begin.
  88. <dt><b>seed=</b><em>name</em>
  89. <dd>The name of a seed signature file is optional. The seed
  90. signatures are signatures that contain cluster means and
  91. covariance matrices which were calculated prior to the
  92. current run of <em>i.cluster</em>. They may be acquired
  93. from a previously run of <em>i.cluster</em> or from a
  94. supervised classification signature training site section
  95. (e.g., using the signature file output by
  96. <em><a href="g.gui.iclass.html">g.gui.iclass</a></em>).
  97. The purpose of seed signatures is to optimize the cluster
  98. decision boundaries (means) for the number of clusters
  99. specified.
  100. <dt><b>sample=</b><em>row_interval,col_interval</em>
  101. <dd>These numbers are optional with default values based on
  102. the size of the data set such that the total pixels to be
  103. processed is approximately 10,000 (consider round up).
  104. <dt><b>iterations=</b><em>value</em>
  105. <dd>This parameter determines the maximum number of
  106. iterations which is greater than the number of iterations
  107. predicted to achieve the optimum percent convergence. The
  108. default value is 30. If the number of iterations reaches
  109. the maximum designated by the user; the user may want to
  110. rerun <em>i.cluster</em> with a higher number of iterations
  111. (see <a href="#reportfile"><em>reportfile</em></a>).
  112. <br>
  113. Default: 30
  114. <a name="convergence"></a>
  115. <dt><b>convergence=</b><em>value</em>
  116. <dd>A high percent convergence is the point at which
  117. cluster means become stable during the iteration process.
  118. The default value is 98.0 percent. When clusters are being
  119. created, their means constantly change as pixels are
  120. assigned to them and the means are recalculated to include
  121. the new pixel. After all clusters have been created,
  122. <em>i.cluster</em> begins iterations that change cluster
  123. means by maximizing the distances between them. As these
  124. means shift, a higher and higher convergence is
  125. approached. Because means will never become totally
  126. static, a percent convergence and a maximum number of
  127. iterations are supplied to stop the iterative process. The
  128. percent convergence should be reached before the maximum
  129. number of iterations. If the maximum number of iterations
  130. is reached, it is probable that the desired percent
  131. convergence was not reached. The number of iterations is
  132. reported in the cluster statistics in the report file
  133. (see <a href="#reportfile"><em>reportfile</em></a>).
  134. <br>
  135. Default: 98.0
  136. <dt><b>separation=</b><em>value</em>
  137. <dd>This is the minimum separation below which clusters
  138. will be merged in the iteration process. The default value
  139. is 0.0. This is an image-specific number (a "magic" number)
  140. that depends on the image data being classified and the
  141. number of final clusters that are acceptable. Its
  142. determination requires experimentation. Note that as the
  143. minimum class (or cluster) separation is increased, the
  144. maximum number of iterations should also be increased to
  145. achieve this separation with a high percentage of
  146. convergence
  147. (see <a href="#convergence"><em>convergence</em></a>).
  148. <br>
  149. Default: 0.0
  150. <dt><b>min_size=</b><em>value</em>
  151. <dd>This is the minimum number of pixels that will be used
  152. to define a cluster, and is therefore the minimum number of
  153. pixels for which means and covariance matrices will be
  154. calculated.
  155. <br>
  156. Default: 17
  157. <A NAME="reportfile"></a>
  158. <dt><b>reportfile=</b><em>name</em>
  159. <dd>The reportfile is an optional parameter which contains
  160. the result, i.e., the statistics for each cluster. Also
  161. included are the resulting percent convergence for the
  162. clusters, the number of iterations that was required to
  163. achieve the convergence, and the separability matrix.
  164. </dl>
  165. <h2>NOTES</h2>
  166. Running in command line mode, <em>i.cluster</em> will
  167. overwrite the output signature file and reportfile (if
  168. required by the user) without prompting if the files
  169. existed.
  170. <h2>EXAMPLE</h2>
  171. Preparing the statistics for unsupervised classification of
  172. a LANDSAT subscene in North Carolina:
  173. <div class="code"><pre>
  174. g.region rast=lsat7_2002_10 -p
  175. # store VIZ, NIR, MIR into group/subgroup
  176. i.group group=my_lsat7_2002 subgroup=my_lsat7_2002 \
  177. input=lsat7_2002_10,lsat7_2002_20,lsat7_2002_30,lsat7_2002_40,lsat7_2002_50,lsat7_2002_70
  178. i.cluster group=my_lsat7_2002 subgroup=my_lsat7_2002 sigfile=sig_clust_lsat2002 \
  179. classes=10 report=rep_clust_lsat2002.txt
  180. </pre></div>
  181. To complete the unsupervised classification, <em>i.maxlik</em> is subsequently used.
  182. <h2>SEE ALSO</h2>
  183. The GRASS 4 <em>
  184. <a href="http://grass.osgeo.org/gdp/imagery/grass4_image_processing.pdf">Image
  185. Processing manual</a></em>
  186. <p>
  187. <em>
  188. <a href="g.gui.iclass.html">g.gui.iclass</a>,
  189. <a href="i.group.html">i.group</a>,
  190. <a href="i.gensig.html">i.gensig</a>,
  191. <a href="i.maxlik.html">i.maxlik</a>,
  192. <a href="i.segment.html">i.segment</a>,
  193. <a href="i.smap.html">i.smap</a>,
  194. <a href="r.kappa.html">r.kappa</a>
  195. </em>
  196. <h2>AUTHORS</h2>
  197. Michael Shapiro,
  198. U.S.Army Construction Engineering
  199. Research Laboratory
  200. <br>
  201. Tao Wen,
  202. University of Illinois at
  203. Urbana-Champaign,
  204. Illinois
  205. <p><i>Last changed: $Date$</i>