segmentlib.dox 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341
  1. /*! \page segmentlib GRASS Segment Library
  2. <!-- doxygenized from "GRASS 5 Programmer's Manual"
  3. by M. Neteler 8/2005
  4. -->
  5. \section segmentintro Segment Library
  6. <P>
  7. Authors: CERL
  8. <P>
  9. Large data files which contain data in a matrix format often need to be
  10. accessed in a nonsequential or random manner. This requirement complicates
  11. the programming.
  12. <P>
  13. Methods for accessing the data are to:
  14. <P>
  15. (1) read the entire data file into memory and process the data as a
  16. two-dimensional matrix,
  17. <P>
  18. (2) perform direct access i/o to the data file for every data value to be
  19. accessed, or
  20. <P>
  21. (3) read only portions of the data file into memory as needed.
  22. <P>
  23. Method (1) greatly simplifies the programming effort since i/o is done once
  24. and data access is simple array referencing. However, it has the
  25. disadvantage that large amounts of memory may be required to hold the data.
  26. The memory may not be available, or if it is, system paging of the module
  27. may severely degrade performance. Method (2) is not much more complicated to
  28. code and requires no significant amount of memory to hold the data. But the
  29. i/o involved will certainly degrade performance. Method (3) is a mixture of
  30. (1) and (2) . Memory requirements are fixed and data is read from the data
  31. file only when not already in memory. Howev er the programming is more
  32. complex.
  33. <P>
  34. The routines provided in this library are an implementation of method (3) .
  35. They are based on the idea that if the original matrix were segmented or
  36. partitioned into smaller matrices these segments could be managed to reduce
  37. both the memory required and the i/o. Data access along connected paths
  38. through the matrix, (i.e., moving up or down one row and left or right one
  39. column) should benefit.
  40. <P>
  41. In most applications, the original data is not in the segmented format. The
  42. data must be transformed from the nonsegmented format to the segmented
  43. format. This means reading the original data matrix row by row and writing
  44. each row to a new file with the segmentation organization. This step
  45. corresponds to the i/o step of method (1) .
  46. <P>
  47. Then data can be retrieved from the segment file through routines by
  48. specifying the row and column of the original matrix. Behind the scenes, the
  49. data is paged into memory as needed and the requested data is returned to
  50. the caller.
  51. <P>
  52. <B>Note:</B> All routines and global variables in this library, documented
  53. or undocumented, start with the prefix <B>segment_.</B> To avoid name
  54. conflicts, programmers should not create variables or routines in their own
  55. modules which use this prefix.
  56. \section Segment_Routines Segment Routines
  57. <P>
  58. The routines in the <I>Segment Library</I> are described below, more or
  59. less in the order they would logically be used in a module. They use a data
  60. structure called SEGMENT which is defined in the header file
  61. <grass/segment.h> that must be included in any code using these
  62. routines: [footnote]
  63. \verbatim
  64. #include <grass/segment.h>
  65. \endverbatim
  66. <P>
  67. The first step is to create a file which is properly formatted for use by
  68. the <I>Segment Library</I> routines:
  69. <P>
  70. int segment_format (int fd, int nrows, int ncols, int srows, int scols,
  71. int len) format a segment fileThe segmentation routines require a disk file
  72. to be used for paging segments in and out of memory. This routine formats the
  73. file open for write on file descriptor <B>fd</B> for use as a segment file.
  74. A segment file must be formatted before it can be processed by other segment
  75. routines. The configuration parameters <B>nrows, ncols, srows, scols</B>,
  76. and <B>len</B> are written to the beginning of the segment file which is
  77. then filled with zeros.
  78. <P>
  79. The corresponding nonsegmented data matrix, which is to be transferred to the
  80. segment file, is <B>nrows</B> by <B>ncols.</B> The segment file is to be
  81. formed of segments which are <B>srows</B> by <B>scols.</B> The data items
  82. have length <B>len</B> bytes. For example, if the <I>data type is int</I>,
  83. <B><I>len</I> </B><I>is sizeof(int) .</I>
  84. <P>
  85. Return codes are: 1 ok; else -1 could not seek or write <I>fd</I>, or -3
  86. illegal configuration parameter(s) .
  87. <P>
  88. The next step is to initialize a SEGMENT structure to be associated with a
  89. segment file formatted by <I>segment_format.</I>
  90. <P>
  91. int segment_init (SEGMENT *seg, int fd, int nsegs) initialize segment
  92. structureInitializes the <B>seg</B> structure. The file on <B>fd</B> is
  93. a segment file created by <I>segment_format</I> and must be open for
  94. reading and writing. The segment file configuration parameters <I>nrows,
  95. ncols, srows, scols</I>, and <I>len</I>, as written to the file by
  96. <I>segment_format</I>, are read from the file and stored in the
  97. <B>seg</B> structure. <B>Nsegs</B> specifies the number of segments that
  98. will be retained in memory. The minimum value allowed is 1.
  99. <P>
  100. <B>Note.</B> The size of a segment is <I>scols*srows*len</I> plus a few
  101. bytes for managing each segment.
  102. <P>
  103. Return codes are: 1 if ok; else -1 could not seek or read segment file, or -2 out of memory.
  104. <P>
  105. Then data can be written from another file to the segment file row by row:
  106. <P>
  107. int segment_put_row (SEGMENT *seg, char *buf, int row) write row to
  108. segment fileTransfers nonsegmented matrix data, row by row, into a segment
  109. file. <B>Seg</B> is the segment structure that was configured from a call
  110. to <I>segment_init.</I> <B>Buf</B> should contain <I>ncols*len</I>
  111. bytes of data to be transferred to the segment file. <B>Row</B> specifies
  112. the row from the data matrix being transferred.
  113. <P>
  114. Return codes are: 1 if ok; else -1 could not seek or write segment file.
  115. <P>
  116. Then data can be read or written to the segment file randomly:
  117. <P>
  118. int segment_get (SEGMENT *seg, char *value, int row, int col) get value
  119. from segment fileProvides random read access to the segmented data. It gets
  120. <I>len</I> bytes of data into <B>value</B> from the segment file
  121. <B>seg</B> for the corresponding <B>row</B> and <B>col</B> in the
  122. original data matrix.
  123. <P>
  124. Return codes are: 1 if ok; else -1 could not seek or read segment file.
  125. <P>
  126. int segment_put (SEGMENT *seg, char *value, int row, int col) put
  127. value to segment fileProvides random write access to the segmented data. It
  128. copies <I>len</I> bytes of data from <B>value</B> into the segment
  129. structure <B>seg</B> for the corresponding <B>row</B> and <B>col</B> in
  130. the original data matrix.
  131. <P>
  132. The data is not written to disk immediately. It is stored in a memory segment
  133. until the segment routines decide to page the segment to disk.
  134. <P>
  135. Return codes are: 1 if ok; else -1 could not seek or write segment file.
  136. <P>
  137. After random reading and writing is finished, the pending updates must be
  138. flushed to disk:
  139. <P>
  140. int segment_flush (SEGMENT *seg) flush pending updates to diskForces
  141. all pending updates generated by <I>segment_put()</I> to be written to the
  142. segment file <B>seg.</B> Must be called after the final segment_put() to
  143. force all pending updates to disk. Must also be called before the first call
  144. to <I>segment_get_row.</I>
  145. <P>
  146. Now the data in segment file can be read row by row and transferred to a normal
  147. sequential data file:
  148. <P>
  149. int segment_get_row (SEGMENT *seg, char *buf, int row) read row from
  150. segment fileTransfers data from a segment file, row by row, into memory
  151. (which can then be written to a regular matrix file) . <B>Seg</B> is the
  152. segment structure that was configured from a call to <I>segment_init.</I>
  153. <B>Buf</B> will be filled with <I>ncols*len</I> bytes of data
  154. corresponding to the <B>row</B> in the data matrix.
  155. <P>
  156. Return codes are: 1 if ok; else -1 could not seek or read segment file.
  157. <P>
  158. Finally, memory allocated in the SEGMENT structure is freed:
  159. <P>
  160. int segment_release (SEGMENT *seg) free allocated memoryReleases the
  161. allocated memory associated with the segment file <B>seg.</B> Does not close
  162. the file. Does not flush the data which may be pending from previous
  163. <I>segment_put()</I> calls.
  164. <P>
  165. \section How_to_Use_the_Library_Routines How to Use the Library Routines
  166. The following should provide the programmer with a good idea of how to use the
  167. <I>Segment Library</I> routines. The examples assume that the data is integer.
  168. The first step is the creation and formatting of a segment file. A file is
  169. created, formatted and then closed:
  170. \verbatim
  171. fd = creat (file, 0666);
  172. segment_format (fd, nrows, ncols, srows, scols, sizeof(int));
  173. close(fd);
  174. \endverbatim
  175. <P>
  176. The next step is the conversion of the nonsegmented matrix data into segment
  177. file format. The segment file is reopened for read and write, initialized, and
  178. then data read row by row from the original data file and put into the segment
  179. file:
  180. \verbatim
  181. #include <fcntl.h>
  182. int buf[NCOLS];
  183. SEGMENT seg;
  184. fd = open (file, O_RDWR);
  185. segment_init (&seg, fd, nseg)
  186. for (row = 0; row < nrows; row++)
  187. {
  188. <code to get original matrix data for row into buf>
  189. segment_put_row (&seg, buf, row);
  190. }
  191. \endverbatim
  192. <P>
  193. Of course if the intention is only to add new values rather than update existing
  194. values, the step which transfers data from the original matrix to the segment
  195. file, using segment_put_row() , could be omitted, since
  196. <I>segment_format</I> will fill the segment file with zeros.
  197. <P>
  198. The data can now be accessed directly using <I>segment_get.</I> For example,
  199. to get the value at a given row and column:
  200. \verbatim
  201. int value;
  202. SEGMENT seg;
  203. segment_get (&seg, &value, row, col);
  204. \endverbatim
  205. <P>
  206. Similarly <I>segment_put()</I> can be used to change data values in the
  207. segment file:
  208. \verbatim
  209. int value;
  210. SEGMENT seg;
  211. value = 10;
  212. segment_put (&seg, &value, row, col);
  213. \endverbatim
  214. <P>
  215. <B>WARNING:</B> It is an easy mistake to pass a value directly to
  216. segment_put(). The following should be avoided:
  217. \verbatim
  218. segment_put (&seg, 10, row, col); /* this will not work */
  219. \endverbatim
  220. <P>
  221. Once the random access processing is complete, the data would be extracted
  222. from the segment file and written to a nonsegmented matrix data file as
  223. follows:
  224. \verbatim
  225. segment_flush (&seg);
  226. for (row = 0; row < nrows; row++)
  227. {
  228. segment_get_row (&seg, buf, row);
  229. <code to put buf into a matrix data file for row>
  230. }
  231. \endverbatim
  232. <P>
  233. Finally, the memory allocated for use by the segment routines would be
  234. released and the file closed:
  235. \verbatim
  236. segment_release (&seg);
  237. close (fd);
  238. \endverbatim
  239. <P>
  240. <B>Note:</B> The <I>Segment Library</I> does not know the name of the
  241. segment file. It does not attempt to remove the file. If the file is only
  242. temporary, the programmer should remove the file after closing it.
  243. <P>
  244. \section Loading_the_Segment_Library Loading the Segment Library
  245. <P>
  246. The library is loaded by specifying $(SEGMENTLIB) in the Makefile.
  247. <P>
  248. See \ref Compiling_and_Installing_GRASS_Modules for a complete
  249. discussion of Makefiles.
  250. */