PrG_Workwith_Blobs.xml 5.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111
  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
  3. "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
  4. <sect1 id="Working_with_BLOBs">
  5. <title><emphasis role="bold">Working with BLOBs</emphasis></title>
  6. <para>BLOB (Binary Large OBject) support in ECL begins with the DATA value
  7. type. This type may contain any type of data, making it perfect for housing
  8. BLOB data.</para>
  9. <para>There are essentially three issues around working with BLOB
  10. data:</para>
  11. <para>1) How to get the data into the HPCC (spraying).</para>
  12. <para>2) How to work with the data, once it is in the HPCC.</para>
  13. <para>3) How to get the data back out of the HPCC (despraying).</para>
  14. <sect2 id="Spraying_BLOB_Data">
  15. <title>Spraying BLOB Data</title>
  16. <para>In the HPCCClientTools.PDF there is a chapter devoted to the
  17. DFUplus.exe program. This is a command line tool with specific options
  18. that allow you to spray and despray files into BLOBs in the HPCC. In all
  19. the examples below, we'll assume you have a DFUPLUS.INI file in the same
  20. folder as the executable containing the standard content described in that
  21. section of the PDF.</para>
  22. <para>The key to making a spray operation write to BLOBs is the use of the
  23. <emphasis>prefix=Filename,Filesize</emphasis> option. For example, the
  24. following command line sprays all the .JPG and .BMP files from the
  25. c:\import directory of the 10.150.51.26 machine into a single logical file
  26. named LE::imagedb:</para>
  27. <programlisting>C:\&gt;dfuplus action=spray srcip=10.150.51.26 srcfile=c:\import\*.jpg,c:\import\*.bmp
  28. dstcluster=le_thor dstname=LE::imagedb overwrite=1
  29. PREFIX=FILENAME,FILESIZE nosplit=1</programlisting>
  30. <para>When using the wildcard characters (* and ?) to spray multiple
  31. source files (<emphasis>srcfile</emphasis>) to a single
  32. <emphasis>dstname</emphasis>, you MUST use both the
  33. <emphasis>filename</emphasis> and <emphasis>filesize</emphasis>
  34. (FILENAME,FILESIZE) options if you need to be able to despray the contents
  35. of each record in the <emphasis>dstname</emphasis> back to the multiple
  36. source files they originally came from.</para>
  37. </sect2>
  38. <sect2 id="Working_with_BLOB_Data">
  39. <title>Working with BLOB Data</title>
  40. <para>Once you've sprayed the data into the HPCC you must define the
  41. RECORD structure and DATASET. The following RECORD structure defines the
  42. result of the spray above:</para>
  43. <programlisting>imageRecord := RECORD
  44. STRING filename;
  45. DATA image;
  46. //first 4 bytes contain the length of the image data
  47. UNSIGNED8 RecPos{virtual(fileposition)};
  48. END;
  49. imageData := DATASET('LE::imagedb',imageRecord,FLAT);
  50. </programlisting>
  51. <para>The key to this structure is the use of variable-length STRING and
  52. DATA value types. The filename field receives the complete name of the
  53. original .JPG or .BMP file that is now contained within the image field.
  54. The first four bytes of the image field contain an integer value
  55. specifying the number of bytes in the original file that are now in the
  56. image field.</para>
  57. <para>The DATA value type is used here for the BLOB field because the JPG
  58. and BMP formats are essentially binary data. However, if the BLOB were to
  59. contain XML data from multiple files, then it could be defined as a STRING
  60. value type. In that case, the first four bytes of the field would still
  61. contain an integer value specifying the number of bytes in the original
  62. file, followed by the XML data from the file.</para>
  63. <para>The upper size limit for any STRING or DATA value is 4GB. </para>
  64. <para>The addition of the RecPos field (a standard ECL “record pointer”
  65. field) allows us to create an INDEX, like this:</para>
  66. <programlisting>imageKey := INDEX(imageData,{filename,fpos},'LE::imageKey');
  67. BUILDINDEX(imageKey);</programlisting>
  68. <para>Having an INDEX allows you to work with the imageData file in keyed
  69. JOIN or FETCH operations. Of course, you can also perform any operation on
  70. the BLOB data files that you would do with any other file in ECL.</para>
  71. </sect2>
  72. <sect2 id="Despraying_BLOB_Data">
  73. <title>Despraying BLOB Data</title>
  74. <para>The DFUplus.exe program also allows you to despray BLOB files from
  75. the HPCC, splitting them back into the separate files they originated
  76. from. The key to making a despray operation write BLOBs to separate files
  77. is the use of the <emphasis>splitprefix=Filename,Filesize</emphasis>
  78. option. For example, the following command line desprays all the BLOB data
  79. to the c:\import\despray directory of the 10.150.51.26 machine from the
  80. single logical file named LE::imagedb:</para>
  81. <programlisting>C:\&gt;dfuplus action=despray dstip=10.150.51.26 dstfile=c:\import\despray\*.*
  82. srcname=LE::imagedb PREFIX=FILENAME,FILESIZE nosplit=1</programlisting>
  83. <para>Once this command has executed, you should have the same set of
  84. files that were originally sprayed, recreated in a separate
  85. directory.</para>
  86. </sect2>
  87. </sect1>