DataHandling.xml 54 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061071081091101111121131141151161171181191201211221231241251261271281291301311321331341351361371381391401411421431441451461471481491501511521531541551561571581591601611621631641651661671681691701711721731741751761771781791801811821831841851861871881891901911921931941951961971981992002012022032042052062072082092102112122132142152162172182192202212222232242252262272282292302312322332342352362372382392402412422432442452462472482492502512522532542552562572582592602612622632642652662672682692702712722732742752762772782792802812822832842852862872882892902912922932942952962972982993003013023033043053063073083093103113123133143153163173183193203213223233243253263273283293303313323333343353363373383393403413423433443453463473483493503513523533543553563573583593603613623633643653663673683693703713723733743753763773783793803813823833843853863873883893903913923933943953963973983994004014024034044054064074084094104114124134144154164174184194204214224234244254264274284294304314324334344354364374384394404414424434444454464474484494504514524534544554564574584594604614624634644654664674684694704714724734744754764774784794804814824834844854864874884894904914924934944954964974984995005015025035045055065075085095105115125135145155165175185195205215225235245255265275285295305315325335345355365375385395405415425435445455465475485495505515525535545555565575585595605615625635645655665675685695705715725735745755765775785795805815825835845855865875885895905915925935945955965975985996006016026036046056066076086096106116126136146156166176186196206216226236246256266276286296306316326336346356366376386396406416426436446456466476486496506516526536546556566576586596606616626636646656666676686696706716726736746756766776786796806816826836846856866876886896906916926936946956966976986997007017027037047057067077087097107117127137147157167177187197207217227237247257267277287297307317327337347357367377387397407417427437447457467477487497507517527537547557567577587597607617627637647657667677687697707717727737747757767777787797807817827837847857867877887897907917927937947957967977987998008018028038048058068078088098108118128138148158168178188198208218228238248258268278288298308318328338348358368378388398408418428438448458468478488498508518528538548558568578588598608618628638648658668678688698708718728738748758768778788798808818828838848858868878888898908918928938948958968978988999009019029039049059069079089099109119129139149159169179189199209219229239249259269279289299309319329339349359369379389399409419429439449459469479489499509519529539549559569579589599609619629639649659669679689699709719729739749759769779789799809819829839849859869879889899909919929939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054105510561057105810591060106110621063106410651066106710681069107010711072107310741075107610771078107910801081108210831084108510861087108810891090109110921093109410951096109710981099110011011102110311041105110611071108110911101111111211131114111511161117111811191120112111221123112411251126112711281129113011311132113311341135113611371138113911401141114211431144114511461147114811491150115111521153115411551156115711581159116011611162116311641165116611671168116911701171117211731174117511761177117811791180118111821183118411851186118711881189119011911192119311941195119611971198119912001201120212031204120512061207120812091210121112121213121412151216121712181219122012211222122312241225122612271228122912301231123212331234123512361237123812391240124112421243124412451246124712481249125012511252125312541255125612571258125912601261126212631264126512661267126812691270127112721273127412751276127712781279128012811282128312841285128612871288128912901291129212931294129512961297129812991300130113021303130413051306130713081309131013111312131313141315131613171318131913201321132213231324132513261327132813291330133113321333133413351336133713381339134013411342134313441345134613471348134913501351135213531354135513561357135813591360136113621363136413651366136713681369137013711372137313741375137613771378137913801381138213831384138513861387138813891390139113921393139413951396139713981399140014011402140314041405140614071408140914101411141214131414141514161417141814191420142114221423142414251426142714281429143014311432143314341435143614371438143914401441144214431444144514461447144814491450145114521453145414551456145714581459146014611462146314641465146614671468146914701471147214731474147514761477147814791480148114821483148414851486148714881489149014911492149314941495149614971498149915001501150215031504150515061507150815091510151115121513151415151516151715181519152015211522152315241525
  1. <?xml version="1.0" encoding="utf-8"?>
  2. <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
  3. "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
  4. <book lang="en_US" xml:base="../">
  5. <bookinfo>
  6. <title>HPCC Data Handling</title>
  7. <mediaobject>
  8. <imageobject>
  9. <imagedata fileref="images/redswooshWithLogo3.jpg" />
  10. </imageobject>
  11. </mediaobject>
  12. <author>
  13. <surname>Boca Raton Documentation Team</surname>
  14. </author>
  15. <legalnotice>
  16. <para>We welcome your comments and feedback about this document via
  17. email to <email>docfeedback@hpccsystems.com</email> Please include
  18. <emphasis role="bold">Documentation Feedback</emphasis> in the subject
  19. line and reference the document name, page numbers, and current Version
  20. Number in the text of the message.</para>
  21. <para>LexisNexis and the Knowledge Burst logo are registered trademarks
  22. of Reed Elsevier Properties Inc., used under license. HPCC Systems is a
  23. registered trademark of LexisNexis Risk Data Management Inc.</para>
  24. <para>Other products, logos, and services may be trademarks or
  25. registered trademarks of their respective companies. All names and
  26. example data used in this manual are fictitious. Any similarity to
  27. actual persons, living or dead, is purely coincidental.</para>
  28. <para> </para>
  29. </legalnotice>
  30. <xi:include href="common/Version.xml" xpointer="FooterInfo"
  31. xmlns:xi="http://www.w3.org/2001/XInclude" />
  32. <xi:include href="common/Version.xml" xpointer="DateVer"
  33. xmlns:xi="http://www.w3.org/2001/XInclude" />
  34. <corpname>HPCC Systems</corpname>
  35. <xi:include href="common/Version.xml" xpointer="Copyright"
  36. xmlns:xi="http://www.w3.org/2001/XInclude" />
  37. <mediaobject role="logo">
  38. <imageobject>
  39. <imagedata fileref="images/LN_Rightjustified.jpg" />
  40. </imageobject>
  41. </mediaobject>
  42. </bookinfo>
  43. <chapter id="Data_Handling">
  44. <title><emphasis>HPCC Data Handling</emphasis></title>
  45. <sect1 id="Introduction" role="nobrk">
  46. <title>Introduction</title>
  47. <para>There are a number of different ways in which data may be
  48. transferred to, from, or within an HPCC system. For each of these data
  49. transfers, there are a few key parameters that must be known.</para>
  50. <sect2 id="Prerequisites-for-most-file-movements">
  51. <title><emphasis role="bold">Prerequisites for most file
  52. movements:</emphasis></title>
  53. <itemizedlist>
  54. <listitem>
  55. <para>Logical filename</para>
  56. </listitem>
  57. <listitem>
  58. <para>Physical filename</para>
  59. </listitem>
  60. <listitem>
  61. <para>Record size (fixed)</para>
  62. </listitem>
  63. <listitem>
  64. <para>Source directory</para>
  65. </listitem>
  66. <listitem>
  67. <para>Destination directory</para>
  68. </listitem>
  69. <listitem>
  70. <para>Dali IP address (source and/or destination)</para>
  71. </listitem>
  72. <listitem>
  73. <para>Landing Zone IP address</para>
  74. </listitem>
  75. </itemizedlist>
  76. <para>The above parameters are used for these major data handling
  77. methods:</para>
  78. <itemizedlist>
  79. <listitem>
  80. <para>Import - Spraying Data from the Landing Zone to Thor</para>
  81. </listitem>
  82. <listitem>
  83. <para>Export - Despraying Data from Thor to Landing Zone</para>
  84. </listitem>
  85. <listitem>
  86. <para>Copy - Replicating Data from Thor to Thor (within same Dali
  87. File System)</para>
  88. </listitem>
  89. <listitem>
  90. <para>Copying Data from Thor to Thor (between different Dali File
  91. Systems)</para>
  92. </listitem>
  93. </itemizedlist>
  94. <para></para>
  95. </sect2>
  96. </sect1>
  97. <sect1 id="Data_Handling_Terms">
  98. <title>Data Handling Terms</title>
  99. <para>A <emphasis>spray</emphasis> or <emphasis>import</emphasis> is the
  100. relocation of a data file from one location (such as a Landing Zone) to
  101. a Data Refinery cluster. The term spray was adopted due to the nature of
  102. the file movement – the file is partitioned across all nodes within a
  103. cluster.</para>
  104. <para>A <emphasis>despray</emphasis> or <emphasis>export</emphasis> is
  105. the relocation of a data file from a Data Refinery cluster to a single
  106. machine location (such as a Landing Zone). The term despray was adopted
  107. due to the nature of the file movement – the file is reassembled from
  108. its parts on all nodes in the cluster and placed in a single file on the
  109. destination.</para>
  110. <para>A <emphasis>copy </emphasis>is the replication of a data file from
  111. one Data Refinery cluster to another cluster within the same
  112. environment.</para>
  113. <para>A <emphasis>Remote copy </emphasis>is the replication of a data
  114. file from one Data Refinery cluster to another cluster in a different
  115. environment.</para>
  116. <para>A <emphasis>Landing Zone</emphasis> (or Drop Zone) is a physical
  117. storage location defined in your system's environment. There can be one
  118. or more of these locations defined. A daemon (DaFileSrv) must be running
  119. on that server to enable file sprays and desprays.</para>
  120. </sect1>
  121. <sect1 id="Working_with_a_data_file">
  122. <title>Working with data files</title>
  123. <para>Once you start working with your HPCC system, you will want to
  124. process some real data, this section shows you how to load data to your
  125. HPCC system.</para>
  126. <sect2 id="Cautions_and_Warnings">
  127. <title>Before you begin</title>
  128. <para>First, you should consider the size of the data and the capacity
  129. of your system. A typical production HPCC system would have much more
  130. data capacity than a development system. The size of the files you
  131. wish to work with is limited by the size of your system.</para>
  132. </sect2>
  133. <sect2 id="Uploading_a_file">
  134. <title>Uploading a file</title>
  135. <para>For smaller data files, maximum of 2GB, you can use the
  136. upload/download file utility in ECL Watch.</para>
  137. <orderedlist>
  138. <listitem>
  139. <para>In your browser, go to the <emphasis role="bold">ECL
  140. Watch</emphasis> URL displayed example,
  141. http://nnn.nnn.nnn.nnn:8010, where nnn.nnn.nnn.nnn is your ESP
  142. Server's IP address.</para>
  143. <para><informaltable colsep="1" frame="all" rowsep="1">
  144. <?dbfo keep-together="always"?>
  145. <tgroup cols="2">
  146. <colspec colwidth="49.50pt" />
  147. <colspec />
  148. <tbody>
  149. <row>
  150. <entry><inlinegraphic
  151. fileref="images/caution.png" /></entry>
  152. <entry>Your IP address could be different from the ones
  153. provided in the example images. Please use the IP
  154. address provided by <emphasis
  155. role="bold">your</emphasis> installation.</entry>
  156. </row>
  157. </tbody>
  158. </tgroup>
  159. </informaltable></para>
  160. </listitem>
  161. <listitem>
  162. <para>From ECL Watch page, click on the <emphasis
  163. role="bold">Upload/download File </emphasis> link in the menu on
  164. the left side.</para>
  165. <para><figure>
  166. <title>Upload/download</title>
  167. <mediaobject>
  168. <imageobject>
  169. <imagedata fileref="images/LZimg03-1.jpg"
  170. vendor="eclwatchSS" />
  171. </imageobject>
  172. </mediaobject>
  173. </figure></para>
  174. <para><phrase> </phrase>Once you click on the Upload/download file
  175. link, it will take you to the dropzones and files page, where you
  176. can choose to <emphasis role="bold">Browse</emphasis> your machine
  177. for a file to upload:</para>
  178. <para><figure>
  179. <title>Dropzones</title>
  180. <mediaobject>
  181. <imageobject>
  182. <imagedata fileref="images/LZimg04.jpg"
  183. vendor="eclwatchSS" />
  184. </imageobject>
  185. </mediaobject>
  186. </figure></para>
  187. </listitem>
  188. <listitem>
  189. <para>Press the <emphasis role="bold">Browse</emphasis> button to
  190. browse the files on your local machine, select the file to upload
  191. and then click <emphasis role="bold">Open</emphasis>
  192. button.</para>
  193. <para>The file you selected should appear in the <emphasis
  194. role="bold">Select a file to upload</emphasis> field.</para>
  195. </listitem>
  196. <listitem>
  197. <para>Press on <emphasis role="bold">Upload Now</emphasis> to
  198. complete the file upload.</para>
  199. </listitem>
  200. </orderedlist>
  201. </sect2>
  202. <sect2 id="Uploading_files_w_secure_client">
  203. <title>Uploading files with a Secure Copy Client</title>
  204. <para>To upload a large file for processing to your virtual machine,
  205. you will need a tool that supports the secure copy protocol. In this
  206. section, we discuss using WinSCP. There are other tools available, but
  207. the steps are similar.</para>
  208. <para><orderedlist>
  209. <listitem>
  210. <para>Open the WinSCP tool, and login to your Landing Zone node
  211. using the username and password given.</para>
  212. <para><informaltable colsep="1" rowsep="1">
  213. <tgroup cols="2">
  214. <colspec colwidth="80pt" />
  215. <colspec colwidth="100pt" />
  216. <tbody>
  217. <row>
  218. <entry>Login ID:</entry>
  219. <entry>hpccdemo</entry>
  220. </row>
  221. <row>
  222. <entry>Password:</entry>
  223. <entry>hpccdemo</entry>
  224. </row>
  225. </tbody>
  226. </tgroup>
  227. </informaltable></para>
  228. </listitem>
  229. <listitem>
  230. <para>Once logged in, it should, navigate automatically to the
  231. landing zone folder. (/var/lib/LexisNexis/mydropzone)</para>
  232. </listitem>
  233. <listitem>
  234. <?dbfo keep-together="always"?>
  235. <para>Navigate to where your local file is in the left part of
  236. the window.</para>
  237. <para><figure>
  238. <title>WinSCP</title>
  239. <mediaobject>
  240. <imageobject>
  241. <imagedata fileref="images/LZimg05.jpg" />
  242. </imageobject>
  243. </mediaobject>
  244. </figure></para>
  245. </listitem>
  246. <listitem>
  247. <para>Select the data file to send and copy it to the landing
  248. zone, using drag-and-drop.</para>
  249. </listitem>
  250. </orderedlist></para>
  251. </sect2>
  252. </sect1>
  253. <sect1 id="Data_Handling_Methods">
  254. <title>Data Handling Methods</title>
  255. <para>There are several ways to spray, despray, or copy data
  256. files:</para>
  257. <itemizedlist>
  258. <listitem>
  259. <para>The DFU interface in Ecl Watch</para>
  260. </listitem>
  261. <listitem>
  262. <para>The DFU Plus command line utility</para>
  263. <para>See the <emphasis>Client Tools</emphasis> manual for
  264. details</para>
  265. </listitem>
  266. <listitem>
  267. <para>Using ECL Code and FileServices library functions.</para>
  268. <para>See the <emphasis>ECL Language Reference</emphasis> for
  269. details.</para>
  270. </listitem>
  271. </itemizedlist>
  272. <sect2 id="Data_Handling_Using_ECL-Watch">
  273. <title>Data Handling Using ECL Watch</title>
  274. <itemizedlist>
  275. <listitem>
  276. <para>Login to ECL Watch for the environment.</para>
  277. <para>The URL is the IP address where the ESP Server is installed
  278. plus the port to which the wssmc service is bound. The default
  279. port is 8010. For example:<programlisting>http://&lt;ESPserverIP&gt;:8010/</programlisting></para>
  280. </listitem>
  281. <listitem>
  282. <?dbfo keep-together="always"?>
  283. <para>Click on the <emphasis role="bold">Browse Logical
  284. Files</emphasis> hyperlink below <emphasis role="bold">DFU Files
  285. </emphasis>in the menu on the left.</para>
  286. <para>The Logical Files page displays showing all files with
  287. logical entries in the Dali Server’s Distributed File
  288. System.</para>
  289. <para><graphic fileref="images/DHMan-3.jpg"
  290. vendor="eclwatchSS" /></para>
  291. <para>From this page, you can despray or copy any file.</para>
  292. </listitem>
  293. </itemizedlist>
  294. <sect3 id="Desprays">
  295. <title>Desprays</title>
  296. <itemizedlist>
  297. <listitem>
  298. <para>Locate the file to despray in the list of files, then
  299. select the arrow graphic on the left hand side, then select
  300. <emphasis role="bold">Despray</emphasis> from the pop-up
  301. menu.</para>
  302. </listitem>
  303. <listitem>
  304. <para>Check the <emphasis role="bold">Source</emphasis>
  305. information that is already filled in.</para>
  306. </listitem>
  307. <listitem>
  308. <para>Provide <emphasis role="bold">Destination</emphasis>
  309. information.</para>
  310. <para><informaltable colsep="0" frame="none" rowsep="0">
  311. <tgroup cols="2">
  312. <colspec align="left" colwidth="122.40pt" />
  313. <colspec colwidth="333.00pt" />
  314. <tbody>
  315. <row>
  316. <entry><emphasis
  317. role="bold">Destination-Machine</emphasis></entry>
  318. <entry>Use the drop list to select the machine to
  319. despray to. The items in the list are landing zones
  320. defined in the system’s confguration.</entry>
  321. </row>
  322. <row>
  323. <entry><emphasis role="bold">Destination-IP
  324. Address</emphasis></entry>
  325. <entry>This is prefilled based upon the selected
  326. machine.</entry>
  327. </row>
  328. <row>
  329. <entry><emphasis role="bold">Destination-Local
  330. Path</emphasis></entry>
  331. <entry>Provide the complete file path of the
  332. destination including file name and extention.</entry>
  333. </row>
  334. <row>
  335. <entry><emphasis role="bold">Destination-<emphasis
  336. role="bold">Network Path</emphasis></emphasis></entry>
  337. <entry>The complete network path of the destination
  338. including file name and extension. (read only)</entry>
  339. </row>
  340. <row>
  341. <entry><emphasis
  342. role="bold">Overwrite</emphasis></entry>
  343. <entry>Check this box to overwrite a file with the
  344. same name if it exists.</entry>
  345. </row>
  346. </tbody>
  347. </tgroup>
  348. </informaltable></para>
  349. </listitem>
  350. <listitem>
  351. <?dbfo keep-together="always"?>
  352. <para>Press the <emphasis role="bold">Submit</emphasis>
  353. button.</para>
  354. <para>The <emphasis role="bold">DFU Workunit</emphasis>
  355. displays.<graphic fileref="images/DHMan-4.jpg"
  356. vendor="eclwatchSS" /></para>
  357. </listitem>
  358. <listitem>
  359. <?dbfo keep-together="always"?>
  360. <para>Press the <emphasis role="bold">Refresh</emphasis> button
  361. periodically until the status of your request indicates it is
  362. <emphasis role="bold">Finished</emphasis> or click on the
  363. <emphasis role="bold">View Progress</emphasis> hyperlink to see
  364. a progress indicator.</para>
  365. <para><graphic fileref="images/DHMan-5.jpg"
  366. vendor="eclwatchSS" />The Progress window shows a green progress
  367. bar indicating the percentage of completion, as well as other
  368. information related to the operation.</para>
  369. <para>If a job fails, information related to the cause of the
  370. failure also displays.</para>
  371. </listitem>
  372. </itemizedlist>
  373. </sect3>
  374. <sect3 id="Spray_Fixed">
  375. <title><emphasis role="bold">Spray Fixed</emphasis></title>
  376. <itemizedlist>
  377. <listitem>
  378. <para>Click on the <emphasis role="bold">Spray Fixed</emphasis>
  379. hyperlink below <emphasis role="bold">DFU</emphasis> in the menu
  380. on the left.</para>
  381. <para>The <emphasis role="bold">Spray Fixed</emphasis> page
  382. displays.</para>
  383. </listitem>
  384. <listitem>
  385. <para>Fill in <emphasis role="bold">Source
  386. </emphasis>information (<emphasis role="bold">Machine, IP, File
  387. Path, and record length)</emphasis> and the <emphasis
  388. role="bold">Destination</emphasis> information (<emphasis
  389. role="bold">Group</emphasis>and <emphasis role="bold">Label
  390. ).</emphasis></para>
  391. <para><informaltable colsep="0" frame="none" rowsep="0">
  392. <tgroup cols="2">
  393. <colspec colwidth="122.40pt" />
  394. <colspec colwidth="333.00pt" />
  395. <tbody>
  396. <row>
  397. <entry align="right"><emphasis
  398. role="bold">Source:</emphasis></entry>
  399. </row>
  400. <row>
  401. <entry><emphasis
  402. role="bold">Machine</emphasis></entry>
  403. <entry>Use the drop list to select the machine where
  404. the source file is located.</entry>
  405. </row>
  406. <row>
  407. <entry><emphasis role="bold">IP</emphasis></entry>
  408. <entry>IP address of machine from which to spray. This
  409. is automatically completed when you select the
  410. <emphasis role="bold">Source
  411. Machine.</emphasis></entry>
  412. </row>
  413. <row>
  414. <entry><emphasis role="bold">Local
  415. Path</emphasis></entry>
  416. <entry>The file path of source file to spray.</entry>
  417. </row>
  418. <row>
  419. <entry><emphasis role="bold">Record
  420. Length</emphasis></entry>
  421. <entry>The size of each record.</entry>
  422. </row>
  423. <row>
  424. <entry align="right"><emphasis
  425. role="bold">Destination:</emphasis></entry>
  426. </row>
  427. <row>
  428. <entry><emphasis role="bold">Group</emphasis></entry>
  429. <entry>Select the name of THOR cluster to spray
  430. to.</entry>
  431. </row>
  432. <row>
  433. <entry><emphasis role="bold">Label</emphasis></entry>
  434. <entry>The logical name that you choose for the
  435. file.</entry>
  436. </row>
  437. <row>
  438. <entry align="right"><emphasis
  439. role="bold">Options:</emphasis></entry>
  440. </row>
  441. <row>
  442. <entry><emphasis
  443. role="bold">Overwrite</emphasis></entry>
  444. <entry>Check this box to overwrite files of the same
  445. name.</entry>
  446. </row>
  447. <row>
  448. <entry><emphasis
  449. role="bold">Replicate</emphasis></entry>
  450. <entry><para>Check this box to create backup copies of
  451. all file parts in the backup directory (by convention
  452. on the secondary drive of the node following in the
  453. cluster).</para><para><emphasis role="bold">This
  454. option is only available on systems where replication
  455. has been enabled.</emphasis></para></entry>
  456. </row>
  457. <row>
  458. <entry><emphasis
  459. role="bold">Compress</emphasis></entry>
  460. <entry>Check this box to compress the files.</entry>
  461. </row>
  462. </tbody>
  463. </tgroup>
  464. </informaltable></para>
  465. </listitem>
  466. <listitem>
  467. <para>Press the <emphasis role="bold">Submit</emphasis>
  468. button.</para>
  469. <para>The <emphasis role="bold">DFU Workunit</emphasis>
  470. displays.</para>
  471. </listitem>
  472. <listitem>
  473. <?dbfo keep-together="always"?>
  474. <para>Press the <emphasis role="bold">Refresh</emphasis>button
  475. periodically until the status of your request indicates it is
  476. <emphasis role="bold">Finished </emphasis>or click on the
  477. <emphasis role="bold">View Progress</emphasis> hyperlink to see
  478. a progress indicator.</para>
  479. <para><informaltable colsep="1" frame="all" rowsep="0">
  480. <tgroup cols="2">
  481. <colspec colwidth="52.60pt" />
  482. <colspec colwidth="384.80pt" />
  483. <tbody>
  484. <row>
  485. <entry><graphic fileref="images/tip.jpg" /></entry>
  486. <entry>You can use the Choose File button to look up
  487. files on the selected source machine.</entry>
  488. </row>
  489. </tbody>
  490. </tgroup>
  491. </informaltable></para>
  492. </listitem>
  493. </itemizedlist>
  494. </sect3>
  495. <sect3 id="Spray_CSV">
  496. <title><emphasis role="bold">Spray CSV</emphasis></title>
  497. <itemizedlist>
  498. <listitem>
  499. <para>Click on the <emphasis role="bold">Spray CSV</emphasis>
  500. hyperlink below <emphasis role="bold">DFU </emphasis>in the menu
  501. on the left.</para>
  502. <para>The <emphasis role="bold">Spray CSV</emphasis> page
  503. displays.</para>
  504. </listitem>
  505. <listitem>
  506. <para>Fill in <emphasis role="bold">Source
  507. </emphasis>information (<emphasis role="bold">Machine, IP, File
  508. Path, and record information)</emphasis> and the <emphasis
  509. role="bold">Destination</emphasis> information (<emphasis
  510. role="bold">Group</emphasis> and <emphasis
  511. role="bold">Label</emphasis>).</para>
  512. <para><informaltable colsep="0" frame="none" rowsep="0">
  513. <tgroup cols="2">
  514. <colspec colwidth="122.40pt" />
  515. <colspec colwidth="333.00pt" />
  516. <tbody>
  517. <row>
  518. <entry align="right"><emphasis
  519. role="bold">Source:</emphasis></entry>
  520. </row>
  521. <row>
  522. <entry><emphasis
  523. role="bold">Machine</emphasis></entry>
  524. <entry>Use the drop list to select the machine where
  525. the source file is located.</entry>
  526. </row>
  527. <row>
  528. <entry><emphasis role="bold">IP</emphasis></entry>
  529. <entry>IP address of machine from which to spray. This
  530. is automatically completed when you select the
  531. <emphasis role="bold">Source
  532. Machine.</emphasis></entry>
  533. </row>
  534. <row>
  535. <entry><emphasis role="bold">Local
  536. Path</emphasis></entry>
  537. <entry>The file path of source file to spray.</entry>
  538. </row>
  539. <row>
  540. <entry><emphasis role="bold">Max Record
  541. Length</emphasis></entry>
  542. <entry>The length of longest record in the
  543. file.</entry>
  544. </row>
  545. <row>
  546. <entry><emphasis
  547. role="bold">Separator</emphasis></entry>
  548. <entry>The character used as a separator in the source
  549. file.</entry>
  550. </row>
  551. <row>
  552. <entry><emphasis role="bold">Line
  553. Terminator</emphasis></entry>
  554. <entry>The character used as a line terminator in the
  555. source file.</entry>
  556. </row>
  557. <row>
  558. <entry><emphasis role="bold">Quote</emphasis></entry>
  559. <entry>The character used as a quote in the source
  560. file.</entry>
  561. </row>
  562. <row>
  563. <entry align="right"><emphasis
  564. role="bold">Destination:</emphasis></entry>
  565. </row>
  566. <row>
  567. <entry><emphasis role="bold">Group</emphasis></entry>
  568. <entry>Select the name of THOR cluster to spray
  569. to.</entry>
  570. </row>
  571. <row>
  572. <entry><emphasis role="bold">Label</emphasis></entry>
  573. <entry>The logical name that you choose for the
  574. file.</entry>
  575. </row>
  576. <row>
  577. <entry align="right"><emphasis
  578. role="bold">Options:</emphasis></entry>
  579. </row>
  580. <row>
  581. <entry><emphasis
  582. role="bold">Overwrite</emphasis></entry>
  583. <entry>Check this box to overwrite files of the same
  584. name.</entry>
  585. </row>
  586. <row>
  587. <entry><emphasis
  588. role="bold">Replicate</emphasis></entry>
  589. <entry><para>Check this box to create backup copies of
  590. all file parts in the backup directory (by convention
  591. on the secondary drive of the node following in the
  592. cluster).</para><para><emphasis role="bold">This
  593. option is only available on systems where replication
  594. has been enabled.</emphasis></para></entry>
  595. </row>
  596. <row>
  597. <entry><emphasis
  598. role="bold">Compress</emphasis></entry>
  599. <entry>Check this box to compress the files.</entry>
  600. </row>
  601. </tbody>
  602. </tgroup>
  603. </informaltable></para>
  604. </listitem>
  605. <listitem>
  606. <para>Press the <emphasis role="bold">Submit</emphasis>
  607. button.</para>
  608. <para>The <emphasis role="bold">DFU Workunit</emphasis>
  609. displays.</para>
  610. </listitem>
  611. <listitem>
  612. <?dbfo keep-together="always"?>
  613. <para>Press the <emphasis role="bold">Refresh</emphasis> button
  614. periodically until the status of your request indicates it is
  615. <emphasis role="bold">Finished</emphasis>or click on the
  616. <emphasis role="bold">View Progress</emphasis> hyperlink to see
  617. a progress indicator.<informaltable colsep="1" frame="all"
  618. rowsep="0">
  619. <tgroup cols="2">
  620. <colspec colwidth="52.60pt" />
  621. <colspec colwidth="384.80pt" />
  622. <tbody>
  623. <row>
  624. <entry><graphic fileref="images/tip.jpg" /></entry>
  625. <entry>You can use the Choose File button to look up
  626. files on the selected source machine.</entry>
  627. </row>
  628. </tbody>
  629. </tgroup>
  630. </informaltable></para>
  631. </listitem>
  632. </itemizedlist>
  633. <beginpage />
  634. </sect3>
  635. <sect3 id="Spray_XML">
  636. <title><emphasis role="bold">Spray XML</emphasis></title>
  637. <itemizedlist>
  638. <listitem>
  639. <para>Click on the <emphasis role="bold">Spray
  640. XML</emphasis>hyperlink below <emphasis
  641. role="bold">DFU</emphasis>in the menu on the left.</para>
  642. <para>The <emphasis role="bold">Spray XML</emphasis> page
  643. displays.</para>
  644. </listitem>
  645. <listitem>
  646. <para>Fill in <emphasis role="bold">Source
  647. </emphasis>information (<emphasis role="bold">Machine, IP, File
  648. Path, and record information)</emphasis>and the <emphasis
  649. role="bold">Destination</emphasis> information (<emphasis
  650. role="bold">Group</emphasis> and <emphasis role="bold">Label
  651. ).</emphasis></para>
  652. <para><informaltable colsep="0" frame="none" rowsep="0">
  653. <tgroup cols="2">
  654. <colspec colwidth="122.40pt" />
  655. <colspec colwidth="333.00pt" />
  656. <tbody>
  657. <row>
  658. <entry align="right"><emphasis
  659. role="bold">Source:</emphasis></entry>
  660. </row>
  661. <row>
  662. <entry><emphasis
  663. role="bold">Machine</emphasis></entry>
  664. <entry>Use the drop list to select the machine where
  665. the source file is located.</entry>
  666. </row>
  667. <row>
  668. <entry><emphasis role="bold">IP</emphasis></entry>
  669. <entry>IP address of machine from which to spray. This
  670. is automatically completed when you select the
  671. <emphasis role="bold">Source
  672. Machine.</emphasis></entry>
  673. </row>
  674. <row>
  675. <entry><emphasis role="bold">Local
  676. Path</emphasis></entry>
  677. <entry>The file path of source file to spray.</entry>
  678. </row>
  679. <row>
  680. <entry><emphasis role="bold">Format</emphasis></entry>
  681. <entry>Select the file format from the drop
  682. list.</entry>
  683. </row>
  684. <row>
  685. <entry><emphasis role="bold">Max Record
  686. Length</emphasis></entry>
  687. <entry>The length of longest record in the
  688. file.</entry>
  689. </row>
  690. <row>
  691. <entry><emphasis role="bold">Row
  692. Tag</emphasis></entry>
  693. <entry>The record separator tag in the XML
  694. file</entry>
  695. </row>
  696. <row>
  697. <entry align="right"><emphasis
  698. role="bold">Destination:</emphasis></entry>
  699. </row>
  700. <row>
  701. <entry><emphasis role="bold">Group</emphasis></entry>
  702. <entry>Select the name of THOR cluster to spray
  703. to.</entry>
  704. </row>
  705. <row>
  706. <entry><emphasis role="bold">Label</emphasis></entry>
  707. <entry>The logical name that you choose for the
  708. file.</entry>
  709. </row>
  710. <row>
  711. <entry align="right"><emphasis
  712. role="bold">Options:</emphasis></entry>
  713. </row>
  714. <row>
  715. <entry><emphasis
  716. role="bold">Overwrite</emphasis></entry>
  717. <entry>Check this box to overwrite files of the same
  718. name.</entry>
  719. </row>
  720. <row>
  721. <entry><emphasis
  722. role="bold">Replicate</emphasis></entry>
  723. <entry><para>Check this box to create backup copies of
  724. all file parts in the backup directory (by convention
  725. on the secondary drive of the node following in the
  726. cluster).</para><para><emphasis role="bold">This
  727. option is only available on systems where replication
  728. has been enabled.</emphasis></para></entry>
  729. </row>
  730. <row>
  731. <entry><emphasis
  732. role="bold">Compress</emphasis></entry>
  733. <entry>Check this box to compress the files.</entry>
  734. </row>
  735. </tbody>
  736. </tgroup>
  737. </informaltable></para>
  738. </listitem>
  739. <listitem>
  740. <para>Press the <emphasis role="bold">Submit
  741. </emphasis>button.</para>
  742. <para>The <emphasis role="bold">DFU Workunit</emphasis>
  743. displays.</para>
  744. </listitem>
  745. <listitem>
  746. <para>Press the <emphasis role="bold">Refresh </emphasis>button
  747. periodically until the status of your request indicates it is
  748. <emphasis role="bold">Finished </emphasis>or click on the
  749. <emphasis role="bold">View Progress</emphasis> hyperlink to see
  750. a progress indicator.</para>
  751. <para><informaltable colsep="1" frame="all" rowsep="0">
  752. <tgroup cols="2">
  753. <colspec colwidth="52.60pt" />
  754. <colspec colwidth="384.80pt" />
  755. <tbody>
  756. <row>
  757. <entry><graphic fileref="images/tip.jpg" /></entry>
  758. <entry>You can use the Choose File button to look up
  759. files on the selected source machine.</entry>
  760. </row>
  761. </tbody>
  762. </tgroup>
  763. </informaltable></para>
  764. </listitem>
  765. </itemizedlist>
  766. <beginpage />
  767. </sect3>
  768. <sect3 id="Copy">
  769. <title><emphasis role="bold">Copy</emphasis></title>
  770. <itemizedlist>
  771. <listitem>
  772. <para>Locate the file to copy in the list of files, then click
  773. on the arrow icon, then select <emphasis role="bold">Copy
  774. </emphasis>from the pop-up menu..</para>
  775. </listitem>
  776. <listitem>
  777. <para>Fill in <emphasis role="bold">Destination</emphasis> and
  778. <emphasis role="bold">Options </emphasis>information.</para>
  779. <informaltable colsep="0" frame="none" rowsep="0">
  780. <tgroup cols="2">
  781. <colspec colwidth="122.40pt" />
  782. <colspec colwidth="333.00pt" />
  783. <tbody>
  784. <row>
  785. <entry align="right"><emphasis
  786. role="bold">Destination:</emphasis></entry>
  787. </row>
  788. <row>
  789. <entry><emphasis role="bold">Group</emphasis></entry>
  790. <entry>Select the name of THOR cluster to copy
  791. to.</entry>
  792. </row>
  793. <row>
  794. <entry align="right"><emphasis
  795. role="bold">Note</emphasis></entry>
  796. <entry>You can only choose from THOR clusters within the
  797. current environment.</entry>
  798. </row>
  799. <row>
  800. <entry><emphasis role="bold">Logical
  801. File</emphasis></entry>
  802. <entry>The logical name for the copied file.</entry>
  803. </row>
  804. <row>
  805. <entry><emphasis role="bold">File
  806. Mask</emphasis></entry>
  807. <entry>Automatically updated based on logical file name
  808. entered.</entry>
  809. </row>
  810. <row>
  811. <entry align="right"><emphasis
  812. role="bold">Options:</emphasis></entry>
  813. </row>
  814. <row>
  815. <entry><emphasis
  816. role="bold">Replicate</emphasis></entry>
  817. <entry><para>Check this box to create backup copies of
  818. all file parts in the backup directory (by convention on
  819. the secondary drive of the node following in the
  820. cluster).</para><para><emphasis role="bold">This option
  821. is only available on systems where replication has been
  822. enabled.</emphasis></para></entry>
  823. </row>
  824. <row>
  825. <entry><emphasis role="bold">Wrap</emphasis></entry>
  826. <entry>Check this box to keep the number of parts the
  827. same and wrap if the target cluster is smaller that the
  828. original.</entry>
  829. </row>
  830. <row>
  831. <entry><emphasis
  832. role="bold">Overwrite</emphasis></entry>
  833. <entry>Check this box to overwrite files of the same
  834. name.</entry>
  835. </row>
  836. <row>
  837. <entry><emphasis role="bold">Compress</emphasis></entry>
  838. <entry>Check this box to compress the files.</entry>
  839. </row>
  840. <row>
  841. <entry><emphasis role="bold">Retain Superfile
  842. Structure</emphasis></entry>
  843. <entry>Check this box to retain the superfile
  844. structure.</entry>
  845. </row>
  846. </tbody>
  847. </tgroup>
  848. </informaltable>
  849. </listitem>
  850. <listitem>
  851. <para>Press the <emphasis role="bold">Submit
  852. </emphasis>button.</para>
  853. <para>The <emphasis role="bold">DFU Workunit
  854. </emphasis>displays.</para>
  855. </listitem>
  856. <listitem>
  857. <para>Press the <emphasis role="bold">Refresh </emphasis>button
  858. periodically until the status of your request indicates it is
  859. <emphasis role="bold">Finished </emphasis>or click on the
  860. <emphasis role="bold">View Progress</emphasis> hyperlink to see
  861. a progress indicator.</para>
  862. </listitem>
  863. </itemizedlist>
  864. <beginpage />
  865. </sect3>
  866. <sect3 id="Remote_Copy">
  867. <title><emphasis role="bold">Remote Copy</emphasis></title>
  868. <para>Remote Copy allows you to copy data from a Thor cluster
  869. outside your environment to the one in your environment.</para>
  870. <itemizedlist>
  871. <listitem>
  872. <para>Click on the <emphasis role="bold">Remote Copy
  873. </emphasis>hyperlink below <emphasis role="bold">DFU
  874. </emphasis>in the menu on the left.</para>
  875. <para>The <emphasis role="bold">Copy File </emphasis>page
  876. displays.</para>
  877. </listitem>
  878. <listitem>
  879. <para>Fill in <emphasis role="bold">Source,
  880. Destination,</emphasis> and <emphasis role="bold">Options
  881. </emphasis>information.</para>
  882. <informaltable colsep="0" frame="none" rowsep="0">
  883. <tgroup cols="2">
  884. <colspec colwidth="122.40pt" />
  885. <colspec colwidth="333.00pt" />
  886. <tbody>
  887. <row>
  888. <entry align="right"><emphasis
  889. role="bold">Source:</emphasis></entry>
  890. </row>
  891. <row>
  892. <entry><emphasis role="bold">Logical
  893. File</emphasis></entry>
  894. <entry>The logical file name in the remote
  895. environment.</entry>
  896. </row>
  897. <row>
  898. <entry><emphasis role="bold">Source
  899. Dali</emphasis></entry>
  900. <entry>The Dali Server in the remote environment</entry>
  901. </row>
  902. <row>
  903. <entry><emphasis role="bold">Source
  904. Username</emphasis></entry>
  905. <entry>A valid user in the remote environment</entry>
  906. </row>
  907. <row>
  908. <entry><emphasis role="bold">Source
  909. Password</emphasis></entry>
  910. <entry>The password for the user in the remote
  911. environment</entry>
  912. </row>
  913. <row>
  914. <entry align="right"><emphasis
  915. role="bold">Destination:</emphasis></entry>
  916. </row>
  917. <row>
  918. <entry><emphasis role="bold">Group</emphasis></entry>
  919. <entry>Select the name of THOR cluster to copy
  920. to.</entry>
  921. </row>
  922. <row>
  923. <entry align="right"><emphasis
  924. role="bold">Note</emphasis></entry>
  925. <entry>You can only choose from THOR clusters within the
  926. current environment.</entry>
  927. </row>
  928. <row>
  929. <entry><emphasis role="bold">Logical
  930. Name</emphasis></entry>
  931. <entry>The logical name for the copied file.</entry>
  932. </row>
  933. <row>
  934. <entry align="right"><emphasis
  935. role="bold">Options:</emphasis></entry>
  936. </row>
  937. <row>
  938. <entry><emphasis
  939. role="bold">Replicate</emphasis></entry>
  940. <entry><para>Check this box to create backup copies of
  941. all file parts in the backup directory (by convention on
  942. the secondary drive of the node following in the
  943. cluster).</para><para><emphasis role="bold">This option
  944. is only available on systems where replication has been
  945. enabled.</emphasis></para></entry>
  946. </row>
  947. <row>
  948. <entry><emphasis role="bold">Wrap</emphasis></entry>
  949. <entry>Check this box to keep the number of parts the
  950. same and wrap if the target cluster is smaller that the
  951. original.</entry>
  952. </row>
  953. <row>
  954. <entry><emphasis
  955. role="bold">Overwrite</emphasis></entry>
  956. <entry>Check this box to overwrite files of the same
  957. name.</entry>
  958. </row>
  959. <row>
  960. <entry><emphasis role="bold">Compress</emphasis></entry>
  961. <entry>Check this box to compress the files.</entry>
  962. </row>
  963. <row>
  964. <entry><emphasis role="bold">Retain Superfile
  965. Structure</emphasis></entry>
  966. <entry>Check this box to retain the superfile
  967. structure.</entry>
  968. </row>
  969. </tbody>
  970. </tgroup>
  971. </informaltable>
  972. </listitem>
  973. <listitem>
  974. <para>Press the <emphasis role="bold">Submit
  975. </emphasis>button.</para>
  976. <para>The <emphasis role="bold">DFU Workunit
  977. </emphasis>displays.</para>
  978. </listitem>
  979. <listitem>
  980. <para>Press the <emphasis role="bold">Refresh </emphasis>button
  981. periodically until the status of your request indicates it is
  982. <emphasis role="bold">Finished </emphasis>or click on the
  983. <emphasis role="bold">View Progress</emphasis> hyperlink to see
  984. a progress indicator.</para>
  985. </listitem>
  986. </itemizedlist>
  987. <beginpage />
  988. </sect3>
  989. <sect3 id="Modifying_a-DFU-Workunit">
  990. <title><emphasis role="bold">Modifying a DFU
  991. Workunit</emphasis></title>
  992. <para>From the DFU Workunit page, you can modify and run any DFU
  993. Spray, Despray, Copy, or Remote Copy action using the modified
  994. settings. This allows you to run similar jobs without filling in
  995. similar details again. It also allows you to correct any errors that
  996. might have caused a DFU workunit to fail.</para>
  997. <para>From any DFU Workunit page:</para>
  998. <itemizedlist>
  999. <listitem>
  1000. <para>Press the Modify button.</para>
  1001. <para>The page for the original action displays.</para>
  1002. </listitem>
  1003. <listitem>
  1004. <para>Modify the details, as needed.</para>
  1005. </listitem>
  1006. <listitem>
  1007. <para>Press the <emphasis role="bold">Submit
  1008. </emphasis>button.</para>
  1009. <para>The <emphasis role="bold">DFU Workunit
  1010. </emphasis>displays.</para>
  1011. </listitem>
  1012. <listitem>
  1013. <para>Press the <emphasis role="bold">Refresh </emphasis>button
  1014. periodically until the status of your request indicates it is
  1015. <emphasis role="bold">Finished </emphasis>or click on the
  1016. <emphasis role="bold">View Progress</emphasis> hyperlink to see
  1017. a progress indicator.</para>
  1018. </listitem>
  1019. </itemizedlist>
  1020. <para></para>
  1021. <para></para>
  1022. </sect3>
  1023. </sect2>
  1024. </sect1>
  1025. </chapter>
  1026. <chapter>
  1027. <title><emphasis>HPCC Data Backups</emphasis></title>
  1028. <sect1 id="Introduction2" role="nobrk">
  1029. <title>Introduction</title>
  1030. <para>This section covers critical system data that requires regular
  1031. backup procedures to prevent data loss.</para>
  1032. <para>There are</para>
  1033. <itemizedlist>
  1034. <listitem>
  1035. <para>The System Data Store (Dali data)</para>
  1036. </listitem>
  1037. <listitem>
  1038. <para>Environment Configuration files</para>
  1039. </listitem>
  1040. <listitem>
  1041. <para>Data Refinery (Thor) data files</para>
  1042. </listitem>
  1043. <listitem>
  1044. <para>Rapid Data Delivery Engine (Roxie) data files</para>
  1045. </listitem>
  1046. <listitem>
  1047. <para>Attribute Repositories</para>
  1048. </listitem>
  1049. <listitem>
  1050. <para>Landing Zone files</para>
  1051. </listitem>
  1052. </itemizedlist>
  1053. </sect1>
  1054. <sect1>
  1055. <title>Dali data</title>
  1056. <para>The Dali Server data is typically mirrored to its backup node.
  1057. This location is specified in the environment configuration file using
  1058. the Configuration Manager.</para>
  1059. <para>Since the data is written simultaneously to both nodes, there is
  1060. no need for a manual backup procedure.</para>
  1061. </sect1>
  1062. <sect1>
  1063. <title>Environment Configuration files</title>
  1064. <para>There is only one active environment file, but you may have many
  1065. alternative configurations.</para>
  1066. <para>Configuration manager only works on files in the
  1067. /etc/HPCCSystems/source/ folder. To make a configuration active, it is
  1068. copied to /etc/HPCCSystems/environment.xml on all nodes.</para>
  1069. <para>Configuration Manager automatically creates backup copies in the
  1070. /etc/HPCCSystems/source/backup/ folder.</para>
  1071. </sect1>
  1072. <sect1>
  1073. <title>Thor data files</title>
  1074. <para>Thor clusters are normally configured to automatically replicate
  1075. data to a secondary location known as the mirror location. Usually, this
  1076. is on the second drive of the subsequent node.</para>
  1077. <para>If the data is not found at the primary location (for example, due
  1078. to drive failure or because a node has been swapped out), it looks in
  1079. the mirror directory to read the data. Any writes go to the primary and
  1080. then to the mirror. This provides continual redundancy and a quick means
  1081. to restore a system after a node swap.</para>
  1082. <para>A Thor data backup should be performed on a regularly scheduled
  1083. basis and on-demand after a node swap.</para>
  1084. <sect2>
  1085. <title>Manual backup</title>
  1086. <para>To run a backup manually, follow these steps:</para>
  1087. <orderedlist>
  1088. <listitem>
  1089. <para>Login to the Thor Master node.</para>
  1090. <para>If you don't know which node is your Thor Master node, you
  1091. can look it up using ECL Watch.</para>
  1092. </listitem>
  1093. <listitem>
  1094. <para>Run this command:</para>
  1095. <programlisting>sudo su hpcc
  1096. /opt/HPCCSystems/bin/start_backupnode &lt;thor_cluster_name&gt; </programlisting>
  1097. <para>This starts the backup process.</para>
  1098. <para></para>
  1099. <graphic fileref="images/backupnode.jpg" />
  1100. <para>Wait until completion. It will say "backupnode finished" as
  1101. shown above.</para>
  1102. </listitem>
  1103. <listitem>
  1104. <para>Run the XREF utility in ECL Watch to verify that there are
  1105. no orphan files or lost files.</para>
  1106. </listitem>
  1107. </orderedlist>
  1108. </sect2>
  1109. <sect2 role="brk">
  1110. <title>Scheduled backup</title>
  1111. <para>The easiest way to schedule the backup process is to create a
  1112. cron job. Cron is a daemon that serves as a task scheduler.</para>
  1113. <para>Cron tab (short for CRON TABle) is a text file that contains a
  1114. the task list. To edit with the default editor, use the
  1115. command:</para>
  1116. <programlisting>sudo crontab -e</programlisting>
  1117. <para>Here is a sample cron tab entry:</para>
  1118. <para><programlisting>30 23 * * * /opt/HPCCSystems/bin/start_backupnode mythor
  1119. </programlisting>30 represents the minute of the hour.</para>
  1120. <para>23 represents the hour of the day</para>
  1121. <para>The asterisks (*) represent every day, month, and
  1122. weekday.</para>
  1123. <para>mythor is the clustername</para>
  1124. <para>To list the tasks scheduled, use the command:</para>
  1125. <programlisting>sudo crontab -l</programlisting>
  1126. <para></para>
  1127. </sect2>
  1128. </sect1>
  1129. <sect1 id="Roxie-Data-Backup">
  1130. <title>Roxie data files</title>
  1131. <para>Roxie data is protected by three forms of redundancy:</para>
  1132. <itemizedlist mark="bullet">
  1133. <listitem>
  1134. <para>Original Source Data File Retention: When a query is deployed,
  1135. the data is typically copied from a Thor cluster's hard drives.
  1136. Therefore, the Thor data can serve as backup, provided it is not
  1137. removed or altered on Thor. Thor data is typically retained for a
  1138. period of time sufficient to serve as a backup copy.</para>
  1139. </listitem>
  1140. <listitem>
  1141. <para>Peer-Node Redundancy: Each Slave node typically has one or
  1142. more peer nodes within its cluster. Each peer stores a copy of data
  1143. files it will read.</para>
  1144. </listitem>
  1145. <listitem>
  1146. <para>Sibling Cluster Redundancy: Although not required, Roxie
  1147. deployments may run multiple identically-configured Roxie clusters.
  1148. When two clusters are deployed for Production each node has an
  1149. identical twin in terms of data and queries stored on the node in
  1150. the other cluster.</para>
  1151. </listitem>
  1152. </itemizedlist>
  1153. <para>This provides multiple redundant copies of data files.</para>
  1154. </sect1>
  1155. <sect1>
  1156. <title>Attribute Repositories</title>
  1157. <para>Attribute repositories are stored on ECL developer's local hard
  1158. drives. They can contain a significant number of hours of work and
  1159. therefore should be regularly backed up. In addition, we suggest using
  1160. some form of source version control, too.</para>
  1161. </sect1>
  1162. <sect1>
  1163. <title>Landing Zone files</title>
  1164. <para>Landing Zones contain raw data for input. They can also contain
  1165. output files. Depending on the size or complexity of these files, you
  1166. may want to retain copies for redundancy.</para>
  1167. </sect1>
  1168. </chapter>
  1169. </book>