PrG_query_libraries.xml 18 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425
  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
  3. "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
  4. <sect1 id="Query_Libraries">
  5. <title><emphasis role="bold">Query Libraries</emphasis></title>
  6. <para>A Query Library is a set of attributes, packaged together in a self
  7. contained unit, which allows the code to be shared between different
  8. workunits. This reduces the time required to deploy a set of attributes, and
  9. can reduce the memory footprint for the set of queries within Roxie that use
  10. the library. It is also possible to update a query library without having to
  11. re-deploy all the queries that use it.</para>
  12. <para>Query libraries are not supported in Thor, but may be in the
  13. future.</para>
  14. <para>A Query Library is defined by two structures—an INTERFACE to define
  15. the parameters to pass, and a MODULE that implements the INTERFACE.</para>
  16. <sect2 id="Library_INTERFACE_Definition">
  17. <title>Library INTERFACE Definition</title>
  18. <para>To create a Query Library, the first requirement is a definition of
  19. its input parameters with an INTERFACE structure that receives the
  20. parameters:</para>
  21. <programlisting>NamesRec := RECORD
  22. INTEGER1 NameID;
  23. STRING20 FName;
  24. STRING20 LName;
  25. END;
  26. FilterLibIface1(DATASET(namesRec) ds, STRING search) := INTERFACE
  27. EXPORT DATASET(namesRec) matches;
  28. EXPORT DATASET(namesRec) others;
  29. END;</programlisting>
  30. <para>This example defines the INTERFACE for a library that takes two
  31. inputs—a DATASET (with the specified layout format) and a STRING—and which
  32. gives you the ability to output two DATASET results.</para>
  33. <para>For most library queries it may be preferable to also use a separate
  34. INTERFACE to define the input parameters. Using an INTERFACE makes the
  35. passed parameters clearer and makes it easier to keep the interface and
  36. implementation in sync. This example defines the same library interface as
  37. above:</para>
  38. <programlisting>NamesRec := RECORD
  39. INTEGER1 NameID;
  40. STRING20 FName;
  41. STRING20 LName;
  42. END;
  43. IFilterArgs := INTERFACE //defines passed parameters
  44. EXPORT DATASET(namesRec) ds;
  45. EXPORT STRING search;
  46. END;
  47. FilterLibIface2(IFilterArgs args) := INTERFACE
  48. EXPORT DATASET(namesRec) matches;
  49. EXPORT DATASET(namesRec) others;
  50. END;</programlisting>
  51. </sect2>
  52. <sect2 id="Library_MODULE_Definitions">
  53. <title>Library MODULE Definitions</title>
  54. <para>A query library is a MODULE structure definition that implements a
  55. particular library INTERFACE definition. The parameters passed to the
  56. MODULE must exactly match the parameters for the library INTERFACE
  57. definition, and the MODULE must contain compatible EXPORT attribute
  58. definitions for each of the results specified in the library INTERFACE.
  59. The LIBRARY option on the MODULE definition specifies the library
  60. INTERFACE being implemented. This example defines an implementation for
  61. the INTERFACEs above:</para>
  62. <programlisting>FilterDsLib1(DATASET(namesRec) ds,
  63. STRING search) := MODULE,LIBRARY(FilterLibIface1)
  64. EXPORT matches := ds(Lname = search);
  65. EXPORT others := ds(Lname != search);
  66. END;</programlisting>
  67. <para>and for the variety that takes an INTERFACE as its single
  68. parameter:</para>
  69. <programlisting>FilterDsLib2(IFilterArgs args) := MODULE,LIBRARY(FilterLibIface2)
  70. EXPORT matches := args.ds(Lname = args.search);
  71. EXPORT others := args.ds(Lname != args.search);
  72. END;</programlisting>
  73. </sect2>
  74. <sect2 id="Building_an_External_library">
  75. <title>Building an External library</title>
  76. <para>A query library may be either internal or external. An internal
  77. library would be primarily used in hthor queries for testing and debugging
  78. before deploying to Roxie. Although you can use internal query libraries
  79. in Roxie queries, the advantages come from using the external
  80. version.</para>
  81. <para>An external query library is created by the BUILD action, which
  82. compiles the query library into its own workunit. The name of the library
  83. is the job name associated with the workunit. Therefore, the #WORKUNIT
  84. would normally be used to give the workunit a meaningful job name, as in
  85. this example:</para>
  86. <programlisting>#WORKUNIT('name','Ppass.FilterDsLib');
  87. BUILD(FilterDsLib1);</programlisting>
  88. <para>This code builds the library for the INTERFACE parameter version of
  89. the code above:</para>
  90. <programlisting>#WORKUNIT('name','Ipass.FilterDsLib');
  91. BUILD(FilterDsLib2);</programlisting>
  92. <para>The system maintains a catalog of the latest versions of each query
  93. library that is updated whenever a library is built. Hthor uses this to
  94. resolve query libraries when running a query (as will Thor, when it
  95. eventually supports query libraries). Roxie uses the query aliasing
  96. mechanism in the same way.</para>
  97. </sect2>
  98. <sect2 id="Using_a_Query_Library">
  99. <title>Using a Query Library</title>
  100. <para>The syntax for using a query library is slightly different depending
  101. on whether the library is internal or external. However, both methods use
  102. the LIBRARY function.</para>
  103. <para>The LIBRARY function returns a MODULE implementation with the proper
  104. parameters passed for the instance in which you want to use it, which may
  105. be used to access the EXPORT attributes from the library.</para>
  106. <sect3>
  107. <title>Internal Libraries</title>
  108. <para>An internal library generates the library code as a separate unit,
  109. but then includes that unit within the query workunit. It doesn't have
  110. the advantage of reducing compile time or memory usage in Roxie, but it
  111. does retain the library structure, which means that changes to the code
  112. cannot affect anyone else using the system. That makes internal
  113. libraries a perfect testing method.</para>
  114. <para>The syntax for using an internal query library simply passes the
  115. library MODULE attribute's name inside an INTERNAL function call in the
  116. first parameter to the LIBRARY function, as in this example (for the
  117. version that does not take an INTERFACE as its parameter):</para>
  118. <programlisting>NamesTable := DATASET([ {1,'Doc','Holliday'},
  119. {2,'Liz','Taylor'},
  120. {3,'Mr','Nobody'},
  121. {4,'Anywhere','but here'}],
  122. NamesRec);
  123. lib1 := LIBRARY(INTERNAL(FilterDsLib1),FilterLibIface1(NamesTable, 'Holliday'));
  124. </programlisting>
  125. <para>In this case, result is a MODULE with two EXPORTed
  126. attributes—matches and others—that can be used just like any other
  127. MODULE, as in this example:</para>
  128. <programlisting>OUTPUT(lib1.matches);
  129. OUTPUT(lib1.others);</programlisting>
  130. <para>and the code changes to this for the variety that takes an
  131. INTERFACE:</para>
  132. <programlisting>NamesTable := DATASET([ {1,'Doc','Holliday'},
  133. {2,'Liz','Taylor'},
  134. {3,'Mr','Nobody'},
  135. {4,'Anywhere','but here'}],
  136. NamesRec);
  137. SearchArgs := MODULE(IFilterArgs)
  138. EXPORT DATASET(namesRec) ds := NamesTable;
  139. EXPORT STRING search := 'Holliday';
  140. END;
  141. lib3 := LIBRARY(INTERNAL(FilterDsLib2),FilterLibIface2(SearchArgs));
  142. OUTPUT(lib3.matches);
  143. OUTPUT(lib3.others);</programlisting>
  144. </sect3>
  145. <sect3>
  146. <title><emphasis role="bold">External Libraries</emphasis></title>
  147. <para>Once the library is implemented as an external library (using the
  148. BUILD action to create the library is done in a separate workunit) the
  149. LIBRARY function no longer requires the use of the INTERNAL function to
  150. specify the library. Instead, it takes a string constant containing the
  151. name of the workunit created by BUILD as its first parameter, like
  152. this:</para>
  153. <programlisting>NamesTable := DATASET([ {1,'Doc','Holliday'},
  154. {2,'Liz','Taylor'},
  155. {3,'Mr','Nobody'},
  156. {4,'Anywhere','but here'}],
  157. NamesRec);
  158. lib2 := LIBRARY('Ppass.FilterDsLib',FilterLibIface1(NamesTable, 'Holliday'));
  159. OUTPUT(lib2.matches);
  160. OUTPUT(lib2.others);</programlisting>
  161. <para>Or, for the INTERFACE version:</para>
  162. <programlisting>NamesTable := DATASET([ {1,'Doc','Holliday'},
  163. {2,'Liz','Taylor'},
  164. {3,'Mr','Nobody'},
  165. {4,'Anywhere','but here'}],
  166. NamesRec);
  167. SearchArgs := MODULE(IFilterArgs)
  168. EXPORT DATASET(namesRec) ds := NamesTable;
  169. EXPORT STRING search := 'Holliday';
  170. END;
  171. lib4 := LIBRARY('Ipass.FilterDsLib',FilterLibIface2(SearchArgs));
  172. OUTPUT(lib4.matches);
  173. OUTPUT(lib4.others);
  174. </programlisting>
  175. <para>A couple of words of warning about using external libraries: If
  176. you are developing an attribute inside a library that is shared by other
  177. people, then you need to make sure that your development changes don't
  178. invalidate other queries. This means you need to use a different library
  179. name while developing. The simplest method is probably to use a
  180. different attribute for the library implementation while you are
  181. developing. Another way to avoid this is to develop/test with internal
  182. libraries and only build and implement the external library when you are
  183. ready to put the query into production.</para>
  184. <para>If libraries are nested then it gets more complicated. If you are
  185. working on a libraryC, which is called from a libraryA, which is then
  186. called from a query, then you will need to use different library names
  187. for libraryC and libraryA. Otherwise you will either not call your
  188. modified code in libraryC, or everyone using libraryA will call your
  189. modified code. You may prefer to make libraryA and libraryC internal
  190. instead, but you won't gain from the reduced compile time associated
  191. with external libraries.</para>
  192. <para>Also remember your changes are occurring in the library, not in
  193. the query. It's not uncommon to wonder why changes to the ECL aren't
  194. having any effect, and then realize that you've been
  195. rebuilding/deploying the wrong item.</para>
  196. </sect3>
  197. </sect2>
  198. <sect2 id="Query_Library_Tips">
  199. <title>Query Library Tips</title>
  200. <para>You can make your code cleaner by making the LIBRARY call a function
  201. attribute, like this:</para>
  202. <programlisting>FilterDataset(DATASET(namesRecord) ds,
  203. STRING search) := LIBRARY('Ppass.FilterDsLib',FilterLibIface1(ds, search));
  204. </programlisting>
  205. <para>The use of the library then becomes:</para>
  206. <programlisting>FilterDataset(myNames, 'Holliday');</programlisting>
  207. <para>The library name (specified as the first parameter to the LIBRARY
  208. function) does not have to be a constant value, but it must not change
  209. while the query is running. This means you can conditionally select
  210. between different versions of a library.</para>
  211. <para>For example, it is likely that you will want separate libraries for
  212. handling FCRA and non-FCRA data, since combining the two could generate
  213. inefficient or un-processable code. The code for selecting between the two
  214. implementations would look like this:</para>
  215. <programlisting>LibToUse := IF(isFCRA,'special.lookupFRCA','special.lookupNoFCRA);
  216. MyResults := LIBRARY(LibToUse, InterfaceCommonToBoth(args));
  217. </programlisting>
  218. </sect2>
  219. <sect2 id="Query_Library_Restrictions">
  220. <title>Restrictions</title>
  221. <para>The system will report an error if you attempt to use an
  222. implementation of a query library that has a different INTERFACE from the
  223. one specified in the LIBRARY function.</para>
  224. <para>There is one particularly notable restriction on the ECL that can be
  225. included within a library: they cannot include workflow services such as
  226. INDEPENDENT, PERSIST, SUCCESS, and especially, STORED. STORED attributes
  227. don't make sense inside a query library because a query library should be
  228. independent of both the queries that use it, and other query libraries.
  229. Instead of using STORED attributes (like SOAP-enabled Roxie queries use)
  230. to pass parameters to the library queries, the parameters must be
  231. explicitly passed into the query library—either as an individual
  232. parameter, or as part of an INTERFACE definition that provides the
  233. arguments. The query that uses the query library can use stored variables,
  234. and then map those stored variables to the parameters expected by the
  235. query libraries.</para>
  236. <para>Query libraries can currently only EXPORT datasets, datarows, and
  237. single-valued expressions. In particular they cannot EXPORT actions (like
  238. OUTPUT), TRANSFORM structures, or other MODULE structures.</para>
  239. </sect2>
  240. <sect2 id="Notes_on_the_implementation">
  241. <title>Notes on the implementation</title>
  242. <para>There are a couple of items that may be worth noting about the
  243. implementation. In Roxie, before executing the query, all library graphs
  244. are expanded into the query graph. Any datasets that are supplied as
  245. parameters to the library (or a dataset inside an interface parameter) are
  246. directly connected to the place they are used in the query library, and
  247. only results that are used are evaluated. This means that using a query
  248. library should have very little overhead compared with including the ECL
  249. code directly in the query. NOTE: Datasets inside row parameters aren't
  250. streamed, so passing a ROW containing a dataset as a parameter to the
  251. library is not as efficient as using an INTERFACE.</para>
  252. <para>The implementation in hthor is not as efficient. Dataset parameters
  253. are fully evaluated, and passed to the library as a complete unit block
  254. and all results are evaluated. Thor does not yet support query
  255. libraries.</para>
  256. <para>The other item of note is that if libraryA uses libraryC, and
  257. libraryB also uses libraryC with the same parameters, the calls from
  258. different libraries will not be commoned up. However if an attribute
  259. exported from an instance of libraryC is passed to libraryA and libraryB,
  260. then the calls to libraryC will be commoned up. The way attributes
  261. currently tend to be structured in the repository, e.g., calculating
  262. get_Dids() and passing that into other attributes means this is unlikely
  263. to cause any issues in practice.</para>
  264. </sect2>
  265. <sect2 id="Suggested_Structure">
  266. <title>Suggested Structure</title>
  267. <para>Before writing a lot of libraries, it is worth spending some time
  268. working out how the attributes for a library are structured, so all the
  269. libraries in the system are consistent. Here are some guidelines to use
  270. during your query library design phase:</para>
  271. <sect3>
  272. <title>Naming Conventions</title>
  273. <para>I would also suggest coming up with a consistent naming convention
  274. before developing lots of libraries. In particular, you need a
  275. convention for the names of the library arguments, library definition,
  276. implementing module, library implementation and the attribute that wraps
  277. the use of the library. (E.g., something like IXArgs, Xinterface, DoX,
  278. Xlibrary, and X()).</para>
  279. </sect3>
  280. <sect3>
  281. <title>Use an INTERFACE to define parameters</title>
  282. <para>This mechanism (example shown below) provides documentation for
  283. the parameters required by a service. It means the code inside the
  284. implementation will access them as args.xxx or options.xxx, so it will
  285. be clear when parameters are being accessed. It also makes some of the
  286. following suggestions simpler.</para>
  287. </sect3>
  288. <sect3>
  289. <title>Hide the LIBRARY</title>
  290. <para>Making the LIBRARY function call a functional attribute (example
  291. also shown below) means you can easily modify all uses of a library if
  292. you are developing a new version. Similarly you can easily switch to use
  293. an internal library instead by changing just the one line of
  294. code.</para>
  295. </sect3>
  296. <sect3>
  297. <title>Use MODULE Inheritance</title>
  298. <para>Use a MODULE structure (without the LIBRARY option) that
  299. implements the library's INTERFACE, and a separate MODULE derived from
  300. the first to implement the LIBRARY using that service module. By hiding
  301. the LIBRARY and using a separate MODULE implementation you can easily
  302. remove the library all together. Also, using a separate implementation
  303. from the library definitions means you can easily generate multiple
  304. variants of the same library from the same definition.</para>
  305. <programlisting>NamesRec := RECORD
  306. INTEGER1 NameID;
  307. STRING20 FName;
  308. STRING20 LName;
  309. END;
  310. NamesTable := DATASET([ {1,'Doc','Holliday'},
  311. {2,'Liz','Taylor'},
  312. {3,'Mr','Nobody'},
  313. {4,'Anywhere','but here'}],
  314. NamesRec);
  315. //define an INTERFACE for the passed parameters
  316. IFilterArgs := INTERFACE
  317. EXPORT DATASET(namesRec) ds;
  318. EXPORT STRING search;
  319. END;
  320. //then define an INTERFACE for the query library
  321. FilterLibIface2(IFilterArgs args) := INTERFACE
  322. EXPORT DATASET(namesRec) matches;
  323. EXPORT DATASET(namesRec) others;
  324. END;
  325. //implement the INTERFACE
  326. FilterDsLib(IFilterArgs args) := MODULE
  327. EXPORT matches := args.ds(Lname = args.search);
  328. EXPORT others := args.ds(Lname != args.search);
  329. END;
  330. //then derive that MODULE to implement the LIBRARY
  331. FilterDsLib2(IFilterArgs args) := MODULE(FilterDsLib(args)),LIBRARY(FilterLibIface2)
  332. END;
  333. //make the LIBRARY call a function
  334. FilterDs(IFilterArgs args) := LIBRARY(INTERNAL(FilterDsLib2),FilterLibIface2(args));
  335. //easily modified to eliminate the LIBRARY, if desired
  336. // FilterDs(IFilterArgs args) := FilterDsLib2(args));
  337. //define the parameters to pass as the interface
  338. SearchArgs := MODULE(IFilterArgs)
  339. EXPORT DATASET(namesRec) ds := NamesTable;
  340. EXPORT STRING search := 'Holliday';
  341. END;
  342. //use the LIBRARY, passing the parameters
  343. OUTPUT(FilterDs(SearchArgs).matches);
  344. OUTPUT(FilterDs(SearchArgs).others);</programlisting>
  345. </sect3>
  346. </sect2>
  347. </sect1>