PrG_Complex_ROXIE_Queries.xml 5.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141
  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
  3. "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
  4. <sect1 id="Complex_Roxie_Query_Techniques">
  5. <title><emphasis role="bold">Complex Roxie Query
  6. Techniques</emphasis></title>
  7. <para>The ECL coding techniques used in Roxie queries can be quite complex,
  8. making use of multiple keys, payload keys, half-keyed JOINs, the KEYDIFF
  9. function, and various other ECL language features. All these techniques
  10. share a single focus, though—to maximize the performance of the query so its
  11. result is delivered as efficiently as possible, thereby maximizing the total
  12. transaction throughput rate possible for the Roxie that services the
  13. query.</para>
  14. <sect2 id="Key_Selection_Based_on_Input">
  15. <title>Key Selection Based on Input</title>
  16. <para>It all starts with the architecture of your data and the keys you
  17. build from it. Typically, a single dataset would have multiple indexes
  18. into it so as to provide multiple access methods into the data. Therefore,
  19. one of the key techniques used in Roxie queries is to detect which of the
  20. set of possible values have been passed to the query, and based on those
  21. values, choose the correct INDEX to use.</para>
  22. <para>The basis for detecting which values have been passed to the query
  23. is determined by the STORED attributes defined to receive the passed
  24. values. The SOAP Interface automatically populates these attributes with
  25. whatever values have been passed to the query. That means the query code
  26. need simply interrogate those parameters for the presence of values other
  27. than their defaults.</para>
  28. <para>This example demonstrates the technique:</para>
  29. <programlisting>IMPORT $;
  30. EXPORT PeopleSearchService() := FUNCTION
  31. STRING30 lname_value := '' : STORED('LastName');
  32. STRING30 fname_value := '' : STORED('FirstName');
  33. IDX := $.IDX__Person_LastName_FirstName;
  34. Base := $.Person.FilePlus;
  35. Fetched := IF(fname_value = '',
  36. FETCH(Base,IDX(LastName=lname_value),RIGHT.RecPos),
  37. FETCH(Base,IDX(LastName=lname_value,FirstName=fname_value),RIGHT.RecPos));
  38. RETURN OUTPUT(CHOOSEN(Fetched,2000));
  39. END;</programlisting>
  40. <para>This query is written assuming that the LastName parameter will
  41. always be passed, so the IF needs only to detect whether a FirstName was
  42. also entered by the user. If so, then the filter on the index parameter to
  43. the FETCH needs to include that value, otherwise the FETCH just needs to
  44. filter the index with the LastName value.</para>
  45. <para>There are several ways this code could have been written. Here's an
  46. alternative:</para>
  47. <programlisting>IMPORT $;
  48. EXPORT PeopleSearchService() := FUNCTION
  49. STRING30 lname_value := '' : STORED('LastName');
  50. STRING30 fname_value := '' : STORED('FirstName');
  51. IDX := $.IDX__Person_LastName_FirstName;
  52. Base := $.Person.FilePlus;
  53. IndxFilter := IF(fname_value = '',
  54. IDX.LastName=lname_value,
  55. IDX.LastName=lname_value AND IDX.FirstName=fname_value);
  56. Fetched := FETCH(Base,IDX(IndxFilter),RIGHT.RecPos);
  57. RETURN OUTPUT(CHOOSEN(Fetched,2000));
  58. END;</programlisting>
  59. <para>In this example, the IF simply builds the correct filter expression
  60. for the FETCH to use. Using this form makes the code easier to read and
  61. maintain by separating out the multiple possible forms of the filter logic
  62. from the function that uses it.</para>
  63. </sect2>
  64. <sect2 id="PG_Keyed_Joins">
  65. <title>Keyed Joins</title>
  66. <para>Although the FETCH function was specifically designed for indexed
  67. access to data, in practice the half-keyed JOIN operation is more commonly
  68. used in Roxie queries. A major reason for this is the flexibility that is
  69. possible with JOIN.</para>
  70. <para>The advantages of using keyed JOIN operations in any query is fully
  71. discussed in the <emphasis>Using ECL Keys (INDEX Files)</emphasis>
  72. article. These advantages really benefit Roxie queries tremendously.
  73. Because of the nature of Roxie, the best advantage from keyed JOINs comes
  74. from the use of half-keyed JOINs that utilize payload keys (eliminating
  75. the need for additional FETCH operations).</para>
  76. </sect2>
  77. <sect2 id="Limiting_Output">
  78. <title>Limiting Output</title>
  79. <para>One major consideration for developing a Roxie query is the amount
  80. of data that may possibly be returned from the query. Since JOIN
  81. operations can possibly result in huge datasets, care should be taken to
  82. limit the number of records any given query may return to a number that is
  83. “reasonable” for that specific type of query. Here are some techniques to
  84. help accomplish that goal:</para>
  85. <para><informaltable colsep="0" frame="none" rowsep="0">
  86. <tgroup cols="2">
  87. <colspec colwidth="61.60pt" />
  88. <colspec />
  89. <tbody>
  90. <row>
  91. <entry>*</entry>
  92. <entry>The CHOOSEN and LIMIT functions should be used to limit
  93. index reads to some maximum number.</entry>
  94. </row>
  95. <row>
  96. <entry>*</entry>
  97. <entry>Keyed JOINs should use the ATMOST, KEEP, or LIMIT
  98. option.</entry>
  99. </row>
  100. <row>
  101. <entry>*</entry>
  102. <entry>When a nested child dataset is defined, it should have a
  103. MAXCOUNT option defined on the child DATASET field in the RECORD
  104. structure, and the code that builds the nested child dataset
  105. should use CHOOSEN with a value that exactly matches the
  106. MAXCOUNT.</entry>
  107. </row>
  108. </tbody>
  109. </tgroup>
  110. </informaltable></para>
  111. <para>All of these techniques will help to ensure that, when the end-user
  112. expects to get around ten results, that they don't end up with ten
  113. million.</para>
  114. </sect2>
  115. </sect1>