PrG_SOAPCALL_fromTHOR_to_ROXIE.xml 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318
  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
  3. "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
  4. <sect1 id="SOAPCALL_from_Thor_to_Roxie">
  5. <title><emphasis role="bold">SOAPCALL from Thor to Roxie</emphasis></title>
  6. <para>Once you have your SOAP-enabled queries tested and deployed to Roxie,
  7. you need to be able to use them. Many Roxie queries can be launched through
  8. some specially-designed user interface that allow end-users to enter search
  9. criteria and get results, one at a time. However, sometimes you need to
  10. retrieve data in a batch mode, where the same query is run once against each
  11. record from a dataset. That makes Thor a candidate to be the requesting
  12. platform, by using SOAPCALL.</para>
  13. <sect2 id="One_Record_Input_Record_Set_Return">
  14. <title>One Record Input, Record Set Return</title>
  15. <para>This example code (contained in Soapcall1.ECL) calls the service
  16. previously deployed in the <emphasis role="bold">Roxie Overview
  17. </emphasis>article (you will need to change the IP attribute in this code
  18. to the appropriate IP and port for the Roxie to which it has been
  19. deployed):</para>
  20. <programlisting>IMPORT $;
  21. OutRec1 := $.DeclareData.Layout_Person;
  22. RoxieIP := 'http://127.0.0.1:8002/WsEcl/soap/query/roxie/roxieoverview1.1';
  23. svc := 'RoxieOverview1.1';
  24. InputRec := RECORD
  25. STRING30 LastName := 'KLYDE';
  26. STRING30 FirstName := '';
  27. END;
  28. //1 rec in, recordset out
  29. ManyRec1 := SOAPCALL(RoxieIP,
  30. svc,
  31. InputRec,
  32. DATASET(OutRec1));
  33. OUTPUT(ManyRec1);</programlisting>
  34. <para>This example shows how you would make a SOAPCALL to the service
  35. passing it a single set of parameters to retrieve only those records that
  36. relate to the set of passed parameters. The service receives a single set
  37. of input data and returns only those records that meet that criteria. The
  38. expected result from this query is a returned set of the 1000 records
  39. whose LastName field contains “KLYDE.”</para>
  40. </sect2>
  41. <sect2 id="Record_Set_Input_Record_Set_Return">
  42. <title>Record Set Input, Record Set Return</title>
  43. <para>This next example code (contained in Soapcall2.ECL) also calls the
  44. same service as the previous example (remember, you will need to change
  45. the IP attribute in this code to the appropriate IP and port for the Roxie
  46. to which it has been deployed):</para>
  47. <programlisting>IMPORT $;
  48. OutRec1 := $.DeclareData.Layout_Person;
  49. RoxieIP := 'http://127.0.0.1:8002/WsEcl/soap/query/roxie/roxieoverview1.1';
  50. svc := 'RoxieOverview1.1';
  51. //recordset in, recordset out
  52. InRec := RECORD
  53. STRING30 LastName {XPATH('LastName')};
  54. STRING30 FirstName{XPATH('FirstName')};
  55. END;
  56. InputDataset := DATASET([{'TRAYLOR','CISSY'},
  57. {'KLYDE','CLYDE'},
  58. {'SMITH','DAR'},
  59. {'BOWEN','PERCIVAL'},
  60. {'ROMNEY','GEORGE'}],Inrec);
  61. ManyRec2 := SOAPCALL(InputDataset,
  62. RoxieIP,
  63. svc,
  64. Inrec,
  65. TRANSFORM(LEFT),
  66. DATASET(OutRec1),
  67. ONFAIL(SKIP));
  68. OUTPUT(ManyRec2);</programlisting>
  69. <para>This example passes a dataset containing multiple sets of parameters
  70. on which the service will operate, returning a single recordset of all
  71. records returned by each set of parameters. In this form, the TRANSFORM
  72. function allows the SOAPCALL to operate like a PROJECT to produce the
  73. input records that provide the input parameters for the service.</para>
  74. <para>The service operates on each record in the input dataset in turn,
  75. combining the results from each into a single return result set. The
  76. ONFAIL option indicates that if there is any type of error, then the
  77. record should simply by skipped. The expected result from this query is a
  78. returned set of three records for the only three records that match the
  79. input criteria (CISSY TRAYLOR, CLYDE KLYDE, and PERCIVAL BOWEN).</para>
  80. </sect2>
  81. <sect2 id="Performance_Considerations_PARALLEL">
  82. <title>Performance Considerations: PARALLEL</title>
  83. <para>The form of the first example takes a single row as its input. When
  84. a single URL is specified, SOAPCALL sends the request to that one URL and
  85. waits for a response. If multiple URLs are specified, SOAPCALL sends a
  86. request to the first URL in the list, waits for a response, sends a
  87. request to the second URL, and on down the list. The PARALLEL option
  88. controls concurrency, so if PARALLEL(<emphasis>n</emphasis>) is specified,
  89. requests are sent concurrently from each Thor node, with up to
  90. <emphasis>n</emphasis> requests in flight at once from each node.</para>
  91. <para>The form of the second example takes a dataset as its input. When a
  92. single URL specified, the default behaviour is to send two requests with
  93. the first and second rows concurrently, wait for a response, send the
  94. third rows, and so on down the dataset, with up to two requests in flight
  95. at once. If PARALLEL(<emphasis>n</emphasis>) is specified, it sends
  96. <emphasis>n</emphasis> requests with the first <emphasis>n</emphasis> rows
  97. concurrently from each Thor node, and so on, with up to
  98. <emphasis>n</emphasis> requests in flight at once from each node.</para>
  99. <para>In an ideal world you would specify a PARALLEL value that multiplies
  100. out to at least the number of Roxie URLs, so that every available host can
  101. work simultaneously. Also, if you're using a dataset as input, you might
  102. want to try a value several times the number of URLs. However, this could
  103. cause network contention (timeouts and dropped connections) if set too
  104. high.</para>
  105. <para>You should add the PARALLEL option to the code from both previous
  106. examples to see what effect differing values may have in your
  107. environment.</para>
  108. </sect2>
  109. <sect2 id="Performance_Considerations_MERGE">
  110. <title>Performance Considerations: MERGE</title>
  111. <para>The MERGE option controls the number of rows per request for the
  112. form that takes a dataset (MERGE does not apply to the forms of SOAPCALL
  113. that take a single row as input). If MERGE(<emphasis>m</emphasis>) is
  114. specified, each request contains up to <emphasis>m</emphasis> rows, rather
  115. than a single row.</para>
  116. <para>If the concurrency (PARALLEL option setting) is less than or equal
  117. to the number of URLs then each URL will normally only see one request at
  118. a time (assuming all hosts operate at about the same speed). In that case,
  119. you might choose a value of MERGE as high as the host and the network can
  120. take: too high a value and a massive request might kill or swamp the
  121. service, but too low a value needlessly increases overhead by sending many
  122. small requests in place of fewer larger ones. If the concurrency is
  123. greater than the number of URLs then each URL will see several requests at
  124. a time and these considerations still apply.</para>
  125. <para>Assuming that the host processes a single request serially, there is
  126. one additional consideration. You should ensure that the MERGE value is
  127. smaller than the number of rows in the dataset so as to ensure that you
  128. are making use of the parallelization on the hosts. If the value of MERGE
  129. is greater than or equal to the number of input rows, then you send the
  130. entire input dataset in one request and the host processes the rows
  131. serially.</para>
  132. <para>You should add the MERGE option to the code from the second example
  133. to see what effect differing values may have in your environment.</para>
  134. </sect2>
  135. <sect2 id="A_Real_World_Example">
  136. <title>A Real World Example</title>
  137. <para>A customer asked for help with a problem—how to compare two strings
  138. and determine if the first contains every word in the second, in any
  139. order, when there are an indeterminate number of words in each string.
  140. This is a fairly straight-forward problem in ECL. Using JOIN and ROLLUP
  141. would be one approach, or nested child dataset queries (not supported in
  142. Thor at the time of the request for help, though they may be by the time
  143. you read this). All the following code is contained in the Soapcall3.ECL
  144. file.</para>
  145. <para>The first need was to create a function that would extract all the
  146. discrete words from a string. This is the kind of job that the PARSE
  147. function excels at, so that's exactly what this code does:</para>
  148. <programlisting>ParseWords(STRING LineIn) := FUNCTION
  149. PATTERN Ltrs := PATTERN('[A-Za-z]');
  150. PATTERN Char := Ltrs | '-' | '\'';
  151. TOKEN Word := Char+;
  152. ds := DATASET([{LineIn}],{STRING line});
  153. RETURN PARSE(ds,line,Word,{STRING Pword := MATCHTEXT(Word)});
  154. END;</programlisting>
  155. <para>This FUNCTION (contained in Soapcall3.ECL) receives an input string
  156. and produces a record set result of all the words contained in that
  157. string. It defines a PATTERN attribute (Char) of allowable characters in a
  158. word as the set of all upper and lower case letters (defined by the Ltrs
  159. PATTERN), the hyphen, and the apostrophe. Any other character than these
  160. will be ignored.</para>
  161. <para>Next, it defines a Word as one or more allowable Char pattern
  162. characters. This pattern is defined as a TOKEN so that only the full word
  163. match is returned and not all the possible alternative matches (i.e.
  164. returning just SOAP, instead of SOAP, SOA, SO, and S—all the possible
  165. alternative matches that a PATTERN would generate).</para>
  166. <para>The one record in-line DATASET attribute (ds) creates the input
  167. “file” for the PARSE function to work on, producing the result record set
  168. of all the discrete words from the input string.</para>
  169. <para>Next, we need a Roxie query to compare the two strings (also
  170. contained in Soapcall3.ECL):</para>
  171. <programlisting>EXPORT Soapcall3() := FUNCTION
  172. STRING UID := '' : STORED('UIDstr');
  173. STRING LeftIn := '' : STORED('LeftInStr');
  174. STRING RightIn := '' : STORED('RightInStr');
  175. BOOLEAN TokenMatch := FUNCTION
  176. P1 := ParseWords(LeftIn);
  177. P2 := ParseWords(RightIn);
  178. SetSrch := SET(P1,Pword);
  179. ProjRes := PROJECT(P2,
  180. TRANSFORM({BOOLEAN Fnd},
  181. SELF.Fnd := LEFT.Pword IN SetSrch));
  182. AllRes := DEDUP(SORT(ProjRes,Fnd));
  183. RETURN COUNT(AllRes) = 1 AND AllRes[1].Fnd = TRUE;
  184. END;
  185. RETURN OUTPUT(DATASET([{UID,TokenMatch}],{STRING UID,BOOLEAN res}));
  186. END;</programlisting>
  187. <para>There are three pieces of data this query expects to receive: a
  188. string containing an identifier for the comparison (for context purposes
  189. in the result), and the two strings whose words to compare.</para>
  190. <para>The FUNCTION passes the input strings to the ParseWords function to
  191. create two recordsets of words from those strings. The SET function then
  192. re-defines the first recordset as a SET so the the IN operator may be
  193. used.</para>
  194. <para>The PROJECT operation does all the real work. It passes each word in
  195. turn from the second input string to its inline TRANSFORM function, which
  196. produces a Boolean result for that word—TRUE or FALSE, is it present in
  197. the set of words from the first input string or not?</para>
  198. <para>To determine if all the words in the second string were contained in
  199. the first, the SORT/DEDUP sorts all the resulting Boolean values then
  200. removes all the duplicate entries. There will only be one or two records
  201. left: either a TRUE and a FALSE, or a single TRUE or FALSE record.</para>
  202. <para>The RETURN expression detects which of the three scenarios has
  203. occurred. Two records left indicates some, but not all, of the words were
  204. present. One record indicates either all or none of the words were
  205. present, and if the value of that record is TRUE, then all words were
  206. present and the FUNCTION returns TRUE. All other cases return
  207. FALSE.</para>
  208. <para>The OUTPUT uses a one-record inline DATASET to format the result.
  209. The identifier that was passed in is passed back along with the Boolean
  210. result of the compare. The identifier becomes important when the query is
  211. called multiple times in Roxie to process through a dataset of strings to
  212. compare in a batch mode because the results may not be returned in the
  213. same order as the input records. If it were only ever used interactively,
  214. this identifier would not be necessary.</para>
  215. <para>Once you've saved the query to the Repository, you can test it with
  216. hThor and/or deploy it to Roxie (hThor will work for testing, but Roxie is
  217. much faster for production). Either way, you can use SOAPCALL to access it
  218. like this (the only difference would be the IP and port you target for the
  219. query (contained in Soapcall4.ECL)):</para>
  220. <programlisting>RoxieIP := 'http://127.0.0.1:8002/WsEcl/soap/query/roxie/soapcall3.1'; //Roxie
  221. svc := 'soapcall3.1';
  222. InRec := RECORD
  223. STRING UIDstr{XPATH('UIDstr')};
  224. STRING LeftInStr{XPATH('LeftInStr')};
  225. STRING RightInStr{XPATH('RightInStr')};
  226. END;
  227. InDS := DATASET([
  228. {'1','the quick brown fox jumped over the lazy red dog','quick fox red dog'},
  229. {'2','the quick brown fox jumped over the lazy red dog','quick fox black dog'},
  230. {'3','george of the jungle lives here','fox black dog'},
  231. {'4','fred and wilma flintstone','fred flintstone'},
  232. {'5','yomama comeonah','brake chill'} ],InRec);
  233. RS := SOAPCALL(InDS,
  234. RoxieIP,
  235. svc,
  236. InRec,
  237. TRANSFORM(LEFT),
  238. DATASET({STRING UIDval{XPATH('uid')},BOOLEAN CompareResult{XPATH('res')}}));
  239. OUTPUT(RS);
  240. </programlisting>
  241. <para>Of course, <emphasis role="bold">you must first change the IP and
  242. port in this code to the correct values for your environment</emphasis>.
  243. You can find the proper IP and port to use by looking at the System
  244. Servers page of your ECL Watch. To target Doxie (aka ECL Agent or hthor),
  245. use the IP of your Thor's ESP Server and the port for its wsecl service.
  246. To target Roxie, use the IP of your Roxie's ESP Server and the port for
  247. its wsecl service. It's possible that both ESP servers could be on the
  248. same box. If so, then the difference will only be in the port assignment
  249. for each.</para>
  250. <para>The key to this SOAPCALL query is the InRec RECORD structure with
  251. its XPATH definitions. These must exactly match the part names and the
  252. STORED names of the query's parameter receiving attributes (NB that these
  253. are case sensitive, since XPATH is XML and XML is always case sensitive).
  254. This is what maps the input data fields through the SOAP interface to the
  255. query's attributes.</para>
  256. <para>This SOAPCALL receives a recordset as input and produces a recordset
  257. as its result, making it very similar to the second example above. One
  258. small change from that previous example of this type is the use of the
  259. shorthand TRANSFORM instead of an inline TRANSFORM function. Also, note
  260. that the XPATH for the first field in the DATASET parameter's inline
  261. RECORD structure contains lower case “uid” while it is obviously
  262. referencing the query's OUTPUT field named “UID”—the XML returned from the
  263. SOAP service uses lower case tag names for the returned data
  264. fields.</para>
  265. <para>When you run this you'll get a TRUE result for records one and four,
  266. and FALSE for all others.</para>
  267. </sect2>
  268. </sect1>