فهرست منبع

HPCC-20089 Rationalize and consolidate the developer documentation

Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
Gavin Halliday 6 سال پیش
والد
کامیت
88e1bd2280
8فایلهای تغییر یافته به همراه627 افزوده شده و 894 حذف شده
  1. 20 19
      ecl/eclcc/DOCUMENTATION.rst
  2. 198 0
      devdoc/Development.rst
  3. 0 0
      devdoc/MemoryManager.rst
  4. 36 0
      devdoc/README.rst
  5. 373 0
      devdoc/StyleGuide.rst
  6. 0 1
      ecl/eclcc/WORKUNITS.rst
  7. 0 8
      ecl/eclcc/README.rst
  8. 0 866
      sourcedoc.xml

+ 20 - 19
ecl/eclcc/DOCUMENTATION.rst

@@ -7,7 +7,7 @@ Introduction
 ************
 
 Purpose
-=======
+========
 The primary purpose of the code generator is to take an ECL query and convert it into a work unit
 that is suitable for running by one of the engines.
 
@@ -229,14 +229,14 @@ The key data structure within eclcc is the graph representation.  The design has
 
 * The expression classes use interfaces and a type field rather than polymorphism.
   This could be argued to be bad object design...but.
-  
+
   There are more than 500 different possible operators.  If a class was created for each of them the
   system would quickly become unwieldy.  Instead there are several different classes which model the
   different types of expression (dataset/expression/scope).
-  
+
   The interfaces contain everything needed to create and interrogate an expression tree, but they do
   not contain functionality for directly processing the graph.
-  
+
   To avoid some of the shortcomings of type fields there are various mechanisms for accessing derived attributes which avoid interrogating the type field.
 
 * Memory consumption is critical.
@@ -273,7 +273,7 @@ must be added to the end).
 
 IHqlSimpleScope
 ---------------
-This interface is implemented by records, and is used to map names to the fields within the records. 
+This interface is implemented by records, and is used to map names to the fields within the records.
 If a record contains IFBLOCKs then each of the fields in the ifblock is defined in the
 IHqlSimpleScope for the containing record.
 
@@ -308,7 +308,7 @@ Properties and attributes
 -------------------------
 There are two related by slightly different concepts.  An attribute refers to the explicit flags that
 are added to operators (e.g., , LOCAL, KEEP(n) etc. specified in the ECL or some internal attributes
-added by the code generator).  There are a couple of different functions for creating attributes. 
+added by the code generator).  There are a couple of different functions for creating attributes.
 createExtraAttribute() should be used by default.  createAttribute() is reserved for an attribute
 that never has any arguments, or in unusual situations where it is important that the arguments are
 never transformed.  They are tested using queryAttribute()/hasAttribute() and represented by nodes of
@@ -323,10 +323,10 @@ Fields can be selected from active rows of a dataset in three main ways:
 
 * Some operators define LEFT/RIGHT to represent an input or processed dataset.  Fields from these
   active rows are referenced with LEFT.<field-name>.  Here LEFT or RIGHT is the "selector".
-  
+
 * Other operators use the input dataset as the selector.  E.g., myFile(myFile.id != 0).  Here the
   input dataset is the "selector".
-  
+
 * Often when the input dataset is used as the selector it can be omitted.  E.g., myFile(id != 0).
   This is implicitly expanded by the PARSER to the second form.
   A reference to a field is always represented in the expression graph as a node of kind no_select
@@ -457,7 +457,7 @@ mechanisms for caching derived information so it is available efficiently.
 * Active datasets - gatherTablesUsed().
 
   It is very common to want to know which datasets an expression references.  This information is
-  calculated and cached on demand and accessed via the IHqlExpression::gatherTablesUsed() functions. 
+  calculated and cached on demand and accessed via the IHqlExpression::gatherTablesUsed() functions.
   There are a couple of other functions IHqlExpression::isIndependentOfScope() and
   IHqlExpression::usesSelector() which provide efficient functions for common uses.
 
@@ -495,7 +495,7 @@ shouldn't repeat the work - otherwise the execution time may be exponential with
 Other things to bear in mind
 
 * If a node isn't modified don't create a new one - return a link to the old one.
-* You generally need to walk the graph and gather some information before creating a modified graph. 
+* You generally need to walk the graph and gather some information before creating a modified graph.
   Sometimes creating a new graph can be short-circuited if no changes will be required.
 * Sometimes you can be tempted to try and short-circuit transforming part of a graph (e.g., the
   arguments to a dataset activity), but because of the way references to fields within dataset work
@@ -537,7 +537,8 @@ Some more details on the individual transforms are given below..
 Key Stages
 **********
 Parsing
-=======
+========
+
 The first job of eclcc is to parse the ECL into an expression graph.  The source for the ECL can come
 from various different sources (archive, source files, remote repository).  The details are hidden
 behind the IEclSource/IEclSourceCollection interfaces.  The createRepository() function is then used
@@ -551,11 +552,11 @@ Several things occur while the ECL is being parsed:
   which is better suited to processing and optimizing.
 
 * Some limited constant folding occurs.
-  
+
   When a function is expanded, often it means that some of the
   test conditions are always true/false.  To reduce the transformations the condition may be folded
-  early on.  
-  
+  early on.
+
 * When a symbol is referenced from another module this will recursively cause the ECL for that module
   (or definition within that module) to be parsed.
 
@@ -572,7 +573,7 @@ There are various problems with the expression graph that comes out of the parse
 * Records can have values as children (e.g., { myField := infield.value} ), but it causes chaos if
   record definitions can change while other transformations are going on.  So the normalization
   removes values from fields.
-* Some activities use records to define the values that output records should contain (e.g., TABLE). 
+* Some activities use records to define the values that output records should contain (e.g., TABLE).
   These are now converted to another form (e.g., no_newusertable).
 * Sometimes expressions have multiple definition names.  Symbols and annotations are rationalized and
   commoned up to aid commoning up other expressions.
@@ -580,7 +581,7 @@ There are various problems with the expression graph that comes out of the parse
   symbols are removed.
 * The CASE/MAP representation for a dataset/action is awkward for the transforms to process.  They
   are converted to nested Ifs.
-  
+
   (At some point a different representation might be a good idea.)
 * EVALUATE is a weird syntax.  Instances are replaced with equivalent code which is much easier to
   subsequently process.
@@ -667,7 +668,7 @@ Workflow
 ========
 
 The actions in a workunit are divided up into individual workflow items.  Details of when each
-workflow item is executed, what its dependencies are stored in the <Workflow> section of the xml. 
+workflow item is executed, what its dependencies are stored in the <Workflow> section of the xml.
 The generated code also contains a class definition, with a method perform() which is used to execute
 the actions associated with a particular workflow item. (The class instances are created by calling
 the exported createProcess() factory function).
@@ -678,7 +679,7 @@ point to execute a graph.
 Graph
 =====
 The activity graphs are stored in the xml.  The graph contains details of which activities are
-required, how those activities link together, what dependencies there are between the activities. 
+required, how those activities link together, what dependencies there are between the activities.
 For each activity it might contain the following information:
 
 * A unique id.
@@ -900,7 +901,7 @@ Most dataset operations are only implemented as activities (e.g., PARSE, DEDUP).
 within a transform/filter then eclcc will generate a call to a child query.  An activity helper for the
 appropriate operation will then be generated.
 
-However a subset of the dataset operations can also be evaluated inline without calling a child query. 
+However a subset of the dataset operations can also be evaluated inline without calling a child query.
 Some examples are filters, projects, and simple aggregation.  It removes the overhead of the child query
 call in the simple cases, and often generates more concise code.
 

+ 198 - 0
devdoc/Development.rst

@@ -0,0 +1,198 @@
+..  ################################################################################
+    #    HPCC SYSTEMS software Copyright (C) 2012-2018 HPCC Systems®.
+    #
+    #    Licensed under the Apache License, Version 2.0 (the "License");
+    #    you may not use this file except in compliance with the License.
+    #    You may obtain a copy of the License at
+    #
+    #       http://www.apache.org/licenses/LICENSE-2.0
+    #
+    #    Unless required by applicable law or agreed to in writing, software
+    #    distributed under the License is distributed on an "AS IS" BASIS,
+    #    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    #    See the License for the specific language governing permissions and
+    #    limitations under the License.
+    ################################################################################
+
+===========
+HPCC Source
+===========
+
+The most upto date details of building the system are found on the HPCC Wiki at
+https://github.com/hpcc-systems/HPCC-Platform/wiki/Building-HPCC.
+
+*******************
+Getting the sources
+*******************
+
+The HPCC Platform sources are hosted on GitHub at https://github.com/hpcc-systems/HPCC-Platform. You can download a
+snapshot of any branch using the download button there, or you can set up a git clone of the repository. If you are
+planning to contribute changes to the system, see the file CONTRIBUTORS at
+https://github.com/hpcc-systems/HPCC-Platform/blob/master/CONTRIBUTORS for information about how to set up a GitHub
+fork of the project through which pull-requests can be made.
+
+********************************
+Building the system from sources
+********************************
+
+Requirements
+============
+The HPCC platform requires a number of third party tools and libraries in order to build.  The `HPCC Wiki`_ contains the
+details of the dependencies that are required for different distributions.
+
+For building any documentation, the following are also required::
+
+    sudo apt-get install docbook
+    sudo apt-get install xsltproc
+    sudo apt-get install fop
+
+**NOTE:** Installing the above via alternative methods (i.e. from source) may place installations outside of searched
+paths.
+
+Building the system
+===================
+
+The HPCC system is built using the cross-platform build tool cmake, which is available for Windows, virtually all
+flavors of Linux, FreeBSD, and other platforms. You should install cmake version 2.8.3 or later before building the
+sources.
+
+On some distros you will need to build cmake from sources if the version of cmake in the standard repositories for
+that distro is not modern enough.  It is good practice in cmake to separate the build directory where objects and
+executables are made from the source directory, and the HPCC cmake scripts will enforce this.
+
+To build the sources, create a directory where the built files should
+be located, and from that directory, run::
+
+    cmake <source directory>
+
+Depending on your operating system and the compilers installed on it,
+this will create a makefile, Visual Studio .sln file, or other build
+script for building the system. If cmake was configured to create a
+makefile, then you can build simply by typing::
+
+    make
+
+If a Visual Studio solution file was created, you can load it simply by typing the name::
+
+    hpccsystems-platform.sln
+
+This will load the solution in Visual Studio where you can build in the usual way.
+
+*********
+Packaging
+*********
+
+To make an installation package on a supported linux system, use the command::
+
+    make package
+
+This will first do a make to ensure everything is up to date, then will
+create the appropriate package for your operating system, Currently supported
+package formats are rpm (for RedHat/Centos) and  .deb (for Debian and
+Ubuntu). If the operating system is not one of the above, or is not recognized,
+make package will create a tarball.
+
+The package installation does not start the service on the machine, so if you
+want to give it a go or test it (see below), make sure to start the service manually
+and wait until all services are up (mainly wait for EclWatch to come up on port 8010).
+
+
+******************
+Testing the system
+******************
+
+
+After compiling, installing the package and starting the services, you can test
+the HPCC platform on a single-node setup.
+
+
+Unit Tests
+==========
+Some components have their own unit-tests. Once you have compiled (no need to
+start the services), you can already run them. Supposing you build a Debug
+version, from the build directory you can run::
+
+    ./Debug/bin/roxie -selftest
+
+and::
+
+    ./Debug/bin/eclagent -selftest
+
+You can also run the Dali regression self-tests::
+
+    ./Debug/bin/daregress localhost
+
+Regression Tests
+================
+
+**MORE** Completely out of date - needs rewriting.
+
+Compiler Tests
+==============
+
+The ECLCC compiler tests rely on two distinct runs: a known good one and your
+test build. For normal development, you can safely assume that the OSS/master
+branch in github is good. For overnight testing, golden directories need to
+be maintained according to the test infrastructure. There are Bash (Linux)
+and Batch (Windows) scripts to run the regressions:
+
+The basic idea behind this tests is to compare the output files (logs and
+XML files) between runs. The log files should change slightly (the comparison
+should be good enough to filter most irrelevant differences), but the XML
+files should be identical if nothing has changed. You should only see
+differences in the XML where you have changed in the code, or new tests
+were added as part of your development.
+
+On Linux, there are two steps:
+
+Step 1: Check-out OSS/master, compile and run the regressions to populate
+the 'golden' directory::
+
+    ./regress.sh -t golden -e buildDir/Debug/bin/eclcc
+
+This will run the regressions in parallel, using as many CPUs as you have,
+and using your just-compiled ECLCC, assuming you compiled for Debug version.
+
+Step 2: Make your changes (or check-out your branch), compile and run again,
+this time output to a new directory and compare to the 'golden' repo.::
+
+    ./regress.sh -t my_branch -c golden -e buildDir/Debug/bin/eclcc
+
+This will run the regressions in the same way, output to 'my_branch' dir
+and compare it to the golden version, highlighting the differences.
+
+NOTE: If you changed the headers that the compiled binaries will use, you
+must re-install the package (or provide -i option to the script to the new
+headers).
+
+Step 3: Step 2 only listed the differences, now you need to see what they are.
+For that, re-run the regressing script omitting the compiler, since the only
+thing we'll do is to compare verbosely.::
+
+    ./regress.sh -t my_branch -c golden
+
+This will show you all differences, using the same ignore filters as before,
+between your two branches. Once you're happy with the differences, commit and
+issue a pull-request.
+
+TODO: Describe compiler tests on Windows.
+
+********************
+Debugging the system
+********************
+
+On linux systems, the makefile generated by cmake will build a specific
+version (debug or release) of the system depending on the options selected
+when cmake is first run in that directory. The default is to build a release
+system. In order to build a debug system instead, use
+command::
+
+    cmake -DCMAKE_BUILD_TYPE=Debug <source directory>
+
+You can then run make or make package in the usual way to build the system.
+
+On a Windows system, cmake always generates s solution file with both debug and
+release target platforms in it, so you can select which one to build within
+Visual Studio.
+
+.. _HPCC Wiki: https://github.com/hpcc-systems/HPCC-Platform/wiki/Building-HPCC

roxie/roxiemem/DOCUMENTATION.rst → devdoc/MemoryManager.rst


+ 36 - 0
devdoc/README.rst

@@ -0,0 +1,36 @@
+=======================
+Developer Documentation
+=======================
+
+This directory contains the documentation specifically targeted at developers of the HPCC system.  Information
+is also include in the wiki at https://github.com/hpcc-systems/HPCC-Platform/wiki.
+
+General documentation
+=====================
+
+* `Development guide`_: Building the system and development guide.
+
+* `C++ style guide`_: Style guide for c++ code.
+
+* `ECL style guide`_: Style guide for ECL code.
+
+Implementation details for different parts of the system
+========================================================
+
+* `Workunit Workflow`_: An explanation of workunits, and a walk-through of the steps in executing a query.
+
+* `Code Generator Documentation`_: Details of the internals of eclcc.
+
+* `Memory Manager`_: Details of the memory manager (roxiemem) used by the query engines.
+
+
+Other documentation
+===================
+The ECL language is documented in the ecl language reference manual (generated as ECLLanguageReference-<version>.pdf).
+
+.. _Development guide: Development.rst
+.. _Code Generator Documentation: CodeGenerator.rst
+.. _Workunit Workflow: WorkUnits.rst
+.. _Memory Manager: MemoryManager.rst
+.. _C++ style guide: StyleGuide.rst
+.. _ECL style guide: ../ecllibrary/StyleGuide.html

+ 373 - 0
devdoc/StyleGuide.rst

@@ -0,0 +1,373 @@
+..  ################################################################################
+    #    HPCC SYSTEMS software Copyright (C) 2012-2018 HPCC Systems®.
+    #
+    #    Licensed under the Apache License, Version 2.0 (the "License");
+    #    you may not use this file except in compliance with the License.
+    #    You may obtain a copy of the License at
+    #
+    #       http://www.apache.org/licenses/LICENSE-2.0
+    #
+    #    Unless required by applicable law or agreed to in writing, software
+    #    distributed under the License is distributed on an "AS IS" BASIS,
+    #    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    #    See the License for the specific language governing permissions and
+    #    limitations under the License.
+    ################################################################################
+
+==================
+Coding conventions
+==================
+
+***********************
+Why coding conventions?
+***********************
+
+Everyone has their own ideas of what the best code formatting style is, but most
+would agree that code in a mixture of styles is the worst of all worlds. A
+consistent coding style makes unfamiliar code easier to understand and navigate.
+
+In an ideal world, the HPCC sources would adhere to the coding standards described
+perfectly. In reality, there are many places that do not. These are being cleaned up
+as and when we find time.
+
+**********************
+C++ coding conventions
+**********************
+Unlike most software projects around, HPCC has some very specific
+constraints that makes most basic design decisions difficult, and often
+the results are odd to developers getting acquainted with its code base.
+For example, when HPCC was initially developed, most common-place
+libraries we have today (like STL and Boost) weren't available or stable
+enough at the time.
+
+Also, at the beginning, both C++ and Java were being considered as
+the language of choice, but development started with C++. So a C++
+library that copied most behaviour of the Java standard library (At the
+time, Java 1.4) was created (see jlib below) to make the transition, if
+ever taken, easier. The transition never happened, but the decisions
+were taken and the whole platform is designed on those terms.
+
+Most importantly, the performance constraints in HPCC can make
+no-brainer decisions look impossible in HPCC. One example is the use of
+traditional smart pointers implementations (such as boost::shared_ptr or
+C++'s auto_ptr), that can lead to up to 20% performance hit if used
+instead of our internal shared pointer implementation.
+
+The last important point to consider is that some
+libraries/systems were designed to replace older ones but haven't got
+replaced yet. There is a slow movement to deprecate old systems in
+favour of consolidating a few ones as the elected official ways to use
+HPCC (Thor, Roxie) but old systems still could be used for years in
+tests or legacy sub-systems.
+
+In a nutshell, expect re-implementation of well-known containers
+and algorithms, expect duplicated functionality of sub-systems and
+expect to be required to use less-friendly libraries for the sake of
+performance, stability and longevity.
+
+For the most part out coding style conventions match those
+described at http://geosoft.no/development/cppstyle.html, with a few
+exceptions or extensions as noted below.
+
+Source files
+============
+
+We use the extension .cpp for C++ source files, and .h or .hpp for header files.
+Header files with the .hpp extension should be used for headers that are internal
+to a single library, while header files with the .h extension should be used for
+the interface that the library exposes. There will typically be one .h file per
+library, and one .hpp file per cpp file.
+
+Source file names within a single shared library should share a common prefix to aid
+in identifying where they belong.
+
+Header files with extension .ipp (i for internal) and .tpp (t for template) will
+be phased out in favour of the scheme described above.
+
+Java-style
+==========
+We adopted a Java-like inheritance model, with macro
+substitution for the basic Java keywords. This changes nothing on the
+code, but make it clearer for the reader on what's the recipient of
+the inheritance doing with it's base.
+
+* **interface** (struct): declares an interface (pure virtual class)
+
+* **extends** (public): One interface extending another, both are pure virtual
+
+* **implements** (public): Concrete class implementing an interface
+
+There is no semantic check, which makes it difficult to enforce
+such scheme, which has led to code not using it intermixed with code
+using it. You should use it when possible, most importantly on code
+that already uses it.
+
+We also tend to write methods inline, which matches well with
+C++ Templates requirements. We, however, do not enforce the
+one-class-per-file rule.
+
+See the `Interfaces`_ section for more information on our implementation of
+interfaces.
+
+Identifiers
+===========
+Class and interface names are in CamelCase with a leading
+capital letter. Interface names should be prefixed capital I followed
+by another capital. Class names may be prefixed with a C if there is a
+corresponding I-prefixed interface name, e.g. when the interface is primarily used to create an opaque type, but
+need not be otherwise.
+
+Variables, function and method names, and parameters use camelCase starting with a lower case letter. Parameters may
+be prefixed with underscore when the parameter is used to initialize a member variable of the same name.  Common cases
+are constructors and setter methods.
+
+Example::
+
+   class MySQLSuperClass
+   {
+        bool haslocalcopy = false;
+        void mySQLFunctionIsCool(int _haslocalcopy, bool enablewrite)
+        {
+            if (enablewrite)
+                haslocalcopy = _haslocalcopy;
+        }
+    };
+
+Pointers
+========
+Use real pointers when you can, and smart pointers when you have
+to. Take extra care on understanding the needs of your pointers and
+their scope. Most programs can afford a few dangling pointers, but a
+high-performance clustering platform cannot.
+
+Most importantly, use common sense and a lot of thought. Here are a few guidelines:
+
+* Use real pointers for return values, parameter passing.
+
+* For local variables use real pointers if their lifetime is
+  guaranteed to be longer than the function (and no exception
+  is thrown from functions you call), shared pointers otherwise.
+
+* Use Shared pointers for member variables - unless there is
+  a strong guarantee the object has a longer lifetime.
+
+* Create Shared<X> with either:
+
+  - Owned<X>: if your new pointer will take ownership of the pointer
+
+  - Linked<X>: if you are sharing the ownership (shared)
+
+Warning: Direct manipulation of the ownership might
+cause Shared<> pointers to lose the pointers, so subsequent
+calls to it (like o2->doIt() after o3 gets ownership) **will** cause
+segmentation faults.
+
+Refer to `Reference counted objects` for more information on our smart pointer
+implementation, Shared<>.
+
+Methods that return pointers to link counted objects, or that use them,
+should use a common naming standard:
+
+* Foo * queryFoo()
+  Does not return a linked pointer since lifetime is guaranteed for a set period. Caller should link if it
+  needs to retain it for longer.
+
+* Foo * getFoo()
+  Returned value is linked and should be assigned to an owned, or returned directly.
+
+* void setFoo(Foo * x)
+  Generally parameters to functions are assumed to be owned by the caller, the callee needs to link them if they
+  are retained.
+
+* void setFoo(Foo * ownedX)
+  Some calls do transfer ownership of parameters - the parameter should be named to indicate this.  If the function
+  only has a single signficant parameter then sometimes the name of the function indicates the ownership.
+
+Indentation
+===========
+We use 4 spaces to indent each level. TAB characters should not be used.
+
+The { that starts a new scope and the corresponding } to close it are placed on a
+new line by themselves, and are not indented. This is sometimes known as the Allman
+or ANSI style.
+
+Comments
+========
+We generally believe in the philosophy that well written code is self-documenting.  Comments are also
+encouraged to describe *why* something is done, rather than how - which should be clear from the code.
+
+javadoc-formatted comments for classes and interfaces are being added.
+
+Classes
+========
+The virtual keyword should be included on the declaration of all virtual functions - including those in derived
+classes, and the override keyword should be used on all virtual functions in derived classes.
+
+Namespaces
+==========
+MORE: Update!!!
+
+We do not use namespaces. We probably should, following the Google style guide's
+guidelines - see http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Namespaces
+
+Other
+=====
+We often pretend we are coding in Java and write all our class members inline.
+
+C++11
+=====
+
+
+************************
+Other coding conventions
+************************
+
+ECL code
+========
+The ECL style guide is published separately.
+
+Javascript, XML, XSL etc
+========================
+We use the commonly accepted conventions for formatting these files.
+
+===============
+Design Patterns
+===============
+
+********************
+Why Design Patterns?
+********************
+Consistent use of design patterns helps make the code easy to understand.
+
+Interfaces
+==========
+While C++ does not have explicit support for interfaces (in the java sense), an
+abstract class with no data members and all functions pure virtual can be used
+in the same way.
+
+Interfaces are pure virtual classes. They are similar concepts to
+Java's interfaces and should be used on public APIs. If you need common
+code, use policies (see below).
+
+An interface's name must start with an 'I' and the base class for
+its concrete implementations should start with a 'C' and have the same
+name, ex::
+
+    CFoo : implements IFoo { };
+
+When an interface has multiple implementations, try to stay as
+close as possible to this rule. Ex::
+
+    CFooCool : implements IFoo { };
+    CFooWarm : implements IFoo { };
+    CFooALot : implements IFoo { };
+
+Or, for partial implementation, use something like this::
+
+    CFoo : implements IFoo { };
+    CFooCool : public CFoo { };
+    CFooWarm : public CFoo { };
+
+Extend current interfaces only on a 'is-a' approach, not to
+aggregate functionality. Avoid pollution of public interfaces by having
+only the public methods on the most-base interface in the header, and
+internal implementation in the source file. Prefer pImpl idiom
+(pointer-to-implementation) for functionality-only requirements and
+policy based design for interface requirements.
+
+Example 1: You want to decouple part of the implementation from
+your class, and this part does not implements the interface your
+contract requires.::
+
+    interface IFoo
+    {
+        virtual void foo()=0;
+    };
+    // Following is implemented in a separate private file...
+    class CFoo : implements IFoo
+    {
+        MyImpl *pImpl;
+    public:
+        virtual void foo() override { pImpl->doSomething(); }
+    };
+
+Example2: You want to implement the common part of one (or more)
+interface(s) in a range of sub-classes.::
+
+    interface ICommon
+    {
+        virtual void common()=0;
+    };
+    interface IFoo : extends ICommon
+    {
+        virtual void foo()=0;
+    };
+    interface IBar : extends ICommon
+    {
+        virtual void bar()=0;
+    };
+
+    template <class IFACE>
+    class Base : implements IFACE
+    {
+        virtual void common() override { ... };
+    }; // Still virtual
+
+    class CFoo : public Base<IFoo>
+    {
+        void foo() override { 1+1; };
+    };
+    class CBar : public Base<IBar>
+    {
+        void bar() override { 2+2; };
+    };
+
+NOTE: Interfaces deliberately do not contain virtual destructors.  This is to help ensure that they are never
+destroyed by calling delete directly.
+
+Reference counted objects
+=========================
+Shared<> is an in-house intrusive smart pointer implementation. It is
+close to boost's intrusive_ptr. It has two derived implementations:
+Linked and Owned, which are used to control whether the pointer is
+linked when a shared pointer is created from a real pointer or not,
+respectively. Ex::
+
+    Owned<Foo> myFoo = new Foo; // Take owenership of the pointers
+    Linked<Foo> anotherFoo = = myFoo; // Shared ownership
+
+Shared<> is thread-safe and uses atomic reference count
+handled by each object (rather than by the smart pointer itself, like
+boost's shared_ptr).
+
+This means that, to use Shared<>, your class must implement the Link() and Release() methods - most commonly by
+extending the CInterfaceOf<> class, or the CInterface class (and using the IMPLEMENT_IINTERFACE macro in the public
+section of your class declaration).
+
+This interface controls how you Link() and Release() the pointer.
+This is necessary because in some inner parts of HPCC, the use of a
+"really smart" smart pointer would add too many links and releases (on
+temporaries, local variables, members, etc) that could add to a
+significant performance hit.
+
+The CInterface implementation also include a virtual function beforeDispose() which is called before the object is
+deleted.  This allows resources to be cleanly freed up, with the full class hierarchy (including virtual functions)
+available even when freeing items in base classes.  It is often used for caches that do not cause the objects to be
+retained.
+
+STL
+===
+MORE: This needs documenting
+
+=================================
+Structure of the HPCC source tree
+=================================
+
+MORE!
+
+Requiring more work:
+* namespaces
+* STL
+* c++11
+* Review all documentation
+* Better examples for shared

+ 0 - 1
ecl/eclcc/WORKUNITS.rst

@@ -951,4 +951,3 @@ Full contents of the generated C++ (as a single file)
 
         return new MyEclProcess;
     }
-

+ 0 - 8
ecl/eclcc/README.rst

@@ -1,8 +0,0 @@
-This directory contains the source of the ecl compiler executable (eclcc).
-
-The ECL language is documented in the ecl language reference manual (generated as ECLLanguageReference-<version>.pdf).
-
-Details of the internals of eclcc are found in the `Code Generator Documentation`_.
-
-
-.. _Code Generator Documentation: DOCUMENTATION.rst

+ 0 - 866
sourcedoc.xml

@@ -1,866 +0,0 @@
-<?xml version="1.0" encoding="utf-8"?>
-<!--
-################################################################################
-#    HPCC SYSTEMS software Copyright (C) 2012 HPCC Systems®.
-#
-#    Licensed under the Apache License, Version 2.0 (the "License");
-#    you may not use this file except in compliance with the License.
-#    You may obtain a copy of the License at
-#
-#       http://www.apache.org/licenses/LICENSE-2.0
-#
-#    Unless required by applicable law or agreed to in writing, software
-#    distributed under the License is distributed on an "AS IS" BASIS,
-#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-#    See the License for the specific language governing permissions and
-#    limitations under the License.
-################################################################################
--->
-<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN" "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd">
-<book lang="en_US">
-    <bookinfo>
-        <title>HPCC Source</title>
-
-        <mediaobject>
-            <imageobject>
-                <imagedata fileref="images/redswoo0.jpg" />
-            </imageobject>
-        </mediaobject>
-
-        <author>
-            <surname>Boca Documentation Team</surname>
-        </author>
-
-        <legalnotice>
-            <para>
-                We welcome your comments and feedback about this document via
-                email to <email>docfeedback@lexisnexis.com</email> Please include
-                <emphasis role="bold">Documentation Feedback</emphasis> in the subject
-                line and reference the document name, page numbers, and current Revision
-                Number in the text of the message.
-            </para>
-
-            <para>
-                LexisNexis and the Knowledge Burst logo are registered trademarks
-                of Reed Elsevier Properties Inc., used under license. Other products and
-                services may be trademarks or registered trademarks of their respective
-                companies. All names and example data used in this manual are
-                fictitious. Any similarity to actual persons, living or dead, is purely
-                coincidental.
-            </para>
-
-            <para></para>
-        </legalnotice>
-
-        <releaseinfo>
-               HPCC SYSTEMS software Copyright (C) 2012 HPCC Systems®.
-
-               Licensed under the Apache License, Version 2.0 (the "License");
-               you may not use this file except in compliance with the License.
-               You may obtain a copy of the License at
-
-                  http://www.apache.org/licenses/LICENSE-2.0
-
-               Unless required by applicable law or agreed to in writing, software
-               distributed under the License is distributed on an "AS IS" BASIS,
-               WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-               See the License for the specific language governing permissions and
-               limitations under the License.
-
-        </releaseinfo>
-
-        <date>2011-01-11</date>
-
-        <corpname>LexisNexis</corpname>
-
-        <copyright>
-            <year>2011 LexisNexis Risk Solutions. All rights reserved</year>
-        </copyright>
-
-        <mediaobject role="logo">
-            <imageobject>
-                <imagedata fileref="images/LN_Horz.gif" scale="45" />
-            </imageobject>
-        </mediaobject>
-    </bookinfo>
-
-    <chapter>
-        <title>Overview</title>
-
-        <para>
-            This manual contains a description of the HPCC sources.
-        </para>
-    </chapter>
-
-    <chapter>
-        <title>Getting the sources</title>
-        <para>
-            The HPCC Platform sources are hosted on GitHub at
-            https://github.com/hpcc-systems/HPCC-Platform. You can download a
-            snapshot of any branch using the download button there, or you can set
-            up a git clone of the repository. If you are planning to contribute
-            changes to the system, see the file CONTRIBUTORS at
-            https://github.com/hpcc-systems/HPCC-Platform/blob/master/CONTRIBUTORS
-            for information about how to set up a GitHub fork of the project
-            through which pull-requests can be made.
-        </para>
-    </chapter>
-
-    <chapter>
-        <title>Building the system from sources
-        </title>
-
-        <sect1>
-            <title>Requirements</title>
-            <para>
-                The HPCC platform requires a number of third party tools and libraries in order to build.
-
-                On Ubuntu 12.04, the following commands will install the required libraries
-                <programlisting>
-                  sudo apt-get install cmake bison flex libicu-dev libboost-regex-dev \
-                                       binutils-dev libxerces-c2-dev libxalan110-dev zlib1g-dev \
-                                       libssl-dev libldap2-dev expect libarchive-dev \
-                                       libapr1-dev libaprutil1-dev
-                </programlisting>
-	    </para>
-
-	    <para>
-		For building any documentation, the following are also required
-		<programlisting>
-                  sudo apt-get install docbook
-                  sudo apt-get install xsltproc
-                  sudo apt-get install fop
-		</programlisting>
-	    </para>
-	    <para>
-	<emphasis role="bold">	NOTE:</emphasis> Installing the above via alternative methods (i.e. from source) may place installations outside of searched paths.
-            </para>
-        </sect1>
-   
-        <sect1>
-            <title>Building the system</title>
-            <para>
-                The HPCC system is built using the cross-platform build tool cmake,
-                which is available for Windows, virtually all flavors of Linux, 
-                FreeBSD, and other platforms. You should install cmake version 
-                2.8.3 or later before building the sources.
-                
-                On some distros you will need to build cmake from sources if the version
-                of cmake in the standard repositories for that distro is not modern enough.
-                It is good practice in cmake to separate the build directory where
-                objects and executables are made from the source directory, and the 
-                HPCC cmake scripts will enforce this.
-                
-                To build the sources, create a directory where the built files should 
-                be located, and from that directory, run
-            
-                <programlisting>
-                    cmake &lt;source directory&gt;
-                </programlisting>
-                
-                Depending on your operating system and the compilers installed on it,
-                this will create a makefile, Visual Studio .sln file, or other build
-                script for building the system. If cmake was configured to create a 
-                makefile, then you can build simply by typing
-
-                <programlisting>
-                    make
-                </programlisting>
-
-                If a Visual Studio solution file was created, you can load it simply
-                by typing the name:
-                
-                <programlisting>
-                    lexisnexisrs.sln
-                </programlisting>
-
-                This will load the solution in Visual Studio where you can build in the
-                usual way.
-            </para>
-        </sect1>
-        <sect1>
-            <title>Packaging</title>
-            <para>
-                To make an installation package on a supported linux system, use the
-                command
-            
-                <programlisting>
-                    make package
-                </programlisting>
-                
-                This will first do a make to ensure everything is up to date, then will 
-                create the appropriate package for your operating system, Currently supported
-                package formats are rpm (for RedHat/Centos) and  .deb (for Debian and
-                Ubuntu). If the operating system is not one of the above, or is not recognized,
-                make package will create a tarball.
-            </para>
-            <para>
-                The package installation does not start the service on the machine, so if you
-                want to give it a go or test it (see below), make sure to start the service manually
-                and wait until all services are up (mainly wait for EclWatch to come up on port 8010).
-            </para>
-        </sect1>
-        <sect1>
-            <title>Testing the system</title>
-            <para>
-                After compiling, installing the package and starting the services, you can test
-                the HPCC platform on a single-node setup.
-            </para>
-            <sect2>
-                <title>Unit Tests</title>
-                <para>
-                    Some components have their own unit-tests. Once you have compiled (no need to
-                    start the services), you can already run them. Supposing you build a Debug
-                    version, from the build directory you can run:
-                    <programlisting>./Debug/bin/roxie -selftest</programlisting>
-                    and
-                    <programlisting>./Debug/bin/eclagent -selftest</programlisting>
-                </para>
-                <para>
-                    You can also run the Dali regression self-tests:
-                    <programlisting>./Debug/bin/daregress localhost</programlisting>
-                </para>
-            </sect2>
-            <sect2>
-                <title>Regression Tests</title>
-                <para>
-                    After the initial batch of unit-tests, which are quick and show only the most
-                    basic errors in the system, you can run the more complete regressions' test.
-                    These tests are located in the source directory 'testing/ecl' and you'll need
-                    the HPCC platform up and running to execute them.
-                </para>
-                <para>
-                    In order for the regression suite to work, there are perl modules that need to be installed as well.
-                    The most efficient method for their installation is to use cpanm.  This itself can be installed using the command line below
-                    and following the prompted setup instructions. In most cases the suggested defaults are applicable.
-
-                    <programlisting>
-                    sudo cpan App:cpanminus
-                    </programlisting>
-
-                    Then install the following list of perl modules:
-
-                    <programlisting>
-                    sudo cpanm Config::Simple        (Required)
-                    sudo cpanm Cwd                   (Required)
-                    sudo cpanm Exporter              (Required)
-                    sudo cpanm File::Compare         (Required)
-                    sudo cpanm File::Copy            (Required)
-                    sudo cpanm File::Path            (Required)
-                    sudo cpanm File::Spec::Functions (Required)
-                    sudo cpanm Getopt::Long          (Required)
-                    sudo cpanm IPC::Run              (Required)
-                    sudo cpanm Pod::Usage            (Required)
-                    sudo cpanm POSIX                 (Required - However, this is typically
-                                                                         installed by default)
-                    sudo cpanm Text::Diff            (Required by the Diff and DiffFull
-                                                                                 report types)
-                    sudo cpanm HTML::Entities        (Required by the HTML report type)
-                    sudo cpanm Text::Diff::HTML      (Required by the HTML report type)
-                    sudo cpanm Template              (Required by the HTML report type)
-                    sudo cpanm Term::Prompt          (Required if you do not specify a
-                                                           password in the configuration file)
-                    sudo cpanm Sys::Hostname         (Recommended: if available, and it can
-                                                      find the hostname, the hostname will be
-                                                                                       logged)
-                    sudo cpanm Text::Wrap            (Optional: if available, makes output of
-                                                                          -listreports neater)
-                    </programlisting>
-                </para>
-                <para>
-                    Step 1: Configure your regression suites. This need only be done once.
-                    <programlisting>./runregress -ini=environment.xml</programlisting>
-                    The file 'environment.xml' is normally located in your '/etc/HPCCPlatform'
-                    directory and contains information on how your cluster is set-up, so the
-                    regression engine can reach it. You should see a new file, 'regress.ini'.
-                    Edit it to accommodate to your preferred setup.
-                </para>
-		<para>
-                    Note: There is a current issue with Roxie tests, so you should comment out
-                    the 'roxie' from 'setup_clusters'. That will leave you about 650 tests to run.
-                </para>
-                <para>
-                    Note 2: There is another issue with eclplus having to live in the current
-                    testing directory. For now, you have to copy or symlink 'eclplus' into that
-                    directory. You can get it from your build directory.
-                </para>
-                <para>
-                    Step 2: Create test files. You'll need some files created as part of the
-                    tests. You should also need this to be run once, too, unless you have cleaned
-                    the files for any reason.
-                    <programlisting>./runregress -setup</programlisting>
-                    There is no reason for this to fail, you should get all queries executed
-                    successfully.
-                </para>
-                <para>
-                    Step 3: Run the regression tests. This takes about 5-10 minutes on a machine
-                    with multiple CPUs/cores. There is an optimum value on the number of parallel
-                    queries, not necessarily more is faster. Start with 50 and work your way up
-                    and down to a better number for your machine.
-                    <programlisting>./runregress -pq 50 hthor_suite</programlisting>
-                    If some of the queries gets locked, CTRL+C them won't help. You need to abort
-                    them from the EclWatch interface, or restart the service.
-                </para>
-                <para>
-                    If, after it finishes, you want to see the report again, just run:
-                    <programlisting>./runregress -n -report Summary hthor_suite</programlisting>
-                </para>
-                <para>
-                    If you want to re-run a simgle test, just run:
-                    <programlisting>./runregress -n -query anytest.ecl hthor_suite</programlisting>
-                </para>
-                <para>
-                    All test results and their expected files are in the suite's directory (like
-                    hthor_suite), on 'out' and 'key' respectively.
-                </para>
-            </sect2>
-            <sect2>
-                <title>Compiler Tests</title>
-                <para>
-                    The ECLCC compiler tests rely on two distinct runs: a known good one and your
-                    test build. For normal development, you can safely assume that the OSS/master
-                    branch in github is good. For overnight testing, golden directories need to
-                    be maintained according to the test infrastructure. There are Bash (Linux)
-                    and Batch (Windows) scripts to run the regressions:
-                </para>
-                <para>
-                    The basic idea behind this tests is to compare the output files (logs and
-                    XML files) between runs. The log files should change slightly (the comparison
-                    should be good enough to filter most irrelevant differences), but the XML
-                    files should be identical if nothing has changed. You should only see
-                    differences in the XML where you have changed in the code, or new tests
-                    were added as part of your development.
-                </para>
-                <para>
-                    On Linux, there are two steps:
-                </para>
-                <para>
-                    Step 1: Check-out OSS/master, compile and run the regressions to populate
-                    the 'golden' directory:
-
-                    <programlisting>
-                        ./regress.sh -t golden -e buildDir/Debug/bin/eclcc
-                    </programlisting>
-
-                    This will run the regressions in parallel, using as many CPUs as you have,
-                    and using your just-compiled ECLCC, assuming you compiled for Debug version.
-                </para>
-                <para>
-                    Step 2: Make your changes (or check-out your branch), compile and run again,
-                    this time output to a new directory and compare to the 'golden' repo.
-
-                    <programlisting>
-                        ./regress.sh -t my_branch -c golden -e buildDir/Debug/bin/eclcc
-                    </programlisting>
-
-                    This will run the regressions in the same way, output to 'my_branch' dir
-                    and compare it to the golden version, highlighting the differences.
-
-                    NOTE: If you changed the headers that the compiled binaries will use, you
-                    must re-install the package (or provide -i option to the script to the new
-                    headers).
-                </para>
-                <para>
-                    Step 3: Step 2 only listed the differences, now you need to see what they are.
-                    For that, re-run the regressing script omitting the compiler, since the only
-                    thing we'll do is to compare verbosely.
-
-                    <programlisting>
-                        ./regress.sh -t my_branch -c golden
-                    </programlisting>
-
-                    This will show you all differences, using the same ignore filters as before,
-                    between your two branches. Once you're happy with the differences, commit and
-                    issue a pull-request.
-                </para>
-                <para>
-                    TODO: Describe compiler tests on Windows.
-                </para>
-            </sect2>
-        </sect1>
-        <sect1>
-            <title>Debugging the system</title>
-            <para>
-                On linux systems, the makefile generated by cmake will build a specific
-                version (debug or release) of the system depending on the options selected 
-                when cmake is first run in that directory. The default is to build a release
-                system. In order to build a debug system instead, use
-                command
-            
-                <programlisting>
-                    cmake -DCMAKE_BUILD_TYPE=Debug &lt;source directory&gt;
-                </programlisting>
-                
-                You can then run make or make package in the usual way to build the system.
-            </para>
-            <para>
-                On a Windows system, cmake always generates s solution file with both debug and 
-                release target platforms in it, so you can select which one to build within
-                Visual Studio.
-            </para>
-        </sect1>
-    </chapter>
-
-    <chapter>
-        <title>Coding conventions</title>
-        <sect1>
-            <title>Why coding conventions</title>
-            <para>
-                Everyone has their own ideas of what the best code formatting style is, but most
-                would agree that code in a mixture of styles is the worst of all worlds. A
-                consistent coding style makes unfamiliar code easier to understand and navigate.
-                
-                In an ideal world, the HPCC sources would adhere to the coding standards described
-                perfectly. In reality, there are many places that do not. These are being cleaned up 
-                as and when we find time.
-            </para>
-        </sect1>
-        <sect1>
-            <title>C++ coding conventions</title>
-            <para>
-                Unlike most software projects around, HPCC has some very specific
-                constraints that makes most basic design decisions difficult, and often
-                the results are odd to developers getting acquainted with its code base.
-                For example, when HPCC was initially developed, most common-place
-                libraries we have today (like STL and Boost) weren't available or stable
-                enough at the time.
-            </para>
-            <para>
-                Also, at the beginning, both C++ and Java were being considered as
-                the language of choice, but development started with C++. So a C++
-                library that copied most behaviour of the Java standard library (At the
-                time, Java 1.4) was created (see jlib below) to make the transition, if
-                ever taken, easier. The transition never happened, but the decisions
-                were taken and the whole platform is designed on those terms.
-            </para>
-            <para>
-                Most importantly, the performance constraints in HPCC can make
-                no-brainer decisions look impossible in HPCC. One example is the use of
-                traditional smart pointers implementations (such as boost::shared_ptr or
-                C++'s auto_ptr), that can lead to up to 20% performance hit if used
-                instead of our internal shared pointer implementation.
-            </para>
-            <para>
-                The last important point to consider is that some
-                libraries/systems were designed to replace older ones but haven't got
-                replaced yet. There is a slow movement to deprecate old systems in
-                favour of consolidating a few ones as the elected official ways to use
-                HPCC (Thor, Roxie) but old systems still could be used for years in
-                tests or legacy sub-systems.
-            </para>
-            <para>
-                In a nutshell, expect re-implementation of well-known containers
-                and algorithms, expect duplicated functionality of sub-systems and
-                expect to be required to use less-friendly libraries for the sake of
-                performance, stability and longevity.
-            </para>
-            <para>
-                For the most part out coding style conventions match those
-                described at http://geosoft.no/development/cppstyle.html, with a few
-                exceptions or extensions as noted below.
-            </para>
-            <sect2>
-                <title>Source files</title>
-                <para>
-                    We use the extension .cpp for C++ source files, and .h or .hpp for header files.
-                    Header files with the .hpp extension should be used for headers that are internal
-                    to a single library, while header files with the .h extension should be used for 
-                    the interface that the library exposes. There will typically be one .h file per 
-                    library, and one .hpp file per cpp file.
-                    
-                    Source file names within a single shared library should share a common prefix to aid 
-                    in identifying where they belong.
-                    
-                    Header files with extension .ipp (i for internal) and .tpp (t for template) will 
-                    be phased out in favour of the scheme described above.
-                </para>
-            </sect2>
-            <sect2>
-                <title>Java-style</title>
-                <para>
-                    We adopted a Java-like inheritance model, with macro
-                    substitution for the basic Java keywords. This changes nothing on the
-                    code, but make it clearer for the reader on what's the recipient of
-                    the inheritance doing with it's base.
-                </para>
-                <para>
-                    <itemizedlist>
-                        <listitem>
-                            <para>
-                                interface (struct): declares an interface (pure virtual class)
-                            </para>
-                        </listitem>
-
-                        <listitem>
-                            <para>
-                                extends (public): One interface extending another, both are pure virtual
-                            </para>
-                        </listitem>
-
-                        <listitem>
-                            <para>
-                                implements (public): Concrete class implementing an interface
-                            </para>
-                        </listitem>
-                    </itemizedlist>
-                </para>
-                <para>
-                    There is no semantic check, which makes it difficult to enforce
-                    such scheme, which has led to code not using it intermixed with code
-                    using it. You should use it when possible, most importantly on code
-                    that already uses it.
-                </para>
-                <para>
-                    We also tend to write methods inline, which matches well with
-                    C++ Templates requirements. We, however, do not enforce the
-                    one-class-per-file rule.
-                </para>
-                <para>
-                    See chapter 3.2 for more information on our implementation of
-                    interfaces.
-                </para>
-            </sect2>
-            <sect2>
-                <title>Identifiers</title>
-                <para>
-                    Class and interface names are in CamelCase with a leading
-                    capital letter. Interface names should be prefixed capital I followed
-                    by another capital. Class names may be prefixed with a C if there is a
-                    corresponding I-prefixed interface name, but need not be
-                    otherwise.
-                </para>
-                <para>
-                    Variables, function and method names, and parameters use
-                    camelCase starting with a lower case letter. Parameters may be
-                    prefixed with underscore, normally when overwritten by local
-                    variables.
-                </para>
-                <para>Example:</para>
-                <para>
-                  <programlisting>    class MySQLSuperClass {
-        void mySQLFunctionIsCool(int _haslocalcopy, bool enablewrite) {
-        bool haslocalcopy = false;
-            if (enablewrite)
-                haslocalcopy = _haslocalcopy;
-        }
-    };
-                  </programlisting>
-                </para>
-            </sect2>
-            <sect2>
-                <title>Pointers</title>
-                <para>
-                    Use real pointers when you can, and smart pointers when you have
-                    to. Take extra care on understanding the needs of your pointers and
-                    their scope. Most programs can afford a few dangling pointers, but a
-                    high-performance clustering platform cannot.
-                </para>
-                <para>
-                    Most importantly, use common sense and a lot of thought. Here
-                    are a few guidelines:
-                </para>
-                <para>
-                    <itemizedlist>
-                        <listitem>
-                            <para>
-                                Use real pointers for return values, parameter passing
-                            </para>
-                        </listitem>
-                        <listitem>
-                          <para>
-                              For local variables use real pointers if their lifetime is
-                              guaranteed to be longer than the function (and no exception
-                              is thrown from functions you call), shared pointers otherwise.
-                          </para>
-                        </listitem>
-                        <listitem>
-                            <para>
-                                Use Shared pointers for member variables - unless there is
-                                a strong guarantee the object has a longer lifetime.
-                            </para>
-                        </listitem>
-                        <listitem>
-                            <para>
-                                Create Shared&lt;&gt; with either:
-                            </para>
-                            <itemizedlist>
-                                <listitem>
-                                    <para>
-                                        Owned&lt;&gt;: if your new pointer will own the
-                                        pointer alone (transfer)
-                                    </para>
-                                </listitem>
-                                <listitem>
-                                    <para>
-                                        Linked&lt;&gt;: if you still want to share the
-                                        ownership (shared)
-                                    </para>
-                                </listitem>
-                            </itemizedlist>
-                        </listitem>
-                        <listitem>
-                            <para>
-                                Consider whether your code is critical and use
-                                link/release when necessary
-                            </para>
-                        </listitem>
-                    </itemizedlist>
-                </para>
-                <para>
-                    Warning: Direct manipulation of the ownership might
-                    cause Shared&lt;&gt; pointers to lose the pointers, so subsequent
-                    calls to it (like o2-&gt;doIt() after o3 gets ownership) *will* cause
-                    segmentation faults.
-                  </para>
-                <para>
-                    Refer to chapter 5.3 for more information on our smart pointer
-                    implementation, Shared&lt;&gt;.
-                </para>
-                <para>
-                    Methods that return Shared&lt;&gt; pointers, or that use them,
-                    should have a common naming standard.
-                </para>
-                <para>
-                    <itemizedlist>
-                        <listitem>
-                            <para>
-                                Foo * queryFoo(): does not return a linked pointer since
-                                lifetime is guaranteed for a set period. Caller should link if it
-                                needs to retain it for longer.
-                            </para>
-                        </listitem>
-                    </itemizedlist>
-                    <itemizedlist>
-                        <listitem>
-                            <para>
-                                Foo * getFoo(): returned values is linked - should be
-                                assigned to an owned, or returned directly.
-                            </para>
-                        </listitem>
-                    </itemizedlist>
-                    <itemizedlist>
-                        <listitem>
-                            <para>
-                                void setFoo(Foo * x): generally parameters to functions are
-                                assumed to not be linked, the callee needs to link them if they
-                                are retained.
-                            </para>
-                        </listitem>
-                    </itemizedlist>
-                    <itemizedlist>
-                        <listitem>
-                            <para>
-                                void setownFoo(Foo * ownedX): Some functions do take
-                                pointers that are linked - where you are implicitly transferring
-                                ownership.
-                            </para>
-                        </listitem>
-                    </itemizedlist>
-                </para>
-            </sect2>
-            <sect2>
-                <title>Indentation</title>
-                <para>
-                    We use 4 spaces to indent each level. TAB characters should not be used. There is
-                    some discussion about possibly changing to a 2-space indentation convention at some
-                    point in the future.
-                </para>
-                <para>
-                    The { that starts a new scope and the corresponding } to close it are placed on a
-                    new line by themselves, and are not indented. This is sometimes known as the Allman
-                    or ANSI style.
-                </para>
-            </sect2>
-            <sect2>
-                <title>Comments</title>
-                <para>
-                    We generally believe in the philosophy that well written code is self-documenting.
-                    javadoc-formatted comments for classes and interfaces are being added.
-                </para>
-            </sect2>
-            <sect2>
-                <title>Namespaces</title>
-                <para>
-                    We do not use namespaces. We probably should, following the Google style guide&apos;s
-                    guidelines - see http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Namespaces
-                </para>
-            </sect2>
-            <sect2>
-                <title>Other</title>
-                <para>
-                    We often pretend we are coding in Java and write all our class members inline. 
-                </para>
-            </sect2>
-        </sect1>
-        <sect1>
-            <title>Other coding conventions</title>
-            <sect2>
-                <title>ECL code</title>
-                <para>
-                    The ECL style guide is published separately.
-                </para>
-            </sect2>
-            <sect2>
-                <title>Javascript, XML, XSL etc</title>
-                <para>
-                    We use the commonly accepted conventions for formatting these files.
-                </para>
-            </sect2>
-        </sect1>
-    </chapter>
-
-    <chapter>
-        <title>Design Patterns</title>
-        <sect1>
-            <title>Why Design Patterns?</title>
-            <para>
-                Consistent use of design patterns helps make the code easy to understand.
-            </para>
-        </sect1>
-        <sect1>
-            <title>Interfaces</title>
-            <para>
-                While C++ does not have explicit support for interfaces (in the java sense), an
-                abstract class with no data members and all functions pure virtual can be used
-                in the same way.
-            </para>
-            <para>
-                Interfaces are pure virtual classes. They are similar concepts to
-                Java's interfaces and should be used on public APIs. If you need common
-                code, use policies (see below).
-            </para>
-            <para>
-                An interface's name must start with an 'I' and the base class for
-                its concrete implementations should start with a 'C' and have the same
-                name, ex:
-            </para>
-            <programlisting>    CFoo : implements IFoo { };</programlisting>
-            <para>
-                When an interface has multiple implementations, try to stay as
-                close as possible from this rule. Ex:
-            </para>
-            <programlisting>    CFooCool : implements IFoo { };
-    CFooWarm : implements IFoo { };
-    CFooALot : implements IFoo { };
-            </programlisting>
-            <para>
-                Or, for partial implementation, use something like this:
-            </para>
-            <programlisting>    CFoo : implements IFoo { };
-    CFooCool : public CFoo { };
-    CFooWarm : public CFoo { };
-            </programlisting>
-            <para>
-                Extend current interfaces only on a 'is-a' approach, not to
-                aggregate functionality. Avoid pollution of public interfaces by having
-                only the public methods on the most-base interface in the header, and
-                internal implementation in the source file. Prefer pImpl idiom
-                (pointer-to-implementation) for functionality-only requirements and
-                policy based design for interface requirements.
-            </para>
-            <para>
-                Example 1: You want to decouple part of the implementation from
-                your class, and this part does not implements the interface your
-                contract requires.
-            </para>
-            <programlisting>    interface IFoo {
-        virtual void foo()=0;
-    };
-    class CFoo : implements IFoo {
-        MyImpl *pImpl;
-    public:
-        void foo() { pImpl-&gt;doSomething(); }
-    };
-            </programlisting>
-            <para>
-                Example2: You want to implement the common part of one (or more)
-                interface(s) in a range of sub-classes.
-            </para>
-            <programlisting>    interface ICommon {
-        virtual void common()=0;
-    };
-    interface IFoo : extends ICommon {
-        virtual void foo()=0;
-    };
-    interface IBar : extends ICommon {
-        virtual void bar()=0;
-    };
-
-    template &lt;class IFACE&gt;
-    class Base : implements IFACE {
-        void common() { ... };
-    }; // Still virtual
-
-    class CFoo : Base&lt;IFoo&gt; {
-        void foo() { 1+1; };
-    };
-    class CBar : Base&lt;IBar&gt; {
-        void bar() { 2+2; };
-    };
-            </programlisting>
-        </sect1>
-        <sect1>
-            <title>Reference counted objects</title>
-            <para>
-                Shared&lt;&gt; is an in-house smart pointer implementation. It's
-                close to boost's intrusive_ptr. It has two derived implementations:
-                Linked and Owned, which are used to control whether the pointer is
-                linked when a shared pointer is created from a real pointer or not,
-                respectively. Ex:
-            </para>
-            <programlisting>    Owned&lt;Foo&gt; = new Foo; // Owns the pointers
-    Linked&lt;Foo&gt; = myFooParmeter; // Shared ownership
-            </programlisting>
-            <para>
-                Shared&lt;&gt; is thread-safe and uses atomic reference count
-                handled by each object (rather than by the smart pointer itself, like
-                boost's shared_ptr).
-            </para>
-            <para>
-                This means that, to use Shared&lt;&gt;, your class must implement
-                the IInterface interface, most commonly by extending the CInterface
-                class (and using the IMPLEMENT_IINTERFACE macro in the public section of
-                your class declaration).
-            </para>
-            <para>
-                This interface controls how you Link() and Release() the pointer.
-                This is necessary because in some inner parts of HPCC, the use of a
-                "really smart" smart pointer would add too many links and releases (on
-                temporaries, local variables, members, etc) that could add to a
-                significant performance hit.
-            </para>
-        </sect1>
-        <sect1><title>STL</title><para/></sect1>
-    </chapter>
-
-    <chapter>
-        <title>Structure of the HPCC source tree</title>
-        <section>
-            <title>Introduction</title>
-            <para/>
-        </section>
-
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="cmake_modules/sourcedoc.xml" />
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="common/sourcedoc.xml" />
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="dali/sourcedoc.xml" />
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="deployment/sourcedoc.xml" />
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ecl/sourcedoc.xml" />
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ecllibrary/sourcedoc.xml" />
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="esp/sourcedoc.xml" />
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="initfiles/sourcedoc.xml" />
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="plugins/sourcedoc.xml" />
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="roxie/sourcedoc.xml" />
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="rtl/sourcedoc.xml" />
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="services/sourcedoc.xml" />
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="system/sourcedoc.xml" />
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="testing/sourcedoc.xml" />
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="thorlcr/sourcedoc.xml" />
-        <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="tools/sourcedoc.xml" />
-
-    </chapter>
-</book>
-
-
-
-
-