No Description

Gavin Halliday b380aecefc HPCC-14745 Minor inline aggregate improvements 9 years ago
build_utils c63b80c278 HPCC-13448 Source Code needs Marca Registrada next to HPCC Systems® 10 years ago
charm 37d0fb7f7d HPCC-11289 Add README file for HPCC Juju Charm Development 10 years ago
clienttools c63b80c278 HPCC-13448 Source Code needs Marca Registrada next to HPCC Systems® 10 years ago
cmake_modules fa036dd1b2 HPCC-14750 Naming changes to redis/memcached/kafka plugins 9 years ago
common 37cb6e6450 Merge pull request #8048 from ghalliday/issue14596 9 years ago
dali ae01e4ec40 Merge pull request #7961 from afishbeck/publishDfsUpdates 9 years ago
deploy c63b80c278 HPCC-13448 Source Code needs Marca Registrada next to HPCC Systems® 10 years ago
deployment d3d5f8cd22 HPCC-14684 ConfigMgr - Add alias attribute to Cluster section in Topology 9 years ago
docs 240477bc83 HPCC-9862 Fix table formatting in ClientTools docs 9 years ago
ecl b380aecefc HPCC-14745 Minor inline aggregate improvements 9 years ago
ecllibrary 0fa32f0208 HPCC-14559 Add WorkunitServices function to set application values 9 years ago
esp 106c768b30 Merge pull request #8103 from wangkx/h14537 9 years ago
githooks 4ba38f7d01 Merge remote-tracking branch 'origin/candidate-3.10.x' 12 years ago
initfiles ae01e4ec40 Merge pull request #7961 from afishbeck/publishDfsUpdates 9 years ago
lib2 bab8a3143f HPCC-14118 Update third party libraries and settings to support VS2015 9 years ago
misc 66e814cdf0 HPCC-9508 Add eclipse code layout settings file to project 12 years ago
plugins fa036dd1b2 HPCC-14750 Naming changes to redis/memcached/kafka plugins 9 years ago
roxie b2b48b5965 HPCC-14656 Split concepts of an input and a row stream 9 years ago
rtl 8e533045b4 Merge pull request #8046 from Michael-Gardner/HPCC-14458 9 years ago
services 80f36a02f3 Merge branch 'candidate-5.4.0' 9 years ago
system f32ebbe248 HPCC-14721 Platform build error signed/unsigned comparison 9 years ago
testing a9e4598f5d Merge pull request #7873 from ghalliday/readwrite 9 years ago
thorlcr d10a6e7203 Merge branch 'candidate-5.4.8' into candidate-6.0.0 9 years ago
tools 413b45b95c Merge pull request #8092 from rpastrana/HPCC-14758-Publishbindingcrash 9 years ago
.gitattributes 7f4953af04 Issue #254 Switches template reading to use jlib 14 years ago
.gitignore a8ff38a9ca Minor code cleaup to avoid false positives from Eclipse 13 years ago
.gitmodules 0f517b4ef3 HPCC-14393 New Kafka plugin for Thor and Roxie 9 years ago
.travis.yml 50beb15156 HPCC-13601 Travis-CI 10 years ago
BUILD_ME.md fb5d21dc72 HPCC-13515 A proper README.md, moving old README.md -> BUILD_ME.md 10 years ago
CMakeLists.txt fa036dd1b2 HPCC-14750 Naming changes to redis/memcached/kafka plugins 9 years ago
CNAME 996619b9ea Add CNAME entry for GitHub pages redirection 14 years ago
CONTRIBUTORS c63b80c278 HPCC-13448 Source Code needs Marca Registrada next to HPCC Systems® 10 years ago
FUTURE b39eb133f9 Initial version of FUTURE document 13 years ago
LICENSE.txt c63b80c278 HPCC-13448 Source Code needs Marca Registrada next to HPCC Systems® 10 years ago
R-LICENSE.txt 41fdcf477c HPCC-14457 Split R plugin to its own package 9 years ago
README.md fb5d21dc72 HPCC-13515 A proper README.md, moving old README.md -> BUILD_ME.md 10 years ago
VERSIONS 04760b84cc Preparation for 6.0.0-beta1 release 9 years ago
baseaddr.txt c63b80c278 HPCC-13448 Source Code needs Marca Registrada next to HPCC Systems® 10 years ago
build-config.h.cmake 08fd95330b HPCC-9902 Use the build version as the ecl version reported by eclcc 11 years ago
sourcedoc.xml c63b80c278 HPCC-13448 Source Code needs Marca Registrada next to HPCC Systems® 10 years ago
version.cmake 036283c550 Community Edition 6.0.0-beta1 Beta 1 9 years ago

README.md

Description / Rationale

HPCC Systems offers an enterprise ready, open source supercomputing platform to solve big data problems. As compared to Hadoop, the platform offers analysis of big data using less code and less nodes for greater efficiencies and offers a single programming language, a single platform and a single architecture for efficient processing. HPCC Systems is a technology division of LexisNexis Risk Solutions.

Getting Started

Architecture

The HPCC Systems architecture incorporates the Thor and Roxie clusters as well as common middleware components, an external communications layer, client interfaces which provide both end-user services and system management tools, and auxiliary components to support monitoring and to facilitate loading and storing of filesystem data from external sources. An HPCC environment can include only Thor clusters, or both Thor and Roxie clusters. Each of these cluster types is described in more detail in the following sections below the architecture diagram.

Thor

Thor (the Data Refinery Cluster) is responsible for consuming vast amounts of data, transforming, linking and indexing that data. It functions as a distributed file system with parallel processing power spread across the nodes. A cluster can scale from a single node to thousands of nodes.

  • Single-threaded
  • Distributed parallel processing
  • Distributed file system
  • Powerful parallel processing programming language (ECL)
  • Optimized for Extraction, Transformation, Loading, Sorting, Indexing and Linking
  • Scales from 1-1000s of nodes

Roxie

Roxie (the Query Cluster) provides separate high-performance online query processing and data warehouse capabilities. Roxie (Rapid Online XML Inquiry Engine) is the data delivery engine used in HPCC to serve data quickly and can support many thousands of requests per node per second.

  • Multi-threaded
  • Distributed parallel processing
  • Distributed file system
  • Powerful parallel processing programming language (ECL)
  • Optimized for concurrent query processing
  • Scales from 1-1000s of nodes

ECL

ECL (Enterprise Control Language) is the powerful programming language that is ideally suited for the manipulation of Big Data.

  • Transparent and implicitly parallel programming language
  • Non-procedural and dataflow oriented
  • Modular, reusable, extensible syntax
  • Combines data representation and algorithm implementation
  • Easily extend using C++ libraries
  • ECL is compiled into optimized C++

ECL IDE

ECL IDE is a modern IDE used to code, debug and monitor ECL programs.

  • Access to shared source code repositories
  • Complete development, debugging and testing environment for developing ECL dataflow programs
  • Access to the ECLWatch tool is built-in, allowing developers to watch job graphs as they are executing
  • Access to current and historical job workunits

ESP

ESP (Enterprise Services Platform) provides an easy to use interface to access ECL queries using XML, HTTP, SOAP and REST.

  • Standards-based interface to access ECL functions