|
8 jaren geleden | |
---|---|---|
.. | ||
hiredis @ 010756025e | 8 jaren geleden | |
CMakeLists.txt | 8 jaren geleden | |
README.md | 9 jaren geleden | |
lib_redis.ecllib | 8 jaren geleden | |
redis.cpp | 9 jaren geleden | |
redis.hpp | 8 jaren geleden |
This is the ECL plugin to utilize the persistent key-value cache Redis. It utilises the C API hiredis.
To build the redis plugin with the HPCC-Platform, libhiredis-dev is required.
sudo apt-get install libhiredis-dev
The redis server and client software can be obtained via either - binaries, source or the preferred method:
sudo apt-get install redis-server
Note: redis-server 2.6.12 or greater is required to use this plugin as intended. For efficiency, such version requirements are not checked as this is a runtime check only. The use of a
lesser version will result in an exception, normally indicating that either a given command does not exist or that the wrong number of arguments was passed to it. The Set
plugin functions will not work when setting with an expiration for a version less than 2.6.12. In addition, whilst it is possible to use Expire
with a version less than
2.1.3 it is not advised due to the change in its semantics.
Note: The minimum version requirement for the API hiredis to allow for redis connections to be cached is 0.13.0.
The server can be started by typing redis-server
within a terminal. To run with a non-default configuration run as redis-server redis.conf
, where
redis.conf is the configuration file supplied with the redis-server package.
For example, to require the server to password authenticate, locate and copy redis.conf to a desired dir. Then locate and alter the 'requirepass' variable within the file. Similarly the server port can also be altered here. Note: that the default is 6379 and that if multiple and individual caches are required then they are by definition redis-servers on different ports.
The redis-server package comes with the redis client redis-cli. This can be used to send and receive commands to and from the server, invoked by redis-cli
or, for example,
redis-cli -p 6380
to connect to the redis-cache on port 6380 (assuming one has been started).
Perhaps one of the most handy uses of redis-cli is the ability to monitor all commands issued to the server via the redis command MONITOR
. INFO ALL
is also a useful command
for listing the server and cache settings and statistics. Note: that if requirepass is activated redis-cli with require you to authenticate via AUTH <passcode>
.
Further documentation is available with a full list of redis commands.
The bulk of this redis plugin for ECL is made up of the various SET
and GET
commands e.g. GetString
or SetReal
. They are accessible via the module redis
from the redis plugin ECL library lib-redis
. i.e.
IMPORT redis FROM lib_redis;
Here is a list of the core plugin functions.
###Set
SetUnicode( CONST VARSTRING key, CONST UNICODE value, CONST VARSTRING options, INTEGER4 database = 0, UNSIGNED4 expire = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
SetString( CONST VARSTRING key, CONST STRING value, CONST VARSTRING options, INTEGER4 database = 0, UNSIGNED4 expire = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
SetUtf8( CONST VARSTRING key, CONST UTF8 value, CONST VARSTRING options, INTEGER4 database = 0, UNSIGNED4 expire = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
SetBoolean( CONST VARSTRING key, BOOLEAN value, CONST VARSTRING options, INTEGER4 database = 0, UNSIGNED4 expire = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
SetReal( CONST VARSTRING key, REAL value, CONST VARSTRING options, INTEGER4 database = 0, UNSIGNED4 expire = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
SetInteger( CONST VARSTRING key, INTEGER value, CONST VARSTRING options, INTEGER4 database = 0, UNSIGNED4 expire = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
SetUnsigned(CONST VARSTRING key, UNSIGNED value, CONST VARSTRING options, INTEGER4 database = 0, UNSIGNED4 expire = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
SetData( CONST VARSTRING key, CONST DATA value, CONST VARSTRING options, INTEGER4 database = 0, UNSIGNED4 expire = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
###Get
INTEGER8 GetInteger(CONST VARSTRING key, CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
UNSIGNED8 GetUnsigned(CONST VARSTRING key, CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
STRING GetString(CONST VARSTRING key, CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
UNICODE GetUnicode(CONST VARSTRING key, CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
UTF8 GetUtf8(CONST VARSTRING key, CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
BOOLEAN GetBoolean(CONST VARSTRING key, CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
REAL GetReal(CONST VARSTRING key, CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
DATA GetData(CONST VARSTRING key, CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
###Numeric
INTEGER8 INCRBY(CONST VARSTRING key, INTEGER8 value = 1, CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
###Utility
BOOLEAN Exists(CONST VARSTRING key, CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000)
FlushDB(CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
Delete(CONST VARSTRING key, CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
Persist(CONST VARSTRING key, CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
Expire(CONST VARSTRING key, CONST VARSTRING options, INTEGER4 database = 0, UNSIGNED4 expire, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
INTEGER DBSize(CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN cacheConnections = TRUE)
###PUB-SUB
UNSIGNED Publish(CONST VARSTRING keyOrChannel, CONST STRING message, CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN lockedKey = FALSE, BOOLEAN cacheConnections = TRUE)
STRING Subscribe(CONST VARSTRING keyOrChannel, CONST VARSTRING options, INTEGER4 database = 0, CONST VARSTRING password = '', UNSIGNED4 timeout = 1000, BOOLEAN lockedKey = FALSE, BOOLEAN cacheConnections = TRUE)
The core points to note here are:
CONST VARSTRING options
passes the server IP and port to the plugin in the strict format - --SERVER=<ip>:<port>
. If options
is empty, the default
127.0.0.1:6379 is used. Note: 6379 is the default port for redis-server.UNSIGNED4 timeout
has units ms and has a default value of 1 second (0 := infinity). c.f. 'Timeout Values' below for advice on choosing appropriate values.UNSIGNED expire
has units ms and a default of 0, i.e. forever.Publish
and Subscribe
have a flag BOOLEAN lockedKey = FALSE
such, that when TRUE, will encode CONST VARSTRING keyOrChannel
as if it were a key
allowing key-channel encoding compatibility with the GetOrLockString
and SetAndPublishString
functions. For this reason, they both also take a Database value as
this is used in the encoding of the lock and channel. Please note however that the redis pub-sub paradigm is actually irrespective of database.###Connection Caching
To prevent unnecessary opening and closing of connections between subsequent functions calls, these connections are cached and reused. This is only true if the said connection
is free of errors, otherwise it is closed and a new one opened. There are three cached instances per thread, storing a single connection for subscriptions, publishes, and then one
for everything else. The caching of connections is only possible for versions of hiredis greater than the minimum version noted in the section Installation and Dependencies.
The caching can be turned ON and OFF on a per function basis using the cacheConnections
boolean passed as a function parameter.
In addition, the following system environment setting HPCC_REDIS_PLUGIN_CONNECTION_CACHING_LEVEL
can be set to:
0
- force any & all caching OFF1
- allow caching of connections (default).2
- force any & all caching ON.< 0 and < 2
- undefined.This environment variable must be set for the user group hpcc and can be done by editing /etc/profile (service restart required).
Regarding the caching of connections used purely for subscriptions, it is essential that they are successfully unsubscribed before being reused. Such an attempt to unsubscribe and confirm this, is limited in effort to do so, otherwise giving up and simply closing the connection and opening a new one. To aid in the tuning of this for both payload and system/network constraints, the following three system environment settings exist:
HPCC_REDIS_PLUGIN_CACHE_SUB_CONNECTIONS
- turn the caching of subscription connections ON or OFF. Default = ONHPCC_REDIS_PLUGIN_UNSUBSCRIBE_READ_ATTEMPTS
- the maximum number of socket reads to attempt to receive the required unsubscribe confirmation before giving up. Default = 2.HPCC_REDIS_PLUGIN_UNSUBSCRIBE_TIMEOUT
- the timeout value (ms) used for such socket reads. Default = 100.Note: For further implementation details refer to the comment associated with the definition of SubConnection::unsubscribe(...) in HPCC-Platform/plugins/redis/redis.cpp.
###The redisServer MODULE
To avoid the cumbersome and unnecessary need to constantly pass options
,
password
, timeout
, and cacheConnections
with each function call,
the module redisServer
can be imported to effectively wrap the above functions.
IMPORT redisServer FROM lib_redis;
myRedis := redisServer('--SERVER=127.0.0.1:6379', 'foobared', 2000, FALSE);
myRedis.SetString('myKey', 'supercalifragilisticexpialidocious');
myRedis.GetString('myKey');
###A Redis 'Database' The notion of a database within a redis cache is that of a partition, such that it may contain an identical key per database e.g.
myRedis.SetString('myKey', 'foo', 0);
myRedis.SetString('myKey', 'bar', 1);
myRedis.GetString('myKey', 0);//returns 'foo'
myRedis.GetString('myKey', 1);//returns 'bar'
Note: that the default database is 0. The maximum number of databases allowed by Redis is 2147483647 (int32).
A common use of external caching systems such as redis is for temporarily storing data that may be expensive, computationally or otherwise, to obtain and thus doing so
only once is paramount. In such a scenario it is possible (in cases usual) for multiple clients/requests to hit the cache simultaneously and upon finding that the data
requested has not yet been stored, it is desired that only one of such requests obtain the new value and then store it for the others to then also obtain (from the cache).
This plugin offers a solution to such a problem via the GetOrLock
and SetAndPublish
functions within the redisServer
and redis
modules of lib_redis.
This module contains only three function categories - the SET
and GET
functions for STRING, UTF8, and UNICODE (i.e. only those that return empty strings)
and lastly, an auxiliary function Unlock
used to manually unlock locked keys as it be discussed.
The principle here is based around a cache miss in which a requested key does not exist, the first requester (race winner) 'locks' the key in an atomic fashion. Any other simultaneous requester (race loser) finds that the key exists but has been locked and thus SUBSCRIBES to the key awaiting a PUBLICATION message from the race-winner that the value has been set. Such a paradigm is well suited by redis due to its efficiently implemented PUB-SUB infrastructure.
###An ECL Example
IMPORT redisServer FROM lib_redis
myRedis := redisServer('--SERVER=127.0.0.1:6379');
STRING poppins := 'supercalifragilisticexpialidocious'; //Value to externally compute/retrieve from 3rd party vendor.
myFunc(STRING key, INTEGER4 database) := FUNCTION //Function for computing/retrieving a value.
return myRedis.GetString(key, database);
END;
SEQUENTIAL(
myRedis.SetString('poppins', poppins, 3),
//If the key does not exist it will 'lock' the key and retrun an empty STRING.
STRING value := myRedis.GetOrLockString('supercali- what?');
//All SetAndPublish<type>() return the value passed in as the 2nd parameter.
IF (LENGTH(value) == 0, myRedis.SetAndPublishString('supercali- what?', myFunc('poppins', 3)), value);
);
Note: further ECL examples can be found in the following files regarding the locking and non-locking functions.
###Timeout Values The timeout durations are effectively for the entire duration of a call to each of the functions exported by this plugin library. By 'effectively', it is meant that a timer is initiated at the start of each call and upon each internal communication with the redis server, any time remaining (at this point) is the timeout value passed to the redis API (hiredis) for that communication call. Since some plugin functions make more calls to the server than others (c.f. 'Behaviour and Implementation Details' below) it is possible for those functions with more server calls to timeout more regularly than those with less. To avoid this, it is advised to set the timeouts to a multiple of the anticipated latency of the client-server-IO, where such multiple should be at least the maximum expected number of internal redis calls made by these plugin functions, e.g. 12.
When using the ECL pattern described in the above section An ECL Example, it is required to set the timeout and lock expiration to be equal to the timeout (if any)
of myFunc
+ that passed to SetAndPublish<type>
, such that both the lock and waiting subscribers live long enough for a value to be set/published.
It should also be noted that, whilst it is possible to set different values for timeout
and expire
to the function GetOrLock<type>
, it is advisable not to.
This is such that the lock does not out live all waiting subscribers that collectively timeout and thus not blocking any subsequent retries of the locking pattern.
A few notes to point out here:
DELETE
the key) via the Unlock(<key>)
function. Note: this function will fail on any communication or reply error however,
it will silently fail, leaving the lock to expire, if the server observes any change to the key during the function call duration.GET
and possible further race conditions in doing so. Note: This does however, mean that it is possible for the actual redis SET
to fail on one client/process,
have the key-value received on another, and yet, the key-value still does not exist on the cache.GET
to wait and subscribe. I.e. the locked key can be deleted and re-set just as any other key can be.Operation/Function | Nominal | Maximum | Diff due to... |
---|---|---|---|
A new connection | 3 | 4 | database |
Cached connection | 0 | 2 | database, timeout |
Get | 1 | 5 | new connection |
Set | 1 | 5 | new connection |
FlushDB | 1 | 5 | new connection |
Delete | 1 | 5 | new connection |
Persist | 1 | 5 | new connection |
Exists | 1 | 5 | new connection |
DBSize | 1 | 5 | new connection |
Expire | 1 | 5 | new connection |
GetOrLock | 7 | 11 | new connection |
GetOrLock (locked) | 8 | 12 | new connection |
SetAndPublish (value length > 29) | 1 | 5 | new connection |
SetAndPublish (value length < 29) | 4 | 8 | new connection |
Unlock | 5 | 9 | new connection |
Publish | 1 | 4 | new connection |
Subscribe | 5 | 5 | new connection always needed |