Hit Enter to Search or X to close

Intro to Redis Scripting with Lua

Need to scale your Redis with server-side Lua scripts? This crash-course will get you started.

Noah Zucker
No items found.

In a previous post, we reviewed how SEI Novus has optimized its use of Redis for its portfolio analytics product, Alpha Platform.  In this article, we'll dig deeper into an aspect of our implementation: scripting Redis with Lua.

Before embarking on this task, I was apprehensive about learning Lua—let alone running it embedded in Redis. As it turns out, it wasn't bad at all; I documented my journey in this article to help you get started and bring similar benefits to your applications.

Use Case Overview

Before getting into the details of our usage of Lua, let's review our use case. In order to optimize the memory profile of data we store in our Redis, we devised a normalization scheme in which we substitute string values (column and group names) with integers.

So, a Scala object with values like:

com.novus.analytics.core.database.api.redis.APIRedisRecord(
  columns = List(
    "Asset",
    "AssetTypeTag",
    "AssetTypeTagCode",
    "ClientProvidedBeta",
    "CustomIndex",
    "DefaultSymbol",
    "Issuer",
    "PnL_userConfigured",
    "Pnl",
    "PositionName",
    "PositionPnl",
    "PositionPnlExFx"), ...)

we serialize as binary objects to Redis from a much more compact tuple of Ints:

scala.Tuple8(List(1,2,3,4,5,6,7,8,9,10))

...where the numbers correspond to keys in a Hash "dictionary." Because our application is distributed, with worker nodes processing these tuples, we need to store the hash dictionary centrally so the workers can fetch it and decode the Ints back to string values. Our central dictionary store? Of course, Redis.

The Redis Hash for each "dictionary" looks something like this:

"1" -> "Asset" 
"2" -> "AssetTypeTag" 
"3" -> "AssetTypeTagCode" 
"4" -> "ClientProvidedBeta" 
"5" -> "CustomIndex" 
"6" -> "DefaultSymbol" 
"7" -> "Issuer" 
"8" -> "PnL_userConfigured" 
"9" -> "Pnl" 
"10" -> "PositionName" 
"11" -> "PositionPnl" 
"12" -> "PositionPnlExFx"

This approach provides enormous benefits in terms of space savings. However, it also introduces a considerable complexity: our hash string values can't be known in advance, so we have to map them to Int keys on-the-fly. Furthermore, for a single user request, we might have hundreds of workers processing different chunks of data, each with distinct string values that need mapping—but all have to share a single hash dictionary with contention for updates. We need to update the hash efficiently while avoiding collisions between entries. 

Put another way: we have hundreds of worker nodes, processing a user request and all trying to update the shared hash dictionary for that request. One worker might seek to register "1" as "Asset" while another wants "1" to be "DefaultSymbol." Obviously, we can't have one update stomp the other—we have to assign out key/value mappings as they come in. How to solve this problem?

First Cut: Brute Force with HGETALL

As described in our previous post, our first pass at this problem was use the Redis HGETALL operation. The logic was as follows:

  1. HGETALL all the fields for a hash.
  1. Check if the value we want is in the hash.
  1. If so, use the existing int → value mapping.
  1. If not, add-and-use a new int → value mapping with HSETNX.
  1. Finally, check the result of HSETNX—if it returned "false" (value not set), retry from the beginning.

This approach certainly works, but suffers from poor performance. HGETALL is a known CPU-hog for large hashes, and we saw large, sustained CPU spikes and request timeouts during peak usage. Furthermore, if a hash key collision does occur (i.e. HSETNX returned false), the resolution requires multiple trips—not ideal.  Aside from the latency of network round trips, reasoning through the logic is difficult.

We need a solution that adds a hash entry and assigns it a key in a single step. This is where Lua scripting comes in.

Devising our Script Logic

Before writing a line of Lua code, we need to understand our logic. Conceptually it's very simple—a classic "get-or-put if-absent" algorithm. Rather than multiple steps with HGETALL and HSETNX, we need a function that looks something like this pseudocode:

function hash_id_get_or_put_if_absent(value) 
    if hash_contains(value) 
        return hash_get(value) 
    else 
        id = hash_put(value) 
        return id 
    end if

We want this to be treated as a transaction: only one update to the hash should occur at a time. Fortunately, Redis is single-threaded, so we know that this simple logic will be correct without having to account for simultaneous requests or locking.

Lua Crash Course

At this point, I must confess: I had been hearing about Lua for over a decade — most recently as the scripting language of choice for the popular online game Roblox — but I had never actually written a line of code. Fortunately, I found the language very approachable. In particular, the site Learn X in Y Minutes, my go-to resource for quickly learning new technologies, was especially helpful.

Rather than looking at contrived examples, let's dive right into our actual production script:

-- Author: Noah Zucker 
-- Copyright © 2022 SEI Novus 
-- Given a number of hash fields for KEY[1], return the numeric ids 
-- already assigned to those fields, or generate new ones for previously unseen fields. 

local ids_to_fields = KEYS[1]                   -- lookup: 1 => foobar 
local fields_to_ids = ids_to_fields.."-reverse" -- reverse lookup: foobar => 1 
 
local result = {} 

for _,field in ipairs(ARGV) do 
    -- Reverse lookup is only used to check if we have already 
    -- assigned an id to the given string value. 
    local id = redis.call("hget", fields_to_ids, field) 

    if not id then 
        id = redis.call("hlen", ids_to_fields) 
        redis.call("hset", ids_to_fields, id, field) 
        redis.call("hset", fields_to_ids, field, id) 
    end 

    table.insert(result, id) 
end 

return result

In case the above code isn't self-explanatory, below is a concise line-by-line explanation.

This script assigns new numeric ids to given string values, taking our pseudo code a step further by handling multiple arguments per invocation. Just to be sure, let's break down what's going on:

  1. Lua has only one data structure: Tables—something like an associative array in PHP or Object in Javascript.
  1. "KEYS" and "ARGV" are two global variables, both Tables, provided by the Redis runtime.
  1. "KEYS[1]" indicates the first element of the table, which we assign to "ids_to_fields"
  1. "local" defines script variables.
  1. "local result = {}" initializes a new Table object where we will hold the accumulate the results of our function.
  1. The Lua string concatenation operator is two dots ("..") — we use this to define the key for the reverse-lookup Hash.
  1. The function "ipairs" returns an iterator over a given Table value.
  1. The block "for _, field in ipairs(ARGV)" iterates over the parameters passed via the ARGV  global variable (the underscore discards the table index, which we don't need).
  1. Inside the for block, we invoke Redis functions via "redis.call()"
  1. "table.insert" adds a record to the result table.
  1. "return" sends the final result back to Redis and onwards to the client application.

For populating our Hash with id → value lookups, the operation 

redis.call("hlen", ids_to_fields)

uses the current length of the hash as our next ID. Thus we have monotonically increasing values for our IDs (i.e. 0, 1, 2, 3...).

Finally, as you might have guessed, comments are indicated by two dashes ("--") , not unlike SQL

Testing the Script

Now it's time to see if our script works.

A great thing about developing Lua scripts with Redis is it's possible to do so completely independent of your actual application. The versatile Redis CLI supports invoking Lua scripts via the "--eval" flag:

$ redis-cli --eval persist_api_dictionary.lua foo , a b c d e f 

1) (integer) 0 
2) (integer) 1 
3) (integer) 2 
4) (integer) 3 
5) (integer) 4 
6) (integer) 5 
7) (integer) 6

This one-liner invokes our Lua script, passing in foo as the single key parameter and a b c d e f as the ARGV  parameters.

Note that you need to have spaces in between either side of that comma! This is the convention for separating the KEY foo from the ARGV  values (a b c d e f)  in the example.

The response from Redis indicates each the of the values assigned to our ARGV values.  In another redis-cli session, we can see the server-side operations via the MONITOR command:

$ redis-cli 
127.0.0.1:6379> monitor 
OK 
... 
1658190531.849940 [0 lua] "hget" "foo-reverse" "a" 
1658190531.849956 [0 lua] "hlen" "foo" 
1658190531.849974 [0 lua] "hset" "foo" "0" "a" 
1658190531.849991 [0 lua] "hset" "foo-reverse" "a" "0" 
1658190531.850004 [0 lua] "hget" "foo-reverse" "b" 
1658190531.850015 [0 lua] "hlen" "foo" 
1658190531.850022 [0 lua] "hset" "foo" "1" "b" 
1658190531.850033 [0 lua] "hset" "foo-reverse" "b" "1" 
1658190531.850046 [0 lua] "hget" "foo-reverse" "c"

On subsequent invocations of the same script, we can see that it correctly returns previously assigned values, while creating new entries as needed:


$ redis-cli --eval persist_api_dictionary.lua foo , a b c 

1) "0" 
2) "1" 
3) "2" 

$ redis-cli --eval persist_api_dictionary.lua foo , d e f 

1) "3" 
2) "4" 
3) (integer) 7 

$ redis-cli --eval persist_api_dictionary.lua foo , x y z 

1) (integer) 8 
2) (integer) 9 
3) (integer) 10 

$ redis-cli --eval persist_api_dictionary.lua foo , g h a b 

1) (integer) 11 
2) (integer) 12 
3) "0" 
4) "1"

One detail to note is that we store previous values as strings, but new values are returned as Integers. This is perhaps something to clean up via the tostring()  function (but our client application handles both for now).

See Redis documentation:

Integrating with Your Scala Application

Now that our script is working, it's time to integrate it into our Scala application. As mentioned in a previous post, we are using the battle-tested scala-redis library by @debasishg  

View it GitHub: https://github.com/debasishg/scala-redis

Calling a script from this library is straightforward. You simply send the entire contents of the script as string to Redis via an API call. It returns a SHA value your application can subsequently use to pre-check if the script is loaded.

/** 
  * Checks if the Lua script exists in the Redis (per its SHA hash value), and if not, loads it. 
  * 
  * Calls SCRIPT EXISTS followed by SCRIPT LOAD if not. 
  */ 
 def ensureScript(sha: String, script: String): com.redis.RedisCommand => Option[String] = client => { 
   // SHA might be null because that's the initial value of the AtomicReference that holds it.
   if (sha == null || client.scriptExists(sha).forall(_ == 0)) { 
     client.scriptLoad(script) 
   } else Option(sha) 
 }
 

After loading the script, your application can call it by SHA instead of sending the entire script over to Redis each time:

/** 
 * Executes a given script referenced by its SHA hash. 
 * You should use `ensureScript` before calling this function to 
 * ensure the script is loaded and cached in the Redis server. 
 * 
 * Calls EVALSHA with the keys and values as arguments. 
 * 
 * @return results of the function as a list of string values. 
 */ 
def evalScriptMultiBySHA( 
    sha: String, 
    keys: List[Any], 
    values: List[Any] 
): com.redis.RedisCommand => List[Option[String]] = client => { 
  import Parse.Implicits.parseString 
  client.evalMultiSHA[String](sha, keys, values).getOrElse(List.empty) 
}

With these two functions defined, your application can start making EVAL calls to Redis. You should of course store the SHA value in a thread-safe variable (i.e. a java.util.concurrent.atomic.AtomicReference) so your application only has to call ensureScript once, i.e. on boot up.

Example code:

private lazy val script = {
  val resource = Resources.getResource("lua/persist_api_dictionary.lua")
  Resources.toString(resource, Charsets.UTF_8)
}

private val shaCache: AtomicReference[String] = new AtomicReference[String] 

private def load(client: RedisCommand): String = { 
  val sha = shaCache.updateAndGet((cachedSha: String) => { 
    // "cachedSha" is the SHA value previously stored in the AtomicReference, if any.
    ensureScript(cachedSha, script)(client).getOrElse( 
      sys.error(s"?!? Failed to load SHA for Lua script")) 
  }) 
  
 sha 
}

We then package the Lua script as a classpath resource in our Scala application JAR. Loading it as a string from the classpath ensures that all worker nodes have a consistent script and generate the same SHA, thus calling ensureScript only once during their runtime.

Once ensureScript is working, the Redis MONITOR will indicate the script being called by its SHA rather than printing the full script text:

1671223771.125772 [0 10.78.91.75:38524] "EVALSHA" "3680766b7818263c72c242f29c6e09ab12397479"

Looking Forward: Redis Functions and Modules

At this point, you have everything you need to get started with Lua in Redis and your Scala application. However, before closing we should mention Redis Modules.

Loading your scripts from your Scala application is slightly inefficient and a bit of a hack. If your script is sufficiently complex—or you have thousands of nodes—you may find your Redis spending far too much time on loading and parsing scripts.

Redis offers several solutions to these challenges:

  • Redis Functions – new in Redis version 7, you can define Lua scripts as "functions," loaded and managed on the server-side (instead of clients having to call SCRIPT LOAD  repeatedly).
  • Redis Modules – you can write shared objects in C that execute in Redis, providing richer, custom commands and data types.

Redis documentation:

Conclusion

Hopefully, the explanation above helps you with scripting Redis with Lua. If you enjoy working with new technologies and solving challenges of distributed, scalable applications—we're hiring! Be sure to check out our careers page.

If you found this interesting...

SEI Novus is hiring! Check out our Jobs Page

SEI Novus Careers
related Posts
> Tech Blogs