Sorted Set is some of the most advanced data structures in Redis. Sorted set is essentially a unique collection of ordered Redis Strings that have a numeric score associated with them. Ordering is based on scores and the string lexicographical order (more on this later). The strings must be unique while the scores might be repeated.
Besides Lists, it is the only ordered data structure in Redis and are preferred to Lists when access to any part of the list is important (unlike List which provide access to ends of list). The average case insertion, removal and search in sorted sets are O(N), where N is the number of elements in the set.
The scores in a sorted set are double 64-bit floating point numbers in the range -(2^53) and +(2^53) included. Sorted sets are sorted by their score in an ascending order. Scores are can be updated for existing keys. To break score ties, strings in a sorted set are ordered lexicographically ascending order. In Redis 2.8, a new feature was implemented to exploit this lexicographic ordering: lexicographic range querying. This has fascinating use cases that we will see later.
Redis sorted sets support a variety of operations from simple set, get, member count to complex lexicographic range calculations. There are about 25+ operations supported. We will look at a subset of them. Let’s start with the basic ones:
# ZADD key [NX|XX] [CH] [INCR] score member [score member ...] Add elements into the set # O(log(N) where N is the number of elements in the set # [NX|XX], [CH] & [INCR] available on > 3.0.2 127.0.0.1:6379> zadd scoreboard 999 "anorak" (integer) 1 # Analogous: use ZREM key member [member ...] to remove members from a sorted set. # ZCARD key O(1): Cardinality of the set 127.0.0.1:6379> zcard scoreboard (integer) 1 # Insert multi 127.0.0.1:6379> zadd scoreboard 99 "daito" 99 "shoto" 199 "aech" 299 "art3mis" 399 "parzival" (integer) 5 # ZSCORE key member. O(1) Returns the score of member in the sorted set at key. 127.0.0.1:6379> zscore scoreboard parzival "399" # ZRANK key member O(log(N)) Get the rank of the member. 127.0.0.1:6379> zrank scoreboard anorak (integer) 5 127.0.0.1:6379> zrank scoreboard shoto (integer) 1 # ZREVRANK key member O(log(N)) Get the rank of the member with scores ordered high to low 127.0.0.1:6379> zrevrank scoreboard anorak (integer) 0 127.0.0.1:6379> zrevrank scoreboard shoto (integer) 4 # ZINCRBY key increment member O(log(N)) Increments the score of member in the sorted set stored at key by increment. 127.0.0.1:6379> zincrby scoreboard 100 parzival "499"
The ones above were some of the basic operations possible on a sorted set. The real value of the sorted sets shines in it’s range based on queries within the set. Let’s take a look at them.
# ZRANGE key start stop [WITHSCORES]. O(log(N)+M) with N being the number of elements in the sorted set and M the number of elements returned. # start and stop are inclusive zero based indexes. 127.0.0.1:6379> zrange scoreboard 0 -1 WITHSCORES 1) "daito" 2) "99" 3) "shoto" 4) "99" 5) "aech" 6) "199" 7) "art3mis" 8) "299" 9) "parzival" 10) "499" 11) "anorak" # 0: 1st member. -1: last member # Analogous: ZREVRANGE key start stop [WITHSCORES] 127.0.0.1:6379> zrevrange scoreboard 0 -1 WITHSCORES 1) "anorak" 2) "999" 3) "parzival" 4) "499" 5) "art3mis" 6) "299" 7) "aech" 8) "199" 9) "shoto" 10) "99" 11) "daito" 12) "99" # ZRANGEBYSCORE key min max [WITHSCORES] [LIMIT offset count]. O(log(N)+M) Returns all the elements in the sorted set at key with a score between min and max (inclusive) # Analogous: ZREVRANGEBYSCORE key max min [WITHSCORES] [LIMIT offset count] # 499 <= score <= +inf 127.0.0.1:6379> zrangebyscore scoreboard 499 +inf 1) "parzival" 2) "anorak" # 499 < score <= +inf 127.0.0.1:6379> zrangebyscore scoreboard (499 +inf 1) "anorak" # ZCOUNT key min max O(log(N)) Returns the number of elements in the sorted set at key with a score between min and max. 127.0.0.1:6379> zcount scoreboard -inf 499 (integer) 5 127.0.0.1:6379> zcount scoreboard -inf +inf (integer) 6
There are another set of set query commands that were introduced in Redis 2.8: the lexicographic range (*BYLEX) commands. Details of how intervals are specified for these commands is documented here. The requirement for these commands to work correctly is that the members of the sorted set should have an identical score. The main commands to operate with lexicographical ranges are ZRANGEBYLEX, ZREVRANGEBYLEX, ZREMRANGEBYLEX and ZLEXCOUNT. Let’s see a couple of examples:
127.0.0.1:6379> zadd lexscores 0 dd 0 aa 0 ccc 0 aaa 0 bb 0 d (integer) 6 # Once inserted, lexicographic sorting for free! 127.0.0.1:6379> zrange lexscores 0 -1 1) "aa" 2) "aaa" 3) "bb" 4) "ccc" 5) "d" 6) "dd" # ZRANGEBYLEX key min max [LIMIT offset count]. O(log(N)+M). min max specify range. LIMIT is like limit in SQL. # min: exclude a max: exclude c 127.0.0.1:6379> ZRANGEBYLEX lexscores (a (c 1) "aa" 2) "aaa" 3) "bb" # min: exclude a max: include c 127.0.0.1:6379> ZRANGEBYLEX lexscores (a [c 1) "aa" 2) "aaa" 3) "bb" # min: exclude a max: include ccc 127.0.0.1:6379> ZRANGEBYLEX lexscores (a [ccc 1) "aa" 2) "aaa" 3) "bb" 4) "ccc" # min: exclude aa max: include cccc 127.0.0.1:6379> ZRANGEBYLEX lexscores (aa [ccc 1) "aaa" 2) "bb" 3) "ccc" # min: exclude aa max: upto all 127.0.0.1:6379> ZRANGEBYLEX lexscores (aa + 1) "aaa" 2) "bb" 3) "ccc" 4) "d" 5) "dd"
Sorted sets are implemented as a dual data structure: It is a combination of both a hash and skip list. The hash part maps objects to scores and the skip list maps scores to objects. We already know how hashes are implemented in Redis from our previous post. The Skip list ensures that searches are fast and most operations on averages are O(log N). The Skip list is implemented in file t_zset.c.
Like most other Redis data structures even Sorted sets are optimized for size when they are small. Sorted sets are stored as only hashes until they grow to a certain size. The config parameters controlling this size are: zset-max-ziplist-entries (default 128) and zset-max-ziplist-value (default 64).
Size estimation: https://stackoverflow.com/questions/6076342/is-there-a-practical-limit-to-the-number-of-elements-in-a-sorted-set-in-redis
Being the advanced data structure that it is, Sorted sets have varied use cases:
- Its most obvious use case is as a scoreboard: maintaining an ordered list of unique members sorted by their scores. For e.g. a leader scoreboard for a MMORPG as explained in the official Redis documentation.
- Sorted sets with identical scores are used as indexes in Redis. This can range from very simple use case to really complex ones. Redis documentation has a wonderful article on Indexing using sorted sets.
- A special case of indexing using sorted sets is the GEO API for Redis that was introduced in Redis 3.2.0.
- Size is a consideration when using sorted sets. Given the complex backing data structures sorted sets will have a relatively larger memory footprint. Exact numbers will depend on the nature of the data set. This StackOverflow post mentions a ballpark number for a decently sized data set.
Since sorted sets have fairly advanced use cases, we will discussed such use cases for sorted sets in subsequent posts. For now, let’s see a simple example.
Gamify your MOOC!
In our previous post on Redis bitmaps, we were the developers supporting a very popular MOOC. The facilitators decide that they want to gamify the course with a leader board tracking the top students contributing to the community posts. The students with the top number of accepted answers to problems posted on the course community posts will receive a special mention every week.
This will be the classic use case for a score based ordering of unique keys aka the Redis sorted set. The student IDs will be the members while the scores will the number of accepted answers. We might update the score using ZINCRBY in real time or in a background job that can run daily or weekly. Obviously updating scores in real time is required for our use case. For initial insertion ZADD with one of the new flags will come in handy. Displaying the scoreboard to the students will need to utilize the reverse range queries (ZREVRANGE etc)
Other posts in our Redis data structures series: