High-Availability with Redis Sentinels: Connecting to Redis Master/Slave Sets

Connecting to single, standalone Redis server is simple enough: simply point to the host, port and provide the authentication password, if any. Most Redis clients even provide support for some sort of URI connection specification as well.
However, in order to achieve High Availability (HA) you need to deploy a master and slave(s) configuration.

In this post, we will show you how to connect to Redis servers in a HA configuration through a single endpoint.

High Availability in Redis

High-Availability in Redis is achieved through master-slave replication. A master Redis server can have multiple Redis servers as slaves, preferably deployed on different nodes across multiple data centers. When the master is unavailable, one of the slaves can be promoted to become the new master and continue to serve data with little or no interruption.

Given the simplicity of Redis, there are many high-availability tools available that can monitor and manage a master-slave replica configuration. However the most common HA solution that comes bundled with Redis is Redis Sentinels. Redis Sentinels run as a set of separate processes that in combination monitor Redis master-slave sets and provide automatic failover and reconfiguration.

Connecting via Redis Sentinels

Redis Sentinels also act as configuration providers for master-slave sets. That is, a Redis client can connect to the Redis sentinels to find out the current master and general health of the master/slave replica set. Redis documentation provides details on how clients should interact with the sentinels. However this mechanism of connecting to Redis has some drawbacks:

  • Needs client support: Connection to Redis sentinels needs a sentinel “aware” client. Most popular Redis clients have now started supporting Redis Sentinels but some still don’t. For example node_redis (Node.js), phpredis (PHP), scala-redis (Scala) are some recommend clients that still don’t have Redis sentinel support.

 

  • Complexity: Configuring and Connecting to Redis Sentinels isn’t always straightforward, especially when the deployment is across data centers or availability zones. For example, sentinels remember IP addresses (not DNS names ) of all data servers and sentinels they ever come across and can get misconfigured when nodes dynamically moved within the data centers. The Redis sentinels also share IP information with other sentinels. Unfortunately they pass around local IP’s which can be problematic if the client is in a separate data center. These issues can add significant complexity to both the Operations and Development.

 

  • Security: The Redis server itself provides for primitive authentication through the server password, the sentinels themselves have no such feature. So a Redis sentinel that is open to the Internet exposes the entire configuration information of all the masters it is configured to manage. Thus Redis sentinels should always be deployed behind correctly configured firewalls. Getting the firewall configuration right, especially for multi-zone configurations can be really tricky.

Single Endpoint

A single network connection endpoint for a master-slave set can be provided in many ways. It could be done through virtual IPs or remapping DNS names or by using a proxy server (E.g. HAProxy) in front of the redis servers. Whenever a failure of the current master is detected (by the Sentinel), the IP or DNS name is failed over to the slave that has been promoted to become the new master by the Redis sentinels. Note that this takes time and the network connection to the endpoint will need to be reestablished. The Redis sentinels recognize a master as down only after it has been down for a period of time (default 30 secs) and then vote to promote a slave. Upon promotion of a slave, the IP address/DNS entry/proxy needs to change to point to new master.

Connecting to Master-Slave sets

The important consideration while connecting to master-slave replica sets using a single endpoint is that one must provision for retries on connection failures to accommodate for any connection failures during an automatic failover of the replica set.

We will show this with examples in Java, Ruby and Node.js. In each example we alternatively write and read from a HA Redis cluster while a failover occurs in the background. In the real world, the retry attempts will be limited to particular duration or count.

Connecting with Java

Jedis is the recommended Java client for Redis.

Single Endpoint Example

public class JedisTestSingleEndpoint {
...
    public static final String HOSTNAME = "SG-cluster0-single-endpoint.example.com";
    public static final String PASSWORD = "foobared";
...
    private void runTest() throws InterruptedException {
        boolean writeNext = true;
        Jedis jedis = null;
        while (true) {
            try {
                jedis = new Jedis(HOSTNAME);
                jedis.auth(PASSWORD);
                Socket socket = jedis.getClient().getSocket();
                printer("Connected to " + socket.getRemoteSocketAddress());
                while (true) {
                    if (writeNext) {
                        printer("Writing...");
                        jedis.set("java-key-999", "java-value-999");
                        writeNext = false;
                    } else {
                        printer("Reading...");
                        jedis.get("java-key-999");
                        writeNext = true;
                    }
                    Thread.sleep(2 * 1000);
                }
            } catch (JedisException e) {
                printer("Connection error of some sort!");
                printer(e.getMessage());
                Thread.sleep(2 * 1000);
            } finally {
                if (jedis != null) {
                    jedis.close();
                }
            }
        }
    }
...
}

The output of this test code during a failover looks like:

Wed Sep 28 10:57:28 IST 2016: Initializing...
Wed Sep 28 10:57:31 IST 2016: Connected to SG-cluster0-single-endpoint.example.com/54.71.60.125:6379 << Connected to node 1
Wed Sep 28 10:57:31 IST 2016: Writing...
Wed Sep 28 10:57:33 IST 2016: Reading...
..
Wed Sep 28 10:57:50 IST 2016: Reading...
Wed Sep 28 10:57:52 IST 2016: Writing...
Wed Sep 28 10:57:53 IST 2016: Connection error of some sort! << Master went down!
Wed Sep 28 10:57:53 IST 2016: Unexpected end of stream.
Wed Sep 28 10:57:58 IST 2016: Connected to SG-cluster0-single-endpoint.example.com/54.71.60.125:6379
Wed Sep 28 10:57:58 IST 2016: Writing...
Wed Sep 28 10:57:58 IST 2016: Connection error of some sort!
Wed Sep 28 10:57:58 IST 2016: java.net.SocketTimeoutException: Read timed out  << Old master is unreachable
Wed Sep 28 10:58:02 IST 2016: Connected to SG-cluster0-single-endpoint.example.com/54.71.60.125:6379
Wed Sep 28 10:58:02 IST 2016: Writing...
Wed Sep 28 10:58:03 IST 2016: Connection error of some sort!
...
Wed Sep 28 10:59:10 IST 2016: Connected to SG-cluster0-single-endpoint.example.com/54.214.164.243:6379  << New master ready. Connected to node 2
Wed Sep 28 10:59:10 IST 2016: Writing...
Wed Sep 28 10:59:12 IST 2016: Reading...

This is a simple test program. In real life, the number of retries will be fixed by duration or count.

Redis Sentinel Example

Jedis supports Redis sentinels as well. So here is the code that does the same thing as the above example but by connecting to sentinels.

public class JedisTestSentinelEndpoint {
    private static final String MASTER_NAME = "mymaster";
    public static final String PASSWORD = "foobared";
    private static final Set sentinels;
    static {
        sentinels = new HashSet();
        sentinels.add("mymaster-0.servers.example.com:26379");
        sentinels.add("mymaster-1.servers.example.com:26379");
        sentinels.add("mymaster-2.servers.example.com:26379");
    }

    public JedisTestSentinelEndpoint() {
    }

    private void runTest() throws InterruptedException {
        boolean writeNext = true;
        JedisSentinelPool pool = new JedisSentinelPool(MASTER_NAME, sentinels);
        Jedis jedis = null;
        while (true) {
            try {
                printer("Fetching connection from pool");
                jedis = pool.getResource();
                printer("Authenticating...");
                jedis.auth(PASSWORD);
                printer("auth complete...");
                Socket socket = jedis.getClient().getSocket();
                printer("Connected to " + socket.getRemoteSocketAddress());
                while (true) {
                    if (writeNext) {
                        printer("Writing...");
                        jedis.set("java-key-999", "java-value-999");
                        writeNext = false;
                    } else {
                        printer("Reading...");
                        jedis.get("java-key-999");
                        writeNext = true;
                    }
                    Thread.sleep(2 * 1000);
                }
            } catch (JedisException e) {
                printer("Connection error of some sort!");
                printer(e.getMessage());
                Thread.sleep(2 * 1000);
            } finally {
                if (jedis != null) {
                    jedis.close();
                }
            }
        }
    }
...
}

Let’s see the behavior of the above program during a sentinel managed failover:

Wed Sep 28 14:43:42 IST 2016: Initializing...
Sep 28, 2016 2:43:42 PM redis.clients.jedis.JedisSentinelPool initSentinels
INFO: Trying to find master from available Sentinels...
Sep 28, 2016 2:43:42 PM redis.clients.jedis.JedisSentinelPool initSentinels
INFO: Redis master running at 54.71.60.125:6379, starting Sentinel listeners...
Sep 28, 2016 2:43:43 PM redis.clients.jedis.JedisSentinelPool initPool
INFO: Created JedisPool to master at 54.71.60.125:6379
Wed Sep 28 14:43:43 IST 2016: Fetching connection from pool
Wed Sep 28 14:43:43 IST 2016: Authenticating...
Wed Sep 28 14:43:43 IST 2016: auth complete...
Wed Sep 28 14:43:43 IST 2016: Connected to /54.71.60.125:6379
Wed Sep 28 14:43:43 IST 2016: Writing...
Wed Sep 28 14:43:45 IST 2016: Reading...
Wed Sep 28 14:43:48 IST 2016: Writing...
Wed Sep 28 14:43:50 IST 2016: Reading...
Sep 28, 2016 2:43:51 PM redis.clients.jedis.JedisSentinelPool initPool
INFO: Created JedisPool to master at 54.214.164.243:6379
Wed Sep 28 14:43:52 IST 2016: Writing...
Wed Sep 28 14:43:55 IST 2016: Reading...
Wed Sep 28 14:43:57 IST 2016: Writing...
Wed Sep 28 14:43:59 IST 2016: Reading...
Wed Sep 28 14:44:02 IST 2016: Writing...
Wed Sep 28 14:44:02 IST 2016: Connection error of some sort!
Wed Sep 28 14:44:02 IST 2016: Unexpected end of stream.
Wed Sep 28 14:44:04 IST 2016: Fetching connection from pool
Wed Sep 28 14:44:04 IST 2016: Authenticating...
Wed Sep 28 14:44:04 IST 2016: auth complete...
Wed Sep 28 14:44:04 IST 2016: Connected to /54.214.164.243:6379
Wed Sep 28 14:44:04 IST 2016: Writing...
Wed Sep 28 14:44:07 IST 2016: Reading...
...

As evident from the logs, a client that supports Sentinels can recover from a failover event fairly quickly.

Connecting with Ruby

Redis-rb is the recommended Ruby client for Redis.

Single Endpoint Example

require 'redis'

HOST = "SG-cluster0-single-endpoint.example.com"
AUTH = "foobared"
...

def connect_and_write
  while true do
    begin
      logmsg "Attempting to establish connection"
      redis = Redis.new(:host => HOST, :password => AUTH)
      redis.ping
      sock = redis.client.connection.instance_variable_get(:@sock)
      logmsg "Connected to #{sock.remote_address.ip_address}, DNS: #{sock.remote_address.getnameinfo}"
      while true do
        if $writeNext
          logmsg "Writing..."
          redis.set("ruby-key-1000", "ruby-value-1000")
          $writeNext = false
        else
          logmsg "Reading..."
          redis.get("ruby-key-1000")
          $writeNext = true
        end
        sleep(2)
      end
    rescue Redis::BaseError => e
      logmsg "Connection error of some sort!"
      logmsg e.message
      sleep(2)
    end
  end
end

...
logmsg "Initiaing..."
connect_and_write

Here’s the sample output during a failover:

"2016-09-28 11:36:42 +0530: Initiaing..."
"2016-09-28 11:36:42 +0530: Attempting to establish connection"
"2016-09-28 11:36:44 +0530: Connected to 54.71.60.125, DNS: [\"ec2-54-71-60-125.us-west-2.compute.amazonaws.com\", \"6379\"] " << Connected to node 1
"2016-09-28 11:36:44 +0530: Writing..."
"2016-09-28 11:36:47 +0530: Reading..."
...
"2016-09-28 11:37:08 +0530: Writing..."
"2016-09-28 11:37:09 +0530: Connection error of some sort!"  << Master went down!
...
"2016-09-28 11:38:13 +0530: Attempting to establish connection"
"2016-09-28 11:38:15 +0530: Connected to 54.214.164.243, DNS: [\"ec2-54-214-164-243.us-west-2.compute.amazonaws.com\", \"6379\"] " << Connected to node 2
"2016-09-28 11:38:15 +0530: Writing..."
"2016-09-28 11:38:17 +0530: Reading..."

Again, Actual code should contain a limited number of retries.

Redis Sentinel Example

Redis-rb supports sentinels as well.

AUTH = 'foobared'

SENTINELS = [
  {:host => "mymaster0.servers.example.com", :port => 26379},
  {:host => "mymaster0.servers.example.com", :port => 26379},
  {:host => "mymaster0.servers.example.com", :port => 26379}
]
MASTER_NAME = "mymaster0"

$writeNext = true
def connect_and_write
  while true do
    begin
      logmsg "Attempting to establish connection"
      redis = Redis.new(:url=> "redis://#{MASTER_NAME}", :sentinels => SENTINELS, :password => AUTH)
      redis.ping
      sock = redis.client.connection.instance_variable_get(:@sock)
      logmsg "Connected to #{sock.remote_address.ip_address}, DNS: #{sock.remote_address.getnameinfo} "
      while true do
        if $writeNext
          logmsg "Writing..."
          redis.set("ruby-key-1000", "ruby-val-1000")
          $writeNext = false
        else
          logmsg "Reading..."
          redis.get("ruby-key-1000")
          $writeNext = true
        end
        sleep(2)
      end
    rescue Redis::BaseError => e
      logmsg "Connection error of some sort!"
      logmsg e.message
      sleep(2)
    end
  end
end

Redis-rb manages sentinel failovers without any disruptions.

"2016-09-28 15:10:56 +0530: Initiaing..."
"2016-09-28 15:10:56 +0530: Attempting to establish connection"
"2016-09-28 15:10:58 +0530: Connected to 54.214.164.243, DNS: [\"ec2-54-214-164-243.us-west-2.compute.amazonaws.com\", \"6379\"] "
"2016-09-28 15:10:58 +0530: Writing..."
"2016-09-28 15:11:00 +0530: Reading..."
"2016-09-28 15:11:03 +0530: Writing..."
"2016-09-28 15:11:05 +0530: Reading..."
"2016-09-28 15:11:07 +0530: Writing..."
...
<<failover>>
...
"2016-09-28 15:11:10 +0530: Reading..."
"2016-09-28 15:11:12 +0530: Writing..."
"2016-09-28 15:11:14 +0530: Reading..."
"2016-09-28 15:11:17 +0530: Writing..."
...
# No disconnections noticed at all by the application

Connecting with Node.js

Node_redis is the recommended Node.js client for Redis.

Single Endpoint Example

...
var redis = require("redis");
var hostname = "SG-cluster0-single-endpoint.example.com";
var auth = "foobared";
var client = null;
...

function readAndWrite() {
  if (!client || !client.connected) {
    client = redis.createClient({
      'port': 6379,
      'host': hostname,
      'password': auth,
      'retry_strategy': function(options) {
        printer("Connection failed with error: " + options.error);
        if (options.total_retry_time > 1000 * 60 * 60) {
          return new Error('Retry time exhausted');
        }
        return new Error('retry strategy: failure');
      }});
    client.on("connect", function () {
      printer("Connected to " + client.address + "/" + client.stream.remoteAddress + ":" + client.stream.remotePort);
    });
    client.on('error', function (err) {
      printer("Error event: " + err);
      client.quit();
    });
  }

  if (writeNext) {
    printer("Writing...");
    client.set("node-key-1001", "node-value-1001", function(err, res) {
      if (err) {
        printer("Error on set: " + err);
        client.quit();
      }
      setTimeout (readAndWrite, 2000)
    });

    writeNext = false;
  } else {
    printer("Reading...");
    client.get("node-key-1001", function(err, res) {
      if (err) {
        client.quit();
        printer("Error on get: " + err);
      }
      setTimeout (readAndWrite, 2000)
    });
    writeNext = true;
  }
}
...
setTimeout(readAndWrite, 2000);
...

Here’s how a failover will look like:

2016-09-28T13:29:46+05:30: Writing...
2016-09-28T13:29:47+05:30: Connected to SG-meh0-6-master.devservers.mongodirector.com:6379/54.214.164.243:6379 << Connected to node 1
2016-09-28T13:29:50+05:30: Reading...
...
2016-09-28T13:30:02+05:30: Writing...
2016-09-28T13:30:04+05:30: Reading...
2016-09-28T13:30:06+05:30: Connection failed with error: null << Master went down
...
2016-09-28T13:30:50+05:30: Connected to SG-meh0-6-master.devservers.mongodirector.com:6379/54.71.60.125:6379 << Connected to node 2
2016-09-28T13:30:52+05:30: Writing...
2016-09-28T13:30:55+05:30: Reading...

You can also experiment with the ‘retry_strategy’ option during connection creation for tweaking the retry logic to meet your needs. The client documentation has an example.

Redis Sentinel Example

Node_redis currently doesn’t support sentinels. The popular Redis client for Node.js, ioredis support sentinels. Refer to it’s documentation on how to connect to sentinels from Node.js.

Ready to scale up? We offer Redis hosting and fully managed solutions on a cloud of your choice. Compare us to others and see why we save you hassle and cash.


Liked this post?

Join the ScaleGrid newsletter and never miss out!

Vaibhaw is a Member of the Technical Staff at ScaleGrid.io (Formerly MongoDirector). You can reach out to him at @_vaibhaw


28 Shares
+17
Tweet
Share
Share21
Pin