Caching tweets using Node.js, Redis and Socket.io

blog

In this article,  we will build a streaming list of tweets based on a search query entered by the user. The tweets will be fetched using Twitter’s Streaming API, stored in a Redis list and updated in the front-end using Socket.io. We will primarily be using Redis as a caching layer for fetching tweets.

Introduction

Here is a brief description of the technologies we will be using:

Redis

Redis is an open-source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, and geospatial indexes with radius queries.

Node.js

Node.js is a platform built on Chrome’s JavaScript runtime for easily building fast and scalable network applications. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient, and thus perfect for data-intensive real-time applications that run across distributed devices.

Express.js

Express.js is a Node.js framework. You can create the server and server-side code for an application like most of the other web languages, but using JavaScript.

Socket.IO

Socket.IO is a JavaScript library for real-time web applications. It enables real-time, bi-directional communication between web clients and servers. It has two parts: a client-side library that runs on the browser, and a server-side library for Node.js. Both the components have nearly identical APIs.

Heroku

Heroku is a cloud platform that lets companies build, deliver, monitor, and scale apps — it is the fastest way to go from idea to URL, bypassing all those infrastructure headaches.

This article assumes that you already have Redis, Node.js, and the Heroku Toolbelt installed on your machine.

Setup

- Download the code from the following repository: https://github.com/Scalegrid/code-samples/tree/sg-redis-node-socket-twitter-search/node-socket-redis-twitter-hashtags

- Run npm install to install the necessary components

- Finally, you can start the node server by doing “node index.js”. You can also run “nodemon” which watches for file changes as well.

You can also access a hosted version of this app here: https://node-socket-redis-stream-tweet.herokuapp.com/

The Process

Here is a brief description of the process that we will be using to build the demo application:

1. We will start by accepting a search query from the user. The query can be Twitter mentions, hashtags or any random search text.

2. Once we have the search query, we will send it to Twitter’s Streaming API to fetch tweets. Since it is a stream, we will be listening when tweets are sent by the API.

3. As soon as a tweet is retrieved, we will store it in a Redis list and broadcast it to the front-end.

What are Redis lists?

Redis lists are implemented via Linked Lists. This means that even if you have millions of elements inside a list, the operation of adding a new element at the head or at the tail of the list is performed in constant time. The speed of adding a new element with the LPUSH command to the head of a list with ten elements is the same as adding an element to the head of a list with 10 million elements.

In our application, we will be storing the tweets received via the API in a list called “tweets”. We will use LPUSH to push the newly received tweet to the list, trim it using LTRIM which restricts the amount of disk space used (as writing a stream may take a lot of space), fetch the latest tweet using LRANGE, and broadcast it to the front-end where it will be appended to the streaming list.

What is LPUSH, LTRIM and LRANGE?

These are a set of Redis commands that are used to add data to a list. Here is a brief description:

LPUSH

Insert all the specified values at the head of the list stored at key. If key does not exist, it is created as an empty list before performing the push operations. When key holds a value that is not a list, an error is returned.

redis> LPUSH mylist "world"
(integer) 1

redis> LPUSH mylist "hello"
(integer) 2

redis> LRANGE mylist 0 -1
1) "hello"
2) "world"

LTRIM

Trim an existing list so that it will contain only the range of elements specified. Both start and stop are zero-based indexes, where 0 is the first element of the list (the head), 1 the next one element and so on.

redis> RPUSH mylist "one"
(integer) 1

redis> RPUSH mylist "two"
(integer) 2

redis> RPUSH mylist "three"
(integer) 3

redis> LTRIM mylist 1 -1
"OK"

redis> LRANGE mylist 0 -1
1) "two"
2) "three"

LRANGE

Returns the specified elements of the list stored at key. The offsets start and stop are zero-based indexes, with 0 being the first element of the list (the head of the list), 1 being the next, and so on.

These offsets can also be negative numbers indicating positions from the end of the list. For example, -1 is the last element of the list, -2 the penultimate, and so on.

redis> RPUSH mylist "one"
(integer) 1

redis> RPUSH mylist "two"
(integer) 2

redis> RPUSH mylist "three"
(integer) 3

redis> LRANGE mylist 0 0
1) "one"

redis> LRANGE mylist -3 2
1) "one"
2) "two"
3) "three"

Building the application

Our demo requires both a front-end and a back-end. Our front-end is a pretty simple text box with a button that will be used to start the stream.

$('body').on('click', '.btn-search', function() {
   $('#tweets_area').empty();
   $(this).text('Streaming...').attr('disabled', true);
   $.ajax({
       url: '/search',
       type: 'POST',
       data: {
           val: $.trim($('.search-txt').val())
       }
   });
});

We need a helper function to build a tweet box once we receive the tweet from our back-end:

 var _buildTweetBox = function(status) {
     var html = '';
     html += '<div class="media tweet-single">';
     html += ' <div class="media-left">';
     html += ' <a href="https://twitter.com/' + status.user.screen_name + '" target="_blank" title="' + status.user.name + '">';
     html += ' <img class="media-object" src="' + status.user.profile_image_url_https + '" alt="' + status.user.name + '" />';
     html += ' </a>';
     html += ' </div>';
     html += ' <div class="media-body">';
     html += ' <h5 class="media-heading"><a href="https://twitter.com/' + status.user.screen_name + '" target="_blank">' + status.user.screen_name + '</a></h5>';
     html += '<p class="tweet-body" title="View full tweet" data-link="https://twitter.com/' + status.user.screen_name + '/status/' + status.id_str + '">' + status.text + '</p>';
     html += ' </div>';
     html += '</div>';
     $('#tweets_area').prepend(html);
     $('#tweets_area').find('.tweet-single').first().fadeIn('slow');
};

We also need a listener to stop the stream and prevent adding any more tweets to the streaming list:

socket.on('stream:destroy', function(status) {
    $('.btn-search').text('Start streaming').removeAttr('disabled');
    $('.alert-warning').fadeIn('slow');
    setTimeout(function() {
       $('.alert-warning').fadeOut('slow');
    }, STREAM_END_TIMEOUT * 1000);
});

Let’s switch over to the back-end side of things and start writing our /search API.

/**
 * API - Search
 */
app.post('/search', function(req, res, next) {
   _searchTwitter(req.body.val);
   res.send({
       status: 'OK'
   });
});

/**
 * Stream data from Twitter for input text
 *
 * 1. Use the Twitter streaming API to track a specific value entered by the user
 * 2. Once we have the data from Twitter, add it to a Redis list using LPUSH
 * 3. After adding to list, limit the list using LTRIM so the stream doesn't overflow the disk
 * 4. Use LRANGE to fetch the latest tweet and emit it to the front-end using Socket.io
 *
 * @param {String} val Query String
 * @return
 */
var _searchTwitter = function(val) {
   twit.stream('statuses/filter', {track: val}, function(stream) {
   stream.on('data', function(data) {
       client.lpush('tweets', JSON.stringify(data), function() {
           client.ltrim('tweets', 0, TWEETS_TO_KEEP, function() {
              client.lrange('tweets', 0, 1, function(err, tweetListStr) {
                  io.emit('savedTweetToRedis', JSON.parse(tweetListStr[0]));
               });
           });
        });
    });
    stream.on('destroy', function(response) {
        io.emit('stream:destroy');
    });
    stream.on('end', function(response) {
        io.emit('stream:destroy');
    });
    setTimeout(stream.destroy, STREAM_TIMEOUT * 1000);
    });
}

The above code contains the core of our back-end. Once a request has been received at /search, we start the stream using Twitter’s streaming API that returns a stream object.

twit.stream('statuses/filter', {track: val}, function(stream) {});

We can listen to the stream object for a key called “data” that will send us a new tweet when available.

stream.on('data', function(data) {});

The “data” object contains the tweet JSON which may look something like this (part of the response has been omitted):

{
 "created_at": "Wed Jul 26 08:01:56 +0000 2017",
 "id": 890119982641803300,
 "id_str": "890119982641803264",
 "text": "RT @FoxNews: Jim DeMint: \"There is no better man than Jeff Sessions, and no greater supporter...of [President #Trump's] agenda.\"… ",
 "source": "<a href=\"http://twitter.com/download/android\" rel=\"nofollow\">Twitter for Android</a>",
 "truncated": false,
 "in_reply_to_status_id": null,
 "in_reply_to_status_id_str": null,
 "in_reply_to_user_id": null,
 "in_reply_to_user_id_str": null,
 "in_reply_to_screen_name": null,
 "user": {
 "id": 4833141138,
 "id_str": "4833141138",
 "name": "randy joe davis",
 "screen_name": "randyjoedavis1",
 "location": null,
 "url": null,
 "description": "Conservative Patriot, retired military, retired DOD civilian. cattle farmer, horseman, adventurer. Lovin Life ! GO HOGS !!",
 "protected": false,
 "verified": false,
 "followers_count": 226,
 "friends_count": 346,
 "listed_count": 0,
 "favourites_count": 3751,
 "statuses_count": 1339,
 "created_at": "Sat Jan 30 03:39:16 +0000 2016",
 "utc_offset": null,
 "time_zone": null,
 "geo_enabled": false,
 "lang": "en",
 "contributors_enabled": false,
 "is_translator": false,
 "profile_background_color": "F5F8FA",
 "profile_background_image_url": "",
 "profile_background_image_url_https": "",
 "profile_background_tile": false,
 "profile_link_color": "1DA1F2",
 "profile_sidebar_border_color": "C0DEED",
 "profile_sidebar_fill_color": "DDEEF6",
 "profile_text_color": "333333",
 "profile_use_background_image": true,
 "profile_image_url": "http://pbs.twimg.com/profile_images/883522005210943488/rqyyXlEX_normal.jpg",
 "profile_image_url_https": "https://pbs.twimg.com/profile_images/883522005210943488/rqyyXlEX_normal.jpg",
 "default_profile": true,
 "default_profile_image": false,
 "following": null,
 "follow_request_sent": null,
 "notifications": null
 }
}

We store this response in a Redis list called “tweets” using LPUSH:

client.lpush('tweets', JSON.stringify(data), function() {});

Once the tweet has been saved, we trim the list using LTRIM to keep a max number of tweets (so our disk space doesn’t get full):

client.ltrim('tweets', 0, TWEETS_TO_KEEP, function() {});

After trimming the list, we fetch the latest tweet using LRANGE and emit it to the front-end:

client.lrange('tweets', 0, 1, function(err, tweetListStr) {
 io.emit('savedTweetToRedis', JSON.parse(tweetListStr[0]));
});

Since this is a demo application, we also need to manually destroy the stream after a specific time so it doesn’t keep writing to disk:

stream.on('end', function(response) {
 io.emit('stream:destroy');
});
setTimeout(stream.destroy, STREAM_TIMEOUT * 1000);

And you’re done! Fire up the server using npm start and enjoy the streaming experience.

A demo of the application is available here: https://node-socket-redis-stream-tweet.herokuapp.com/

For deploying this application on Heroku, check out their docs: https://devcenter.heroku.com/categories/deployment

The entire source code is also available on GitHub for you to fork and work on: https://github.com/Scalegrid/code-samples/tree/sg-redis-node-socket-twitter-search/node-socket-redis-twitter-hashtags

As always, if you build something awesome, do tweet us about it @scalegridio

If you need help with Redis hosting and management, reach out to us at support@scalegrid.io for further information.


Kunal is the UI guy at ScaleGrid.io. You can reach him at _kunalnagar


9 Shares
+12
Tweet
Share
Share7
Pin