I am sharing my experiences on monitoring Redis server that has millions of keys. These keys keep on growing and is being used as cache by multiple microservices. Below are the list of things that I keep on monitoring frequently.

Monitoring tools used for Redis server

  • Redis cli – INFO command
  • AWS Cloudwatch

Besides, above tools I also keep switching between Grafana, Kibana, New Relic, Kubernetes dashboard to correlate the events during investigation.

1. Number of keys and their growth

We have to keep close eyes on the total number of keys and their rate of growth over time. The Redis instance has limited memory and when this memory is reached instance will start to be the bottleneck. If your app/site is very popular and Redis has been used as cache then Redis itself will start to get slower due to heavy memory pressure. Under heavy memory pressure it will evict the keys based on eviction policy. So, set your eviction policy based on the usage of keys.

Mostly, session cache are kept in Redis. If we start to create session for every single request then we can easily run out of memory in Redis instance.

Create sample data for test

Freeable Memory decreasing

root@8762425eed3f:/data# redis-cli -h $REDIS_HOST
127.0.0.1:6379> keys *
1) "tus:server:895f65a62d01b683b6322209740dbf57"
127.0.0.1:6379> INFO commandstats
# Commandstats
cmdstat_command:calls=1,usec=2950,usec_per_call=2950.00
cmdstat_keys:calls=1,usec=803,usec_per_call=803.00
127.0.0.1:6379> set test:redis:command 10
OK
127.0.0.1:6379> ttl test:redis:command
(integer) -1
127.0.0.1:6379> set test:redis:command2 10 EX 300
OK
127.0.0.1:6379> ttl test:redis:command2
(integer) 294
127.0.0.1:6379> info commandstats
# Commandstats
cmdstat_set:calls=2,usec=316,usec_per_call=158.00
cmdstat_info:calls=1,usec=143,usec_per_call=143.00
cmdstat_command:calls=1,usec=2950,usec_per_call=2950.00
cmdstat_keys:calls=1,usec=803,usec_per_call=803.00
cmdstat_ttl:calls=2,usec=49,usec_per_call=24.50

2. TTL for keys

Always set TTL for keys if when you think it doesn’t need one. All data in Redis lives in memory. If we don’t set any TTL then this data will live in memory forever and won’t be garbage collected (removed) by Redis. Redis internally has a routine that will scan for expired keys and remove them from memory. This happens when Redis has some idle time. If there are millions of keys in Redis and most of them don’t have TTL then you are likely to run out of memory sooner than expected unless you clean them manually using some techniques. see <a href="#info-keyspace" rel="nofollow" title="#info-keyspace">INFO keyspace</a> below on how to get TTL information from Redis metrics.

3. INFO – Collect metrics, data is your friend

Total Keys in Redis Memory

Always collect metrics and keep monitoring your Redis server. We can use INFO command can give us tons of information.

INFO all

INFO all will give us all available information about Redis server at that instant of time. But this can be overwhelming at once. So, we can nitpick the details that we want to look. Total number of keys in Redis The information about the total number of commands used. The total number of keys in Redis etc.

INFO commandstats

This command gives us the information about the numer of items specific command was used in server. This is helpful to find which commands are used frequently and which are not. While investigating for issues, there are times when we have to correlate the events. Checking the use of various command over time helps us to find the bottleneck sooner. If we know that mostly GET commands are popular in our system and suddenly we see the spike in use of SET command then we know that something is not working as expected. Something is trying to SET huge number of keys that is not supposed to happen. This information can be obtained from INFO commandstats see the example below.

127.0.0.1:6379> info commandstats
# Commandstats
cmdstat_set:calls=2,usec=316,usec_per_call=158.00
cmdstat_info:calls=1,usec=143,usec_per_call=143.00
cmdstat_command:calls=1,usec=2950,usec_per_call=2950.00
cmdstat_keys:calls=1,usec=803,usec_per_call=803.00
cmdstat_ttl:calls=2,usec=49,usec_per_call=24.50

How to read above output

The output format is cmdstat_{command}. So from above output, we can see that

  • SET was used 2 times.
  • INFO was used 1 time
  • KEYS was used 1 time
  • TTL was used 2 times

4. INFO keyspace

Below is how we can find the total number of keys that are yet to be expired or were never set with TTL in first place.

INFO keyspace command can give us information about the total number of keys that are currently residing in memory of Redis server, out of them how many are set to be expired and what are their average ttl.

127.0.0.1:6379> info keyspace
# Keyspace
db0:keys=3,expires=1,avg_ttl=225936
127.0.0.1:6379> keys *
1) "tus:server:895f65a62d01b683b6322209740dbf57"
2) "test:redis:command2"
3) "test:redis:command"

In above output

db0:keys=3 It means that in database-0 we have total 3 keys set currently.

expires=1 Out of 3 keys only 1 key was set with TTL.

Total number of keys that do not have TTL set: 3 – 1 = 2 keys

5. Use namespace for keys

Always namespace Redis keys based on the service, key usage or functionalities. This will make it easier to find and target keys during investigation or cleanup. If we have keys in namespaced format e.g. test:redis:command here Redis internally optimizes namespaced keys by grouping and creating hashes for the same namespaced keys.

Find all keys that belong to test:* then we can use GET test:*

Find all keys that belong to test:redis:* then we can use GET test:redis:*

I have listed only five things here, there could be others based on the redis usage. If your redis is configured as queue system then you might want to consider monitoring commandstats and watch for GET vs SET in queue. And how often the jobs are being deleted. If there are two many jobs waiting to be processed then it could mean that there is something wrong with queue worker.

I hope this information will help people who are starting with Redis Monitoring.

Reference

When doubt always refer to official documentation. https://redis.io/documentation#programming-with-redis has tons of information. Read it thoroughly if you have interest in Redis Administration.