5 Things to monitor in production Redis server
I am sharing my experiences on monitoring Redis server that has millions of keys. These keys keep on growing and is being used as cache by multiple microservices. Below are the list of things that I keep on monitoring frequently.
Monitoring tools used for Redis server
- Redis cli – INFO command
- AWS Cloudwatch
Besides, above tools I also keep switching between Grafana, Kibana, New Relic, Kubernetes dashboard to correlate the events during investigation.
1. Number of keys and their growth
We have to keep close eyes on the total number of keys and their rate of growth over time. The Redis instance has limited memory and when this memory is reached instance will start to be the bottleneck. If your app/site is very popular and Redis has been used as cache then Redis itself will start to get slower due to heavy memory pressure. Under heavy memory pressure it will evict the keys based on eviction policy. So, set your eviction policy based on the usage of keys.
Mostly, session cache are kept in Redis. If we start to create session for every single request then we can easily run out of memory in Redis instance.
Create sample data for test
root@8762425eed3f:/data# redis-cli -h $REDIS_HOST 127.0.0.1:6379> keys * 1) "tus:server:895f65a62d01b683b6322209740dbf57" 127.0.0.1:6379> INFO commandstats # Commandstats cmdstat_command:calls=1,usec=2950,usec_per_call=2950.00 cmdstat_keys:calls=1,usec=803,usec_per_call=803.00 127.0.0.1:6379> set test:redis:command 10 OK 127.0.0.1:6379> ttl test:redis:command (integer) -1 127.0.0.1:6379> set test:redis:command2 10 EX 300 OK 127.0.0.1:6379> ttl test:redis:command2 (integer) 294 127.0.0.1:6379> info commandstats # Commandstats cmdstat_set:calls=2,usec=316,usec_per_call=158.00 cmdstat_info:calls=1,usec=143,usec_per_call=143.00 cmdstat_command:calls=1,usec=2950,usec_per_call=2950.00 cmdstat_keys:calls=1,usec=803,usec_per_call=803.00 cmdstat_ttl:calls=2,usec=49,usec_per_call=24.50
2. TTL for keys
Always set TTL for keys if when you think it doesn’t need one. All data in Redis lives in memory. If we don’t set any TTL then this data will live in memory forever and won’t be garbage collected (removed) by Redis. Redis internally has a routine that will scan for expired keys and remove them from memory. This happens when Redis has some idle time. If there are millions of keys in Redis and most of them don’t have TTL then you are likely to run out of memory sooner than expected unless you clean them manually using some techniques. see <a href="#info-keyspace" rel="nofollow" title="#info-keyspace">INFO keyspace</a>
below on how to get TTL information from Redis metrics.
3. INFO – Collect metrics, data is your friend
Always collect metrics and keep monitoring your Redis server. We can use INFO
command can give us tons of information.
INFO all
INFO all will give us all available information about Redis server at that instant of time. But this can be overwhelming at once. So, we can nitpick the details that we want to look. The information about the total number of commands used. The total number of keys in Redis etc.
INFO commandstats
This command gives us the information about the numer of items specific command was used in server. This is helpful to find which commands are used frequently and which are not. While investigating for issues, there are times when we have to correlate the events. Checking the use of various command over time helps us to find the bottleneck sooner. If we know that mostly GET
commands are popular in our system and suddenly we see the spike in use of SET
command then we know that something is not working as expected. Something is trying to SET huge number of keys that is not supposed to happen. This information can be obtained from INFO commandstats
see the example below.
127.0.0.1:6379> info commandstats # Commandstats cmdstat_set:calls=2,usec=316,usec_per_call=158.00 cmdstat_info:calls=1,usec=143,usec_per_call=143.00 cmdstat_command:calls=1,usec=2950,usec_per_call=2950.00 cmdstat_keys:calls=1,usec=803,usec_per_call=803.00 cmdstat_ttl:calls=2,usec=49,usec_per_call=24.50
How to read above output
The output format is cmdstat_{command}. So from above output, we can see that
- SET was used 2 times.
- INFO was used 1 time
- KEYS was used 1 time
- TTL was used 2 times
4. INFO keyspace
Below is how we can find the total number of keys that are yet to be expired or were never set with TTL in first place.
INFO keyspace
command can give us information about the total number of keys that are currently residing in memory of Redis server, out of them how many are set to be expired and what are their average ttl.
127.0.0.1:6379> info keyspace # Keyspace db0:keys=3,expires=1,avg_ttl=225936 127.0.0.1:6379> keys * 1) "tus:server:895f65a62d01b683b6322209740dbf57" 2) "test:redis:command2" 3) "test:redis:command"
In above output
db0:keys=3 It means that in database-0 we have total 3 keys set currently.
expires=1 Out of 3 keys only 1 key was set with TTL.
Total number of keys that do not have TTL set: 3 – 1 = 2 keys
5. Use namespace for keys
Always namespace Redis keys based on the service, key usage or functionalities. This will make it easier to find and target keys during investigation or cleanup. If we have keys in namespaced format e.g. test:redis:command
here Redis internally optimizes namespaced keys by grouping and creating hashes for the same namespaced keys.
Find all keys that belong to test:*
then we can use GET test:*
Find all keys that belong to test:redis:*
then we can use GET test:redis:*
I have listed only five things here, there could be others based on the redis usage. If your redis is configured as queue system then you might want to consider monitoring commandstats and watch for GET vs SET in queue. And how often the jobs are being deleted. If there are two many jobs waiting to be processed then it could mean that there is something wrong with queue worker.
I hope this information will help people who are starting with Redis Monitoring.
Reference
When doubt always refer to official documentation. https://redis.io/documentation#programming-with-redis has tons of information. Read it thoroughly if you have interest in Redis Administration.