search by tags

for the user

adventures into the land of the command line

redis memory benchmarking

read this with a grain of salt (not literally)

recently i’ve been working on this app which uses redis as a database. i’m at a point where i want to put it on the internet, and i’m going with aws to host it. there’s a recurring theme here around me being cheap, and as cheap as i am, i don’t want to pay more than i have to for an ec2 instance, if i’m not going to be effectively utilising more than 80% of its potential, so in keeping with the theme of cheapness, here are the results of some work i did, investigating how much memory my data uses up in redis.

the structure of my data looks something like this:

{"somerandomuuid":"{"key":"value",
                    "key":"value",
                    "key":"value",
                    "key":"value",
                    "key":"value",
                    "key":"value",
                    "key":"value",
                    "key":"value",
                    "key":"value",
                    "key":"value",
                    "key":"value",
                    "key":"value",
                    "transactions":"{"transaction1":"{
                                                     "key":"value",
                                                     "key":"value",
                                                     "key":"value",
                                                     "key":"value",
                                                     "key":"value",
                                                     "key":"value",
                                                     "key":"value"
                                                    },
                                    "transaction2":"{
                                                     "key":"value",
                                                     "key":"value",
                                                     "key":"value",
                                                     "key":"value",
                                                     "key":"value",
                                                     "key":"value",
                                                     "key":"value"
                                                    },
                                    "transaction3":"{
                                                     "key":"value",
                                                     "key":"value",
                                                     "key":"value",
                                                     "key":"value",
                                                     "key":"value",
                                                     "key":"value",
                                                     "key":"value"
                                                    }"
                                    }"
                   }"
}

im using the hash data structure in redis but my value is a giant string that looks like a hash value, and i read it into a dict in python using the ast.literal_eval utility

to check info about your redis instance from the command line:

$ redis-cli
127.0.0.1:6379> info

redis will print a whole bunch of useful information, with explanations of the exact meanings here

here’s what i got

1 key with 44 transactions ~

# Memory
used_memory:829896
used_memory_human:810.45K
used_memory_rss:5689344
used_memory_peak:20633128
used_memory_peak_human:19.68M
used_memory_lua:36864
mem_fragmentation_ratio:6.86
mem_allocator:jemalloc-3.6.0

100 keys with 1 transaction ~

# Memory
used_memory:1063792
used_memory_human:1.01M
used_memory_rss:7282688
used_memory_peak:1151960
used_memory_peak_human:1.10M
used_memory_lua:36864
mem_fragmentation_ratio:6.85
mem_allocator:jemalloc-3.6.0

100 keys with 365 transactions ~

# Memory
used_memory:7940080
used_memory_human:7.57M
used_memory_rss:13688832
used_memory_peak:20633128
used_memory_peak_human:19.68M
used_memory_lua:36864
mem_fragmentation_ratio:1.72
mem_allocator:jemalloc-3.6.0

100 keys with 730 transactions ~

# Memory
used_memory:14567400
used_memory_human:13.89M
used_memory_rss:19951616
used_memory_peak:27370624
used_memory_peak_human:26.10M
used_memory_lua:36864
mem_fragmentation_ratio:1.37
mem_allocator:jemalloc-3.6.0

200 keys with 730 transactions ~

# Memory
used_memory:27899512
used_memory_human:26.61M
used_memory_rss:29638656
used_memory_peak:40621232
used_memory_peak_human:38.74M
used_memory_lua:36864
mem_fragmentation_ratio:1.06
mem_allocator:jemalloc-3.6.0

after returning to 1 key with 44 transactions ~

# Memory
used_memory:839880
used_memory_human:820.20K
used_memory_rss:3997696
used_memory_peak:40621232
used_memory_peak_human:38.74M
used_memory_lua:36864
mem_fragmentation_ratio:4.76
mem_allocator:jemalloc-3.6.0

redis says to looks at used_memory_rss to know how much memory your operating system thinks redis is using.

so it looks like we go from about 5MB to about 7MB when increasing 1 key to 100 keys with mostly 1 transaction in each. this makes me assume that each key is worth ~16KB.

(7,282,688 - 5,689,344) / 100 = 15,933

if 100 keys each have 365 transactions, memory usage goes up to ~13.6MB, which makes me think that each transaction is worth ~331B per user:

(13,688,832 - (7,282,688 - 5,689,344)) / 100 / 365 = 331

is it a linear extrapolation if transactions are doubled?

(19,951,616 - (7,282,688 - 5,689,344)) / 100 / 730 = 251

looks like its shrinking…

(29,638,656 - (7,282,688 - 5,689,344)) / 200 / 730 = 192

331 - 251 = 80 => 24.2% decrease per 365 transactions 251 - 192 = 59 => 23.5% decrease per 100 keys

so if i hypothetically had a redis instance with 1,000 keys each with 730 transactions, for every 100 keys added, memory utilisation would decrease by 23.5% for each block of 100 keys? and for every 365 transactions added, memory utilisation would be reduced by 24.2% for transactions?

my maths is not very good anymore :( if anyone knows the formula to work this type of problem out it would be really cool. but im just gonna guess

(251 * 100 * 730) * 10 * 0.235 = 43,059,050
(251 * 100 * 730) * 10 * 0.242 = 44,341,600
(251 * 100 * 730) * 10 - 43,059,050 = 140,170,950 = 140MB

seem’s alright… i’d bet my maths was wrong but you get the idea.

it’s also worth noting that after removing all keys except one, used_memory_peak remains at 38.74MB while the used_memory_rss drops down to ~3MB.

there’s an interesting read here with a more in depth benchmark that compares the different data structures as well.

seeing as i dont think i’ll need more than 1000 keys for now, i think im gonna go with a small ec2 instance… and i’ll edit my calculations here when i become smarter :)