today i want to continue my little series about how to use redis as rails cache.
i will show you how to build a caching system that does not rely on cache invalidation and is able to constantly deliver up to date cached results.
if you haven’t read the articles before this one, you should at least skim them, since i’ll referring to some parts of the setup.
here at rapidrabbit we deliver millions of json files to our customers every day. all of them come from rails applications, but none of the customers ever really hit the rails apps.
why? imagine you have a million users, that get an alert, asking them to open your app, all at the same time.
now imagine you just invalidated your cache for the controller they are about to hit. what would happen?
sadly…i can tell from my own experience…your rails app goes the way of the dodo and simply rolls over, waiting for it all to be over. even if your rails app is very fast you can still run into the limit, where you simply can’t generate the cache fast enough before some thousand other people came through to your app, dosing it.
so what do we do? simply put we never invalidate the cache, a.k.a. deleting it. we just refresh it.
today we’ll start with the ‘simpler’ form of the exercise by regenerating your cache with a cron job. don’t worry the next article will be about how to trigger the regeneration process.
so before we begin i assume following things:
- you have redis, nginx and the redis gem installed
- you have set up your nginx as described here
- your controllers are as described here
by the way…if you haven’t run into the issue yet:
go check /etc/redis.conf
and set
maxmemory 3gb
to something sensible. if you forget this, it will bite you very soon.
so first off we need to modify our controllers. if you truly want to have no real rails stack hits at any time we have to remove the cache lifetime all together. But this has some drawbacks, as your cache will constantly grow and especially very rare calls will be regenerated just as any other call.
so if your app allows indefinite variable urls it could be possible to bring your redis to a grinding halt by simply overloading it with data.
that’s why you should ask your admin to monitor the redis usage closely and report any problems to you.
so here is our new app/controllers/application_controller.rb
class ApplicationController < ActionController::Base protect_from_forgery def save_cache_to_redis #we just use set here now $redis.set( request.request_uri, response.body ) #just like before $redis.sadd("#{@model_name.downcase}_instances_collection", @model_id) $redis.sadd("#{@model_name.downcase}_#{@model_id}_urls_collection", request.request_uri) end end
in your other controllers you can now just remove the cache_lifetime variables.
next we need to modify our app/models/thingy.rb
class Thingy < ActiveRecord::Base #all the invalidations methods vanished, instead we just want a list of the urls that were cached def self.cached_urls cached_urls = [] $redis_cache.smembers("thingy_instances_collection").each do |instance_id| $redis_cache.smembers("thingy_#{instance_id}_urls_collection").each do |url| cached_urls << url end end cached_urls end end
so now we now the urls that have to be refreshed and we can spider them using a rails task.
but wait, how will we be able to access the rails app if the redis is in front of it?
we just define another nginx server with a local name and no cache in front of it.
so we have a /etc/nginx/sites-enabled/yourapp-local
server { listen 80; #this name should be entered in the /etc/hosts e.g. 'your.website.local 127.0.0.1' server_name your.website.local; root /home/appuser/app/current/public; #so we removed the redis cache location / { proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header Host $http_host; proxy_redirect off; if (!-f $request_filename) { #this upstream is the same as in the config for your external server proxy_pass http://yourunicornupstream; break; } } }
now we can ask rails directly and it will overwrite the old redis cache when being called.
last but not least we need a spider rake task to crawl all your urls and give them a freshen up ;)
let's look at lib/tasks/spider.rake
require "./config/environment" require 'open-uri' namespace :spider do desc 'get all urls and spider them' task :crawl do @todo = [] @max_threads = 4 #try what works best for you #you either enter a list of classes you want to spider #or you use this. (only works in rails 3) ActiveRecord::Base.descendants.each do |k| @todo += k.cached_urls if k.respond_to?(:cached_urls) end #the actual spidering @threads = [] @max_threads.times do @threads << Thread.new do while @todo != [] url = "http://your.website.local" + @todo.pop p url #depending how fast your app is, this will become a bottleneck ;) open(url) end end end #you may have to set the timeout higher depending on your workload @threads.each { |thread| thread.join(640) } end end
tada...now all you have todo is set up a cronjob that refreshes the cache at a time interval of your liking.
i hope you could learn something new yet again. if there are any questions feel free to ask, i may even answer ;)
till the next time (when we do the cool triggered thingy)
have fun.