Here's an interesting problem for you: build some simple caching.
Let's say you have a server and an app. Your server has to do something every time you release a new app. Asking for reviews is a good example1. If you automate this, everyone will be happier.
So you keep track of the latest app version and run checks when users ping your API. The list of versions is going to grow fast-ish, and your SELECT
statement has to run in code because Postgres doesn't know how to compare version strings.
In Ruby, finding the latest app version looks like this:
AppVersion.where(platform: platform)
.sort_by{ |v| Gem::Version.new(v.version) }
.last
This looks innocent, but it builds an array of all AppVersion
models, then sorts it in Ruby, then takes the last one and discards the rest. It's kind of okay when the table is small, but it’s terrible when the table grows big.
Ruby on Rails's and your database's default caching strategies can't cache this call. You have to run it every time.
The result only changes every few days. Sometimes, it goes unchanged for weeks. But you still have to check every time a user logs in because maybe they're the first person with a new app version, and you need to know when it showed up.
?
A naive caching strategy
You should use caching, obviously. Somewhat expensive thing to calculate that is checked often and changes rarely. Cache!
So you implement the simplest approach: memoization.
In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.
Your code looks something like this:
class AppVersion < ActiveRecord::Base
@current = {}
def self.latest(platform)
if @current[platform].nil?
@current[platform] = where(platform: platform)
.sort_by{ |v| Gem::Version.new(v.version) }
.last
end
@current[platform]
end
def self.check_app_version(device)
latest = self.latest(device.platform)
if latest
if Gem::Version.new(device.app_version) > Gem::Version.new(latest.version)
latest = new_latest(device)
@current[device.platform] = latest
else
AppVersion.find_or_create_by(version: device.app_version,
platform: device.platform)
.update!(last_seen_at: DateTime.now)
end
else
latest = new_latest(device)
@current[device.platform] = latest
end
latest
end
?
Overall, this is the AppVersion
model. It stores information about each new app version the server encounters. When a user logs in, we call AppVersion.check_app_version(user.device)
.
This function:
- Fetches the
latest
app version - 1.1. `latest` returns saved value if it exists
- 1.2. if not, `latest` saves its result in a class instance variable
- If we got a version, we check if the new app is of a newer version
- If user's device is newer, we create a new
AppVersion
entry - Then we update the class instance variable
- If device is not newer, we update the
last_seen_at
timestamp - If this is the first time ever that we're checking – there's no
latest
– then we make a new latest and update the class instance variable
Seems reasonable, right? Calculate value, save value, update value when needed. We rely on class instance variables persisting across requests.
You do some testing locally, you write some tests. All good. Feature works. Ship it.
Feature goes to production, you release a new app, it creates 18 entries in the database ?
What went wrong?
The correct caching strategy
The clue is in that 18
number. Our code thought exactly 18 times that it had encountered a new latest app version.
At the time, we had 9
Heroku dynos with 4
threads each running in production. 9*4 = 36
, which is not 18. But it's twice as much as 18 and class instance variables are meant to be shared between threads.
Perhaps the way Puma shares memory between threads, or potential race conditions in our code, means that it took 2 tries before every thread on a machine knew about the new version. It's hard to say why it works out that way, but it does.
Memoization does not work. Caching is hard.
In retrospect, it is obvious that this was never going to work. Our "server" is distributed among multiple virtual machines. They don't share memory. How would they ever have seen each other's class instance variables?
The answer is to stop trying to be clever. Rails has built-in caching that's been battle tested and developed by smart people.
Fixed code looks like this:
def self.latest(platform)
Rails.cache.fetch("latest_app_version/#{platform}") do
where(platform: platform).sort_by{ |v| Gem::Version.new(v.version) }.last
end
end
def self.check_app_version(device)
latest = self.latest(device.platform)
if latest
if Gem::Version.new(device.app_version) > Gem::Version.new(latest.version)
latest = new_latest(device)
Rails.cache.delete("latest_app_version/#{device.platform}")
else
AppVersion.find_or_create_by(version: device.app_version,
platform: device.platform)
.update!(last_seen_at: DateTime.now)
end
else
latest = new_latest(device)
end
latest
end
Much better! Logic is same as before, except that we use Rails.cache
as our caching mechanism.
.fetch
reads from cache and if there's a miss, it runs the provided block and stores its result. .delete
deletes the cached value so next time we use latest
, it reads from cache.
The fixed code works because Rails.cache
can be configured to use an external caching server – Memcache or Redis for instance. This creates a memory space shared between all server machines and server threads.
Problem solved, crisis averted, lesson learned. Don't be clever. Use the tools your frameworks give you. ?
- Apple promotes reviews for the latest app version. It hides previous reviews. If you have frequent updates, like you should, and you have good reviews, like you hope, then all your expensive traffic sees "This app has not received enough ratings to show an average". How encouraging! (By the way, you pay for traffic because this is the real world and if you build it they will come does not work.) ↩
Continue reading about Here's why you should never implement your own caching
Semantically similar articles hand-picked by GPT-4
- More messing with time: Deduping messages between iOS and JavaScript
- Logging 1,721,410 events per day with Postgres, Rails, Heroku, and a bit of JavaScript
- Lesson learned, test your migrations on the big dataset
- 90% of performance is data access patterns
- Time is funny in Ruby
Learned something new?
Read more Software Engineering Lessons from Production
I write articles with real insight into the career and skills of a modern software engineer. "Raw and honest from the heart!" as one reader described them. Fueled by lessons learned over 20 years of building production code for side-projects, small businesses, and hyper growth startups. Both successful and not.
Subscribe below 👇
Software Engineering Lessons from Production
Join Swizec's Newsletter and get insightful emails 💌 on mindsets, tactics, and technical skills for your career. Real lessons from building production software. No bullshit.
"Man, love your simple writing! Yours is the only newsletter I open and only blog that I give a fuck to read & scroll till the end. And wow always take away lessons with me. Inspiring! And very relatable. 👌"
Have a burning question that you think I can answer? Hit me up on twitter and I'll do my best.
Who am I and who do I help? I'm Swizec Teller and I turn coders into engineers with "Raw and honest from the heart!" writing. No bullshit. Real insights into the career and skills of a modern software engineer.
Want to become a true senior engineer? Take ownership, have autonomy, and be a force multiplier on your team. The Senior Engineer Mindset ebook can help 👉 swizec.com/senior-mindset. These are the shifts in mindset that unlocked my career.
Curious about Serverless and the modern backend? Check out Serverless Handbook, for frontend engineers 👉 ServerlessHandbook.dev
Want to Stop copy pasting D3 examples and create data visualizations of your own? Learn how to build scalable dataviz React components your whole team can understand with React for Data Visualization
Want to get my best emails on JavaScript, React, Serverless, Fullstack Web, or Indie Hacking? Check out swizec.com/collections
Did someone amazing share this letter with you? Wonderful! You can sign up for my weekly letters for software engineers on their path to greatness, here: swizec.com/blog
Want to brush up on your modern JavaScript syntax? Check out my interactive cheatsheet: es6cheatsheet.com
By the way, just in case no one has told you it yet today: I love and appreciate you for who you are ❤️