So what’s monkey patching anyway?
I’m glad you asked
the term monkey patch only refers to dynamic modifications of a class or module at runtime, motivated by the intent to patch existing third-party code as a workaround to a bug or feature which does not act as desired
You can think of monkey patching as a magic trick. Run a function and it changes how other functions behave without changing the source code.
.flat for example. It recursively flattens an array.
Now let’s say you disagree with the levels of recursion argument. You want
.flat to always completely flatten an array.
.flat implementation. Run this somewhere, anywhere, in your codebase.
Array.prototype.flat and replace it with a function of your own.
.reduce the array and use
... to combine values. When the current value is an array, go into recursion, otherwise use the value.
Every array in your codebase now has this function. 🧙♂️
Here’s a CodeSandbox to prove it works
When monkey patching goes rogue
Monkey patching on its own ain’t bad at all. It’s a great tool when used responsibly. You can make non-standard methods easy to use, add to your language’s standard library, and even fix bugs.
But we can take that
.flat example from before and make it sinister.
Whoa where’d that wizard come from?
Some unscrupulous programmer monkey patched our
.flat method to add a wizard after every 5th element. How dare they play such a trick on us!
And then monkey patching took 2 days off my life
That’s what happened to me one fateful day when a feature finally shipped to production after a million rounds of testing. We got it to work, product was happy, QA was happy, PR was happy. Ship it.
I mean the feature worked but …
Trouble started on Friday as soon as we deployed. Yes we deployed on Friday.
Our Sentry slack channel started blowing up with
ActiveRecord::ConnectioTimeoutError:> could not obtain a database connection within 5.000 seconds errors.
That’s a bad error because it means your server couldn’t establish a connection to your database. When that happens, nothing works. The request fails and the user cries.
Luckily most errors happened in background processes and users didn’t notice. Think we got 1 actual user complaint?
But a 3% error rate every time you do anything is no joke.
We like to keep
jobs.failure at a cool 0.1% or so. Now it shot up to almost 3%. 30 times worse 😬
So what happened?
We couldn’t figure it out for the life of us. The database wasn’t out of memory, there were plenty of connections left in the connection pool. Both common causes of connection timeouts.
What’s worse, our memory and connection usage went down because of the issue. With 3% less load the database was absolutely thriving. She was loving it!
And yet our application was failing.
I spent that entire Monday poring through logs, looking at graphs, smashing my face against New Relic graphs, even looked through the entire code diff between now production and old production.
We didn’t change how we talk to the database. We didn’t add a bunch of background processes competing for resources. We didn’t even change any configuration. One of our queries just happened to start taking 10x longer.
it was a coincidence
Graphs don’t just change color like that at the exact same timestamp your deploy went through because of a coincidence.
At wit’s end we tried a thing.
What if we remove that gem we added? That’s the only thing left. Could that library that enables CSV data imports have something to do with this?
It worked. We lost a feature and gained a working system.
Digging through the gem’s codebase we found this in a dependency.
and this …
You know what that means? It means this gem messes with every single Postgres operation in your codebase.
Every single one of them. They go through this gem whether you like it or not.
And that’s how I lost 2 days of my life that I’m never getting back. A gem monkey patching my database connection.
Learned something new? Want to improve your skills?
Join over 10,000 engineers just like you already improving their skills!
Here's how it works 👇
PS: You should also follow me on twitter 👉 here.
It's where I go to shoot the shit about programming.