Monkey patching is a programming technique popular in the Ruby world and nipped-in-the-bud by the JavaScript Community. Good.

Remember the smoosh controversy? JavaScript couldn’t adopt Array.flatten because MooTools used it in the 2010’s and making a new one might break the web. That was due to monkey patching.

So what’s monkey patching anyway?

I’m glad you asked

Monkey Patching

the term monkey patch only refers to dynamic modifications of a class or module at runtime, motivated by the intent to patch existing third-party code as a workaround to a bug or feature which does not act as desired

You can think of monkey patching as a magic trick. Run a function and it changes how other functions behave without changing the source code.

Let’s take .flat for example. It recursively flattens an array.

Click through for source

Now let’s say you disagree with the levels of recursion argument. You want .flat to always completely flatten an array.

You can overwrite JavaScript’s native .flat implementation. Run this somewhere, anywhere, in your codebase.

Click through for source

Overwrite Array.prototype.flat and replace it with a function of your own.

.reduce the array and use ... to combine values. When the current value is an array, go into recursion, otherwise use the value.

Every array in your codebase now has this function. 🧙‍♂️

Here’s a CodeSandbox to prove it works

Click through for source

When monkey patching goes rogue

Monkey patching on its own ain’t bad at all. It’s a great tool when used responsibly. You can make non-standard methods easy to use, add to your language’s standard library, and even fix bugs.

Polyfills are [sort of] an example of successful monkey patching in the JavaScript world. Adding features to browsers that otherwise don’t have them.

But we can take that .flat example from before and make it sinister.

Click through for source

Whoa where’d that wizard come from?

Some unscrupulous programmer monkey patched our .flat method to add a wizard after every 5th element. How dare they play such a trick on us!

Click through for source

And then monkey patching took 2 days off my life

That’s what happened to me one fateful day when a feature finally shipped to production after a million rounds of testing. We got it to work, product was happy, QA was happy, PR was happy. Ship it.

boom 💥

I mean the feature worked but …

Click through for source

Trouble started on Friday as soon as we deployed. Yes we deployed on Friday.

Our Sentry slack channel started blowing up with ActiveRecord::ConnectioTimeoutError:> could not obtain a database connection within 5.000 seconds errors.

That’s a bad error because it means your server couldn’t establish a connection to your database. When that happens, nothing works. The request fails and the user cries.

Luckily most errors happened in background processes and users didn’t notice. Think we got 1 actual user complaint?

phew giphy

But a 3% error rate every time you do anything is no joke.

We like to keep jobs.failure at a cool 0.1% or so. Now it shot up to almost 3%. 30 times worse 😬

So what happened?

We couldn’t figure it out for the life of us. The database wasn’t out of memory, there were plenty of connections left in the connection pool. Both common causes of connection timeouts.

What’s worse, our memory and connection usage went down because of the issue. With 3% less load the database was absolutely thriving. She was loving it!

And yet our application was failing.

I spent that entire Monday poring through logs, looking at graphs, smashing my face against New Relic graphs, even looked through the entire code diff between now production and old production.

Nothing.

We didn’t change how we talk to the database. We didn’t add a bunch of background processes competing for resources. We didn’t even change any configuration. One of our queries just happened to start taking 10x longer.

it was a coincidence

Graphs don’t just change color like that at the exact same timestamp your deploy went through because of a coincidence.

At wit’s end we tried a thing.

What if we remove that gem we added? That’s the only thing left. Could that library that enables CSV data imports have something to do with this?

It worked. We lost a feature and gained a working system.

Digging through the gem’s codebase we found this in a dependency.

Click through for source

and this …

Click through for source

You know what that means? It means this gem messes with every single Postgres operation in your codebase.

Every single one of them. They go through this gem whether you like it or not.

fuck_this giphy

And that’s how I lost 2 days of my life that I’m never getting back. A gem monkey patching my database connection.

Cheers,
~Swizec

Learned something new? Want to improve your skills?

Join over 10,000 engineers just like you already improving their skills!

Here's how it works 👇

Leave your email and I'll send you an Interactive Modern JavaScript Cheatsheet 📖right away. After that you'll get thoughtfully written emails every week about React, JavaScript, and your career. Lessons learned over my 20 years in the industry working with companies ranging from tiny startups to Fortune5 behemoths.

PS: You should also follow me on twitter 👉 here.
It's where I go to shoot the shit about programming.