This morning there was a link going around listing the 15 papers you should read to understand node.js's background. A large portion of the list is devoted to the comparison of thread- and event- based models of execution. Since I hear a lot about event loops being better than threads, I read The problem with threads written in 2006 by one Edward A. Lee, a professor at Berkeley.
If the next generation of programmers makes more intensive use of multithreading, then the next generation of computers will become nearly unusable.
And this bleak future is almost inevitable when you think about it. The rule of Moore's law has been changing from faster chips, to more and more cores. And yet the only software making good use of multiple cores are what Lee calls "embarrassingly parallel" applications - those that simply split themselves into independent processes.
Problems with threads stem from sharing memory space. Without proper care this can lead to a situation where one thread changes a piece of data, while the other is reading it. Now you have two threads that both have a different understanding of what the value of a certain variable is.
While in theory solvable with semaphores, locks and so on, without very proper care this can then lead to deadlocks - more importantly, taking care of interleaving threads is _very hard. _You are essentially facing a nondeterministic machine and trying to understand it through a poor abstraction that doesn't make use of your natural ability to deal with concurrency.
Lee also makes the case that this isn't merely a problem of syntax or tools, but that threads are a fundamentally flawed model of computation. I won't burden you with his whole proof/argumentation (page 3 to 5). The gist is that
/../ given a sequental program and an initial state, you have a defined sequence of events. Any two programs can be compared - they are equivalent if they halt for the same initial states and the final state is the same. When threads are introduced these essential properties are lost /../ if two threads can provide the next action, we can no longer compare two programs, we might be able to compare all interleavings, but on a multithreaded envrionment even that is lost since we'd have to know about all the other programs as well.
Essentially the problem is this - the core abstraction of computation is a deterministic assembly of deterministic components, but threads are inherently nondeterministic. The car analogy Lee provides is trying to compose an internal combustion engine out of a pot of iron, hydrocarbon and oxygen molecules randomly moving according to thermal forces.
This is so bad in practice even using very strict and rigid engineering principles, doesn't help. In early 2000 Lee started the Ptolemy Project, a project designing concurrent embedded systems. But this is also an experiment in battling threads with rigorous engineering discipline - they developed a code maturity rating system, design reviews, code reviews, nightly builds, the works. Reviewers included concurrency experts and there are regression tests with 100% code coverage.
When the system started being used in 2000 no problems were observed until a deadlock on April 26th, 2004. Just the fact they know the exact date this happened tells a lot about how rigorously engineering principles are applied. Still, a very serious problem took four years to be discovered.
Many other approaches exist that try to combat this problem, but they either wallow in obscurity or aren't very effective; perhaps the most effective is the use of higher-order principles like MapReduce, which aim to break the problem into tiny fractions that can be computed separately and then combining the results.
An interesting solution Lee proposes is using the so-called Rendezvous pattern, which as far as I can tell is a generalization of MapReduce, with the use of a nondeterministic merge, so that each part of the program is executing deterministically and nondeterminism is only used in a single spot on the network - where threads actually have to interact. Lee proposes using this with some sort of coordination language, so that programmers aren't burdened with learning new ways of implementing their solutions (a historically very slow process), but can assemble different solutions almost like Lego's.
Lee concludes that concurrency in software is difficult, but much of it is our own fault due to using poor abstractions. To deal with concurrency in a robust and predictable manner, we should stop using threads and focus on using nondeterminism only where it is warranted.
Continue reading about The problem with threads
Semantically similar articles hand-picked by GPT-4
- Python multiprocessing is fucking sweet
- Some life advice I learned from computer science
- Do you really need immutable data?
- Week 7: Time, Clocks, and Ordering of Events in a Distributed System
- 43 years of the actor model
Learned something new?
Read more Software Engineering Lessons from Production
I write articles with real insight into the career and skills of a modern software engineer. "Raw and honest from the heart!" as one reader described them. Fueled by lessons learned over 20 years of building production code for side-projects, small businesses, and hyper growth startups. Both successful and not.
Subscribe below 👇
Software Engineering Lessons from Production
Join Swizec's Newsletter and get insightful emails 💌 on mindsets, tactics, and technical skills for your career. Real lessons from building production software. No bullshit.
"Man, love your simple writing! Yours is the only newsletter I open and only blog that I give a fuck to read & scroll till the end. And wow always take away lessons with me. Inspiring! And very relatable. 👌"
Have a burning question that you think I can answer? Hit me up on twitter and I'll do my best.
Who am I and who do I help? I'm Swizec Teller and I turn coders into engineers with "Raw and honest from the heart!" writing. No bullshit. Real insights into the career and skills of a modern software engineer.
Want to become a true senior engineer? Take ownership, have autonomy, and be a force multiplier on your team. The Senior Engineer Mindset ebook can help 👉 swizec.com/senior-mindset. These are the shifts in mindset that unlocked my career.
Curious about Serverless and the modern backend? Check out Serverless Handbook, for frontend engineers 👉 ServerlessHandbook.dev
Want to Stop copy pasting D3 examples and create data visualizations of your own? Learn how to build scalable dataviz React components your whole team can understand with React for Data Visualization
Want to get my best emails on JavaScript, React, Serverless, Fullstack Web, or Indie Hacking? Check out swizec.com/collections
Did someone amazing share this letter with you? Wonderful! You can sign up for my weekly letters for software engineers on their path to greatness, here: swizec.com/blog
Want to brush up on your modern JavaScript syntax? Check out my interactive cheatsheet: es6cheatsheet.com
By the way, just in case no one has told you it yet today: I love and appreciate you for who you are ❤️