Swizec Teller - a geek with a hatswizec.com

    The problem with threads

    Regular Expression NFA

    This morning there was a link going around listing the 15 papers you should read to understand node.js's background. A large portion of the list is devoted to the comparison of thread- and event- based models of execution. Since I hear a lot about event loops being better than threads, I read The problem with threads written in 2006 by one Edward A. Lee, a professor at Berkeley.

    If the next generation of programmers makes more intensive use of multithreading, then the next generation of computers will become nearly unusable.

    And this bleak future is almost inevitable when you think about it. The rule of Moore's law has been changing from faster chips, to more and more cores. And yet the only software making good use of multiple cores are what Lee calls "embarrassingly parallel" applications - those that simply split themselves into independent processes.

    Problems with threads stem from sharing memory space. Without proper care this can lead to a situation where one thread changes a piece of data, while the other is reading it. Now you have two threads that both have a different understanding of what the value of a certain variable is.

    While in theory solvable with semaphores, locks and so on, without very proper care this can then lead to deadlocks - more importantly, taking care of interleaving threads is _very hard. _You are essentially facing a nondeterministic machine and trying to understand it through a poor abstraction that doesn't make use of your natural ability to deal with concurrency.

    Lee also makes the case that this isn't merely a problem of syntax or tools, but that threads are a fundamentally flawed model of computation. I won't burden you with his whole proof/argumentation (page 3 to 5). The gist is that

    /../ given a sequental program and an initial state, you have a defined sequence of events. Any two programs can be compared - they are equivalent if they halt for the same initial states and the final state is the same. When threads are introduced these essential properties are lost /../ if two threads can provide the next action, we can no longer compare two programs, we might be able to compare all interleavings, but on a multithreaded envrionment even that is lost since we'd have to know about all the other programs as well.

    Essentially the problem is this - the core abstraction of computation is a deterministic assembly of deterministic components, but threads are inherently nondeterministic. The car analogy Lee provides is trying to compose an internal combustion engine out of a pot of iron, hydrocarbon and oxygen molecules randomly moving according to thermal forces.

    This is so bad in practice even using very strict and rigid engineering principles, doesn't help. In early 2000 Lee started the Ptolemy Project, a project designing concurrent embedded systems. But this is also an experiment in battling threads with rigorous engineering discipline - they developed a code maturity rating system, design reviews, code reviews, nightly builds, the works. Reviewers included concurrency experts and there are regression tests with 100% code coverage.

    When the system started being used in 2000 no problems were observed until a deadlock on April 26th, 2004. Just the fact they know the exact date this happened tells a lot about how rigorously engineering principles are applied. Still, a very serious problem took four years to be discovered.

    Many other approaches exist that try to combat this problem, but they either wallow in obscurity or aren't very effective; perhaps the most effective is the use of higher-order principles like MapReduce, which aim to break the problem into tiny fractions that can be computed separately and then combining the results.

    An interesting solution Lee proposes is using the so-called Rendezvous pattern, which as far as I can tell is a generalization of MapReduce, with the use of a nondeterministic merge, so that each part of the program is executing deterministically and nondeterminism is only used in a single spot on the network - where threads actually have to interact. Lee proposes using this with some sort of coordination language, so that programmers aren't burdened with learning new ways of implementing their solutions (a historically very slow process), but can assemble different solutions almost like Lego's.

    Lee concludes that concurrency in software is difficult, but much of it is our own fault due to using poor abstractions. To deal with concurrency in a robust and predictable manner, we should stop using threads and focus on using nondeterminism only where it is warranted.

    Enhanced by Zemanta

    Did you enjoy this article?

    Published on December 21st, 2011 in Java, Lego, MapReduce, Programming, Ptolemy Project, Threads, Uncategorized

    Learned something new?
    Want to become an expert?

    Here's how it works 👇

    Leave your email and I'll send you thoughtfully written emails every week about React, JavaScript, and your career. Lessons learned over 20 years in the industry working with companies ranging from tiny startups to Fortune5 behemoths.

    Join Swizec's Newsletter

    And get thoughtful letters 💌 on mindsets, tactics, and technical skills for your career. Real lessons from building production software. No bullshit.

    "Man, love your simple writing! Yours is the only newsletter I open and only blog that I give a fuck to read & scroll till the end. And wow always take away lessons with me. Inspiring! And very relatable. 👌"

    ~ Ashish Kumar

    Join over 14,000 engineers just like you already improving their careers with my letters, workshops, courses, and talks. ✌️

    Have a burning question that you think I can answer? I don't have all of the answers, but I have some! Hit me up on twitter or book a 30min ama for in-depth help.

    Ready to Stop copy pasting D3 examples and create data visualizations of your own?  Learn how to build scalable dataviz components your whole team can understand with React for Data Visualization

    Curious about Serverless and the modern backend? Check out Serverless Handbook, modern backend for the frontend engineer.

    Ready to learn how it all fits together and build a modern webapp from scratch? Learn how to launch a webapp and make your first 💰 on the side with ServerlessReact.Dev

    Want to brush up on your modern JavaScript syntax? Check out my interactive cheatsheet: es6cheatsheet.com

    By the way, just in case no one has told you it yet today: I love and appreciate you for who you are ❤️

    Created bySwizecwith ❤️