Swizec Teller - a geek with a hatswizec.com

Senior Mindset Book

Get promoted, earn a bigger salary, work for top companies

Senior Engineer Mindset cover
Learn more

    I was wrong about databases

    [name|Friend], a few folks wrote in to say I was wrong about databases on Tuesday – they're smarter than I thought.

    This is the part I got wrong:

    With the arrays approach, your database has to make room for the new reaction. Every post in the entire table has to move out of the way, then it can add a new reaction. The more data you add, the more everything has to move.

    This makes writes slow. You're adding 2 integers, but moving and rewriting gigabytes of data. Plus updating any indexes.

    As one commenter pointed out: "no sane RDBMS would move gigabytes on insert"

    What do they do instead? TOAST records and tombstones. Although I can't tell from documentation if tombstones are something you implement by hand or the database does for you. For Postgres at least.

    TOAST lets you split a wide table row into chunks.

    When a row that is to be stored is "too wide" (the threshold for that is 2KB by default), the TOAST mechanism first attempts to compress any wide field values. If that isn't enough to get the row under 2KB, it breaks up the wide field values into chunks that get stored in the associated TOAST table. Each original field value is replaced by a small pointer that shows where to find this "out of line" data in the TOAST table. TOAST will attempt to squeeze the user-table row down to 2KB in this way, but as long as it can get below 8KB, that's good enough and the row can be stored successfully.

    The limit is 4 billion chunks per row – 32bit pointers. Reading and writing these rows sounds to me like it would be slower than effectively using relational data modeling.

    Tombstones let you mark a row as dead so it can be rewritten in a more convenient place (at the end of the table). Moving a whole row to add new data still feels slower than effectively using relational data modeling. Way faster than moving the whole table of course.

    And as one commenter mentioned: Data packing matters less now that we have SSDs with random access. Sequential reads used to be hugely faster than random reads. Now they're only a little faster :)

    You should measure what works better for your production use-case. Performance depends on your data access patterns.

    Here are more benefits of relational data I didn't mention.

    The best part about writing is that you get to learn new and exciting ways you were wrong.

    Cheers,
    ~Swizec

    Published on February 13th, 2025 in Databases, Backend, Software Engineering, Architecture

    Did you enjoy this article?

    Continue reading about I was wrong about databases

    Semantically similar articles hand-picked by GPT-4

    Senior Mindset Book

    Get promoted, earn a bigger salary, work for top companies

    Learn more

    Have a burning question that you think I can answer? Hit me up on twitter and I'll do my best.

    Who am I and who do I help? I'm Swizec Teller and I turn coders into engineers with "Raw and honest from the heart!" writing. No bullshit. Real insights into the career and skills of a modern software engineer.

    Want to become a true senior engineer? Take ownership, have autonomy, and be a force multiplier on your team. The Senior Engineer Mindset ebook can help 👉 swizec.com/senior-mindset. These are the shifts in mindset that unlocked my career.

    Curious about Serverless and the modern backend? Check out Serverless Handbook, for frontend engineers 👉 ServerlessHandbook.dev

    Want to Stop copy pasting D3 examples and create data visualizations of your own? Learn how to build scalable dataviz React components your whole team can understand with React for Data Visualization

    Want to get my best emails on JavaScript, React, Serverless, Fullstack Web, or Indie Hacking? Check out swizec.com/collections

    Did someone amazing share this letter with you? Wonderful! You can sign up for my weekly letters for software engineers on their path to greatness, here: swizec.com/blog

    Want to brush up on your modern JavaScript syntax? Check out my interactive cheatsheet: es6cheatsheet.com

    By the way, just in case no one has told you it yet today: I love and appreciate you for who you are ❤️

    Created by Swizec with ❤️