Swizec Teller - a geek with a hatswizec.com

Senior Mindset Book

Get promoted, earn a bigger salary, work for top companies

Senior Engineer Mindset cover
Learn more

    Code Review Practices for Refactoring Changes

    Do you review refactoring pull requests differently than you do others? An empirical study of OpenStack, an open source cloud platform, says that you do and finds the 6 criteria you care about.

    In Code Review Practices for Refactoring Changes: An Empirical Study on OpenStack AlOmar et al aimed to answer:

    • Do refactoring code reviews take longer?
    • What are the criteria reviewers look for?
    • Could tools be developed to make it easier?

    They studied 11,010 code changes on OpenStack and created a taxonomy of 28 criteria used to accept or reject code refactors. Tools exist, but can't replace a human.

    The purpose of code review

    Under related research, the authors explored why we do code review. You and I have gut feels about this, but research boils it down to 3 factors:

    1. Finding defects
    2. Improving the code
    3. Increasing knowledge transfer

    Whether that matches outcomes, they don't say. The authors do list a study that found 95% of code review doesn't decrease code smells. A fun future read. 😅

    Other research shows that 63% of refactoring happens while implementing new features and only 31% of changes that contain refactoring are explicitly about refactoring.

    The long discussion and disagreement on refactoring code reviews increases design degradation. 🤔

    Study design

    The authors combined a qualitative and quantitative approach. Numbers to see if refactoring takes longer to review, manual review to see what people look for.

    They examined multiple projects and landed on OpenStack because of its high level of activity, full review coverage (100% of code is reviewed), and a high number of refactoring code review. Changes that explicitly say they're about refactoring.

    The authors focused on pull requests that contained the word "refactor" in both subject and body. This feels limiting, but makes sense for the focus of this research.

    Do refactoring code reviews take longer?

    As you may have guessed, reviewing a refactor indeed takes longer. There's more discussion, more disagreement, and more code thrash.

    Statistics of code review activity efforts
    Statistics of code review activity efforts

    The table compares various metrics between refactoring and non-refactoring code reviews. On average you get

    • 1 more reviewer
    • 60% more comments
    • 2 more code changes
    • more files changed
    • 75% longer comments in total
    • 82% more code churn

    Refactoring leads to higher code churn and that creates longer discussion. Interestingly, having more reviewers does not increase review length. Code changes do.

    What are the criteria reviewers look for?

    Two researchers manually reviewed thousands of pull requests to build a taxonomy of important criteria. An engineer from industry validated their findings with "Sounds about right".

    Refactoring review criteria in modern code review
    Refactoring review criteria in modern code review

    They found 6 categories of review criteria and what makes them difficult to achieve.

    1. Quality (of design) is a vital part of refactoring review. There is no systemic way to gain a full understanding of the software evolution, which leads to locally optimal refactors that don't fit. The more you change the same set of files, the better your design gets.

    2. Refactoring that leads to a safe and stable design is considered "correct". Because this often happens as part of other code changes, bugs are easy to miss.

    3. Objective challenges mean that someone eventually asked "What's the purpose of this pull request?". You can't tell if the change is appropriate, if you don't understand its intent. Better PR descriptions help.

    4. Testing is meant to ensure that refactoring doesn't change behavior. Ideally relying on the existing test suite. Adding new tests is fine when a refactor reduces coverage.

    5. Integration of refactoring changes can run into challenges with complex merges, changes in configuration, and subtle changes in API surface. Hyrum's Law looms large. This includes library upgrades, which leads to even longer discussions about whether the upgrade is worth it.

    6. Management is a common source of longer reviews. Sometimes people just forget you're waiting.

    Can we have tools for this?

    There are tools that can help with refactoring. Nothing that stands out as magical yet.

    Until then strict type coverage makes refactoring easier as does a decent test suite. The better your software design, the easier your refactoring.

    The authors recommend establishing common guidelines for refactoring reviews, avoid refactoring as part of other changes, use continuous integration, and agree on a convention for structuring your code. All that saves time.

    Good luck.

    Cheers,
    ~Swizec

    Published on April 20th, 2022 in Software Engineering, Refactoring, Papers, Research

    Did you enjoy this article?

    Continue reading about Code Review Practices for Refactoring Changes

    Semantically similar articles hand-picked by GPT-4

    Senior Mindset Book

    Get promoted, earn a bigger salary, work for top companies

    Learn more

    Have a burning question that you think I can answer? Hit me up on twitter and I'll do my best.

    Who am I and who do I help? I'm Swizec Teller and I turn coders into engineers with "Raw and honest from the heart!" writing. No bullshit. Real insights into the career and skills of a modern software engineer.

    Want to become a true senior engineer? Take ownership, have autonomy, and be a force multiplier on your team. The Senior Engineer Mindset ebook can help 👉 swizec.com/senior-mindset. These are the shifts in mindset that unlocked my career.

    Curious about Serverless and the modern backend? Check out Serverless Handbook, for frontend engineers 👉 ServerlessHandbook.dev

    Want to Stop copy pasting D3 examples and create data visualizations of your own? Learn how to build scalable dataviz React components your whole team can understand with React for Data Visualization

    Want to get my best emails on JavaScript, React, Serverless, Fullstack Web, or Indie Hacking? Check out swizec.com/collections

    Did someone amazing share this letter with you? Wonderful! You can sign up for my weekly letters for software engineers on their path to greatness, here: swizec.com/blog

    Want to brush up on your modern JavaScript syntax? Check out my interactive cheatsheet: es6cheatsheet.com

    By the way, just in case no one has told you it yet today: I love and appreciate you for who you are ❤️

    Created by Swizec with ❤️