Do you review refactoring pull requests differently than you do others? An empirical study of OpenStack, an open source cloud platform, says that you do and finds the 6 criteria you care about.
In Code Review Practices for Refactoring Changes: An Empirical Study on OpenStack AlOmar et al aimed to answer:
- Do refactoring code reviews take longer?
- What are the criteria reviewers look for?
- Could tools be developed to make it easier?
They studied 11,010 code changes on OpenStack and created a taxonomy of 28 criteria used to accept or reject code refactors. Tools exist, but can't replace a human.
The purpose of code review
Under related research, the authors explored why we do code review. You and I have gut feels about this, but research boils it down to 3 factors:
- Finding defects
- Improving the code
- Increasing knowledge transfer
Whether that matches outcomes, they don't say. The authors do list a study that found 95% of code review doesn't decrease code smells. A fun future read. 😅
Other research shows that 63% of refactoring happens while implementing new features and only 31% of changes that contain refactoring are explicitly about refactoring.
The long discussion and disagreement on refactoring code reviews increases design degradation. 🤔
The authors combined a qualitative and quantitative approach. Numbers to see if refactoring takes longer to review, manual review to see what people look for.
They examined multiple projects and landed on OpenStack because of its high level of activity, full review coverage (100% of code is reviewed), and a high number of refactoring code review. Changes that explicitly say they're about refactoring.
The authors focused on pull requests that contained the word "refactor" in both subject and body. This feels limiting, but makes sense for the focus of this research.
Do refactoring code reviews take longer?
As you may have guessed, reviewing a refactor indeed takes longer. There's more discussion, more disagreement, and more code thrash.
The table compares various metrics between refactoring and non-refactoring code reviews. On average you get
- 1 more reviewer
- 60% more comments
- 2 more code changes
- more files changed
- 75% longer comments in total
- 82% more code churn
Refactoring leads to higher code churn and that creates longer discussion. Interestingly, having more reviewers does not increase review length. Code changes do.
What are the criteria reviewers look for?
Two researchers manually reviewed thousands of pull requests to build a taxonomy of important criteria. An engineer from industry validated their findings with "Sounds about right".
They found 6 categories of review criteria and what makes them difficult to achieve.
1. Quality (of design) is a vital part of refactoring review. There is no systemic way to gain a full understanding of the software evolution, which leads to locally optimal refactors that don't fit. The more you change the same set of files, the better your design gets.
2. Refactoring that leads to a safe and stable design is considered "correct". Because this often happens as part of other code changes, bugs are easy to miss.
3. Objective challenges mean that someone eventually asked "What's the purpose of this pull request?". You can't tell if the change is appropriate, if you don't understand its intent. Better PR descriptions help.
4. Testing is meant to ensure that refactoring doesn't change behavior. Ideally relying on the existing test suite. Adding new tests is fine when a refactor reduces coverage.
5. Integration of refactoring changes can run into challenges with complex merges, changes in configuration, and subtle changes in API surface. Hyrum's Law looms large. This includes library upgrades, which leads to even longer discussions about whether the upgrade is worth it.
6. Management is a common source of longer reviews. Sometimes people just forget you're waiting.
Can we have tools for this?
There are tools that can help with refactoring. Nothing that stands out as magical yet.
Until then strict type coverage makes refactoring easier as does a decent test suite. The better your software design, the easier your refactoring.
The authors recommend establishing common guidelines for refactoring reviews, avoid refactoring as part of other changes, use continuous integration, and agree on a convention for structuring your code. All that saves time.
Learned something new?
Read more Software Engineering Lessons from Production
I write articles with real insight into the career and skills of a modern software engineer. "Raw and honest from the heart!" as one reader described them. Fueled by lessons learned over 20 years of building production code for side-projects, small businesses, and hyper growth startups. Both successful and not.
Subscribe below 👇
Software Engineering Lessons from Production
Join Swizec's Newsletter and get insightful emails 💌 on mindsets, tactics, and technical skills for your career. Real lessons from building production software. No bullshit.
"Man, love your simple writing! Yours is the only newsletter I open and only blog that I give a fuck to read & scroll till the end. And wow always take away lessons with me. Inspiring! And very relatable. 👌"
Senior Mindset Book
Get promoted, earn a bigger salary, work for top companiesLearn more
Have a burning question that you think I can answer? Hit me up on twitter and I'll do my best.
Who am I and who do I help? I'm Swizec Teller and I turn coders into engineers with "Raw and honest from the heart!" writing. No bullshit. Real insights into the career and skills of a modern software engineer.
Want to become a true senior engineer? Take ownership, have autonomy, and be a force multiplier on your team. The Senior Engineer Mindset ebook can help 👉 swizec.com/senior-mindset. These are the shifts in mindset that unlocked my career.
Curious about Serverless and the modern backend? Check out Serverless Handbook, for frontend engineers 👉 ServerlessHandbook.dev
Want to Stop copy pasting D3 examples and create data visualizations of your own? Learn how to build scalable dataviz React components your whole team can understand with React for Data Visualization
Did someone amazing share this letter with you? Wonderful! You can sign up for my weekly letters for software engineers on their path to greatness, here: swizec.com/blog
By the way, just in case no one has told you it yet today: I love and appreciate you for who you are ❤️