TL;DR: As soon as possible.
I recently came across advice by James Clear on the timing of problem-solving. He observed that many problems are minor when they first emerge but grow into enormous issues when you let them linger. He then suggested: “As a rule of thumb, fix it now.”
James Clear’s observation is as valid for generic problems as it is for software bugs. The sooner a bug or defect is caught and fixed, the less impact it will have on the company’s bottom line. That is, the cost of a defect depends, in part, on the phase of the software development life-cycle (SDLC) in which it is discovered.
When you are writing code and notice you introduced a bug, fixing it is easy and cheap because all the context is already loaded in your working memory. Moreover, if the bug is in the code you just added, no one has experienced it yet. No harm done.
When bugged code leaves your machine, its impact begins to increase. If a colleague spots the issue during code review, you pay the price of some back and forth, which might delay the feature delivery but generally won’t affect the bottom line. However, as soon as the code ships to a beta version, or worse, production, the experience of actual users risks being compromised. Depending on the bug’s nature, this could span from tweets of complaint to increased load on the customer support and even loss of revenue.
As an aside, the increase in the cost of bug fixing is one of the reasons I recommend writing unit tests during the code’s development, ideally by practicing Test-Driven Development.
We can all agree that issues in new code should be resolved as soon as possible. No developer in their right mind willingly submits bugged code for code review without a plan to tackle it in a follow-up PR.
The matter of timing becomes more tricky when the issue is discovered in production. When is the best time to fix a bug discovered in production?
“Fix it as soon as possible” is the best rule of thumb for both bugs discovered during development and in production. We can call this approach ASAP-bug-fixing.
The alternative to ASAP-bug-fixing is to treat bugs as any other task in the product development flow, adding them to the project management system of choice, each with a priority value. We can call this approach Scheduled-bug-fixing.
Scheduled-bug-fixing requires specifying criteria for what defines how soon a bug should be fixed. For example, a bug that affects only a portion of users without compromising their core experience should have low priority and might stay in the app for a long time. On the other hand, a bug that prevents users from signing up ought to jump in front of everything else because, if new users cannot sign up, they will never convert to paying customers.
ASAP-bug-fixing as a risk-mitigation strategy
There is merit in prioritizing bug fixing, as it helps to identify which issues to address first. On the other hand, Scheduled-bug-fixing fails to account for the degree of non-determinism inherent in production software.
It’s impossible to predict the true impact of a bug, no matter how many dashboards with usage data and trends we build. Something that looks harmless today might degenerate into a show stopper tomorrow for all sorts of reasons. A new feature might be built on top of that flawed code, or there might be a sudden surge of new users, or a bugged code path previously used by only a few users might become the main one because of a change in design.
Addressing issues as soon as they are reported is a risk mitigation strategy. In the moment, it might seem wasteful to address a low-impact bug when there are new features to build. But, when thinking long-term, it’s clear this practice clearly prevents each bug from degenerating into an expensive incident.
ASAP-bug-fixing becomes even more valuable when you consider its effect at scale. Venture capital firms invest in numerous startups, knowing that most will fail but that the few that will succeed will generate enough returns to cover all losses and make healthy profits. In the same way, when fixing all bugs as soon as possible, the savings that result from stopping one bug from exploding into a major issue pay off for the cost, in terms of delayed feature development, of fixing all the other trivial little ones. In the same way, when fixing all bugs as soon as possible, the savings that result from not having to deal with one major incident because of one bad bug pay off for the cost of delayed feature development due to fixing all the trivial bugs that would have never materialized into serious issues.
Of course, there might be bugs that do not deserve to be fixed. For example, a bug that only affects users in an older browser or OS version scheduled to be dropped soon or a UI glitch in an area that will soon be redesigned. ASAP-bug-fixing is a rule of thumb, not a rule engraved in stone. It’s a behavior to default to, but that can be bypassed if necessary.
Software development is a continuous management of different tradeoffs. Fixing bugs as soon as they are reported exchanges speed of new feature development with stable software and long-time safety.
An early-stage startup still looking for product-market fit can cope with a buggy product because they haven’t yet validated their idea. They should invest capital in that direction rather than towards building bulletproof software that no one might use. But as soon as you have software people are actually paying for and depending upon, you should think carefully about the number of bugs you are willing to live with.
What is your approach to bug fixing? Where do you sit in the ASAP vs. scheduled approach? Let me know via Twitter.
Image credits: Composite by the author.