How We Fix Bugs In 36 Minutes Or Less

How We Fix Bugs In 36 Minutes Or Less

A couple of months ago I was working on my Carving Up Legacy With Microservices presentation when I was notified of a new github issue raised against the jobs platform my presentation was about. The development team were a few men down so despite being on leave I decided to take a look.

The bug was a simple oversight to do with dates. An upstream system wasn't validating date ranges correctly and we had trafficked a job advert before the job was live. I self-assigned the issue and posted a message to the team chat room saying I'd pick it up.

  • I pulled the latest commits
  • Re-ran npm install in case there were dependency changes
  • Ran the tests to ensure they passed
  • Wrote a failing test for the bug
  • Fixed the bug by adjusting the advert dates so they were always within range
  • Re-ran the tests (they passed)
  • Committed and pushed.

Our build for this service takes about 50 seconds and the continuous deployment to our first test environment a further 20. Booking a job into the test environment confirmed the fix had worked so I logged into Jenkins and checked there were no other builds since the last live release and clicked the promote to staging button. A short test later I promoted the service to live using the same process, reindexed the affected job, verified the advert had been removed and closed the issue. The time from bug report to closure was 36 minutes.

The first thing to come clean about is that we don't always fix our bugs in 36 minutes or less, but at least we have a system that makes this possible. The team has a high degree of autonomy, so we don't need to ask permission to perform releases. Our promotion process is fast and triggered by a single click, so occasionally even business users trigger releases. However the most significant factor in making this possible is the microservice architecture. Writing microservices means

  • There's less code to checkout
  • Fewer dependencies to install
  • Fewer tests to run

but this is just the cherry on the cake. When you get microservices right, you can deploy them independently without fear. We are in the habit of deploying changes immediately, so we have no software inventory. Consequently, when fixing a bug, the code you have locally is the code that's in live. No branching or merging, no investigation to see if the unreleased changes are safe.

Adopting a microservice architecture comes with many problems. One of the principles of systems is you never reduce complexity, you only redistribute it, and as John Pither remarks in his excellent blog post, "The tightrope is to avoid bringing in complexity to manage complexity".

Developing software has never been without challenge, but the 36 minute bug fix testifies that if you manage to successfully walk the tightrope, the benefits of microservices are are real.