When Should You Rewrite?
By Adrian Sutton
Greg picks up on my previous post about XP principles and how it helps avoid rewrites. I thought I should explain in more detail why rewrites are a bad thing and my thoughts about when and how you should do them anyway.
Programmers for some reason seem to think any code they didn’t write, and often any code they wrote some time ago, is poor quality, misguided and generally crap. Often this is quite true, but the degree to which the code is bad is usually significantly less than the initial impression it gives. That is, when you first look at a piece of code and start working your way through it, you feel as if the programmer was completely brain-dead and it’s amazing the software worked at all. Except in very rare occasions the code does work with just minor bugs or even no bugs (at least that have been discovered) so the inclination to think that it can’t possibly work is just a form of panic reaction from your brain while it struggles to comprehend the new code. It’s very easy to condemn a piece of code in those first few moments when you don’t actually understand it and are just seeing a mess of symbols and a bunch of bad coding practices, but doing so condemns your rewrite.
XP has a notion that the code is the documentation (this particularly applies to test cases in XP) and while you may debate whether or not that’s a good idea, it’s hard to disagree that the code contains knowledge, information about little gotcha’s, user requirements and previously fixed bugs. That information must be preserved if you want to successfully fix the code either through refactoring or rewriting. In the first few panics moments where you just can’t believe how unbelievably bad this code is, you can’t possibly detect and document all the knowledge that resides in the existing code. That code is currently your only lifeline and your greatest asset – even if you do wind up throwing it out, treat it like your best friend right up until the point that you do.
It’s this concept of extracting the knowledge from the code that drives people (including Martin Fowler in the early chapters of “Refactoring”) to make the first step of any rewrite to write tests for the code. The same applies for a rewrite, first write tests to capture exactly what the current code does in as much detail as possible. If at all possible, write automated tests, but for things you can’t automate, write manual tests and make sure you run through them regularly while you work on the code. For your first pass, write the tests assuming that the old code is absolutely perfect and bug free – whatever output it gives you should be considered the right answer. If you have a test that looks like it shows a bug in the old code, by all means add a note about it right there in the test so that when you come to make that test pass in your rewrite or when that test starts failing after your refactoring you can work through the impacts of the change you’ve made and decide on what the best result actually is in that case.
Once you’ve extracted the knowledge out of the old code, you’re finally ready to decide if you should rewrite or refactor. If after all this analysis you still think the old code is unfixable, then you are probably in the right situation for a rewrite. In most cases though you will now have an understanding of why the code was so complex and hard to understand and you will probably have a number of insights into better ways it could have been done – that’s the time when you should refactor. Start with the most obvious improvement you could make and do it, then look for the new most obvious improvement. Repeat that process over and over until you’re happy with the code. Just make sure you run all the tests after each change.
If you’ve decided to rewrite you have some more planning work to do. You want to avoid working off the main branch as much as possible or other team members might end up relying on the old code that you’re about to change – any of your bug fixes or improvements might impact on your team members. Besides that, bad code generally isn’t nicely modularized so you’re probably going to have to make changes to a number of areas of your code base – that makes merging back in difficult if you leave it too long. To avoid these problems, see if you can break the rewrite down into smaller chunks – keep most of the old code (possibly with an abstraction layer that provides the interface you eventually want to have) and replace one part of it, then merge back with the main branch. Repeat this process of replacing small chunks until you’re happy with the code. Note here that you don’t repeat until the old code is completely replaced, some of the smaller chunks might just need refactoring and can be kept around, some of them might be quite good once you get rid of the bad code from around them and don’t need any major changes. Keep your eyes open for ways of reducing your workload but still ending up with the same quality (or better) code.
As you go about the rewrite, make sure you write lots of tests – you already have your acceptance tests written, they’re the ones that document the output of the previous code, but make sure you write integration and unit tests, plus tests for whatever new functionality you wind up adding. If you’re not doing XP that’s fine, you can still do your big design up front, you can even write your tests after you write your code, but don’t consider an acceptance test passing until you have all the lower-level tests for that code written. The last thing you want is to wind up with another piece of crap code after you’ve spent so much time and effort replacing the old, previously working code. You need to take it slow and do it right – the code was probably bad last time because it was rushed. I suggest something like acceptance test driven development.
Always remember, you’re probably no smarter than the programmer that came before you – you’re genius is not going to make the replacement better than the original, you need to leverage some other advantage. That advantage might come from the new knowledge and experience with the related libraries you’re using. It might come from using a new library to do most of the heavy lifting for your. It might come from actually thinking about the design ahead of time, it might come from not thinking about the design ahead of time and using TDD. It might come from writing better documentation or more tests. It might just come from actually focussing on quality and not rushing. Ideally your advantage would come from a number of different things so that the replacement code is the best it can be. Your management will not be happy if this bit of code causes complaints in the future.
Speaking of management, it’s actually business needs that are most impacted by doing rewrites. The trouble for the business is that they wind up paying a number of really expensive developers to sit around and recreate things that the business has already paid to have develop. Even worse, while that’s happening much less or potentially no value adding work is being done. To the business the rewrite might have strategic value, but it’s very hard to quantify and measure. Engineering will tell the business that they’ll be able to add features faster in the future after this rewrite but they can’t say how much faster or which features will benefit. There may not even be a significant number of users complaining about product quality, so there’s no easily visible reason for the business to support the cost of a rewrite. As part of deciding to do a rewrite, you need to weigh up these business interests. Will the company go broke before you complete the rewrite? Will the company miss a big, long-term opportunity because it can’t add features while the rewrite is happening. Has it been a long time since a new version went out and would it be better to delay the rewrite until you get a new version out so that sales and marketing have something to sell while you work? What does the business need to do to handle the increase in support after your rewrite ships (you should be expecting to lose stability in the short-term but gain it in the long-term in most cases)? How much of a reduction in support will you see after the new code has settled in? How much easier will it be to add new features?
Always remember, rewrites are hard, they take much longer than you thought and introduce more bugs than you thought. As Greg said:
Naturally, if the decision to rewrite is taken, it should only be done if there is a clear commitment to first learn the lessons from the failed implementation and to create a design and a methodology that have a reasonable likelihood of success.