Testing Your Way To Bug Diagnosis

By Adrian Sutton

June 7, 2006

Sometime you run into a bug that you can reproduce off an on, and you just get this feeling that it’s because each time you try to reproduce it you’re doing something slightly different and that’s causing it to appear and disappear. I encountered just such a bug today.

The bug report came in, select a word at the end of a list item, hit backspace and the word is deleted correctly but the next list item is incorrectly moved up and appended to this one (as if you’d hit forward delete at the end of the list item with no selection). I’d seen this problem happen with my own eyes, the first time I tried it I reproduced the problem. So I made a change to the code base to try and track down what caused it and all of a sudden the problem disappeared.

Aha! Clearly I’ve isolated the problem code and fixed the issue. Being a good little XP developer I slapped myself on the wrist, undid my code change and set off to write a test first. For some reason, I first manually went through the same steps as before but this time, even with my changes reversed, the bug didn’t show up. I reverted all the files and tried again – still no sign of the bug. I put the fix back in and suddenly the bug came back.

Finally it occurred to me that I had made a change to the way I was manually running the test – the first time I ran the test I had created the list, selected the last word and hit backspace. After a while I got sick of creating the list so I did it once more and saved the result so that it would be preloaded next time I ran the application. The document was exactly the same, but the key step to reproduce the bug was actually inserting the list – the bug would never show up if you didn’t first apply a list to some content. As a side note, this makes perfect sense and isn’t actually a case of code that’s too tightly coupled – it’s the effect of a complex feature that is currently only partly implemented.

Even knowing this difference I found that I kept doing things just slightly differently that the bug didn’t show up, so I had no confidence that I had fixed it. Slapping my wrist again, I wrote an automated test that ran through the exact procedure I needed to reproduce the bug. Now I could run through the same steps anytime I wanted and know that they were done exactly the same way (as an added bonus it did them much faster).

It turns out that my original attempted fix did nothing to fix the bug and the actual cause was something quite different. A few minutes later and it was all fixed up and I could get back to implementing new stories, confident that we’d never see that problem again.