How many times must you rub your customer’s nose in it?

So far in this series of blog posts, I’ve discussed how LiveRecorder can solve reproducibility issues in two very different testing scenarios. This time we’re going out of the development lab/office and into the real world.

Software.Has.Bugs. Unless you’re one of the very small minority who work on software with sufficiently precisely defined operational parameters for which it is feasible to mathematically prove correctness, this is axiomatic. Even where such a feat is possible, only for a select number of highly sensitive applications (nuclear reactor control, military equipment and the like) is it cost effective.

So...Software Has Bugs. Let’s take that as a given. In spite of the best efforts of your QA team, this will remain true beyond the point where you release to customers, and those customers can and will trip over bugs. Since your QA team already tested all of the stuff they could think of, your customer will have done something a bit more obscure; it is quite on the cards that it will be correspondingly difficult both to diagnose and to fix.

A former colleague of mine once drew a diagram a bit like this on the whiteboard in a meeting.In_Production_Pic1I’ve probably not done it justice, but the point he was trying to convey is simply that the cost of a bug rises, the later it’s discovered in the product cycle. The rise is steep, indeed cliff-like, if it gets as far as the customer, and the reasons are twofold:

  1. 1. Diagnostic difficulty. Often bugs found by customers are just plain hard. This is to be expected - the customer has done something which was not anticipated. If it had been anticipated, the bug would have been found by QA and fixed before being released.
  2. 2. Reputational damage. Each time a customer encounters a bug in software their opinion of that software, and of the vendor, takes a hit.

Either one would be bad enough on their own, but here’s the kicker: a side-effect of 1 is often  repeated iterations of 2. This may simply be due to the passage of time, i.e. before the vendor can address the difficult customer issue, elapsed time results in several opportunities for the customer to experience it. There is, however, an even more pernicious cycle that runs like this.

    • Customer hits a bug, and reports it.
    • Vendor is unable to reproduce the symptom.
    • Vendor requests customer to reproduce the issue with additional diagnostics.
    • Repeat until solved.

For each iteration, the customer is actually being asked to revisit the thing that’s currently pissing them off.

Naturally, vendors attempt to minimise the number of iterations that it takes to give the engineers the information they need to resolve the issue, in addition to minimising the time it takes to do so. In an earlier post, I described how LiveRecorder can be used to do reduce the number of times a failing test needs to be repeated in order to fix the failure. The same principle may be applied to resolving customer issues. If the customer can reproduce the issue just once while the application is being recorded, then the vendor’s support engineers can use that recording to identify the root cause of the customer’s issue.

The recording addresses the issue of making the customer revisit the annoyance. The ability to explore the captured process execution backwards as well as forwards further shortens the elapsed time to solution.