Debugging Part One: Debugging the right way
Debugging can be hard - everyone wants to spend less time finding and fixing bugs and more time writing new code. That’s why we’re writing tools to make your programming life easier (we’re programmers too after all!). But it’s also why we’ve put together this series of posts sharing everything we know about debugging, including the best tools, tips and tricks to help you find those really annoying memory leaks, race conditions and stack crashes so that you can spend less time debugging and more time doing what you do best.
Lesson 1: fix the bug, not the symptom
What is debugging for a typical programmer these days?
Debugging involves finding the lines of code or configuration of your program which aren’t behaving as you expect or as you want, and changing them so that they behave as expected. That’s a fairly simple intention of the programmer - change things so that they are how you’d like them to be.
Anyone who has ever programmed (or indeed configured a computer) knows that the act of debugging can quickly become far more complex than the simple intention of changing things to be how you want them. Your intention might be to stop the user tapping the wrong button, but the act can involve reading unfamiliar code, learning new abstract ideas and potentially writing code to help you do the debugging.
Whether you’ve just graduated from a computing science degree or just found your first bug in scratch [scratch.mit.edu], there comes the moment for all programmers when it dawns on you: I’m going to spend a whole lot more time debugging code than I am writing code.
Often, the success of the programmer from than point is their ability to debug well. And now we can think about what debugging really is: debugging is the ability to reason about the code, hypothesise about what may be happening and test the hypothesis. And finally, to make the change required.
But that’s not how most people start debugging and, in fact, it’s often not how many developers solve problems.
How to avoid debugging
We’ve all sat in front of code not knowing how to fix it properly, but knowing how to make the symptom go away. Or you put in a “quick fix” which hangs around for 5 years.
It’s sometimes easier to avoid debugging. An individual programmer can avoid debugging by just hiding the problem or coding around it. If the page is slow to load, they create a cache system on top. The symptom has gone away but the underlying cause is still present, left for another programmer to solve another day.
Or the programmer might write code to handle a specific case so that the original code doesn’t have to handle it. While there are cases where this is a good idea, this is an excellent cause of code-sprawl. Code is written to handle ever more edge cases until there is no understandable structure in it and it becomes hard to “reason about” - a term which means you can think about the main structures and ideas in the code and visualise what’s going on.
The same things happens at a department or team level and then at an ecosystem level. Teams may see a bug as not one for them to fix and so refer it along to someone else or suggest the user find another way of fixing or working around it. If the database is slow, we’ll limit connections instead of finding out why it’s really slow.
Avoiding debugging can really hurt in the long term, so let’s contrast this with a proper debugging process...
First: Reason about the code
The first step is to reason about the code, which requires knowing what code and systems are running. For small projects this is easy, but as systems become large and complex the interrelated elements make an overview very hard. This is where practices like object-oriented programming , structuring project files and design patterns help the programmer hit the ground running with a well-known set of best practices.
The rule here is: be sure you’re clear what you’re reasoning about rather than treating it as just a big lump of code. Start from something you can see or control and work outwards from that, adding to your understanding.
During debugging, you often end up in a fog of war: it’s hard to see where you are in the code or in the behavior you were investigating. This is fine and to be expected, but always track back to something you’re sure of so everything you’re investigating relates.
Second: Hypothesise about the location of the bug
Next, the programmer thinks about where in the code the bug might be. There are two basic strategies here: reasoning down from the bug or up from the code.
Reasoning down starts by capturing something about the bug and guessing or searching where that will give you a way into the code. Error codes - often long and unfriendly to users - are a good example of this. The programmer can grab this code and jump right into where the code threw the error.
Reasoning up involves knowing a bit about how the program runs. For example, you might know that performance bugs tend to start in the database layer and you can visualise the parts of the code which are likely responsible. You would use this to jump into the part of the system concerned with calling those systems.
Most hypotheses start vague. It’s “something” in the cache, or “something” in the transaction logs which means you have to go hunting for information...
Third: Hunting for information
In the next step, the programmer probably needs to check their understanding of the code. This might be done by looking at logs ], using breakpoint debugging , reverse debugging or (time-travel debugging)] or by littering the code with debugging information. They might use automated tests to run the code for reproducible bugs, but not all bugs are easily reproduced.
Each of these techniques is trying to turn the programmer’s hypothesis about what’s going on into a confirmation or a correction and it’s here where many long hours can be lost if you don’t have the right tools or don’t read the information.
The naive programmer will read logs passively, not understanding what the logs say about the running system, or they will step through the code not really looking at what’s happening.
The key in this process is always to be examining a hypothesis about what you think should be happening versus what is happening.
Fourth: Learn, adjust course and repeat
Some bugs are just a few minutes of reasoning, hypothesising and investigation. Enjoy those!
Many others are far more complex, exposing layers and layers about the problem as new information comes to light, but also layers of distraction. The programmer will find some new information which may reveal more information about the bug. Equally likely is new information which has no relation to the bug.
With each new piece of information, the programmer reasons differently about the code, forms a new hypothesis and changes the course of their investigation. This repetition of reasoning, hypothesising and investigating may happen many times.
The key to successful debugging is to keep track of what’s known and what hypotheses have been tested. The less successful programmers will find themselves rerunning tests, blindly changing things and perhaps declaring the bug “unfixable”.
And eventually… Eureka!
Finally, after what can feel like reading runes and battling inner demons to find a spark of inspiration you hit upon the solution. This may be as simple as tweaking a few configuration settings but might involve unifying several parts of the code so they work better together. Either way, for hard to reproduce bugs, implementing the fix may be as much work as the investigation.