6 Things You Need to Know About Time Travel Debugging

What is Time Travel Debugging / Reverse Debugging?

Reverse debuggers (or time travel debuggers) enable developers to record all program activities at runtime (every memory access, every computation, and every call to the operating system), and then rewind and replay to inspect the program state. This colossal amount of data is presented via a powerful metaphor: the ability to travel backwards in time (and forwards again) to inspect the program state. Typically, in order to optimise for performance, the developers will tune the reverse debugger to collect only the information necessary for accurate replay of the program under inspection.

How do reverse debuggers compare with conventional debuggers?

Conventional debuggers let you step line-by-line through a program and watch for a bug; an approach which works well for very simple bugs, where the crash happens on the same line as, or immediately after, the error. The location of an error and a developer’s knowledge of the code may suggest one or more possible sites for the root cause so that when you restart the program in the debugger and run it forward, you can look at the logic around these sites to see where it goes off track. (For an evaluation of the best Linux C++ debuggers see this very helpful discussion by Dr Dobb’s). However, this strategy is much harder to apply in the case of hard to reproduce bugs, as a developer has little/no information available about why the bug came about. Reverse debuggers are the single most helpful solution for these types of failures as a programmer can walk through the program’s execution backwards, as well as forwards, in order to home in on a point of interest, enabling them to find the root cause from two ends of the program instead of one.

How does reverse debugging compare with runtime debugging?

Runtime debugging is the common practice of forwards-only debugging. It is the mixture of breakpoints, watchpoints and other general runtime-debugger things that serve as markers to help you find bugs in your software.You can run your program until it hits a breakpoint and analyse any issues that appear in that particular section of code. In some cases, this works perfectly well, but more often than not, you do not know what section of code caused the error: a major problem when programs are large, making it time-consuming to run them again and again. With reverse debugging, a developer only has to run their program once in order to capture a complete record of the program’s execution - including any software issue that appeared during runtime - in a recording. This not only contains the bug itself, but the sequence of events that led up to it.

How does reverse debugging / time travel debugging work?

Developers have adopted different methods for reverse debugging and there is much discussion about how it works (see for example the StackExchange thread How does reverse debugging work?). Taking Undo as an example, Undo’s reverse debugger, UndoDB, provides the ability to instruct the process to go to any previous point in the execution history. It relies on the fact that many operations in a computer are actually deterministic and uses these as a way of identifying all sources of non-determinism that appear in compiled code (for more information, see introduction to reverse debugging). This allows developers to address the most time-consuming aspects of debugging today’s multilevel, multicomponent, multithreaded and multi-process applications.

How does reverse debugging affect performance?

Some slowdown must be expected when using reverse debugging (see for example the points raised on StackExchange in the thread Why is reverse debugging rarely used?). However, this varies hugely depending on the method of reverse debugging used.  The reverse debugging function of GDB, for example, can slow a program down 50,000x, which is far too slow to be considered by most developers. Undo’s reverse debugger, UndoDB, typically causes a slowdown of 2-3x (though for large complex code systems, this can be greater). However, it is important to remember that you will have to restart your program multiple times when using a conventional debugger whereas with time travel debuggers such as UndoDB, you can go backwards and forwards as often as you need without having to restart.

Can you use Time Travel Debugging in any programming language?

Most Time Travel Debuggers for compiled code are based on the GNU debugger, GDB, and therefore support all languages compatible with GDB (e.g. C/C++, Go and even Fortran). For other languages, there are various bespoke projects such as:

Time Travel Debugging for Java (Chronon)
Reverse debugging for .Net and C# (RevDeBug)
Reverse debugging for Python (RevPDB)
Time Travel Debugging for Elm (Elm TTD)