gdbWatchPoint

Can't use a reversible debugger? Try these GDB commands.

Written by Dr Greg Law | Last updated 9th Oct 2020

In this GDB tutorial, we are reimplementing by hand what tools like LiveRecorder, rr, UDB, or even GDB’s built-in reversible debugging functions automate for you, using only GDB commands.

Why would you do that? 

Sadly, full-featured reverse debugging is not always available (e.g. you’re running on an architecture or platform or program that rr can’t support, or you don’t have access to a LiveRecorder or a UDB license). 

Let’s dive in.

Deterministic versus non-deterministic

First some theory because in this GDB tutorial, I am going to use an entirely deterministic program, so I can trace down step by step a bug in my code with only using GDB commands.

Deterministic software program:

  • For a particular input, the program always gives the same output.
  • Defects in the program are traceable because it follows a predictable execution pattern; you can determine the next execution step.

Non-deterministic software program:

  • For a particular input, different runs of the program will give different outputs.
  • Tracing defects in the program is complicated because you can’t determine the preceding or next steps of the execution due to more than one path the program can take.

Most of the software programs that we work with these days are non-deterministic, which makes debugging them more difficult. It is good practice to segment non-deterministic code down into smaller deterministic pieces, reducing the complexity of debugging; for example, unit tests.

Square root

Let’s write a little program that we can use for this GDB tutorial. It calculates some square roots and stores them in a cache; but of course, it has a bug.

Open your favorite editor and type or copy-paste the following lines.

typedef struct
{
    unsigned char number;
    unsigned char sqroot;
} cache_entry_t;
 
static cache_entry_t cache[100];
static int cache_size = sizeof(cache)/ sizeof(cache[0]);
 
static int
cache_calculate(int operand)
{
    for (int i=0; i<cache_size; ++1)
    {
        if (cache[i].number == operand)
        {
            /* Cache hit. */
            return cache[i].sqroot;
        }
    }
    
    /* Cache miss. Find the correct result and populate a few cache entries. */
    int sqroot = 0;
    int op_adj;
    for (op_adj=operand-1; op_adj < operand+1; ++op_adj)
    {
        int sqroot_adj = (int) (sqrt(op_adj));
        int i = (int) (1.0 * cache_size * rand() / (RAND_MAX+1.0));
        cache[i].number = op_adj;
        cache[i].sqroot = sqroot_adj;
        if (op_adj==operand)
        {
            /* This is our return value. */
            sqroot = sqroot_adj;
        }
    }        
    return sqroot;
}
 
int
main(void)
{
    /* Disable srand() to make the program deterministic */
    //srand(time(NULL));
 
    /* Repeatedly check cache_calculate(). */
    for (int i=0; ; ++i)
    {
        if (i % 100 == 0)
        {
            printf(“i=%i\n”, i);
        }
        /* Check cache_calculat() with random number. */
        int number = (int) (256.0 * rand() / (RAND_MAX+1.0));
        int sqroot_cache = cache_calculate(number);
        int sqroot_correct = (int) sqrt(number);
 
        if (sqroot_cache != sqroot_correct)
        {
            /* cache_calculate() returned incorrect value. */
            prinft(“i=%i: number=%i sqroot_cache=%i sqroot_correct=%i\n”, 
                i number, sqroot_cache,    sqroot_correct);
            assert(0);
        }
    }
    return 0;
}

Save the program as sqroot.c

Compile it.

$ gcc - g3 sqroot.c -lm

Note that I use the -lm attribute because we need the math library for the square root calculations.

Now, run it.

$ ./a.out

You’ll notice that the program is deterministic because every time we run the program, it fails after exactly 2011 iterations.

Undo - Deterministic

Let’s load it in GDB.  

$ gdb a.out

Start the program.

(gdb) run

No surprise here, the program fails after 2011 iterations; it’s deterministic after all.

Type the GDB command Ctrl-x-a or tui enable to switch to the GDB Text User Interface (TUI) mode. The advantage of working in GDB TUI mode is that you can see where you are in the program, which helps us in establishing why it fails.

Undo - TUI Mode
In case you need it, read here a GDB tutorial to help you get up to speed with GDB TUI mode quickly.

In the next paragraphs, I walk you through my debugging flow and the GDB commands I use to identify the cause of the program failure. To see all the action, I do recommend watching the entire video here after you finished reading this GDB tutorial.

Put breakpoints

Stepping through the program in TUI mode, you’ll notice that it is calling the function cache_calculate(number), but this function returns the value zero, which isn’t the square root for the input value 255. 

So, the question is, why is cache_calculate() not returning the correct square root value?

First, we put a breakpoint at line 76 because we want to determine what cache_calculate() is doing.

(gdb) b 76
Undo - Breakpoint 1

Next, we use the GDB command ignore.

Use the ignore command

We use the GDB command ignore to ensure that our program execution doesn’t stop when it hits breakpoint 1.

Note that gdb will still count the number of times the breakpoint is hit.

Type the following GDB command.

(gdb) ignore 1 100000000

The value 1 is the breakpoint ID number, and the value 100000000 is the number of breakpoint encounters we want to ignore. What this means is that the program doesn’t stop.

I use the GDB command info break to see the number of times the program hits breakpoint 1.

Undo - Info Break

The program encounters our breakpoint 2012 times before it eventually fails. Unfortunately, at this point, we lost the data required to establish what happened. 

I now use the ignore command again, but this time I change the ignore value to 2011, so at the 2012th iteration, the program stops at line 76, saving the status for our inspection.

(gdb) ignore 1 2011

Rerun the program.

Undo - Ignore 2

The program stops after the 2011th iteration at line 76. The value of the variable number is 255, as we expected. Now, we step into the function. Instead of manually stepping around the loop, I set two breakpoints at the points where we escape from this loop at lines 39 and 44 and continue the program.

Undo - Cache 90

The program escapes at breakpoint 2 (line 39), and the function returns the 90th entry in the cache. We can see that this cache-position holds bad data.

The question is, why?

To find out, I use a regular watchpoint.

If you need a quick refresher on GDB watchpoints, then read my GDB tutorial on watchpoints here.

Set a conditional watchpoint

I add a conditional watchpoint with the following GDB command:

(gdb) watch cache[90].number if cache[90].number == 255

Note that I don’t need my other breakpoints any longer, so I disable them.

(gdb) disable

Next, I enable only my watchpoint.

(gdb) enable 4

Run the program again.

Undo - Watch REGEX

The program fails, and we are now at the point where we write garbage in the cache. 

Undo - Watchpoint Result

The program calculates the square root of -1. Everyone knows that you can’t take the square root of a negative number. With a little bit of code inspection, we determine that the program calls the function with the value zero, which is causing the bug.

Up your game

The truth is that we live in a non-deterministic world, which means that testing our programs isn’t as straightforward. If you’re debugging non-deterministic software, then try first to segment the code down into deterministic pieces that are easier to debug, and, if you can, augment your testing with a reverse debugging tool, such as RR or LiveRecorder. Having access to a purpose-built reverse debugging tool helps you to up your debugging game. However, if you don’t have access to a reverse debugging tool, then in this GDB tutorial, you learned how to reverse debug with only GDB commands.

Make sure to watch my video to the end, because, in the final two minutes, I couldn’t resist to showcase you the goodness of UDB (formerly known as UndoDB). I am sure you recognize the value a reverse-debugging tool can add to your debugging flow.

Do not miss my next GDB tutorial: sign up for the gdbWatchPoint mailing below.
Get tutorials straight to your inbox
Become a GDB Power User. Get Greg's debugging tips directly in your inbox every 2 weeks.
  • Time Travel Debugging
    The fastest way to debug: step backward in your program's execution and find all the answers.
    Meet UDB

Time travel debugging: turbo boost your time to fix bugs

Meet UDB