Tutorial: Getting Started with UDB

Tutorial: Getting Started with UDB

Overview

In this tutorial, we will walk you through how to use UDB and time travel debugging to identify the root cause of a small bug.

You will:

  • get familiar with using UDB
  • learn the basics of time travelling backward and forward through code in order to inspect and easily understand program state

The principles learned by following this guide can be used on more complex code.

Unpack the Tar file

If you have just downloaded UDB, you will need to install it to get going.

If you have already done this, you can skip this step by clicking ‘Next’ to continue.

It’s easy to install UDB. Here’s what you need to do:

  1. Open terminal
  2. Unpack the .tgz file that you just downloaded with the following command: tar -xzf UDB-Individual-Evaluation-<version>.tgz
  3. This will create a folder named UDB-Individual-Evaluation-<version>

Build & run the example program

We will use the sample program cache.c (cache calculate) that can be found in the examples directory of the UDB-Individual-Evaluation-<version> folder you just created.

This sample program maintains a square root cache data structure in memory and validates it through repeatedly looking up values, caching additional new values on a cache miss.

  1. Change directory to the examples: cd UDB-<TAB>/examples
  2. Feel free to open cache.c in a text editor so you can follow along
  3. Build the cache example:make cache
  4. Run the program to see where it fails:./cache.
    The sample program crashes with an error message – the number that the program is pulling out of the cache is not the expected number.

Run UDB & diagnose the problem

  1. Let’s open the program with the UDB debugger to analyze the program execution and diagnose the reason for this failure. ../udb cache

    Press Enter to page through the license and y to accept
  2. Next, let’s run the application, so type run

    The application runs to the point where it crashes.
  3. Lets now examine the call stack to see a summary of how our program has got where it is. Type backtrace
  4. Because UDB is a time travel debugger, we can run the execution of the program in reverse. Use the reverse-finish command twice to reverse up the stack to the abort() statement in main() at cache.c line 85.

    Note: in UDB – like GDB – pressing enter on an empty line repeats the previous command.
  5. It will be best to switch into TUI (Text User Interface) mode now to see what is going on in the source code more easily. You can do this by pressing Ctrl+X and then A
  6. With a time travel debugger, you can go back to any line of code that executed and see the complete program state. So type info locals to see the state of the variables at this point.
    We can see that the integer square root of 255 is 15 (in sqroot_correct). But sqroot_cache is 0; which is the wrong value.

    This is the point where the defect manifests, but it’s not the root cause of the defect. We need to find the point where the cache is populated with the incorrect value.
  7. Line 78 is where the sqroot_correct variable is set.
    The reverse-next command executes the program backwards to the previous source line in the same file. So use the reverse-next command 3 times to go back in time to line 78.
  8. The previous line is where the sqroot_cache variable is set to its incorrect value.
    The reverse-step command executes the program backwards until it reaches a different source line. So use the reverse-step command once to step back into the cache_calculate() function and again to go back to line 39 where it returns this incorrect value.
  9. Type print g_cache[i]

    We see that the square root stored in the cache for 255 is 0; which is incorrect.
  10. Now we need to find out where this cache entry was populated with this incorrect value. We can do this by setting a watchpoint (a.k.a. a data breakpoint) on the incorrect entry in the cache and running back in time to where it was set.
    First set the watchpoint, watch -l g_cache[i].sqroot

    Then type reverse-continue to run backwards in time to see where this value was written to.

    The incorrect value was written to g_cache[i].sqroot
  11. Type info locals again to see the state of the variables at this point.

    Here we see that sqroot_adj is a very large negative number. But when stored in to g_cache[i].sqroot, it is being stored as 0.
    This suggests a type casting error. We need to investigate why the large negative number is being stored as a 0.
  12. Let’s print the type of the variables.
    ptype sqroot_adj
    ptype g_cache


    The data structure shows the array which is made up of 100 pairs of unsigned char. But unsigned char can only hold values 0 to 255, so when sqroot_adj is stored in the cache, its value is truncated.
  13. Now we know how we got the zero, and similarly, when -1 is cast to an unsigned char, it becomes 255, but where did the -2,147,483,648 actually come from?
  14. Line 48 shows that sqroot_adj is set to sqrt(number_adj), statically cast to an integer as sqrt() returns a double.
    print sqrt(number_adj)
  15. You’ve discovered that the root cause of this application failure happens as a result of attempting to put the square root of -1 into the cache, which was not intended.That happens because the for loop in line 46 loops from number-1 to number+1, but there is no protection anywhere to deal with the special case that we just hit where the number is zero. A simple if (number_adj < 0) continue; at the start of the for loop would have avoided this error.

Tutorial Complete

Congratulations! you used time travel debugging successfully to diagnose the root cause of the error in no time!

You’re now a time travelling Bug Hunter!

Next steps

  1. Try it on your own code (See Docs)

Help and Support

If you get stuck, help is always at hand.

UDB Documentation
Community – ask a question

Stay informed. Get the latest in your inbox.