Undo Cheat Sheet

Undo Cheat Sheet

Understand complex code and fix bugs faster

Overview

This cheat sheet covers two areas of interest:

  1. How to fix your hardest bugs with Undo
  2. Step by step guide on how to record and replay

How to Fix Hard Bugs with Undo

Memory Corruption

This is probably the easiest to solve with UDB:

  • Go to the end of the recording, ugo end.
  • Observe the problem – print <variable>.
  • last <variable> on the variable causing the problem. This will wind back to the line of code that most recently updated the variable in question.
  • Do more last if needed, until you get to the root cause. Top tip: last with no argument keeps on going back to the preceding change.

Concurrency Issues Including Deadlocks

Deadlocks are relatively easy to spot – the program doesn’t progress anymore:

  • Go to the end of the recording ugo end.
  • thread apply all bt – this will print the backtrace for all threads, hopefully it will show which threads are blocked on a lock.
  • break <locking function> – place a breakpoint on the functions the threads are blocked in, to acquire the lock.
  • break <unlocking function> – if there is a locking function there must be an unlocking function, place a breakpoint there too.
  • reverse-continue (rc for short) to go back to when the function was called.
  • info threads – to confirm which thread stopped on the breakpoint.

Understanding New Code

This section is going to be a bit generic as code can be so varied. Let’s assume you are interested in a specific sub-section of the program, not the whole of it

  • Place a breakpoint on the function you want to start your investigation on break <function>.
  • continue to get to it.
  • layout src to see the source code in UDB – this will allow you to have a much better idea of what is going on.
  • nextto get to the next line of code.
  • reverse next to go back to the previous line of code.
  • step to step into the next function.
  • reverse step to step into the function on the previous line, from the end.
  • last <expression> when you want to know what code was responsible for updating a data structure.
  • ubookmark <name for relevant place/action> – using bookmarks will make it so much easier to understand what is going on: they are always ordered by time -> you can build your graph of actions in UDB directly.

When you have visited enough interesting places (even out-of-order), info bookmarks (i bo for short) will show you the order in which those places were executed. This should help you understand what the software did.

Stack Corruption

Stack corruptions are a sub-set of memory corruptions, they are interesting because, usually, in GDB you get an unusable stacktrace and, therefore, have no idea of where you are / what is happening.

  • Go to the end and see how the stack has been corrupted.
  • backtrace (bt for short) to see that the stack is indeed in bad shape.
  • reverse-stepi (rsi for short) to step backward one instruction.
  • Repeat until the backtrace is useful again.
  • Check what operation is about to be executed, that’s likely to be the problem.
  • If there are parameters or variables involved then you might want to use last to see where these came from.

Step by Step Guide to Get Started with Time Travel Debugging

RECORD

Development and Testing: Recording a Program
$ live-record /path/to/myprog myarg1 myarg2

Use LiveRecorder to start myprog and record it.

 

$ live-record –p `pidof myprog`

Start recording the (only) running instance of myprog.

$ live-record --record-on symbol:listen myprog

Start myprog and start recording when it reaches the listen() function.

 

$ live-record --record-on program:myprog* mylauncher

Start mylauncher and follow all forks and execs, recording all processes that exec something beginning myprog.

 

$ live-record -o myprog1.undo --record-on:program --retry-for 10min --save-on error myprog1

Repeat run of myprog1 for 10min or until it fails to reproduce flaky failures.

CI and Scripting: Automating Recording         
$ live-record --save-on error /path/to/mytest myarg1

Drop-in replacement for the command line /path/to/mytest myarg1 giving the same return value, stdout etc. Saves a recording if mytest fails (exits due to SEGV, SIGABRT etc, or with a non-zero exit status).

CI and Production: LiveRecorder Library API

Add this code where you want to start recording:

 undolr_error_t error;

 int e = undolr_start(&error);

 if (e) { /* add error handling */ }

 e = undolr_save_on_termination("rec.undo");

Start recording; save a recording on exits. Requires your program to be linked with libundolr_x64.so.

 

 e = undolr_save_point_recording("rec.undo", error);

Save a recording of the current point in time only (e.g. from error handler).

REPLAY

Replay Recordings
$ udb <recording>

Start UDB and load the recording of your application. This will set up the environment for the recording to be replayed. No other steps required.

 

start 1> continue

Replay the recording to see the output and reach the end, where, most likely, the problem manifested.

 

end 4,562,983> backtrace

Get a feeling of where you are, what functions are involved and what went wrong.

 

end 4,562,983> print <variable>

Gather information about the variables involved in the problem, usually there will be one or more with values that are surprising.

 

end 4,562,983> last <variable>

Go back to the point in time and line of code where <variable> was most recently modified.

 

88% 4,015,425> ubookmark <name>

Note down interesting places in the execution, with sensible names, so that at a glance you will be able to understand what is happening.

 

43% 1,962,082>info bookmarks

Take a look at all the bookmarks you placed, to see the order of relevant events and get an understanding of what happened, when.

Interactive Debugging – Attach to a Running Process
$ udb -p <pid>

Start UDB and attach to a running process.

 

recording> continue

Just like with recordings, when debugging a live process, the best option is to let it run and allow Undo to capture everything that is happening. Once the bug manifests itself, then you can start to investigate what happened.

 

recording> backtrace

Get a feeling of where you are, what functions are involved and what went wrong.

 

recording> print <variable>

Gather information about the variables involved in the problem, usually there will be one or more with values that are surprising.

 

recording> last <variable>

Go back to the point in time and line of code where <variable> was most recently modified.

 

88% 4,015,425> ubookmark <name>

Note down interesting places in the execution, with sensible names, so that at a glance you will be able to understand what is happening.

 

43% 1,962,082> info bookmarks

See all the bookmarks you placed, to see the order of relevant events and get an understanding of what happened, when.

 

43% 1,962,082> usave <recording name>

The bug has been reproduced and you want to show it to your colleagues. Save the recording and capture this specific run, for them to see or for you to get back to again, later.

Interactive Debugging – Start a New Process
$ udb my_process

Start UDB

 

not running> run <arguments of my_process>

Just like with recordings, when debugging a live process, the best option is to let it run and allow Undo to capture everything that is happening. Once the bug manifests itself, then you can start to investigate what happened.

 

recording> backtrace

Get a feeling of where you are, what functions are involved and what went wrong.

 

recording> print <variable>

Gather information about the variables involved in the problem, usually there will be one or more with values that are clearly wrong.

 

recording> last <variable>

Check when the variable with the wrong value was last modified, going back in time and observing how / where the problem is.

 

88% 4,015,425> ubookmark <name>

Note down interesting places in the execution, with sensible names, so that at a glance you will be able to understand what is happening.

 

43% 1,962,082> info bookmarks

Take a look at all the bookmarks you placed, to see the order of relevant events and get an understanding of what happened, when.

 

43% 1,962,082> usave <recording name>

The bug has been reproduced and you want to show it to your colleagues. Save the recording and capture this specific run, for them to see or for you to get back to again, later.