LiveRecorder 6.9 recordings are quicker to replay

LiveRecorder 6.9 recordings are quicker to replay

One of the new features in LiveRecorder 6.9 is saving multiple snapshots into recording files.

You can see how this can make debugging up to 4x faster in a newly loaded LiveRecorder 6.9 recording compared to a newly loaded recording made with LiveRecorder 6.8:

Multiple Snapshots

I’ve illustrated the benefit here with command line UDB but it’s even more important to customers testing beta deployments of our new LiveRecorder Observatory web-based interface, where people often load up a recording freshly generated from QA or CI to see where it went wrong.

The last command I used to find the erroneous line in the example program is a very helpful tool also new in LiveRecorder 6.9. See https://docs.undo.io/TrackingValueChanges.html for a description of how to benefit from it.

Data from the real world

The debugging session above illustrated an example program, but you can see how the time to reverse-continue from the end of a recording all the way to the beginning improved significantly for some very large recordings in testing in real-world situations on Undo customers’ own equipment.

Table: real world data

Testing real-world situations has its own challenges. But Undo’s technology allows us and our customers to precisely replay the execution of a program. So we took some customers’ recordings made with earlier versions of LiveRecorder, added the extra snapshots to simulate recordings made with LiveRecorder 6.9, and tested reverse-continue with both versions.

For those real-world recordings, the speed-up varied between 2.5x and 4x depending what else the application was doing. Other improvements to replay performance mean that we’ll see recordings generated with the new LiveRecorder 6.9 experience speed ups closer to the maximum 4x speed up more often, especially for customers who need to record very large programs that run for hours, as with some of the examples above.

In theory, it may be faster to replay some applications’ recordings than it would have taken to execute the application without recording.

How did we achieve this speed-up?

When replaying a recording the Undo Engine keeps a number of snapshots of the program’s state throughout its execution history so it can rapidly recreate the state of the program at whatever point in history the programmer needs to see.

It doesn’t store the complete state for every moment, instead it constructs the state for an arbitrary time by forking a new process from the closest prior snapshot and running the program forward until it reaches the desired time. See https://docs.undo.io/TechnicalDetails.html for a description of how this works.

The Undo Engine loads snapshots from the recording file and dynamically adds and removes additional snapshots as the recording is replayed.

The replay of the customer applications above are so much improved because of two key features which came out in Liverecorder 6.8 and 6.9:

  • LiveRecorder 6.8 and earlier saved into the recording file a snapshot at the start of the program’s execution history, and one other snapshot near the end. LiveRecorder 6.9 saves four snapshots into the recording file; one at each quarter of the way through history into a recording, erring toward saving the last snapshot closer to the end of history and the others as equally spaced as possible. These snapshots are then present in memory as soon as a recording is loaded, without needing to wait until the recording has been replayed all the way through to dynamically produce additional snapshots.
  • LiveRecorder 6.8 introduced parallel search. This means that when searching throughout history to find the last time a variable changed, the Undo Engine replays four different parts of history at the same time, seeing which one contains the change we’re looking for. See this technical article for an explanation of why parallel search is so useful.

These features work together to mean that even just after you’ve loaded a recording, the Undo Engine can search through multiple parts of history at the same time, giving an up to 4x speed-up compared to older versions of Undo:

Multiple snapshot diagram

Fiddly details

Note that this only helps to speed up the slowest debugging operations, the ones that need to replay large parts or the entirety of the program’s execution history.

Even in earlier versions of Undo, recordings were saved with a snapshot just before the end of execution history, meaning that after loading a recording a programmer could explore the very last part of history where any failure was most likely to be. Those operations which were already fairly quick have not got any faster with this change.

In our example program, if the line with the mistake that corrupted the pointer had been executed near the end of its execution, just before the print statement and after the long slow “calculation”, our last command would have uncovered it promptly, even using a recording produced by an older version of LiveRecorder.

If the pointer had been corrupted just before that snapshot, earlier versions of the Undo Engine would only have been able to construct the program’s state at that point by replaying the program’s execution from the beginning, filling in missing snapshots as it went.

The same applies if the instruction the programmer needed to find had been near the beginning of the program’s execution, as in our example, because any changes to the pointer’s value early in history might not be the reason it has its final value, if the variable is assigned to multiple times.

Using LiveRecorder 6.8, those operations that needed to replay the entire history became quicker because the Undo Engine could replay four parts of history simultaneously. But that only helped once another operation had already regenerated sufficiently many snapshots. With LiveRecorder 6.9, you get that increased speed as soon as you sit down and load a recording.

By Jack Vickeridge, a software engineer at Undo

Stay informed. Get the latest in your inbox.