Undo's co-founder and CTO, Dr. Greg Law, sat down with InfoQ's Daniel Bryant at QCon 2019 in San Francisco. That discussion was covered in a recent InfoQ feature.
Undo’s LiveRecorder captures all nondeterministic data, down to the instruction level, and this can be used to recreate an application’s complete state. This recording process typically results in a 2x to 5x slowdown of the target application. LiveRecorder is only used to record failed processes, and is not intended to be run constantly like a profiler or other observability tooling.
Logging is valuable, in that it provides additional context to the business problem or underlying algorithm. Logging can 'state what you know'. An exception or stack trace can also give good clues as to what’s gone wrong. But often as an engineer, you simply need to replay the scenario that caused an issue, and step-by-step build a mental model of the issue and the underlying causes.
Data capture is optimized so it can be stored in memory, and the entire program’s memory and register state can be recreated on demand with minimal overhead. The recorded data can be persisted to a file and shared amongst engineers. LiveRecorder comes with an integrated reversible debugger so the recording can be debugged by replaying it backwards and forwards.
In discussion with InfoQ, Law was clear to communicate that using LiveRecorder should not be seen as an alternative to emitting metrics or logging. LiveRecorder does not aim to alert engineers to the presence of faults or failures within an application, and this is instead best accomplished via the use of metric collection and analysis, exception handling code, and alerting tooling. Logging can also be used by engineers to provide more context as to what is, or should be, occurring within an application at an arbitrary point in execution.