How to identify race conditions in multithreaded code

How to identify race conditions in multithreaded code

Rare race conditions in multithreaded code can be difficult to identify. Undo Suite 8.0 introduces Feedback-Directed Thread Fuzzing which allows you to capture more of those bugs in Undo recordings, more quickly, and so diagnose and fix them before they are released.

The new live-record --thread-fuzzing-analyze option analyzes an Undo recording to identify instructions that access memory shared between multiple threads, and increases the frequency of thread switches around those instructions when re-recording the program.

Read how one engineer finds value in Undo’s Feedback-Directed Thread Fuzzing feature below:

“Undo and its directed thread fuzzing capability has become essential for debugging our heavily multithreaded codebase and making sure we identify and fix known bugs during development – before they get anywhere near customers.” 

David – Software Engineer at a leading networking company

The challenge

David works on the infrastructure code for a networking device (complex C code running on Linux) which orchestrates all network operations.

During development, David encountered a problem in a complex concurrent code. The program was crashing arbitrarily in a multithreaded environment.

Introducing printf statements in several locations made the bug disappear, making it harder to pin down. Integration with sanitizers was not convenient because of the way the data structures are allocated. So David was back to normal code execution; he ran the code again and again. But nothing came out. A typical heisenbug, lurking in the code and only appearing from time to time. 

Traditional methods were not satisfactory

David tried GDB, but getting a consistent reproduction of the issue took multiple runs; and once the program crashed, it was difficult to identify the root cause as certain data structures contained data that should not have been there, indicating that these structures had been somehow modified. 

David tried to debug the issue for 2 days with conventional methods but did not get anywhere.

The solution

Finally, David tried Undo’s new functionality called Feedback-Directed Thread Fuzzing. Undo’s thread fuzzing is a key feature that enables engineers to expose concurrency bugs in large-scale multithreaded codebases. It does that by manipulating the way the threads are scheduled, making concurrency bugs that are rare in normal circumstances become statistically more common. It’s a bit like shaking your codebase and seeing what comes out. 

Feedback-Directed Thread Fuzzing is a new enhancement that uses a 2-pass technique:

  • 1st pass: Identifies memory accessed by multiple threads
  • 2nd pass: Uses the information from the 1st pass to force thread switches around these data accesses, increasing the likelihood of race conditions occuring.

David used the following debugging procedure:

  • Obtained a “reference” Undo recording file of the program with: live-record -o recording.undo --disable-aslr program-name. This recording run did not need to exhibit the crash.
  • Fed the recording into Undo (with directed thread fuzzing on): live-record --thread-fuzzing-analyze recording.undo program-name and iterated 20 times (via shell scripting). An alternative, which avoids shell scripting, is to use the --retry-for option: live-record --thread-fuzzing-analyze recording.undo --retry-for 5m --save-on error program-name
  • Obtained a crashing execution (in less than a 1 minute!) in the form of a recording file which he could then load up into Undo UDB to debug by travelling back and forth in the program’s execution to see what happened.

After analyzing the failing process captured in the recording, it was discovered that the proximate cause of the crash was heap internal bookkeeping data being overwritten by a memcpy

Want to try Feedback-Directed Thread Fuzzing on your own application and see what you can find?

Get started in under two minutes with our free trial.

Free Trial

Stay informed. Get the latest in your inbox.