WatchPoint

Image link

Debugging multithreaded code with GDB: thread names

Threading bugs can be tricky to diagnose! Thankfully, GDB has some great functionality for helping to debug threads. In this tutorial, we’ll look at how to debug threads using GDB, along with some helpful examples.

 

​Debugging C/C++ with GDB & Pthreads

The following program will be used for demonstration. It creates a four-long character array and four threads, which each indefinitely increment one of the characters in the array to the next letter in the alphabet.
After each increment in each thread, they print the value of the character array and a carriage return.

#define _GNU_SOURCE
#include <stdio.h>
#include <stdint.h>
#include <assert.h>
#include <string.h>
#include <pthread.h>

#define THREAD_COUNT 4

char spinners[THREAD_COUNT];

static void* looper(void *p) {
    int idx = (intptr_t) p;
    while (1) {
        spinners[idx]++;
        if (spinners[idx] > 'z') spinners[idx] = 'a';
        printf(" %s\r", spinners);
        fflush(stdout);
    }
    return p;
}

int main() {
    memset(spinners, 'a', THREAD_COUNT);
    pthread_t threads[THREAD_COUNT];
    intptr_t i;for (i=0;i<THREAD_COUNT;i++) {
        int e = pthread_create(&threads[i], NULL, looper, (void*)i);
        assert(!e);
        char name[64];
        snprintf(name, sizeof name, "worker%li", i);
        pthread_setname_np(threads[i], name);
    }
    pthread_join(threads[0], NULL);
    return 0;
}

This is compiled with:

$ cc -Wall -pipe -std=c89 -Og -ggdb3 -lpthread -o threads threads.c

Running it in GDB gives:

GNU gdb (GDB) 16.2
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./threads...
(gdb) run
Starting program: /home/user/Downloads/pthreads/threads 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7bff6c0 (LWP 2980)]
[New Thread 0x7ffff73fe6c0 (LWP 2981)]
[New Thread 0x7ffff6bfd6c0 (LWP 2982)]
[New Thread 0x7ffff63fc6c0 (LWP 2983)]
 pzsr

[New LWP 100370 of process 1667] refers to the creation of a new thread (“Light weight process”) with subprocess ID 100370 of process with ID 1667. You can use Ctrl+c to interrupt the running of the program and the commands info threads and backtrace / bt to get this information:

 ^C
Thread 1 "threads" received signal SIGINT, Interrupt.
__syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
warning: 56	../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S: No such file or directory
(gdb) info threads
  Id   Target Id                                  Frame 
* 1    Thread 0x7ffff7f96740 (LWP 2977) "threads" __syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
  2    Thread 0x7ffff7bff6c0 (LWP 2980) "worker0" __GI___lll_lock_wake_private (futex=0x7ffff7e127b0 <_IO_stdfile_1_lock>)
    at ./nptl/lowlevellock.c:57
  3    Thread 0x7ffff73fe6c0 (LWP 2981) "worker1" __GI___lll_lock_wake_private (futex=0x7ffff7e127b0 <_IO_stdfile_1_lock>)
    at ./nptl/lowlevellock.c:57
  4    Thread 0x7ffff6bfd6c0 (LWP 2982) "worker2" __syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
  5    Thread 0x7ffff63fc6c0 (LWP 2983) "worker3" futex_wait (futex_word=0x7ffff7e127b0 <_IO_stdfile_1_lock>, expected=2, private=0)
    at ../sysdeps/nptl/futex-internal.h:146
(gdb) bt
#0  __syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
#1  0x00007ffff7c9eae3 in __internal_syscall_cancel (a1=a1@entry=140737349941648, a2=<optimised out>, a3=<optimised out>, a4=a4@entry=0, a5=a5@entry=0, a6=a6@entry=4294967295, nr=202)
    at ./nptl/cancellation.c:49
#2  0x00007ffff7c9f237 in __futex_abstimed_wait_common64 (private=128, futex_word=0x7ffff7bff990, expected=<optimised out>, op=265, abstime=0x0, cancel=true) at ./nptl/futex-internal.c:57
#3  __futex_abstimed_wait_common (futex_word=0x7ffff7bff990, expected=<optimised out>, clockid=0, abstime=0x0, private=128, cancel=true) at ./nptl/futex-internal.c:87
#4  __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x7ffff7bff990, expected=<optimised out>, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=128)
    at ./nptl/futex-internal.c:139
#5  0x00007ffff7ca4614 in __pthread_clockjoin_ex (threadid=140737349940928, thread_return=thread_return@entry=0x0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, block=block@entry=true)
    at ./nptl/pthread_join_common.c:108
#6  0x00007ffff7ca4483 in ___pthread_join (threadid=<optimised out>, thread_return=thread_return@entry=0x0) at ./nptl/pthread_join.c:24
#7  0x000055555555530f in main () at threads.c:33
(gdb) info threads
Section Description
* 1 Asterisk means that that thread is selected
LWP 100298 Light Weight Process (thread) and it’s ID
"worker0" Process name, set by pthread_setname_np(threads[0], name);
_write The current function that the thread is at

The backtrace shows the calls of the main thread because that is the one selected. To view the backtrace of another thread, you can switch selected thread with thread n, with n being the Id of the thread to switch to and call backtrace again:

(gdb) thread 2
[Switching to thread 2 (Thread 0x7ffff7bff6c0 (LWP 2980))]
#0  __GI___lll_lock_wake_private (futex=0x7ffff7e127b0 <_IO_stdfile_1_lock>) at ./nptl/lowlevellock.c:57
warning: 57	./nptl/lowlevellock.c: No such file or directory
(gdb) bt
#0  __GI___lll_lock_wake_private (futex=0x7ffff7e127b0 <_IO_stdfile_1_lock>) at ./nptl/lowlevellock.c:57
#1  0x00007ffff7c8b667 in _IO_acquire_lock_fct (p=<synthetic pointer>) at ./libio/libioP.h:991
#2  __GI__IO_fflush (fp=0x7ffff7e115c0 <_IO_2_1_stdout_>) at ./libio/iofflush.c:39
#3  0x000055555555525f in looper (p=<optimised out>) at threads.c:18
#4  0x00007ffff7ca27f1 in start_thread (arg=<optimised out>) at ./nptl/pthread_create.c:448
#5  0x00007ffff7d33c9c in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

We can use the disp command to view the spinners array as it changes over the execution of the program.

Depending on your CPU, when using next, GDB may either just run the selected thread…

(gdb) disp spinners
1: spinners = "vqoh"
(gdb) next
17	    if (spinners[idx] > 'z') spinners[idx] = 'a';
1: spinners = "vqoh"
(gdb) next
19	    printf("%s\r", spinners);
1: spinners = "wqoh"
(gdb) next
20	    fflush(stdout);
1: spinners = "wqoh"
(gdb) next
wqoh
14	  while (1)
1: spinners = "wqoh"
(gdb) next

…or with modern CPU’s, all the threads together. In the latter situation, the other threads will run further ahead than the current thread as GDB waits to break the current thread before attempting to break the other threads.

You can see this as the second, third and fourth characters increment much further than the first character (which is being altered by the currently selected thread):

(gdb) disp spinners
1: spinners = "qkzb"
(gdb) next
 qcpe
17			printf(" %s\r", spinners);
1: spinners = "qcpf"
(gdb) next
 qfqa
18			fflush(stdout);
1: spinners = "qgqb"
(gdb) next
 qlel
15			spinners[idx]++;
1: spinners = "qlel"
(gdb) next
 rnys
16			if (spinners[idx] > 'z') spinners[idx] = 'a';
1: spinners = "rnzs"

To stop other threads from running in parallel, you can use the command set scheduler-locking on so you can focus on the processing of a single thread:

(gdb) set scheduler-locking on
(gdb) next
19	    printf("%s\r", spinners);
1: spinners = "xqoh"
(gdb) 
20	    fflush(stdout);
1: spinners = "xqoh"
(gdb) 
xqoh
14	  while (1)
1: spinners = "xqoh"
(gdb) 
16	    spinners[idx]++;
1: spinners = "xqoh"
(gdb) 
17	    if (spinners[idx] > 'z') spinners[idx] = 'a';
1: spinners = "xqoh"
(gdb) 
19	    printf("%s\r", spinners);
1: spinners = "yqoh"
(gdb) 
20	    fflush(stdout);
1: spinners = "yqoh"
(gdb) 
yqoh
14	  while (1)
1: spinners = "yqoh"
(gdb) 
16	    spinners[idx]++;
1: spinners = "yqoh"
(gdb) 
17	    if (spinners[idx] > 'z') spinners[idx] = 'a';
1: spinners = "yqoh"

As you can see non of the other characters change alongside the first. This means that all the other processes are inactive. Another mode is scheduler-locking step where only the single thread is executed for all commands with the single exception of continue.

Breakpoints

When creating breakpoints, the first thread to hit the breakpoint will be reported. If your system was the kind to run a single thread on it’s own with scheduler-locking disabled, then it will always be the selected thread. Otherwise it will be a random thread. Below is the former type:

(gdb) b 20
Breakpoint 1 at 0x2019f9: file threads.c, line 20.
(gdb) set scheduler-locking off
(gdb) info threads
  Id   Target Id                            Frame 
  1    LWP 100298 of process 1667           0x0000000800253e7c in ?? ()
   from /lib/libthr.so.3
* 2    LWP 100370 of process 1667 "worker0" 0x000000080025cf5a in ?? ()
   from /lib/libthr.so.3
  3    LWP 100371 of process 1667 "worker1" 0x0000000800253e7a in ?? ()
  from /lib/libthr.so.3
  4    LWP 100373 of process 1667 "worker3" 0x0000000800253e7c in ?? ()
   from /lib/libthr.so.3
  5    LWP 100372 of process 1667 "worker2" 0x0000000800253e7c in ?? ()
   from /lib/libthr.so.3
(gdb) continue
Continuing.
Thread 2 "worker0" hit Breakpoint 1, looper (p=<optimized out>) at threads.c:20
20	    fflush(stdout);
1: spinners = "rqtg"
(gdb) continue
Continuing.
rqtg
Thread 2 "worker0" hit Breakpoint 1, looper (p=<optimized out>) at threads.c:20
20	    fflush(stdout);
1: spinners = "sqtg"
(gdb) continue
Continuing.
sqtg
Thread 2 "worker0" hit Breakpoint 1, looper (p=<optimized out>) at threads.c:20
20	    fflush(stdout);
1: spinners = "tqtg"
(gdb) continue
Continuing.
tqtg
Thread 2 "worker0" hit Breakpoint 1, looper (p=<optimized out>) at threads.c:20
20	    fflush(stdout);
1: spinners = "uqtg"

Running commands on multiple threads

You can use thread apply all to run a command on all threads. Here is that command being used to print the program counter of all threads:

(gdb) thread apply all print $pc

Thread 5 (Thread 0x7ffff63fc6c0 (LWP 2983) "worker3"):
$1 = (void (*)()) 0x7ffff7caafb0 <__syscall_cancel_arch>

Thread 4 (Thread 0x7ffff6bfd6c0 (LWP 2982) "worker2"):
$2 = (void (*)()) 0x7ffff7c9f3fb <__GI___lll_lock_wait_private+43>

Thread 3 (Thread 0x7ffff73fe6c0 (LWP 2981) "worker1"):
$3 = (void (*)()) 0x7ffff7c9f3fb <__GI___lll_lock_wait_private+43>

Thread 2 (Thread 0x7ffff7bff6c0 (LWP 2980) "worker0"):
$4 = (void (*)()) 0x7ffff7c9f3fb <__GI___lll_lock_wait_private+43>

Thread 1 (Thread 0x7ffff7f96740 (LWP 2977) "threads"):
$5 = (void (*)()) 0x7ffff7caafe2 <__syscall_cancel_arch+50>

To run in order of ascending thread Id you can use the -ascending argument:

(gdb) thread apply all -ascending print $pc

Thread 1 (Thread 0x7ffff7f96740 (LWP 2977) "threads"):
$6 = (void (*)()) 0x7ffff7caafe2 <__syscall_cancel_arch+50>

Thread 2 (Thread 0x7ffff7bff6c0 (LWP 2980) "worker0"):
$7 = (void (*)()) 0x7ffff7c9f3fb <__GI___lll_lock_wait_private+43>

Thread 3 (Thread 0x7ffff73fe6c0 (LWP 2981) "worker1"):
$8 = (void (*)()) 0x7ffff7c9f3fb <__GI___lll_lock_wait_private+43>

Thread 4 (Thread 0x7ffff6bfd6c0 (LWP 2982) "worker2"):
$9 = (void (*)()) 0x7ffff7c9f3fb <__GI___lll_lock_wait_private+43>

Thread 5 (Thread 0x7ffff63fc6c0 (LWP 2983) "worker3"):
$10 = (void (*)()) 0x7ffff7caafb0 <__syscall_cancel_arch>

Instead of using the all keyword you can specify a specific range of threads to apply to:

(gdb) thread apply 4-5 print $er^&errno

Thread 4 (Thread 0x7ffff6bfd6c0 (LWP 2982) "worker2"):
$11 = (int *) 0x7ffff6bfd648

Thread 5 (Thread 0x7ffff63fc6c0 (LWP 2983) "worker3"):
$12 = (int *) 0x7ffff63fc648

To continue running through threads ignoring errors you can use silent. This is useful if you are trying to run a command that only works for some threads.
A shorted form of thread apply all is taa. A shorted form of thread apply all silent is taas, used below:

(gdb) taas print &errno

Thread 5 (Thread 0x7ffff63fc6c0 (LWP 2983) "worker3"):
$13 = void

Thread 4 (Thread 0x7ffff6bfd6c0 (LWP 2982) "worker2"):
$14 = void

Thread 3 (Thread 0x7ffff73fe6c0 (LWP 2981) "worker1"):
$15 = void

Thread 2 (Thread 0x7ffff7bff6c0 (LWP 2980) "worker0"):
$16 = void

Thread 1 (Thread 0x7ffff7f96740 (LWP 2977) "threads"):
$17 = void

Non-stop mode

Non-stop mode being enabled makes non-selected threads keep running when the main thread is stopped. As demonstrated below, it is useful to run using r >/dev/null to direct away standard output from the program in this case, as the program has continual output from it’s concurrent threads. When running normally GDB and the program have to share the terminal for output. Another alternative is to redirect the programs’ inputs and outputs to a separate terminal emulator (i.e. run >/dev/pts/4 </dev/pts/4).

(gdb) set non-stop on
Cannot change this setting while the inferior is running.
(gdb) kill
Kill the program being debugged? (y or n) y
[Inferior 1 (process 2977) killed]
(gdb) set non-stop on
(gdb) run
Starting program: /home/user/Downloads/pthreads/threads 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7bff6c0 (LWP 2991)]
Thread 2 "worker0" hit Breakpoint 1, __syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:31
warning: 31	../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S: No such file or directory
1: spinners = "baaa"
(gdb) [New Thread 0x7ffff73fe6c0 (LWP 2992)]
[New Thread 0x7ffff6bfd6c0 (LWP 2993)]
[New Thread 0x7ffff63fc6c0 (LWP 2994)]
Thread 1 "threads" hit Breakpoint 1, __syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:31
31	in ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S
1: spinners = "bbbb"
Quit
(gdb) delete
Delete all breakpoints, watchpoints, tracepoints, and catchpoints? (y or n) y
(gdb) coninue
Continuing.
^C
Thread 1 "threads" received signal SIGINT, Interrupt.
__syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
56	in ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S
1: spinners = "bbbb"
(gdb) info threads
  Id   Target Id                                  Frame 

* 1    Thread 0x7ffff7f96740 (LWP 2990) "threads" __syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56

  2    Thread 0x7ffff7bff6c0 (LWP 2991) "worker0" __syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:31

  3    Thread 0x7ffff73fe6c0 (LWP 2992) "worker1" (running)

  4    Thread 0x7ffff6bfd6c0 (LWP 2993) "worker2" (running)

  5    Thread 0x7ffff63fc6c0 (LWP 2994) "worker3" (running)

You can see that threads 2 through 5 are shown as (running).
It is important to note that non-stop mode can only be set when the inferior program is not running and can only be set for certain program targets:

(gdb) set non-stop on
Cannot change this setting while the inferior is running.
(gdb) kill
Kill the program being debugged? (y or n) y
[Inferior 1 (process 1667) killed]
(gdb) set non-stop on
(gdb) run
The target does not support running in non-stop mode.

 

New call-to-action

Want debugging tips directly in your inbox?

Share this tutorial