Understanding Pointer Authentication Effects in Debugger Output

Understanding Pointer Authentication Effects in Debugger Output

or, why is UDB showing me mad things?

Author: Andrew Collier, Senior Software Engineer at Undo

 

When you are debugging an application running on an ARM CPU, if you examine the register contents you may sometimes notice that pointers have values that seem to be higher than the expected possible range. This is particularly visible in the link register (X30) which, by convention, contains the instruction address where code execution will continue after the current function is completed. The extra high-value bits appear to make its value invalid, not pointing to real code. Why is this?

This is Pointer Authentication, a security feature introduced as FEAT_PAuth in ARMv8.3-A, which aims to mitigate certain kinds of control-flow attack by cryptographically signing pointers, and later authenticating that signature before the pointer is used again. The signature (Pointer Authentication Code, or PAC) is stored in some otherwise-unused upper bits of the 64-bit pointer.

Pointer AuthenticationYour application may already be using Pointer Authentication, without you having made any explicit changes. At the moment you are most likely to see this in system libraries, including glibc, on certain relatively modern Linux distributions (e.g. Fedora Linux since version 33).

Additionally, to get the benefits of pointer authentication throughout your application, you can make a relatively small change to your compiler command line (which should require no changes to your source code) by adding the following:

-mbranch-protection=pac-ret+leaf

On its own, this will use only the subset of instructions that are fully backward-compatible with the very earliest 64-bit ARM chips. However, there are some additional related instructions in ARMv8.3-A which enable better code size and performance, and if you know that your application will be running only on a CPU that implements them you can get the advantages of using them by adding the following also:

-march=armv8.3-a

Effects visible in a debugging session

When GDB or UDB displays a backtrace, it is generally able to automatically take account of PAC and recover the original code addresses. The stripped address is displayed in the backtrace with a label [PAC] where this needed to be done:

Stripped Address [PAC]However, if (for whatever reason) you just have a pointer and you need to find out the code it is associated with – the debugger does not have enough context to perform this automatically and you will need to strip off the PAC manually.

Stip PAC manually

Unfortunately it is system-dependent which address bits need to be removed, but the target exposes a pseudo-register $pauth_cmask to describe the required bitmask. It can be used like this for example:

$pauth_cmask

Can pointer authentication have an effect on the operation of the program itself?

In a very few circumstances, yes, but it is probably only noticeable if you are mixing compiled code with either hand-written assembler routines or JIT compilation.

Generally, on entry to your hand-written routine the Link Register will contain an unmodified address – it is your choice whether or not to sign it. When the routine returns, you need to consistently make the same choice of whether to authenticate the Link Register value. This is usually trivial to do correctly, but you need to take extra care if you implement tail call optimisations: in this case a different function may have chosen to sign the return value, you need to ensure that your routine makes the same choice. It is only possible to authenticate a pointer that has been signed.

It should be noted that the encryption key used for pointer signing, will be different every time your application runs. This means that the PAC values will be different in each invocation, and this may be confusing if you are trying to debug a problem with signed pointers. If you can use Undo to record your application, the encryption keys are saved along with the recording, and so the PAC values generated will be the same every time the recording is replayed.

What challenges does pointer authentication create for tools like UDB which rely on dynamic binary translation?

Dynamic Binary Translation is an environment in which, rather than directly executing the instructions of a program, the instructions are examined just before they are needed and a modified program is written elsewhere in memory, which is actually run instead. This might allow you to run an application that was compiled for an entirely different kind of CPU, for example.

UDB is an interactive time travel debugger for Linux. We intercept each instruction and record any potentially non-deterministic results into an execution history, which you can replay to get instant visibility into what the process just did, and why. UDB allows you to travel back and forth in the recording to inspect program state at any point in time, following the trail from symptom to root cause in a single debug cycle.

Because the really-executed instructions are at a different address than the original application code, care always needs to be taken whenever there is a branch from one instruction into any other part of the application. In UDB’s case, any address that appears on the stack or in the link register, is the address as it is seen by the original program (we call this an A-address). We translate it into the modified address of the instrumented code (we call this a B-address) at the time it is used. When an application signs a pointer, or authenticates a PAC, it is an A address that is being checked.

It is often advantageous to be able to predict, before executing an instruction, what effects it might have (in terms of accessing certain memory regions, causing exceptions to be taken etc.) but interestingly this is not always possible in the case of pointer authentication. There is no direct mechanism for a program to determine whether or not a PAC is valid, other than to attempt authenticating it: on some CPUs (e.g. AWS’s Graviton 3) a failure just results in the pointer value being made out of range, which we can compare against the original pointer value and determine that failure occurred. But on other CPUs (e.g. Graviton 4, or more generally those that implement ARM architecture feature FEAT_FPAC), a failure will immediately result in a synchronous exception which is reported by Linux as a SIGILL signal.

It is very important that, should the application ever try to authenticate a pointer that has an invalid PAC, then the effect must be the same inside the Undo environment as it would have been if the application were run normally. (If this were not the case, and we allowed the use of pointers whose signature is not valid, this might result in weakening the security of the application and allow bugs to be exploited in ways that would otherwise have been prevented by the Pointer Authentication model.)

One challenge are the instructions retaa and retab which combine, in a single instruction, authenticating a PAC and also jumping to the pointer’s original value. Undo needs to perform these steps separately because the A-address pointer, that we authenticate, is never the same as the B-address pointer from which we continue executing.

Interested in trying UDB? You can sign up for a free trial using the button below.

Try now

Stay informed. Get the latest in your inbox.