On libunwind and dynamically generated code on x86-64
libunwind is a - supposedly portable - library for performing native stack unwinding. In simple scenarios, the library does its job fairly well, however, things get more interesting in the presence of dynamically-generated (vis-à-vis JIT-compiled) code. In fact, on any platform, and in any context, unwinding native stacks in the presence of dynamically-generated code is an interesting topic. It so happens that the Windows API gets this right, presenting you with two different options, with it being sufficient to go with either option:
- Construct some RUNTIME_FUNCTION data structures up front, and then call RtlAddFunctionTable.
- Call RtlInstallFunctionTableCallback, and then construct RUNTIME_FUNCTION data structures lazily on demand (this is exactly the interface that a JIT compiler would want, and perhaps it was designed with one in mind).
I feel that Linux doesn't really get this right. Rather than there being a single OS-supplied interface, every tool has its own way of doing things:
- libunwind: Construct ELF
.debug_frame
data structures, andtable_entry
data structures, and aunw_dyn_table_info
data structure, and aunw_dyn_info_t
structure, then call_U_dyn_register
. - C++ exception unwinder: Construct ELF
.eh_frame
data structures, then call__register_frame
(I'd like to link to some documentation on__register_frame
, but Google doesn't immediately find anything, so I assume that there isn't any). - GDB: Construct a full in-memory ELF object, manually maintain a doubly-linked list of all such objects in a global variable called
__jit_debug_descriptor
, and call a global function called__jit_debug_register_code
when this list is changed (though it seems baroque, this is documented quite well).
All three of these interfaces require data structures to be generated up-front, which isn't ideal for JIT compilers, but I digress (though possibly in some cases, creative use of PROT_NONE
pages and SIGSEGV
handlers could allow some degree of lazy on-demand generation). That all three interfaces consume different data is annoying. That .eh_frame
and .debug_frame
are subtly different is also annoying, but I digress again.
Though libunwind presents an interface, it happens to be poorly documented and poorly implemented for x86-64. In particular, if you read the documentation, then you'd be left with the impression that unw_dyn_info_t
can refer to either a unw_dyn_proc_info_t
structure (UNW_INFO_FORMAT_DYNAMIC
), or a unw_dyn_table_info
structure (UNW_INFO_FORMAT_REMOTE_TABLE
or UNW_INFO_FORMAT_TABLE
). On the former structure, the documentation has the following to say:
This is the preferred dynamic unwind-info format and it is generally the one used by full-blown runtime code-generators. In this format, the details of a procedure are described by a structure of type
unw_dyn_proc_info_t
.
Let me save you some time by pointing out that unwind directives for unw_dyn_proc_info_t
structures plain aren't implemented on x86-64. As such, using unw_dyn_proc_info_t
is a non-starter if you actually want to do any unwinding. Consequently, the only option is to use unw_dyn_table_info
. The most interesting field of unw_dyn_table_info
is table_data
, the documentation for which states:
A pointer to the actual data encoding the unwind-info. The exact format is architecture-specific (see architecture-specific sections below).
Of course, there are no such notes below with reference to x86-64. Let me save you some time by pointing out that the table_data
field should refer to an array of table_entry
structures (which aren't documented, or present in any header, but can be found in the source). In turn, the fde_offset
field of that structure should refer to a DWARF FDE in .debug_frame
style.
After supplying unwind information via UNW_INFO_FORMAT_TABLE
, libunwind is capable of unwinding over call frames for dynamically-generated code on x86-64. After getting basic unwinding working, one might like to make libunwind supply a useful function name for the call frame. The unw_dyn_table_info
structure contains a name_ptr
field which looks perfect for this task, but the code which should read this field instead just returns UNW_ENOINFO
for UNW_INFO_FORMAT_TABLE
(or, it would, but UNW_EINVAL
is also likely, as the enclosing switch statement should be on pi.format
rather than di->format
). The observant reader will spot that this name-related logic is fully implemented for UNW_INFO_FORMAT_DYNAMIC
, leaving us in a sad situation on x86-64: use UNW_INFO_FORMAT_DYNAMIC
and get names but no unwinding, or use UNW_INFO_FORMAT_TABLE
and get unwinding but no names.
I'd like to finish on a positive note instead of that sad note, but alas, this is a tale of woe.