What Is Still Missing in System Call Tracing
- Track: Kernel
- Room: UA2.114 (Baudoux)
- Day: Sunday
- Start: 15:20
- End: 15:40
- Video only: ua2114
- Chat: Join the conversation!
This talk follows last year's presentation "Status and Desiderata for Syscall Tracing and Virtualization Support" and reports on progress and remaining gaps in Linux system call tracing.
The talk presents a set of Linux kernel patches, intended for upstream submission, that address the following limitations and aim to make system call tracing and virtualization more expressive, portable, and efficient.
Over the past year, support for PTRACE_SET_SYSCALL_INFO has been merged into the mainline kernel. While developing a portable version of VUOS across multiple architectures, several limitations of the current tracing interfaces became evident. In particular, skipping a system call by setting its number to -1 is insufficient, as it does not allow the tracer to control the return value or errno, nor to adjust the program counter. As a consequence, the current VUOS proof-of-concept replaces skipped system calls with getpid and fixes up the return value at PTRACE_SYSCALL_INFO_EXIT, doubling the number of context switches and incurring a measurable performance cost. Updating the program counter currently requires non-portable, architecture-specific code using PTRACE_POKEUSER or PTRACE_SETREGSET.
Additional issues arise with seccomp_unotify. Tracing all system calls is difficult because file descriptors must be transferred from the traced task to the tracer; common techniques based on UNIX domain sockets and ancillary messages require sendmsg and recvmsg themselves to be excluded from tracing. Furthermore, there is currently no support for virtualizing the F_DUPFD command of fcntl, nor for allowing a tracer to atomically close a file descriptor in the traced process.
Speakers
| Renzo Davoli | |
| Davide Berardi |