Brussels / 3 & 4 February 2024


Linux' receive_fd_replace() semantics confusing

The current implementation of seccomp's ADDFD uses the receive_fd_replace() helper, and the semantics of this are quite confusing. It does what it says on the tin: it replaces a file descriptor in the fd table of the trapped process. However, various kernel subsystems (e.g. epoll) take a copy of the file descriptor number for their internal data structures, in addition to a ref of the struct file. This creates all sorts of problems: epoll will not report events correctly since the new struct file * is not replaced in the epoll instance, and may generate errors, etc.

We'll go over epoll's representation in detail along with our current in-production solution to fix all of this up from userspace. We'll also go over 1, a proposed solution for fixing up epoll in particular, that would require each instance of this to be manually fixed.

We're looking for feedback on how to make this faster, and what any potential kernel fixing would look like.


Photo of Tycho Andersen Tycho Andersen