* [GIT PULL] io_uring cBPF filter support
@ 2026-02-06 18:58 Jens Axboe
0 siblings, 0 replies; only message in thread
From: Jens Axboe @ 2026-02-06 18:58 UTC (permalink / raw)
To: Linus Torvalds; +Cc: io-uring, LKML, Christian Brauner
Hi Linus,
On top of the core io_uring changes, this adds support for both cBPF
filters for io_uring, as well as task inherited restrictions and
filters.
seccomp and io_uring don't play along nicely, as most of the interesting
data to filter on resides somewhat out-of-band, in the submission queue
ring. As a result, things like containers and systemd that apply seccomp
filters, can't filter io_uring operations. That leaves them with just
one choice if filtering is critical - filter the actual
io_uring_setup(2) system call to simply disallow io_uring. That's rather
unfortunate, and has limited us because of it.
io_uring already has some filtering support. It requires the ring to be
setup in a disabled state, and then a filter set can be applied. This
filter set is completely bi-modal - an opcode is either enabled or it's
not. Once a filter set is registered, the ring can be enabled. This is
very restrictive, and it's not useful at all to systemd or containers
which really want both broader and more specific control.
This patchset first adds support for cBPF filters for opcodes, which
enables tighter control over what exactly a specific opcode may do. As
examples, specific support is added for IORING_OP_OPENAT/OPENAT2,
allowing filtering on resolve flags. And another example is added for
IORING_OP_SOCKET, allowing filtering on domain/type/protocol. These are
both common use cases. cBPF was chosen rather than eBPF, because the
latter is often restricted in containers as well.
These filters are run post the init phase of the request, which allows
filters to even dip into data that is being passed in struct in user
memory, as the init side of requests make that data stable by bringing
it into the kernel. This allows filtering without needing to copy this
data twice, or have filters etc know about the exact layout of the user
data. The filters get the already copied and sanitized data passed.
On top of that support is added for per-task filters, meaning that any
ring created with a task that has a per-task filter will get those
filters applied when it's created. These filters are inherited across
fork as well. Once a filter has been registered, any further added
filters may only further restrict what operations are permitted. Filters
cannot change the return value of an operation, they can only permit or
deny it based on the contents.
Please pull!
The following changes since commit 0105b0562a5ed6374f06e5cd4246a3f1311a65a0:
io_uring: split out CQ waiting code into wait.c (2026-01-22 09:21:16 -0700)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux.git tags/io_uring-bpf-restrictions.4-20260206
for you to fetch changes up to ed82f35b926b2e505c14b7006473614b8f58b4f4:
io_uring: allow registration of per-task restrictions (2026-02-06 07:29:19 -0700)
----------------------------------------------------------------
io_uring-bpf-restrictions.4-20260206
----------------------------------------------------------------
Jens Axboe (7):
io_uring: add support for BPF filtering for opcode restrictions
io_uring/net: allow filtering on IORING_OP_SOCKET data
io_uring/bpf_filter: allow filtering on contents of struct open_how
io_uring/bpf_filter: cache lookup table in ctx->bpf_filters
io_uring/bpf_filter: add ref counts to struct io_bpf_filter
io_uring: add task fork hook
io_uring: allow registration of per-task restrictions
include/linux/io_uring.h | 14 +-
include/linux/io_uring_types.h | 13 +
include/linux/sched.h | 1 +
include/uapi/linux/io_uring.h | 10 +
include/uapi/linux/io_uring/bpf_filter.h | 62 +++++
io_uring/Kconfig | 5 +
io_uring/Makefile | 1 +
io_uring/bpf_filter.c | 430 +++++++++++++++++++++++++++++++
io_uring/bpf_filter.h | 48 ++++
io_uring/io_uring.c | 48 ++++
io_uring/io_uring.h | 1 +
io_uring/net.c | 9 +
io_uring/net.h | 6 +
io_uring/openclose.c | 9 +
io_uring/openclose.h | 3 +
io_uring/register.c | 91 +++++++
io_uring/tctx.c | 42 ++-
kernel/fork.c | 6 +
18 files changed, 789 insertions(+), 10 deletions(-)
create mode 100644 include/uapi/linux/io_uring/bpf_filter.h
create mode 100644 io_uring/bpf_filter.c
create mode 100644 io_uring/bpf_filter.h
--
Jens Axboe
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2026-02-06 18:58 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-06 18:58 [GIT PULL] io_uring cBPF filter support Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox