From: Jens Axboe <axboe@kernel.dk>
To: io-uring@vger.kernel.org
Subject: [PATCHSET RFC v3] Inherited restrictions and BPF filtering
Date: Thu, 15 Jan 2026 09:36:31 -0700 [thread overview]
Message-ID: <20260115165244.1037465-1-axboe@kernel.dk> (raw)
Hi,
Followup to v2 here:
https://lore.kernel.org/io-uring/20260109185155.88150-1-axboe@kernel.dk/
While this is a followup, it takes a different approach to the problem.
What remains is task inheritance - if a set of restrictions are
registered with a task, any children will get it too.
What's new is adding basic support for BPF filters, so that anything
can be filtered. You can add filters for each opcode, and several of
them as well. As the filtering is done after the prep phase, it's even
possible to support filtering based on user structs that are copied in
to the kernel. For now, only IORING_OP_SOCKET is done, and allows
filtering on domain/type/protocol. This is done as an example. A sample
filter for that could look like:
SEC("io_uring_filter")
int socket_filter(struct io_uring_bpf_ctx *ctx)
{
/* Only allow AF_INET and AF_INET6 */
if (ctx->socket.family != AF_INET && ctx->socket.family != AF_INET6)
return 0; /* Reject */
/* Only allow SOCK_STREAM (TCP) */
if (ctx->socket.type != SOCK_STREAM)
return 0; /* Reject */
/* Only allow IPPROTO_TCP or default (0) */
if (ctx->socket.protocol != IPPROTO_TCP && ctx->socket.protocol != 0)
return 0; /* Reject */
return 1; /* Accept */
}
to restrict certain families, types, or protocols.
Just supports SQE opcodes for this kind of filtering, but easily
extendable to cover REGISTER opcodes as well, including arguments.
Sending this out as an RFC for comments. I think this provides most of
the functionality needed to filter basically anything. There's still
some rough edges here, notably the BPF support, as I really don't know
what I'm doing there. But it works for testing, at least... I don't have
a liburing branch for this just yet, let me know if you want some
test/sample code and I'll be happy to toss it over the wall. I'll add a
liburing branch over the weekend for easier experimentation.
Sample based on the above filter:
axboe@m2max-kvm ~> ./io_uring_bpf_loader io_uring_bpf_filter.c.bpf.o
io_uring BPF Socket Filter Test (C-based)
==========================================
io_uring initialized
BPF program loaded successfully from io_uring_bpf_filter.c.bpf.o, fd=4
BPF filter registered for opcode 45
Running tests...
Testing AF_INET TCP (explicit): PASSED (fd=5)
Testing AF_INET TCP (default): PASSED (fd=5)
Testing AF_INET6 TCP (explicit): PASSED (fd=5)
Testing AF_INET6 TCP (default): PASSED (fd=5)
Testing AF_INET UDP: PASSED (correctly rejected)
Testing AF_INET RAW: PASSED (correctly rejected)
Testing AF_UNIX: PASSED (correctly rejected)
Testing AF_INET TCP socket with UDP proto: PASSED (correctly rejected)
or running t/io_uring with IORING_OP_NOP and a filter set. This filter
just allows the opcode, but it's still run on each NOP issued:
axboe@m2max-kvm ~/g/fio (master)> sudo taskset -c 0 t/io_uring -N1 -n1 -E ~/noop_filter.bpf.c.o -B0 -F0 trim.json
submitter=0, tid=2287, file=trim.json, nfiles=1, node=-1
BPF program loaded successfully from /home/axboe/noop_filter.bpf.c.o, fd=5
BPF filter registered for opcode 0
polled=1, fixedbufs=0, register_files=0, buffered=0, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=13.89M, IOS/call=32/32
IOPS=13.90M, IOS/call=32/32
[...]
Comments welcome! Kernel branch can be found here:
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux.git/log/?h=io_uring-bpf-restrictions
and sits on top of for-7.0/io_uring
include/linux/bpf.h | 1 +
include/linux/bpf_types.h | 4 +
include/linux/io_uring.h | 2 +-
include/linux/io_uring_types.h | 20 +++-
include/linux/sched.h | 1 +
include/uapi/linux/bpf.h | 1 +
include/uapi/linux/io_uring.h | 46 +++++++
io_uring/Makefile | 1 +
io_uring/bpf_filter.c | 212 +++++++++++++++++++++++++++++++++
io_uring/bpf_filter.h | 41 +++++++
io_uring/io_uring.c | 33 ++++-
io_uring/net.c | 9 ++
io_uring/net.h | 5 +
io_uring/register.c | 133 +++++++++++++++++++--
io_uring/register.h | 2 +
io_uring/tctx.c | 26 ++--
kernel/bpf/syscall.c | 9 ++
kernel/fork.c | 4 +
18 files changed, 527 insertions(+), 23 deletions(-)
--
Jens Axboe
next reply other threads:[~2026-01-15 16:52 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-15 16:36 Jens Axboe [this message]
2026-01-15 16:36 ` [PATCH 1/3] io_uring: move ctx->restrictions to be dynamically allocated Jens Axboe
2026-01-15 16:36 ` [PATCH 2/3] io_uring: add support for BPF filtering for opcode restrictions Jens Axboe
2026-01-15 20:11 ` Jonathan Corbet
2026-01-15 21:02 ` Jens Axboe
2026-01-15 21:05 ` Jonathan Corbet
2026-01-15 21:08 ` Jens Axboe
2026-01-15 16:36 ` [PATCH 3/3] io_uring: allow registration of per-task restrictions Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260115165244.1037465-1-axboe@kernel.dk \
--to=axboe@kernel.dk \
--cc=io-uring@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox