public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET RFC v3] Inherited restrictions and BPF filtering
@ 2026-01-15 16:36 Jens Axboe
  2026-01-15 16:36 ` [PATCH 1/3] io_uring: move ctx->restrictions to be dynamically allocated Jens Axboe
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Jens Axboe @ 2026-01-15 16:36 UTC (permalink / raw)
  To: io-uring

Hi,

Followup to v2 here:

https://lore.kernel.org/io-uring/20260109185155.88150-1-axboe@kernel.dk/

While this is a followup, it takes a different approach to the problem.
What remains is task inheritance - if a set of restrictions are
registered with a task, any children will get it too.

What's new is adding basic support for BPF filters, so that anything
can be filtered. You can add filters for each opcode, and several of
them as well. As the filtering is done after the prep phase, it's even
possible to support filtering based on user structs that are copied in
to the kernel. For now, only IORING_OP_SOCKET is done, and allows
filtering on domain/type/protocol. This is done as an example. A sample
filter for that could look like:

SEC("io_uring_filter")
int socket_filter(struct io_uring_bpf_ctx *ctx)
{
	/* Only allow AF_INET and AF_INET6 */
	if (ctx->socket.family != AF_INET && ctx->socket.family != AF_INET6)
		return 0;  /* Reject */

	/* Only allow SOCK_STREAM (TCP) */
	if (ctx->socket.type != SOCK_STREAM)
		return 0;  /* Reject */

	/* Only allow IPPROTO_TCP or default (0) */
	if (ctx->socket.protocol != IPPROTO_TCP && ctx->socket.protocol != 0)
		return 0;  /* Reject */

	return 1; /* Accept */
}

to restrict certain families, types, or protocols.

Just supports SQE opcodes for this kind of filtering, but easily
extendable to cover REGISTER opcodes as well, including arguments.

Sending this out as an RFC for comments. I think this provides most of
the functionality needed to filter basically anything. There's still
some rough edges here, notably the BPF support, as I really don't know
what I'm doing there. But it works for testing, at least... I don't have
a liburing branch for this just yet, let me know if you want some
test/sample code and I'll be happy to toss it over the wall. I'll add a
liburing branch over the weekend for easier experimentation.

Sample based on the above filter:

axboe@m2max-kvm ~> ./io_uring_bpf_loader io_uring_bpf_filter.c.bpf.o
io_uring BPF Socket Filter Test (C-based)
==========================================

io_uring initialized
BPF program loaded successfully from io_uring_bpf_filter.c.bpf.o, fd=4
BPF filter registered for opcode 45

Running tests...

Testing AF_INET TCP (explicit): PASSED (fd=5)
Testing AF_INET TCP (default): PASSED (fd=5)
Testing AF_INET6 TCP (explicit): PASSED (fd=5)
Testing AF_INET6 TCP (default): PASSED (fd=5)
Testing AF_INET UDP: PASSED (correctly rejected)
Testing AF_INET RAW: PASSED (correctly rejected)
Testing AF_UNIX: PASSED (correctly rejected)
Testing AF_INET TCP socket with UDP proto: PASSED (correctly rejected)

or running t/io_uring with IORING_OP_NOP and a filter set. This filter
just allows the opcode, but it's still run on each NOP issued:

axboe@m2max-kvm ~/g/fio (master)> sudo taskset -c 0 t/io_uring -N1 -n1 -E ~/noop_filter.bpf.c.o -B0 -F0 trim.json
submitter=0, tid=2287, file=trim.json, nfiles=1, node=-1
BPF program loaded successfully from /home/axboe/noop_filter.bpf.c.o, fd=5
BPF filter registered for opcode 0
polled=1, fixedbufs=0, register_files=0, buffered=0, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=13.89M, IOS/call=32/32
IOPS=13.90M, IOS/call=32/32
[...]

Comments welcome! Kernel branch can be found here:

https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux.git/log/?h=io_uring-bpf-restrictions

and sits on top of for-7.0/io_uring

 include/linux/bpf.h            |   1 +
 include/linux/bpf_types.h      |   4 +
 include/linux/io_uring.h       |   2 +-
 include/linux/io_uring_types.h |  20 +++-
 include/linux/sched.h          |   1 +
 include/uapi/linux/bpf.h       |   1 +
 include/uapi/linux/io_uring.h  |  46 +++++++
 io_uring/Makefile              |   1 +
 io_uring/bpf_filter.c          | 212 +++++++++++++++++++++++++++++++++
 io_uring/bpf_filter.h          |  41 +++++++
 io_uring/io_uring.c            |  33 ++++-
 io_uring/net.c                 |   9 ++
 io_uring/net.h                 |   5 +
 io_uring/register.c            | 133 +++++++++++++++++++--
 io_uring/register.h            |   2 +
 io_uring/tctx.c                |  26 ++--
 kernel/bpf/syscall.c           |   9 ++
 kernel/fork.c                  |   4 +
 18 files changed, 527 insertions(+), 23 deletions(-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-01-15 21:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-15 16:36 [PATCHSET RFC v3] Inherited restrictions and BPF filtering Jens Axboe
2026-01-15 16:36 ` [PATCH 1/3] io_uring: move ctx->restrictions to be dynamically allocated Jens Axboe
2026-01-15 16:36 ` [PATCH 2/3] io_uring: add support for BPF filtering for opcode restrictions Jens Axboe
2026-01-15 20:11   ` Jonathan Corbet
2026-01-15 21:02     ` Jens Axboe
2026-01-15 21:05       ` Jonathan Corbet
2026-01-15 21:08         ` Jens Axboe
2026-01-15 16:36 ` [PATCH 3/3] io_uring: allow registration of per-task restrictions Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox