public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: [email protected]
Cc: [email protected]
Subject: [RFC 0/3] Add BPF for io_uring
Date: Mon, 11 Nov 2024 01:50:43 +0000	[thread overview]
Message-ID: <[email protected]> (raw)

WARNING: it's an early prototype and could likely be broken and unsafe
to run. Also, most probably it doesn't do the right thing from the
modern BPF perspective, but that's fine as I want to get some numbers
first and only then consult with BPF folks and brush it up.

A comeback of the io_uring BPF proposal put on top new infrastructure.
Instead executing BPF as a new request type, it's now run in the io_uring
waiting loop. The program is called to react every time we get a new
event like a queued task_work or an interrupt. Patch 3 adds some helpers
the BPF program can use to interact with io_uring like submitting new
requests and looking at CQEs. It also controls when to return control
back to user space by returning one of IOU_BPF_RET_{OK,STOP}, and sets
the task_work batching size, i.e. how many CQEs to wait for it be run
again, via a kfunc helper. We need to be able to sleep to submit
requests, hence only sleepable BPF is allowed. 

BPF can help to create arbitrary relations between requests from
within the kernel and later help with tuning the wait loop batching.
E.g. with minor extensions we can implement batch wait timeouts.
We can also use it to let the user to safely access internal resources
and maybe even do a more elaborate request setup than SQE allows it.

The benchmark is primitive, the non-BPF baseline issues a 2 nop request
link at a time and waits for them to complete. The BPF version runs
them (2 * N requests) one by one. Numbers with mitigations on:

# nice -n -20 taskset -c 0 ./minimal 0 50000000
type 2-LINK, requests to run 50000000
sec 10, total (ms) 10314
# nice -n -20 taskset -c 0 ./minimal 1 50000000
type BPF, requests to run 50000000
sec 6, total (ms) 6808

It needs to be better tested, especially with asynchronous requests
like reads and other hardware. It can also be further optimised. E.g.
we can avoid extra locking by taking it once for BPF/task_work_run.

The test (see examples-bpf/minimal[.bpf].c)
https://github.com/isilence/liburing.git io_uring-bpf
https://github.com/isilence/liburing/tree/io_uring-bpf

Pavel Begunkov (3):
  bpf/io_uring: add io_uring program type
  io_uring/bpf: allow to register and run BPF programs
  io_uring/bpf: add kfuncs for BPF programs

 include/linux/bpf.h               |   1 +
 include/linux/bpf_types.h         |   4 +
 include/linux/io_uring/bpf.h      |  10 ++
 include/linux/io_uring_types.h    |   4 +
 include/uapi/linux/bpf.h          |   1 +
 include/uapi/linux/io_uring.h     |   9 ++
 include/uapi/linux/io_uring/bpf.h |  22 ++++
 io_uring/Makefile                 |   1 +
 io_uring/bpf.c                    | 205 ++++++++++++++++++++++++++++++
 io_uring/bpf.h                    |  43 +++++++
 io_uring/io_uring.c               |  16 +++
 io_uring/register.c               |   7 +
 kernel/bpf/btf.c                  |   3 +
 kernel/bpf/syscall.c              |   1 +
 kernel/bpf/verifier.c             |  10 +-
 15 files changed, 336 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/io_uring/bpf.h
 create mode 100644 include/uapi/linux/io_uring/bpf.h
 create mode 100644 io_uring/bpf.c
 create mode 100644 io_uring/bpf.h

-- 
2.46.0


             reply	other threads:[~2024-11-11  1:50 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-11  1:50 Pavel Begunkov [this message]
2024-11-11  1:50 ` [RFC 1/3] bpf/io_uring: add io_uring program type Pavel Begunkov
2024-11-11  1:50 ` [RFC 2/3] io_uring/bpf: allow to register and run BPF programs Pavel Begunkov
2024-11-13  8:21   ` Ming Lei
2024-11-13 13:09     ` Pavel Begunkov
2024-11-11  1:50 ` [RFC 3/3] io_uring/bpf: add kfuncs for " Pavel Begunkov
2024-11-13  8:13 ` [RFC 0/3] Add BPF for io_uring Ming Lei
2024-11-13 13:09   ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox