public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	io-uring@vger.kernel.org,
	Caleb Sander Mateos <csander@purestorage.com>,
	Akilesh Kailash <akailash@google.com>,
	bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>
Subject: Re: [PATCH 0/5] io_uring: add IORING_OP_BPF for extending io_uring
Date: Thu, 6 Nov 2025 16:03:29 +0000	[thread overview]
Message-ID: <58c0e697-2f6a-4b06-bf04-c011057cd6c7@gmail.com> (raw)
In-Reply-To: <aQtz-dw7t7jtqALc@fedora>

On 11/5/25 15:57, Ming Lei wrote:
> On Wed, Nov 05, 2025 at 12:47:58PM +0000, Pavel Begunkov wrote:
>> On 11/4/25 16:21, Ming Lei wrote:
>>> Hello,
>>>
>>> Add IORING_OP_BPF for extending io_uring operations, follows typical cases:
>>
>> BPF requests were tried long time ago and it wasn't great. Performance
> 
> Care to share the link so I can learn from the lesson? Maybe things have
> changed now...

https://lore.kernel.org/io-uring/a83f147b-ea9d-e693-a2e9-c6ce16659749@gmail.com/T/#m31d0a2ac6e2213f912a200f5e8d88bd74f81406b

There were some extra features and testing from folks, but I don't
think it was ever posted to the list.

>> for short BPF programs is not great because of io_uring request handling
>> overhead. And flexibility was severely lacking, so even simple use cases
> 
> What is the overhead? In this patch, OP's prep() and issue() are defined in

The overhead of creating, freeing and executing a request. If you use
it with links, it's also overhead of that. That prototype could also
optionally wait for completions, and it wasn't free either.

> bpf prog, but in typical use case, the code size is pretty small, and bpf
> prog code is supposed to run in fast path.> 
>> were looking pretty ugly, internally, and for BPF writers as well.
> 
> I am not sure what `simple use cases` you are talking about.

As an example, creating a loop reading a file:
read N bytes; wait for completion; repeat

>> I'm not so sure about your criteria, but my requirement was to at least
>> being able to reuse all io_uring IO handling, i.e. submitting requests,
>> and to wait/process completions, otherwise a lot of opportunities are
>> wasted. My approach from a few months back [1] controlling requests from
> 
> Please read the patchset.
> 
> This patchset defines new IORING_BPF_OP code, which's ->prep(), ->issue(), ...,
> are hooked with struct_ops prog, so all io_uring core code is used, just the
> exact IORING_BPF_OP behavior is defined by struct_ops prog.

Right, but I'm talking about what the io_uring BPF program is capable
of doing.

>> the outside was looking much better. At least it covered a bunch of needs
>> without extra changes. I was just wiring up io_uring changes I wanted
>> to make BPF writer lifes easier. Let me resend the bpf series with it.
>>
>> It makes me wonder if they are complementary, but I'm not sure what
> 
> I think the two are orthogonal in function, and they can co-exist.
> 
>> your use cases are and what capabilities it might need.
> 
> The main use cases are described in cover letter and the 3rd patch, please
> find the details there.
> 
> So far the main case is to access the registered (kernel)buffer
> from issue() callback of struct_ops, because the buffer doesn't have
> userspace mapping. The last two patches adds support to provide two
> buffers(fixed, plain) for IORING_BPF_OP, and in future vectored buffer
> will be added too, so IORING_BPF_OP can handle buffer flexibly, such as:
> 
> - use exported compress kfunc to compress data from kernel buffer
> into another buffer or inplace, then the following linked SQE can be submitted
> to write the built compressed data into storage
> 
> - in raid use case, calculate IO data parity from kernel buffer, and store
> the parity data to another plain user buffer, then the following linked SQE
> can be submitted to write the built parity data to storage
> 
> Even for userspace buffer, the BPF_OP can support similar handling for saving
> one extra io_uring_enter() syscall.

Sure, registered buffer handling was one of the use cases for
that recent re-itarations as well, and David Wei had some thoughts
for it as well. Though, it was not exactly about copying.

>> [1] https://lore.kernel.org/io-uring/cover.1749214572.git.asml.silence@gmail.com/
> 
> I looked at your patches, in which SQE is generated in bpf prog(kernel),

Quick note: userspace and BPF are both allowed to submit
requests / generate SQEs.

> and it can't be used in my case.
Hmm, how so? Let's say ublk registers a buffer and posts a
completion. Then BPF runs, it sees the completion and does the
necessary processing, probably using some kfuncs like the ones
you introduced. After it can optionally queue up requests
writing it to the storage or anything else.

The reason I'm asking is because it's supposed to be able to
do anything the userspace can already achieve (and more). So,
if it can't be used for this use cases, there should be some
problem in my design.

-- 
Pavel Begunkov


  reply	other threads:[~2025-11-06 16:03 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-04 16:21 [PATCH 0/5] io_uring: add IORING_OP_BPF for extending io_uring Ming Lei
2025-11-04 16:21 ` [PATCH 1/5] io_uring: prepare for extending io_uring with bpf Ming Lei
2025-11-04 16:21 ` [PATCH 2/5] io_uring: bpf: add io_uring_ctx setup for BPF into one list Ming Lei
2025-11-04 16:21 ` [PATCH 3/5] io_uring: bpf: extend io_uring with bpf struct_ops Ming Lei
2025-11-07 19:02   ` kernel test robot
2025-11-08  6:53   ` kernel test robot
2025-11-04 16:21 ` [PATCH 4/5] io_uring: bpf: add buffer support for IORING_OP_BPF Ming Lei
2025-11-04 16:21 ` [PATCH 5/5] io_uring: bpf: add io_uring_bpf_req_memcpy() kfunc Ming Lei
2025-11-07 18:51   ` kernel test robot
2025-11-05 12:47 ` [PATCH 0/5] io_uring: add IORING_OP_BPF for extending io_uring Pavel Begunkov
2025-11-05 15:57   ` Ming Lei
2025-11-06 16:03     ` Pavel Begunkov [this message]
2025-11-07 15:54       ` Ming Lei
2025-11-11 14:07         ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58c0e697-2f6a-4b06-bf04-c011057cd6c7@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=akailash@google.com \
    --cc=ast@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bpf@vger.kernel.org \
    --cc=csander@purestorage.com \
    --cc=io-uring@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox