From: Jens Axboe <[email protected]>
To: Pavel Begunkov <[email protected]>,
[email protected]
Cc: io-uring <[email protected]>,
[email protected], [email protected]
Subject: Re: [LSF/MM/BPF TOPIC] programmable IO control flow with io_uring and BPF
Date: Fri, 31 Jan 2020 14:30:06 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 1/24/20 7:18 AM, Pavel Begunkov wrote:
> Apart from concurrent IO execution, io_uring allows to issue a sequence
> of operations, a.k.a links, where requests are executed sequentially one
> after another. If an "error" happened, the rest of the link will be
> cancelled.
>
> The problem is what to consider an "error". For example, if we
> read less bytes than have been asked for, the link will be cancelled.
> It's necessary to play safe here, but this implies a lot of overhead if
> that isn't the desired behaviour. The user would need to reap all
> cancelled requests, analyse the state, resubmit them and suffer from
> context switches and all in-kernel preparation work. And there are
> dozens of possibly desirable patterns, so it's just not viable to
> hard-code them into the kernel.
>
> The other problem is to keep in running even when a request depends on
> a result of the previous one. It could be simple passing return code or
> something more fancy, like reading from the userspace.
>
> And that's where BPF will be extremely useful. It will control the flow
> and do steering.
>
> The concept is to be able run a BPF program after a request's
> completion, taking the request's state, and doing some of the following:
> 1. drop a link/request
> 2. issue new requests
> 3. link/unlink requests
> 4. do fast calculations / accumulate data
> 5. emit information to the userspace (e.g. via ring's CQ)
>
> With that, it will be possible to have almost context-switch-less IO,
> and that's really tempting considering how fast current devices are.
>
> What to discuss:
> 1. use cases
> 2. control flow for non-privileged users (e.g. allowing some popular
> pre-registered patterns)
> 3. what input the program needs (e.g. last request's
> io_uring_cqe) and how to pass it.
> 4. whether we need notification via CQ for each cancelled/requested
> request, because sometimes they only add noise
> 5. BPF access to user data (e.g. allow to read only registered buffers)
> 6. implementation details. E.g.
> - how to ask to run BPF (e.g. with a new opcode)
> - having global BPF, bound to an io_uring instance or mixed
> - program state and how to register
> - rework notion of draining and sequencing
> - live-lock avoidance (e.g. double check io_uring shut-down code)
I think this is a key topic that we should absolutely discuss at LSFMM.
--
Jens Axboe
prev parent reply other threads:[~2020-01-31 21:32 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-24 14:18 [LSF/MM/BPF TOPIC] programmable IO control flow with io_uring and BPF Pavel Begunkov
2020-01-31 21:30 ` Jens Axboe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox