From: Jens Axboe <[email protected]>
To: Pavel Begunkov <[email protected]>, [email protected]
Subject: Re: [PATCH RFC] io_uring: limit inflight IO
Date: Fri, 8 Nov 2019 10:45:19 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 11/8/19 7:05 AM, Jens Axboe wrote:
> On 11/8/19 2:56 AM, Pavel Begunkov wrote:
>> On 08/11/2019 03:19, Jens Axboe wrote:
>>> On 11/7/19 4:21 PM, Jens Axboe wrote:
>>>> I'd like some feedback on this one. Even tith the overflow backpressure
>>>> patch, we still have a potentially large gap where applications can
>>>> submit IO before we get any dropped events in the CQ ring. This is
>>>> especially true if the execution time of those requests are long
>>>> (unbounded).
>>>>
>>>> This adds IORING_SETUP_INFLIGHT, which if set, will return -EBUSY if we
>>>> have more IO pending than we can feasibly support. This is normally the
>>>> CQ ring size, but of IORING_SETUP_CQ_NODROP is enabled, then it's twice
>>>> the CQ ring size.
>>>>
>>>> This helps manage the pending queue size instead of letting it grow
>>>> indefinitely.
>>>>
>>>> Note that we could potentially just make this the default behavior -
>>>> applications need to handle -EAGAIN returns already, in case we run out
>>>> of memory, and if we change this to return -EAGAIN as well, then it
>>>> doesn't introduce any new failure cases. I'm tempted to do that...
>>>>
>>>> Anyway, comments solicited!
>> What's wrong with giving away overflow handling to the userspace? It
>> knows its inflight count, and it shouldn't be a problem to allocate
>> large enough rings. The same goes for the backpressure patch. Do you
>> have a particular requirement/user for this?
>
> There are basically two things at play here:
>
> - The backpressure patch helps with users that can't easily size the
> ring. This could be library use cases, or just normal use cases that
> don't necessarily know how big the ring needs to be. Hence they need
> to size for the worst case, which is wasteful.
>
> - For this case, it's different. I just want to avoid having the
> application having potential tons of requests in flight. Before the
> backpressure patch, if you had more than the CQ ring size inflight,
> you'd get dropped events. Once you get dropped events, it's game over,
> the application has totally lost track. Hence I think it's safe to
> unconditionally return -EBUSY if we exceed that limit.
>
> The first one helps use cases that couldn't really io_uring before, the
> latter helps protect against use cases that would use a lot of memory.
> If the request for the latter use case takes a long time to run, the
> problem is even worse.
>
>> Awhile something could be done {efficiently,securely,etc} in the
>> userspace, I would prefer to keep the kernel part simpler.
>
> For this particular patch, the majority of issues will just read the sq
> head and mask, which we will just now anyway. The extra cost is
> basically a branch and cmp instruction. There's no way you can do that
> in userspace that efficiently, and it helps protect the kernel.
>
> I'm not a fan of adding stuff we don't need either, but I do believe we
> need this one.
I've been struggling a bit with how to make this reliable, and I'm not
so sure there's a way to do that. Let's say an application sets up a
ring with 8 sq entries, which would then default to 16 cq entries. With
this patch, we'd allow 16 ios inflight. But what if the application does
for (i = 0; i < 32; i++) {
sqe = get_sqe();
prep_sqe();
submit_sqe();
}
And then directly proceeds to:
do {
get_completions();
} while (has_completions);
As long as fewer than 16 requests complete before we start reaping,
we don't lose any events. Hence there's a risk of breaking existing
setups with this, even though I don't think that's a high risk.
We probably want to add some sysctl limit for this instead. But then
the question is, what should that entry (or entries) be?
--
Jens Axboe
next prev parent reply other threads:[~2019-11-08 17:45 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-07 23:21 [PATCH RFC] io_uring: limit inflight IO Jens Axboe
2019-11-08 0:19 ` Jens Axboe
2019-11-08 9:56 ` Pavel Begunkov
2019-11-08 14:05 ` Jens Axboe
2019-11-08 17:45 ` Jens Axboe [this message]
2019-11-09 11:16 ` Pavel Begunkov
2019-11-09 14:23 ` Jens Axboe
2019-11-09 15:15 ` Jens Axboe
2019-11-09 19:24 ` Pavel Begunkov
2019-11-09 10:33 ` Pavel Begunkov
2019-11-09 14:12 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox