From: Jens Axboe <[email protected]>
To: Jann Horn <[email protected]>
Cc: [email protected],
"[email protected]" <[email protected]>
Subject: Re: [RFC] io_uring CQ ring backpressure
Date: Wed, 6 Nov 2019 13:08:54 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAG48ez1_91Lk73sdpp1SiufOQShdP2zX6g9gMLW46gAvMioKOA@mail.gmail.com>
On 11/6/19 12:51 PM, Jann Horn wrote:
> On Wed, Nov 6, 2019 at 5:23 PM Jens Axboe <[email protected]> wrote:
>> Currently we drop completion events, if the CQ ring is full. That's fine
>> for requests with bounded completion times, but it may make it harder to
>> use io_uring with networked IO where request completion times are
>> generally unbounded. Or with POLL, for example, which is also unbounded.
>>
>> This patch adds IORING_SETUP_CQ_NODROP, which changes the behavior a bit
>> for CQ ring overflows. First of all, it doesn't overflow the ring, it
>> simply stores backlog of completions that we weren't able to put into
>> the CQ ring. To prevent the backlog from growing indefinitely, if the
>> backlog is non-empty, we apply back pressure on IO submissions. Any
>> attempt to submit new IO with a non-empty backlog will get an -EBUSY
>> return from the kernel.
>>
>> I think that makes for a pretty sane API in terms of how the application
>> can handle it. With CQ_NODROP enabled, we'll never drop a completion
>> event (well unless we're totally out of memory...), but we'll also not
>> allow submissions with a completion backlog.
> [...]
>> +static void io_cqring_overflow(struct io_ring_ctx *ctx, u64 ki_user_data,
>> + long res)
>> + __must_hold(&ctx->completion_lock)
>> +{
>> + struct cqe_drop *drop;
>> +
>> + if (!(ctx->flags & IORING_SETUP_CQ_NODROP)) {
>> +log_overflow:
>> + WRITE_ONCE(ctx->rings->cq_overflow,
>> + atomic_inc_return(&ctx->cached_cq_overflow));
>> + return;
>> + }
>> +
>> + drop = kmalloc(sizeof(*drop), GFP_ATOMIC);
>> + if (!drop)
>> + goto log_overflow;
>> +
>> + drop->user_data = ki_user_data;
>> + drop->res = res;
>> + list_add_tail(&drop->list, &ctx->cq_overflow_list);
>> +}
>
> This could potentially consume moderately large amounts of atomic
> memory quickly and without any guarantee that the memory will be freed
> anytime soon, right? That seems moderately bad. Is there no way to
> e.g. pre-reserve memory for completion events, or something like that?
As soon as there's even one entry in that backlog, the ring won't accept
anymore new IO. So I don't think it's a huge concern. If we pre-reserve,
we haven't really made much progress in making sure we don't drop events,
and we'll be tying up that memory all the time.
The alternative, as Pavel also mentioned, is to re-use the io_kiocb
for this. But that'll tie up more memory, and it's a bit tricky with
the life times. Just because the request has completed doesn't mean
that someone isn't still holding a reference to it, and who knows
what they will do.
--
Jens Axboe
next prev parent reply other threads:[~2019-11-06 20:08 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-06 16:21 [RFC] io_uring CQ ring backpressure Jens Axboe
2019-11-06 19:12 ` Pavel Begunkov
2019-11-06 19:43 ` Jens Axboe
2019-11-06 19:51 ` Jann Horn
2019-11-06 20:08 ` Jens Axboe [this message]
2019-11-06 21:31 ` Jens Axboe
2019-11-06 21:54 ` Pavel Begunkov
2019-11-06 21:56 ` Jens Axboe
2019-11-06 22:42 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox