Re: Resizing io_uring SQ/CQ?

public inbox for [email protected]
 help / color / mirror / Atom feed

From: Stefan Hajnoczi <[email protected]>
To: Jens Axboe <[email protected]>, Ming Lei <[email protected]>
Cc: [email protected]
Subject: Re: Resizing io_uring SQ/CQ?
Date: Fri, 10 Mar 2023 08:44:00 -0500	[thread overview]
Message-ID: <20230310134400.GB464073@fedora> (raw)
In-Reply-To: <[email protected]>

[-- Attachment #1: Type: text/plain, Size: 3613 bytes --]

On Thu, Mar 09, 2023 at 07:58:31PM -0700, Jens Axboe wrote:
> On 3/9/23 6:38?PM, Ming Lei wrote:
> > On Thu, Mar 09, 2023 at 08:48:08AM -0500, Stefan Hajnoczi wrote:
> >> Hi,
> >> For block I/O an application can queue excess SQEs in userspace when the
> >> SQ ring becomes full. For network and IPC operations that is not
> >> possible because deadlocks can occur when socket, pipe, and eventfd SQEs
> >> cannot be submitted.
> > 
> > Can you explain a bit the deadlock in case of network application? io_uring
> > does support to queue many network SQEs via IOSQE_IO_LINK, at least for
> > send.
> > 
> >>
> >> Sometimes the application does not know how many SQEs/CQEs are needed upfront
> >> and that's when we face this challenge.
> > 
> > When running out of SQEs,  the application can call io_uring_enter() to submit
> > queued SQEs immediately without waiting for get events, then once
> > io_uring_enter() returns, you get free SQEs for moving one.
> > 
> >>
> >> A simple solution is to call io_uring_setup(2) with a higher entries
> >> value than you'll ever need. However, if that value is exceeded then
> >> we're back to the deadlock scenario and that worries me.
> > 
> > Can you please explain the deadlock scenario?
> 
> I'm also curious of what these deadlocks are. As Ming says, you
> generally never run out of SQEs as you can always just submit what you
> have pending and now you have a full queue size worth of them available
> again.
> 
> I do think resizing the CQ ring may have some merit, as for networking
> you may want to start smaller and resize it if you run into overflows as
> those will be less efficient. But I'm somewhat curious on the reasonings
> for wanting to resize the SQ ring?

Hi Ming and Jens,
Thanks for the response. I'll try to explain why I worry about
deadlocks.

Imagine an application has an I/O operation that must complete in order
to make progress. If io_uring_enter(2) fails then the application is
unable to submit that critical I/O.

The io_uring_enter(2) man page says:

  EBUSY  If  the IORING_FEAT_NODROP feature flag is set, then EBUSY will
	 be returned if there were overflow entries,
	 IORING_ENTER_GETEVENTS flag is set and not all of the overflow
	 entries were able to be flushed to the CQ ring.

	 Without IORING_FEAT_NODROP the application is attempting to
	 overcommit the number of requests it can have pending. The
	 application should wait for some completions and try again. May
	 occur if the application tries to queue more requests than we
	 have room for in the CQ ring, or if the application attempts to
	 wait for more events without having reaped the ones already
	 present in the CQ ring.

Some I/O operations can take forever (e.g. reading an eventfd), so there
is no guarantee that the I/Os already in flight will complete. If in
flight I/O operations accumulate to the point where io_uring_enter(2)
returns with EBUSY then the application is starved and unable to submit
more I/O.

Starvation turns into a deadlock when the completion of the already in
flight I/O depends on the yet-to-be-submitted I/O. For example, the
application is supposed to write to a socket and another process will
then signal the eventfd that the application is reading, but we're
unable to submit the write.

I asked about resizing the rings because if we can size them
appropriately, then we can ensure there are sufficient resources for all
I/Os that will be in flight. This prevents EBUSY, starvation, and
deadlock.

Maybe I've misunderstood the man page?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

next prev parent reply	other threads:[~2023-03-10 13:44 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-09 13:48 Resizing io_uring SQ/CQ? Stefan Hajnoczi
2023-03-10  1:38 ` Ming Lei
2023-03-10  2:58   ` Jens Axboe
2023-03-10  3:42     ` Vito Caputo
2023-03-10 13:44     ` Stefan Hajnoczi [this message]
2023-03-10 15:14       ` Ming Lei
2023-03-10 16:39         ` Stefan Hajnoczi
2023-03-10 16:56         ` Stefan Hajnoczi
2023-03-15 15:18           ` Jens Axboe
2023-03-15 15:15         ` Stefan Hajnoczi
2023-03-15 15:19           ` Jens Axboe
2023-03-15 19:01             ` Stefan Hajnoczi
2023-03-15 19:10               ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230310134400.GB464073@fedora \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox