From: Caleb Sander Mateos <csander@purestorage.com>
To: Pavel Begunkov <asml.silence@gmail.com>
Cc: io-uring@vger.kernel.org
Subject: Re: [PATCH 2/2] io_uring: introduce non-circular SQ
Date: Tue, 14 Oct 2025 12:46:20 -0700 [thread overview]
Message-ID: <CADUfDZqVG6sd-VChW3CxM+dgY7t7MRg3mqth038P0aYjjCsycA@mail.gmail.com> (raw)
In-Reply-To: <fdff4e0c-0d26-4e19-8671-1f98e1c526a6@gmail.com>
On Tue, Oct 14, 2025 at 12:25 PM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> On 10/14/25 19:37, Caleb Sander Mateos wrote:
> > On Tue, Oct 14, 2025 at 3:57 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
> ...>> + * SQEs always start at index 0 in the submission ring instead of using a
> >> + * wrap around indexing.
> >> + */
> >> +#define IORING_SETUP_SQ_REWIND (1U << 19)
> >
> > Keith's mixed-SQE-size patch series is already planning to use this
> > flag: https://lore.kernel.org/io-uring/20251013180011.134131-3-kbusch@meta.com/
>
> I'll rebase it as ususual if that gets merged before.
> >> - /*
> >> - * Ensure any loads from the SQEs are done at this point,
> >> - * since once we write the new head, the application could
> >> - * write new data to them.
> >> - */
> >> - smp_store_release(&rings->sq.head, ctx->cached_sq_head);
> >> + if (ctx->flags & IORING_SETUP_SQ_REWIND) {
> >> + ctx->cached_sq_head = 0;
> >
> > The only awkward thing about this interface seems to be if
> > io_submit_sqes() aborts early without submitting all the requested
> > SQEs. Does userspace then need to memmove() the remaining SQEs to the
> > start of the ring? It's certainly an unlikely case but something
> > userspace has to handle because io_alloc_req() can fail for reasons
> > outside its control. Seems like it might simplify the userspace side
> > if cached_sq_head wasn't rewound if not all SQEs were consumed.
> This kind of special rules is what usually makes interfaces a pain to
> work with. What if you want to abort all un-submitted requests
> instead? You can empty the queue, but then the next syscall will
> still start from the middle. Or what if the application wants to
> queue more requests before resubmitting previous ones? There are
> reasons b/c the kernel will need to handle it in a less elegant way
> than it potentially can otherwise. memmove sounds appropriate.
Maybe most convenient would be a way for userspace to pass both a head
and a nr/tail value to the syscall instead of assuming the head is
always 0. But it's probably difficult to modify the existing syscall
interface without an indirection to the head value, which seems to be
a main point of this series. So always resetting to 0 and requiring
userspace to memmove() the remaining SQEs in the rare case that
io_uring_enter() doesn't consume all of them seems like a reasonable
approach.
>
> >> @@ -3678,6 +3687,12 @@ static int io_uring_sanitise_params(struct io_uring_params *p)
> >> {
> >> unsigned flags = p->flags;
> >>
> >> + if (flags & IORING_SETUP_SQ_REWIND) {
> >> + if ((flags & IORING_SETUP_SQPOLL) ||
> >> + !(flags & IORING_SETUP_NO_SQARRAY))
> >
> > Is there a reason IORING_SETUP_NO_SQARRAY is required? It seems like
> > the implementation would work just fine with the SQ indirection ring;
> > the rewind would just apply to the indirection ring instead of the SQE
> > array. The cache hit rate benefit would probably be smaller since many
> > more SQ indirection entries fit in a single cache line, but I don't
> > see a reason to explicitly forbid it.
>
> B/c I don't care about sqarray setups, they are on the way out for soft
> deprecation with liburing defaulting to NO_SQARRAY, and once you try
> to optimise the kernel IORING_SETUP_SQ_REWIND handling it might turn
> out that !NO_SQARRAY is in the way... or not, but you can always allow
> it later while limiting it would break uapi. In short, it's weighting
> chances of (micro) optimisations in the future vs supporting a case
> which is unlikely going to be used.
Fair point, tradeoffs either way.
Best,
Caleb
>
> --
> Pavel Begunkov
>
next prev parent reply other threads:[~2025-10-14 19:46 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-14 10:58 [PATCH 0/2] Introduce non circular SQ Pavel Begunkov
2025-10-14 10:58 ` [PATCH 1/2] io_uring: check for user passing 0 nr_submit Pavel Begunkov
2025-10-14 10:58 ` [PATCH 2/2] io_uring: introduce non-circular SQ Pavel Begunkov
2025-10-14 17:21 ` Jens Axboe
2025-10-14 18:58 ` Pavel Begunkov
2025-10-14 18:37 ` Caleb Sander Mateos
2025-10-14 19:26 ` Pavel Begunkov
2025-10-14 19:46 ` Caleb Sander Mateos [this message]
2025-10-16 11:38 ` Pavel Begunkov
2025-10-14 15:05 ` [PATCH 0/2] Introduce non circular SQ Jens Axboe
2025-10-14 16:02 ` Pavel Begunkov
2025-10-14 16:08 ` Pavel Begunkov
2025-10-14 17:19 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CADUfDZqVG6sd-VChW3CxM+dgY7t7MRg3mqth038P0aYjjCsycA@mail.gmail.com \
--to=csander@purestorage.com \
--cc=asml.silence@gmail.com \
--cc=io-uring@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox