From: Pavel Begunkov <asml.silence@gmail.com>
To: Jens Axboe <axboe@kernel.dk>, io-uring <io-uring@vger.kernel.org>
Subject: Re: [PATCH for-next] io_uring/query: check for loops in in_query()
Date: Tue, 16 Sep 2025 16:05:56 +0100 [thread overview]
Message-ID: <437ebe86-3183-470a-b5d3-1d5ff8557e01@gmail.com> (raw)
In-Reply-To: <3acf3cdc-8ace-42e6-a8a8-974442d98092@kernel.dk>
On 9/15/25 19:41, Jens Axboe wrote:
> On 9/11/25 5:40 AM, Pavel Begunkov wrote:
>> On 9/11/25 10:02, Pavel Begunkov wrote:
>>> On 9/11/25 01:13, Jens Axboe wrote:
>>>> io_query() loops over query items that the application or liburing
>>>> passes in. But it has no checking for a max number of items, or if a
>>>> loop could be present. If someone were to do:
>>>>
>>>> struct io_uring_query_hdr hdr1, hdr2, hdr3;
>>>>
>>>> hdr3.next_entry = &hdr1;
>>>> hdr2.next_entry = &hdr3;
>>>> hdr1.next_entry = &hdr2;
>>>>
>>>> io_uring_register(fd, IORING_REGISTER_QUERY, &hdr1, 0);
>>>>
>>>> then it'll happily loop forever and process hdr1 -> hdr2 -> hdr3 and
>>>> then loop back to hdr1.
>>>>
>>>> Add a max cap for these kinds of cases, which is arbitrarily set to
>>>> 1024 as well. Since there's now a cap, it seems that it would be saner
>>>> to have this interface return the number of items processed. Eg 0..N
>>>> for success, and < 0 for an error. Then if someone does need to query
>>>> more than the supported number of items, they can do so iteratively.
>>>
>>> That worsens usability. The user would have to know / count how
>>> many entries there was in the first place, retry, and do all
>>> handling. It'll be better to:
>>>
>>> if (nr > (1U << 20))
>>> return -ERANGE;
>>> if (fatal_signal_pending())
>>> return -EINTR;
>>> ...
>>> return 0;
>>>
>>>
>>> 1M should be high enough for future proofing and to protect from
>>> mildly insane users (and would still be fast enough). I also had
>>> cond_resched() in some version, but apparently it got lost as
>>> well.
>>
>> Tested the diff below, works well enough. 1M breaks out after a
>> second even in a very underpowered VM.
>
> Honestly I'm not sure which of the two I dislike more, I think both are
> not great in terms of API. In practice, nobody is going to ask for 1000
> entries. In practice, people will do one at the time. At the same time,
> I do like having the ability to process multiple in one syscall, even if
> it doesn't _really_ matter. Normally for interfaces like that, returning
> number of processed is the right approach. Eg when you get a signal or
> run into an error, you know where that happened. At the same time, it's
In which case you lose the error code, which can be more important.
In either case, the idea is that it should only fail if the app is
buggy, in which case the user can do it one by one while debugging.
> also a pain in the butt to use for an application if it did to hundreds
> of then. But let's be real, it will not. It'll do a a handful at most,
> and then it's pretty clear where to continue. The only real error here
> would be -EINTR, as anything would be the applications fault because
I'd rather delay non fatal signals and even more so task work
processing, it can't ever be reliable in general case otherwise
and would always need to be executed in a loop. And the execution
should be brief, likely shorter than non-interruptible sections
of many syscalls. In this sense capping at 1000 can be a better
choice.
> it's dumb or malicious, hence the only thing it'd do is submit the whole
> thing again. It's not like it's going to say "oh I got 2, which is less
> than the 5, let me restart at 3". But it now might have to, because it
> doesn't know what the error is.
>
> Anyway, that's a long winded way of saying I kind of hate needing any
> kind of limit, but at least with returning the number of entries
> processed, we can make the limit low and meaningful rather than some
> random high number "which is surely enough for everyone" with the sole
> idea behind that being that we need to be able to detect loops. And even
> if the interface is idempotent, it's still kind of silly to need to redo
> everything in case of a potentially valid error like an -EINVAL.
It's not a valid syscall return for a correct application (when
querying is available). Query errors are returned individually.
> Are we perhaps better off just killing this linking and just doing
> single items at the time? That nicely avoids any issues related to this,
> makes the 0/-ERROR interface sane, and makes the app interface simpler
> too. The only downside is needing a few more syscalls at probe time,
> which is not something worth optimizing for imho.
You was pretty convincing insisting that extra waiting on teardown is
not tolerable. In the same spirit there were discussions on how fast
you can create rings. I wouldn't care if it's one or two extra
syscalls, but I reasonably expect that it might grow to low double
digit queries at some point, and 10-15 syscalls doesn't sound that
comfortable while that can be easily avoided.
--
Pavel Begunkov
next prev parent reply other threads:[~2025-09-16 15:04 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-11 0:13 [PATCH for-next] io_uring/query: check for loops in in_query() Jens Axboe
2025-09-11 9:02 ` Pavel Begunkov
2025-09-11 11:40 ` Pavel Begunkov
2025-09-15 18:41 ` Jens Axboe
2025-09-16 15:05 ` Pavel Begunkov [this message]
2025-09-16 18:33 ` Jens Axboe
2025-09-16 18:35 ` Jens Axboe
2025-09-18 9:50 ` Pavel Begunkov
2025-09-18 10:27 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=437ebe86-3183-470a-b5d3-1d5ff8557e01@gmail.com \
--to=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=io-uring@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox