From: Jens Axboe <axboe@kernel.dk>
To: Pavel Begunkov <asml.silence@gmail.com>, io-uring@vger.kernel.org
Cc: Dylan Yudaken <dyudaken@gmail.com>
Subject: Re: [PATCH 1/1] io_uring/zctx: separate notification user_data
Date: Tue, 17 Feb 2026 08:15:13 -0700 [thread overview]
Message-ID: <56a3e17b-8ad1-4623-bc8c-e8f4e9f4e265@kernel.dk> (raw)
In-Reply-To: <5eeb233d-74e4-453c-ad18-f30382dc44e7@gmail.com>
On 2/17/26 8:03 AM, Pavel Begunkov wrote:
> On 2/17/26 13:12, Jens Axboe wrote:
>> On 2/17/26 4:15 AM, Pavel Begunkov wrote:
>>> On 2/16/26 17:27, Jens Axboe wrote:
>>> ...
>>>>> There are already 6, it'll be 7th. I also have one or two more in mind,
>>>>> that's already over the half. The same was probably thought about
>>>>> sqe->flags, and even though it's twice as many bits for net, those
>>>>> are taken faster as potential cost of redesign is lower.
>>>>>
>>>>> Fwiw, the code is nastier as well, more branchy and away from
>>>>> other notification init because of dependency on reading the
>>>>> flags.
>>>>>
>>>>> @@ -1331,7 +1333,7 @@ int io_send_zc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>>>>> zc->done_io = 0;
>>>>> - if (unlikely(READ_ONCE(sqe->__pad2[0]) || READ_ONCE(sqe->addr3)))
>>>>> + if (unlikely(READ_ONCE(sqe->__pad2[0])))
>>>>> return -EINVAL;
>>>>> /* we don't support IOSQE_CQE_SKIP_SUCCESS just yet */
>>>>> if (req->flags & REQ_F_CQE_SKIP)
>>>>> @@ -1358,6 +1360,13 @@ int io_send_zc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>>>>> }
>>>>> }
>>>>> + if (zc->flags & IORING_SEND_ZC_NOTIF_USER_DATA) {
>>>>> + notif->cqe.user_data = READ_ONCE(sqe->addr3);
>>>>> + } else {
>>>>> + if (READ_ONCE(sqe->addr3))
>>>>> + return -EINVAL;
>>>>> + }
>>>>> +
>>>>
>>>> I think just remove the else part here - addr3 is valid now that
>>>> IORING_SEND_ZC_NOTIF_USER_DATA is supported, and if you mess it up in
>>>> your applications, you'll find this via development anyway. Since addr3
>>>> == 0 is a valid value, it doesn't make much sense to check for it being
>>>> non-zero.
>>>
>>> Gating it on a separate flag but not checking when not set makes
>>> it only more confusing in terms of why would you do a flag in
>>> the first place.
>>>
>>> It's not like a flags field where any value set would be an
>>>> -EINVAL case. Doesn't even exclude having another flag for using addr3
>>>> for something else anyway.
>>>
>>> You can override the behaviour with another flag in either case,
>>> but realistically it's better to avoid as it's always messy,
>>> unless the features are clearly exclusive.
>>>
>>> I know there is no way to convince you, but v2 already degraded
>>> the uapi as per requested, can we have that one? The "else" branch
>>> doesn't make the api worse, on the opposite.
>>
>> The else ties into all of it though, as it perpetuates the "user_data is
>> zero is not valid" part. The reason we have the addr3 check in the first
>
> How come? If the flag is not set, it shouldn't be user_data and
> the kernel shouldn't care whether the user assumes otherwise. It's
> a general unused value check, and it is unused exactly because the
> field doesn't have any meaning put into it when the flag is not
> set. Same way nobody argues that sqe->__pad2[0] shouldn't be zero
> checked because a user might mess up the ABI by accident and put
> some user_data there. It'd be nice to be able to check if it's
> "set", but the best I can do is the typical "0 == not set".
Check if it's set == IORING_SEND_ZC_NOTIF_USER_DATA is set. That's why
the flag is there.
>> place is to have a way of saying "this field isn't used for this opcode,
>> may be used in the future". Now it is used/supported, and I don't think
>
> addr3 is not used nor supported in the else branch. You'd allow
> users to set addr3 without the flag, but what behaviour does it
> supposed to achieve? Just ignoring doing nothing?
It's not supposed to achieve anything, that is the point. The only
downside I can see is that you now always need a flag to say 'use addr3'
for this. But I think that's better than if we had used it for something
else first, and then you'd need a flag later on anyway to say "no it's
not X, it's Y if this flag is set".
>> we should be checking it. If we end up with future flags that also need
>> addr3 and 0 is valid, then it'll end up with more checking for that,
>> based on which flags are set and which are not.
>
> Or it can be used without any new flags when the user_data flag
> is not set, assuming it can tolerate the default 0.
So you'd have addr3 be user_data is IORING_SEND_ZC_NOTIF_USER_DATA is
set, and potentially something else if NO flag is set? That seems pretty
confusing.
>> The patch should just be removing that addr3 -EINVAL case, and adding
>> the two lines that check IORING_SEND_ZC_NOTIF_USER_DATA, and if set, assign
>> notif->cqe.user_data from addr3.
>>
>> But I object to saying this is a "degraded" uapi, to me it's very much a
>> better one as it allows all values of user_data, rather than have some
>> magic 0 value that's not valid for no other reason than force policy.
>
> Well, we clearly disagree on that one.
In the spirit of making progress and not wasting anymore time on this
fairly fruitless discussion, I'm fine with adding the else branch, and
hence v2 as-is.
--
Jens Axboe
prev parent reply other threads:[~2026-02-17 15:15 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-16 11:48 [PATCH 1/1] io_uring/zctx: separate notification user_data Pavel Begunkov
2026-02-16 15:10 ` Jens Axboe
2026-02-16 15:48 ` Pavel Begunkov
2026-02-16 15:52 ` Jens Axboe
2026-02-16 15:53 ` Pavel Begunkov
2026-02-16 15:55 ` Jens Axboe
2026-02-16 17:20 ` Pavel Begunkov
2026-02-16 17:27 ` Jens Axboe
2026-02-17 11:15 ` Pavel Begunkov
2026-02-17 13:12 ` Jens Axboe
2026-02-17 15:03 ` Pavel Begunkov
2026-02-17 15:15 ` Jens Axboe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56a3e17b-8ad1-4623-bc8c-e8f4e9f4e265@kernel.dk \
--to=axboe@kernel.dk \
--cc=asml.silence@gmail.com \
--cc=dyudaken@gmail.com \
--cc=io-uring@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox