public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Jens Axboe <axboe@kernel.dk>, io-uring@vger.kernel.org
Cc: Dylan Yudaken <dyudaken@gmail.com>
Subject: Re: [PATCH 1/1] io_uring/zctx: separate notification user_data
Date: Tue, 17 Feb 2026 15:03:28 +0000	[thread overview]
Message-ID: <5eeb233d-74e4-453c-ad18-f30382dc44e7@gmail.com> (raw)
In-Reply-To: <fd6ac244-40ee-48e1-b41b-d4d78839fe72@kernel.dk>

On 2/17/26 13:12, Jens Axboe wrote:
> On 2/17/26 4:15 AM, Pavel Begunkov wrote:
>> On 2/16/26 17:27, Jens Axboe wrote:
>> ...
>>>> There are already 6, it'll be 7th. I also have one or two more in mind,
>>>> that's already over the half. The same was probably thought about
>>>> sqe->flags, and even though it's twice as many bits for net, those
>>>> are taken faster as potential cost of redesign is lower.
>>>>
>>>> Fwiw, the code is nastier as well, more branchy and away from
>>>> other notification init because of dependency on reading the
>>>> flags.
>>>>
>>>> @@ -1331,7 +1333,7 @@ int io_send_zc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>>>>          zc->done_io = 0;
>>>>    -    if (unlikely(READ_ONCE(sqe->__pad2[0]) || READ_ONCE(sqe->addr3)))
>>>> +    if (unlikely(READ_ONCE(sqe->__pad2[0])))
>>>>            return -EINVAL;
>>>>        /* we don't support IOSQE_CQE_SKIP_SUCCESS just yet */
>>>>        if (req->flags & REQ_F_CQE_SKIP)
>>>> @@ -1358,6 +1360,13 @@ int io_send_zc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>>>>            }
>>>>        }
>>>>    +    if (zc->flags & IORING_SEND_ZC_NOTIF_USER_DATA) {
>>>> +        notif->cqe.user_data = READ_ONCE(sqe->addr3);
>>>> +    } else {
>>>> +        if (READ_ONCE(sqe->addr3))
>>>> +            return -EINVAL;
>>>> +    }
>>>> +
>>>
>>> I think just remove the else part here - addr3 is valid now that
>>> IORING_SEND_ZC_NOTIF_USER_DATA is supported, and if you mess it up in
>>> your applications, you'll find this via development anyway. Since addr3
>>> == 0 is a valid value, it doesn't make much sense to check for it being
>>> non-zero.
>>
>> Gating it on a separate flag but not checking when not set makes
>> it only more confusing in terms of why would you do a flag in
>> the first place.
>>
>> It's not like a flags field where any value set would be an
>>> -EINVAL case. Doesn't even exclude having another flag for using addr3
>>> for something else anyway.
>>
>> You can override the behaviour with another flag in either case,
>> but realistically it's better to avoid as it's always messy,
>> unless the features are clearly exclusive.
>>
>> I know there is no way to convince you, but v2 already degraded
>> the uapi as per requested, can we have that one? The "else" branch
>> doesn't make the api worse, on the opposite.
> 
> The else ties into all of it though, as it perpetuates the "user_data is
> zero is not valid" part. The reason we have the addr3 check in the first

How come? If the flag is not set, it shouldn't be user_data and
the kernel shouldn't care whether the user assumes otherwise. It's
a general unused value check, and it is unused exactly because the
field doesn't have any meaning put into it when the flag is not
set. Same way nobody argues that sqe->__pad2[0] shouldn't be zero
checked because a user might mess up the ABI by accident and put
some user_data there. It'd be nice to be able to check if it's
"set", but the best I can do is the typical "0 == not set".

> place is to have a way of saying "this field isn't used for this opcode,
> may be used in the future". Now it is used/supported, and I don't think

addr3 is not used nor supported in the else branch. You'd allow
users to set addr3 without the flag, but what behaviour does it
supposed to achieve? Just ignoring doing nothing?

> we should be checking it. If we end up with future flags that also need
> addr3 and 0 is valid, then it'll end up with more checking for that,
> based on which flags are set and which are not.

Or it can be used without any new flags when the user_data flag
is not set, assuming it can tolerate the default 0.

> The patch should just be removing that addr3 -EINVAL case, and adding
> the two lines that check IORING_SEND_ZC_NOTIF_USER_DATA, and if set, assign
> notif->cqe.user_data from addr3.
> 
> But I object to saying this is a "degraded" uapi, to me it's very much a
> better one as it allows all values of user_data, rather than have some
> magic 0 value that's not valid for no other reason than force policy.

Well, we clearly disagree on that one.

-- 
Pavel Begunkov


  reply	other threads:[~2026-02-17 15:03 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-16 11:48 [PATCH 1/1] io_uring/zctx: separate notification user_data Pavel Begunkov
2026-02-16 15:10 ` Jens Axboe
2026-02-16 15:48   ` Pavel Begunkov
2026-02-16 15:52     ` Jens Axboe
2026-02-16 15:53       ` Pavel Begunkov
2026-02-16 15:55         ` Jens Axboe
2026-02-16 17:20           ` Pavel Begunkov
2026-02-16 17:27             ` Jens Axboe
2026-02-17 11:15               ` Pavel Begunkov
2026-02-17 13:12                 ` Jens Axboe
2026-02-17 15:03                   ` Pavel Begunkov [this message]
2026-02-17 15:15                     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5eeb233d-74e4-453c-ad18-f30382dc44e7@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=dyudaken@gmail.com \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox