From: Pavel Begunkov <asml.silence@gmail.com>
To: Jens Axboe <axboe@kernel.dk>, io-uring@vger.kernel.org
Cc: Dylan Yudaken <dyudaken@gmail.com>
Subject: Re: [PATCH 1/1] io_uring/zctx: separate notification user_data
Date: Mon, 16 Feb 2026 17:20:15 +0000 [thread overview]
Message-ID: <e59d8887-d908-463b-ad31-3bf10d977de4@gmail.com> (raw)
In-Reply-To: <fc217246-2397-4ae4-8354-7ed0c498d23c@kernel.dk>
On 2/16/26 15:55, Jens Axboe wrote:
> On 2/16/26 8:53 AM, Pavel Begunkov wrote:
>> On 2/16/26 15:52, Jens Axboe wrote:
>>> On 2/16/26 8:48 AM, Pavel Begunkov wrote:
>>>> On 2/16/26 15:10, Jens Axboe wrote:
>>>>> On 2/16/26 4:48 AM, Pavel Begunkov wrote:
>>>>>> People previously asked for the notification CQE to have a different
>>>>>> user_data value from the main request completion. It's useful to
>>>>>> separate buffer and request handling logic and avoid separately
>>>>>> refcounting the request.
>>>>>>
>>>>>> Let the user pass the notification user_data in sqe->addr3. If zero,
>>>>>> it'll inherit sqe->user_data as before. It doesn't change the rules for
>>>>>> when the user can expect a notification CQE, and it should still check
>>>>>> the IORING_CQE_F_MORE flag.
>>>>>
>>>>> This should use and sqe->ioprio flag to manage it, otherwise you're
>>>>> excluding 0. Which may not be important in and of itself, but the
>>>>> flag approach is expected way to do this.
>>>>
>>>> What's the benefit? It's not unreasonable to exclude zero, it won't
>>>> limit any use cases, and it's not new either (i.e. buffer tags).
>>>> On the other hand, the user will now have to modify two fields
>>>> instead of one, which is cleaner. And you're taking one extra bit
>>>> out of 16bit ->ioprio, which is not critical if it's all going to
>>>> be flags, but it wouldn't be an outrageous idea to take 8 bits
>>>> out of it for some index, for example.
>>>
>>> The benefit is that it's weird to exclude a given user_data value, just
>>> so it can get used as both a key and a flag. IMHO much cleaner to have a
>>> flag for it which explicitly says "use the user_data I provide". Also
>>> easier to explain in docs, set this flag and then the value in X will be
>>> the user_data for the completion.
>>
>> Ok, I'll respin, let's go with wasting bits for nothing.
>
> It's not like they are a scarce resource, and if we need more than 16
> bits to modify send/recv behavior, then arguably we have bigger
> problems.
There are already 6, it'll be 7th. I also have one or two more in mind,
that's already over the half. The same was probably thought about
sqe->flags, and even though it's twice as many bits for net, those
are taken faster as potential cost of redesign is lower.
Fwiw, the code is nastier as well, more branchy and away from
other notification init because of dependency on reading the
flags.
@@ -1331,7 +1333,7 @@ int io_send_zc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
zc->done_io = 0;
- if (unlikely(READ_ONCE(sqe->__pad2[0]) || READ_ONCE(sqe->addr3)))
+ if (unlikely(READ_ONCE(sqe->__pad2[0])))
return -EINVAL;
/* we don't support IOSQE_CQE_SKIP_SUCCESS just yet */
if (req->flags & REQ_F_CQE_SKIP)
@@ -1358,6 +1360,13 @@ int io_send_zc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
}
}
+ if (zc->flags & IORING_SEND_ZC_NOTIF_USER_DATA) {
+ notif->cqe.user_data = READ_ONCE(sqe->addr3);
+ } else {
+ if (READ_ONCE(sqe->addr3))
+ return -EINVAL;
+ }
+
zc->len = READ_ONCE(sqe->len);
zc->msg_flags = READ_ONCE(sqe->msg_flags) | MSG_NOSIGNAL | MSG_ZEROCOPY;
req->buf_index = READ_ONCE(sqe->buf_index);
--
Pavel Begunkov
next prev parent reply other threads:[~2026-02-16 17:20 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-16 11:48 [PATCH 1/1] io_uring/zctx: separate notification user_data Pavel Begunkov
2026-02-16 15:10 ` Jens Axboe
2026-02-16 15:48 ` Pavel Begunkov
2026-02-16 15:52 ` Jens Axboe
2026-02-16 15:53 ` Pavel Begunkov
2026-02-16 15:55 ` Jens Axboe
2026-02-16 17:20 ` Pavel Begunkov [this message]
2026-02-16 17:27 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e59d8887-d908-463b-ad31-3bf10d977de4@gmail.com \
--to=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=dyudaken@gmail.com \
--cc=io-uring@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox