From: Jens Axboe <[email protected]>
To: Pavel Begunkov <[email protected]>,
Dylan Yudaken <[email protected]>
Cc: [email protected]
Subject: Re: [PATCH 8/8] io_uring/net: set MSG_MORE if we're doing multishot send and have more
Date: Mon, 26 Feb 2024 07:52:46 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 2/26/24 7:24 AM, Pavel Begunkov wrote:
> On 2/26/24 13:42, Jens Axboe wrote:
>> On 2/26/24 3:59 AM, Dylan Yudaken wrote:
>>> On Sun, Feb 25, 2024 at 12:46?AM Jens Axboe <[email protected]> wrote:
>>>>
>>>> If we have more data pending, we know we're going to do one more loop.
>>>> If that's the case, then set MSG_MORE to inform the networking stack
>>>> that there's more data coming shortly for this socket.
>>>>
>>>> Signed-off-by: Jens Axboe <[email protected]>
>>>> ---
>>>> io_uring/net.c | 10 +++++++---
>>>> 1 file changed, 7 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/io_uring/net.c b/io_uring/net.c
>>>> index 240b8eff1a78..07307dd5a077 100644
>>>> --- a/io_uring/net.c
>>>> +++ b/io_uring/net.c
>>>> @@ -519,6 +519,10 @@ int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags)
>>>> if (!io_check_multishot(req, issue_flags))
>>>> return io_setup_async_msg(req, kmsg, issue_flags);
>>>>
>>>> + flags = sr->msg_flags;
>>>> + if (issue_flags & IO_URING_F_NONBLOCK)
>>>> + flags |= MSG_DONTWAIT;
>>>> +
>>>> retry_multishot:
>>>> if (io_do_buffer_select(req)) {
>>>> void __user *buf;
>>>> @@ -528,12 +532,12 @@ int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags)
>>>> if (!buf)
>>>> return -ENOBUFS;
>>>>
>>>> + if ((req->flags & (REQ_F_BL_EMPTY|REQ_F_APOLL_MULTISHOT)) ==
>>>> + REQ_F_APOLL_MULTISHOT)
>>>> + flags |= MSG_MORE;
>>>> iov_iter_ubuf(&kmsg->msg.msg_iter, ITER_SOURCE, buf, len);
>>>> }
>>>
>>> This feels racy. I don't have an exact sequence in mind, but I believe
>>> there are cases where between
>>> the two calls to __sys_sendmsg_sock, another submission could be
>>> issued and drain the buffer list.
>>> I guess the result would be that the packet is never sent out, but I
>>> have not followed the codepaths of MSG_MORE.
>>
>> This is true, but that race always exists depending on how gets to go
>> first (the adding of the buffer, or the send itself). The way I see it,
>> when the send is issued we're making the guarantee that we're going to
>> at least deplete the queue as it looks when entered. If more is added
>> while it's being processed, we _may_ see it.
>>
>> Outside of that, we don't want it to potentially run in perpetuity. It
>> may actually be a good idea to make the rule of "just issue what was
>> there when first seen/issued" a hard one, though I don't think it's
>> really worth doing. But making any guarantees on buffers added in
>> parallel will be impossible. If you do that, then you have to deal with
>> figuring out what's left in the queue once you get a completion withou
>> CQE_F_MORE.
>>
>>> The obvious other way to trigger this codepath is if the user messes
>>> with the ring by decrementing
>>> the buffer counter. I do not believe there are any nefarious outcomes
>>> - but just to point out that
>>> REQ_F_BL_EMPTY is essentially user controlled.
>>
>> The user may certainly shoot himself in the foot. As long as that
>> doesn't lead to a nefarious outcome, then that's not a concern. For this
>> case, the head is kernel local, user can only write to the tail. So we
>> could have a case of user fiddling with the tail and when we grab the
>> next buffer (and the previous one did not have REQ_F_BL_EMPTY set), the
>> ring will indeed appear to be empty. At that point you get an -ENOBUFS
>> without CQE_F_MORE set.
>
> A side note, don't forget that there are other protocols apart
> from TCP. AFAIK UDP corking will pack it into a single datagram,
> which is not the same as two separate sends.
Yeah, should really have labeled this one as a test/rfc kind of patch. I
wasn't even convinced we want to do this uncondtionally for TCP. I'll
just leave it at the end for now, it's a separate kind of discussion
imho and this is why it was left as a separate patch rather than being
bundled with the multishot send in general.
--
Jens Axboe
prev parent reply other threads:[~2024-02-26 14:52 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-25 0:35 [PATCHSET v3 0/8] Support for provided buffers for send Jens Axboe
2024-02-25 0:35 ` [PATCH 1/8] io_uring/net: unify how recvmsg and sendmsg copy in the msghdr Jens Axboe
2024-02-26 14:41 ` Pavel Begunkov
2024-02-26 15:03 ` Jens Axboe
2024-02-25 0:35 ` [PATCH 2/8] net: remove {revc,send}msg_copy_msghdr() from exports Jens Axboe
2024-02-25 0:35 ` [PATCH 3/8] io_uring/net: add provided buffer support for IORING_OP_SEND Jens Axboe
2024-02-25 0:35 ` [PATCH 4/8] io_uring/net: add provided buffer support for IORING_OP_SENDMSG Jens Axboe
2024-02-25 0:35 ` [PATCH 5/8] io_uring/kbuf: flag request if buffer pool is empty after buffer pick Jens Axboe
2024-02-25 0:35 ` [PATCH 6/8] io_uring/net: support multishot for send Jens Axboe
2024-02-26 10:47 ` Dylan Yudaken
2024-02-26 13:38 ` Jens Axboe
2024-02-26 14:02 ` Dylan Yudaken
2024-02-26 14:27 ` Jens Axboe
2024-02-26 14:36 ` Pavel Begunkov
2024-02-26 15:16 ` Jens Axboe
2024-02-26 15:41 ` Pavel Begunkov
2024-02-26 19:11 ` Jens Axboe
2024-02-26 19:21 ` Pavel Begunkov
2024-02-26 20:12 ` Jens Axboe
2024-02-26 20:51 ` Pavel Begunkov
2024-02-26 21:27 ` Jens Axboe
2024-02-28 12:39 ` Pavel Begunkov
2024-02-28 17:28 ` Jens Axboe
2024-02-28 23:49 ` Jens Axboe
2024-02-29 1:46 ` Jens Axboe
2024-02-29 15:42 ` Jens Axboe
2024-02-26 19:31 ` Dylan Yudaken
2024-02-26 19:49 ` Jens Axboe
2024-02-25 0:35 ` [PATCH 7/8] io_uring/net: support multishot for sendmsg Jens Axboe
2024-02-25 0:35 ` [PATCH 8/8] io_uring/net: set MSG_MORE if we're doing multishot send and have more Jens Axboe
2024-02-26 10:59 ` Dylan Yudaken
2024-02-26 13:42 ` Jens Axboe
2024-02-26 14:24 ` Pavel Begunkov
2024-02-26 14:52 ` Jens Axboe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox