From: Jens Axboe <[email protected]>
To: Constantine Gavrilov <[email protected]>
Cc: [email protected]
Subject: Re: Short sends returned in IORING
Date: Wed, 4 May 2022 09:55:23 -0600 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 5/4/22 9:28 AM, Jens Axboe wrote:
> On 5/4/22 9:21 AM, Constantine Gavrilov wrote:
>> On Wed, May 4, 2022 at 4:54 PM Jens Axboe <[email protected]> wrote:
>>>
>>> On 5/3/22 5:05 PM, Constantine Gavrilov wrote:
>>>> Jens:
>>>>
>>>> This is related to the previous thread "Fix MSG_WAITALL for
>>>> IORING_OP_RECV/RECVMSG".
>>>>
>>>> We have a similar issue with TCP socket sends. I see short sends
>>>> regarding of the method (I tried write, writev, send, and sendmsg
>>>> opcodes, while using MSG_WAITALL for send and sendmsg). It does not
>>>> make a difference.
>>>>
>>>> Most of the time, sends are not short, and I never saw short sends
>>>> with loopback and my app. But on real network media, I see short
>>>> sends.
>>>>
>>>> This is a real problem, since because of this it is not possible to
>>>> implement queue size of > 1 on a TCP socket, which limits the benefit
>>>> of IORING. When we have a short send, the next send in queue will
>>>> "corrupt" the stream.
>>>>
>>>> Can we have complete send before it completes, unless the socket is
>>>> disconnected?
>>>
>>> I'm guessing that this happens because we get a task_work item queued
>>> after we've processed some of the send, but not all. What kernel are you
>>> using?
>>>
>>> This:
>>>
>>> https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.19/io_uring&id=4c3c09439c08b03d9503df0ca4c7619c5842892e
>>>
>>> is queued up for 5.19, would be worth trying.
>>>
>>> --
>>> Jens Axboe
>>>
>>
>> Jens:
>>
>> Thank you for your reply.
>>
>> The kernel is 5.17.4-200.fc35.x86_64. I have looked at the patch. With
>> the solution in place, I am wondering whether it will be possible to
>> use multiple uring send IOs on the same socket. I expect that Linux
>> TCP will serialize multiple send operations on the same socket. I am
>> not sure it happens with uring (meaning that socket is blocked for
>> processing a new IO until the pending IO completes). Do I need
>> IOSQE_IO_DRAIN / IOSQE_IO_LINK for this to work? Would not be optimal
>> because of multiple different sockets in the same uring. While I
>> already have a workaround in the form of a "software" queue for
>> streaming data on TCP sockets, I would rather have kernel to do
>> "native" queueing in sockets layer, and have exrtra CPU cycles
>> available to the application.
>
> The patch above will mess with ordering potentially. If the cause is as
> I suspect, task_work causing it to think it's signaled, then the better
> approach may indeed be to just flush that work and retry without
> re-queueing the current one. I can try a patch against 5.18 if you are
> willing and able to test?
You can try something like this, if you run my for-5.19/io_uring branch.
I'd be curious to know if this solves the short send issue for you.
diff --git a/fs/io_uring.c b/fs/io_uring.c
index f6b6db216478..b835e80be1fa 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5684,6 +5684,7 @@ static int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags)
if (flags & MSG_WAITALL)
min_ret = iov_iter_count(&kmsg->msg.msg_iter);
+retry:
ret = __sys_sendmsg_sock(sock, &kmsg->msg, flags);
if (ret < min_ret) {
@@ -5694,6 +5695,8 @@ static int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags)
if (ret > 0 && io_net_retry(sock, flags)) {
sr->done_io += ret;
req->flags |= REQ_F_PARTIAL_IO;
+ if (io_run_task_work())
+ goto retry;
return io_setup_async_msg(req, kmsg);
}
req_set_fail(req);
@@ -5744,6 +5747,7 @@ static int io_send(struct io_kiocb *req, unsigned int issue_flags)
min_ret = iov_iter_count(&msg.msg_iter);
msg.msg_flags = flags;
+retry:
ret = sock_sendmsg(sock, &msg);
if (ret < min_ret) {
if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK))
@@ -5755,6 +5759,8 @@ static int io_send(struct io_kiocb *req, unsigned int issue_flags)
sr->buf += ret;
sr->done_io += ret;
req->flags |= REQ_F_PARTIAL_IO;
+ if (io_run_task_work())
+ goto retry;
return -EAGAIN;
}
req_set_fail(req);
--
Jens Axboe
next prev parent reply other threads:[~2022-05-04 15:55 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-03 23:05 Short sends returned in IORING Constantine Gavrilov
2022-05-04 13:54 ` Jens Axboe
2022-05-04 15:21 ` Constantine Gavrilov
2022-05-04 15:28 ` Jens Axboe
2022-05-04 15:55 ` Jens Axboe [this message]
2022-05-11 14:56 ` Constantine Gavrilov
2022-05-11 15:11 ` Constantine Gavrilov
2022-05-11 15:33 ` Constantine Gavrilov
2022-05-11 19:30 ` Constantine Gavrilov
2022-05-12 16:28 ` Jens Axboe
2022-05-15 13:36 ` Constantine Gavrilov
2022-05-16 12:50 ` Constantine Gavrilov
2022-05-04 16:18 ` Constantine Gavrilov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox