From: Jens Axboe <[email protected]>
To: Ming Lei <[email protected]>, Stefan Metzmacher <[email protected]>
Cc: [email protected], Pavel Begunkov <[email protected]>,
David Ahern <[email protected]>
Subject: Re: IOSQE_IO_LINK vs. short send of SOCK_STREAM
Date: Fri, 13 Jan 2023 10:51:20 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <Y8EuhoodlKFGh/55@T590>
On 1/13/23 3:12 AM, Ming Lei wrote:
> Hello,
>
> On Thu, Jan 12, 2023 at 08:35:36AM +0100, Stefan Metzmacher wrote:
>> Am 12.01.23 um 04:40 schrieb Jens Axboe:
>>> On 1/11/23 8:27?PM, Ming Lei wrote:
>>>> Hi Stefan and Jens,
>>>>
>>>> Thanks for the help.
>>>>
>>>> BTW, the issue is observed when I write ublk-nbd:
>>>>
>>>> https://github.com/ming1/ubdsrv/commits/nbd
>>>>
>>>> and it isn't completed yet(multiple send sqe chains not serialized
>>>> yet), the issue is triggered when writing big chunk data to ublk-nbd.
>>>
>>> Gotcha
>>>
>>>> On Wed, Jan 11, 2023 at 05:32:00PM +0100, Stefan Metzmacher wrote:
>>>>> Hi Ming,
>>>>>
>>>>>> Per my understanding, a short send on SOCK_STREAM should terminate the
>>>>>> remainder of the SQE chain built by IOSQE_IO_LINK.
>>>>>>
>>>>>> But from my observation, this point isn't true when using io_sendmsg or
>>>>>> io_sendmsg_zc on TCP socket, and the other remainder of the chain still
>>>>>> can be completed after one short send is found. MSG_WAITALL is off.
>>>>>
>>>>> This is due to legacy reasons, you need pass MSG_WAITALL explicitly
>>>>> in order to a retry or an error on a short write...
>>>>> It should work for send, sendmsg, sendmsg_zc, recv and recvmsg.
>>>>
>>>> Turns out there is another application bug in which recv sqe may cut in the
>>>> send sqe chain.
>>>>
>>>> After the issue is fixed, if MSG_WAITALL is set, short send can't be
>>>> observed any more. But if MSG_WAITALL isn't set, short send can be
>>>> observed and the send io chain still won't be terminated.
>>>
>>> Right, if MSG_WAITALL is set, then the whole thing will be written. If
>>> we get a short send, it's retried appropriately. Unless an error occurs,
>>> it should send the whole thing.
>>>
>>>> So if MSG_WAITALL is set, will io_uring be responsible for retry in case
>>>> of short send, and application needn't to take care of it?
>>
>> With new kernels yes, but the application should be prepared to have retry
>> logic in order to be compatible with older kernels.
>
> Now ublk-nbd can be played, mkfs/mount and fio starts to work.
>
> But short send still can be observed sometimes when sending nbd write
> request, which is done by sendmsg(), and the message includes two vectors,
> (the 1st is the nbd_request, another one is the data to be written to disk).
>
> Short send is reported by cqe in which cqe->res is always 28, which is
> size of 'struct nbd_request', also the length of the 1st io vec. And not
> see send cqe failure message.
>
> And MSG_WAITALL is set for all ublk-nbd io actually.
>
> Follows the steps:
>
> 1) install liburing 2.0+
>
> 2) build ublk & reproduce the issue:
>
> - git clone https://github.com/ming1/ubdsrv.git -b nbd
>
> - cd ubdsrv
>
> - vim build_with_liburing_src && set LIBURING_DIR to your liburing dir
>
> - ./build_with_liburing_src&& make -j4
>
> 3) run the nbd test
> - cd ubdsrv
> - make test T=nbd
>
> Sometimes the test hangs, and the following log can be observed
> in syslog:
>
> nbd_send_req_done: short send/receive tag 2 op 1 8000000000800002, len 524316 written 28 cqe flags 0
> ...
>
I can reproduce this, but it's a SEND that ends up being triggered,
not a SENDMSG. Should the payload carrying op not be a SENDMSG? I'm
assuming two vecs for that one.
--
Jens Axboe
next prev parent reply other threads:[~2023-01-13 17:58 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-11 15:26 IOSQE_IO_LINK vs. short send of SOCK_STREAM Ming Lei
2023-01-11 15:49 ` Jens Axboe
2023-01-11 16:32 ` Stefan Metzmacher
2023-01-11 16:36 ` Jens Axboe
2023-01-12 3:27 ` Ming Lei
2023-01-12 3:40 ` Jens Axboe
2023-01-12 7:35 ` Stefan Metzmacher
2023-01-13 10:12 ` Ming Lei
2023-01-13 17:51 ` Jens Axboe [this message]
2023-01-13 18:01 ` Jens Axboe
2023-01-14 0:27 ` Ming Lei
2023-01-14 1:39 ` Ming Lei
2023-01-14 2:12 ` Ming Lei
2023-01-14 3:47 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox