public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: Pavel Begunkov <[email protected]>,
	Stefan Metzmacher <[email protected]>
Cc: io-uring <[email protected]>,
	Samba Technical <[email protected]>,
	Jeremy Allison <[email protected]>
Subject: Re: Data Corruption bug with Samba's vfs_iouring and Linux 5.6.7/5.7rc3
Date: Thu, 7 May 2020 10:43:17 -0600	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 5/6/20 9:42 AM, Pavel Begunkov wrote:
> On 06/05/2020 18:20, Stefan Metzmacher wrote:
>> Am 06.05.20 um 14:55 schrieb Pavel Begunkov:
>>> On 05/05/2020 23:19, Stefan Metzmacher wrote:
>>> AFAIK, it can. io_uring first tries to submit a request with IOCB_NOWAIT,
>>> in short for performance reasons. And it have been doing so from the beginning
>>> or so. The same is true for writes.
>>
>> See the other mails in the thread. The test I wrote shows the
> 
> Cool you resolved the issue!
> 
>> implicit IOCB_NOWAIT was not exposed to the caller in  (at least in 5.3
>> and 5.4).
>>
> 
> # git show remotes/origin/for-5.3/io_uring:fs/io_uring grep "kiocb->ki_flags |=
> IOCB_NOWAIT" -A 5 -B 5
> 
> if (force_nonblock)
>         kiocb->ki_flags |= IOCB_NOWAIT;
> 
> And it have been there since 5.2 or even earlier. I don't know, your results
> could be because of different policy in block layer, something unexpected in
> io_uring, etc., but it's how it was intended to be.
> 
> 
>> I think the typical user don't want it to be exposed!
>> I'm not sure for blocking reads on a socket, but for files
>> below EOF it's really not what's expected.
> 
> Hard to say, but even read(2) without any NONBLOCK doesn't guarantee that.
> Hopefully, BPF will help us with that in the future.

Replying here, as I missed the storm yesterday... The reason why it's
different is that later kernels no longer attempt to prevent the short
reads. They happen when you get overlapping buffered IO. Then one sqe
will find that X of the Y range is already in cache, and return that.
We don't retry the latter blocking. We previously did, but there was
a few issues with it:

- You're redoing the whole IO, which means more copying

- It's not safe to retry, it'll depend on the file type. For socket,
  pipe, etc we obviously cannot. This is the real reason it got disabled,
  as it was broken there.

Just like for regular system calls, applications must be able to deal
with short IO.

-- 
Jens Axboe


  reply	other threads:[~2020-05-07 16:43 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-05 10:04 Data Corruption bug with Samba's vfs_iouring and Linux 5.6.7/5.7rc3 Stefan Metzmacher
2020-05-05 14:41 ` Jens Axboe
2020-05-05 15:44   ` Jens Axboe
2020-05-05 16:53     ` Jens Axboe
2020-05-05 17:39       ` Jens Axboe
2020-05-05 17:48         ` Jeremy Allison
2020-05-05 17:50           ` Jens Axboe
     [not found]           ` <[email protected]>
2020-05-06 10:33             ` Stefan Metzmacher
2020-05-06 10:41               ` Stefan Metzmacher
     [not found]               ` <[email protected]>
2020-05-06 14:08                 ` Stefan Metzmacher
2020-05-06 14:43                   ` Andreas Schneider
2020-05-06 14:46                   ` Andreas Schneider
2020-05-06 15:06                     ` Stefan Metzmacher
2020-05-06 17:03                   ` Jeremy Allison
2020-05-06 17:13                     ` Jeremy Allison
2020-05-06 18:01                     ` Jeremy Allison
2020-05-05 20:19       ` Stefan Metzmacher
2020-05-06 12:55         ` Pavel Begunkov
2020-05-06 15:20           ` Stefan Metzmacher
2020-05-06 15:42             ` Pavel Begunkov
2020-05-07 16:43               ` Jens Axboe [this message]
2020-05-07 16:48                 ` Jeremy Allison
2020-05-07 16:50                   ` Jens Axboe
2020-05-07 18:31                     ` Jeremy Allison
2020-05-07 18:35                       ` Jens Axboe
2020-05-07 18:55                         ` Jeremy Allison
2020-05-07 18:58                           ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox