public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jeremy Allison <[email protected]>
To: Stefan Metzmacher <[email protected]>
Cc: Anoop C S <[email protected]>,
	Samba Technical <[email protected]>,
	io-uring <[email protected]>, Jens Axboe <[email protected]>,
	[email protected]
Subject: Re: Data Corruption bug with Samba's vfs_iouring and Linux 5.6.7/5.7rc3
Date: Wed, 6 May 2020 10:03:44 -0700	[thread overview]
Message-ID: <20200506170344.GA32399@jeremy-acer> (raw)
In-Reply-To: <[email protected]>

On Wed, May 06, 2020 at 04:08:03PM +0200, Stefan Metzmacher wrote:
> Am 06.05.20 um 14:41 schrieb Anoop C S:
> > On Wed, 2020-05-06 at 12:33 +0200, Stefan Metzmacher wrote:
> >> Hi Anoop,
> >>
> >>> I could reproduce the difference in SHA256 checksum after copying a
> >>> directory with 100 copies of test file(provided by reporter) from
> >>> io_uring VFS module enabled share using Windows explorer(right-
> >>> click-
> >>>> copy/paste). Only 5 out of 100 files had correct checksum after
> >>>> copy
> >>> operation :-/
> >>
> >> Great! Can you please try to collect level 1 log files with
> >> the patch https://bugzilla.samba.org/attachment.cgi?id=15955
> >> applied?
> > 
> > I have attached three log files.
> > log.io_uring.smbd -- Copy using Windows explorer
> > log.io_uring-mget.smd -- Copy using smbclient
> > log.io_uring-powershell.smd -- Copy using `Copy-Item`
> 
> Thanks! All of them show short reads like:
> 
> > [2020/05/06 17:27:28.130248,  1] ../../source3/modules/vfs_io_uring.c:103(vfs_io_uring_finish_req)
> >   vfs_io_uring_finish_req: pread ofs=0 (0x0) len=32768 (0x8000) nread=32768 (0x32768) eof=10000000 (0x989680) blks=4096 blocks=19536 dir/1.bin fnum 1607026405
> > [2020/05/06 17:27:28.131049,  1] ../../source3/modules/vfs_io_uring.c:103(vfs_io_uring_finish_req)
> >   vfs_io_uring_finish_req: pread ofs=9969664 (0x982000) len=30336 (0x7680) nread=30336 (0x30336) eof=10000000 (0x989680) blks=4096 blocks=19536 dir/1.bin fnum 1607026405
> > [2020/05/06 17:27:28.133679,  1] ../../source3/modules/vfs_io_uring.c:103(vfs_io_uring_finish_req)
> >   vfs_io_uring_finish_req: pread ofs=61440 (0xf000) len=32768 (0x8000) nread=32768 (0x32768) eof=10000000 (0x989680) blks=4096 blocks=19536 dir/1.bin fnum 1607026405
> > [2020/05/06 17:27:28.140184,  0] ../../source3/modules/vfs_io_uring.c:88(vfs_io_uring_finish_req)
> >   vfs_io_uring_finish_req: Invalid pread ofs=0 (0x0) len=1048576 (0x100000) nread=32768 (0x32768) eof=10000000 (0x989680) blks=4096 blocks=19536 dir/1.bin fnum 1607026405
> 
> It seems the first request is at ofs=0 len=32768 (0x8000) and we get
> 32768 back.
> A follow up request with ofs=0 len=1048576 (0x100000) only gets the
> first 32768 bytes which are already in the page cache.
> 
> I can easily reproduce this with the Ubuntu 5.4 kernel once I add
> state->ur.sqe.rw_flags |= RWF_NOWAIT; to vfs_io_uring_pread_send()
> and use this.
> 
> echo 1 > /proc/sys/vm/drop_caches
> head -c 1024 /root/samba-test/ff1.dat | hexdump -C
> 00000000  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
> |................|
> *
> 00000400
> smbclient //172.31.9.167/uringff -Uroot%test -c "get ff1.dat"
> 
> results in this log entries:
> > [2020/05/06 06:51:57.069990,  0] ../../source3/modules/vfs_io_uring.c:89(vfs_io_uring_finish_req)
> >   vfs_io_uring_finish_req: Invalid pread ofs=0 (0x0) len=8388608 (0x800000) nread=16384 (0x4000) eof=8388608 (0x800000) blks=4096 blocks=16384 ff1.dat fnum 840153065
> > [2020/05/06 06:51:57.076882,  1] ../../source3/modules/vfs_io_uring.c:104(vfs_io_uring_finish_req)
> >   vfs_io_uring_finish_req: pread ofs=16384 (0x4000) len=8372224 (0x7fc000) nread=8372224 (0x7fc000) eof=8388608 (0x800000) blks=4096 blocks=16384 ff1.dat fnum 840153065
> 
> smbclient is just smart enough to recover itself from the short read.
> But the windows client isn't.

Well we pay attention to the amount of data returned
and only increment the next read request by the amount
actually returned.

I'm amazed that the Windows client doesn't seem to
check this !

> The attached test against liburing (git://git.kernel.dk/liburing) should
> be able to demonstrate the problem. It can also be found in
> https://github.com/metze-samba/liburing/tree/implicit-rwf-nowaithttps://github.com/metze-samba/liburing/commit/eb06dcfde747e46bd08bedf9def2e6cb536c39e3
> 
> 
> I added the sqe->rw_flags = RWF_NOWAIT; line in order to demonstrate it
> against the Ubuntu 5.3 and 5.4 kernels. They both seem to have the bug.
> 
> Can someone run the unmodified test/implicit-rwf_nowait against
> a newer kernel?

Aha. I wondered about the short read issue when this
was first reported but I could never catch it in the
act.

If the Windows client doesn't check and the kernel
returns short reads I guess we'll have to add logic
similar to tstream_readv_send()/tstream_writev_send()
that ensure all bytes requested/send actually go through
the interface and from/into the kernel unless a read
returns 0 (EOF) or a write returns an error.

What a pain though :-(. SMB2+ server implementors
really need to take note that Windows clients will corrupt
files if they get a short read/write return.

The fact that early kernels don't return short
reads on io_uring but later kernels do makes it
even worse :-(.

There's even an SMB2 protocol field in SMB2_READ:

"MinimumCount (4 bytes): The minimum number of bytes to be read for this operation to be
successful. If fewer than the minimum number of bytes are read by the server, the server
MUST return an error rather than the bytes read."

We correctly return EOF if the amount read from
the kernel is less than SMB2_READ.MinimumCount
so I'm guessing they're not using it or looking
at it (or setting it to zero).

MinimumCount is supposed to allow the client to cope with
this. Anoop, do you have wireshark traces so we can
see what the Windows clients are setting here ?

Jeremy.

  parent reply	other threads:[~2020-05-06 17:03 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-05 10:04 Data Corruption bug with Samba's vfs_iouring and Linux 5.6.7/5.7rc3 Stefan Metzmacher
2020-05-05 14:41 ` Jens Axboe
2020-05-05 15:44   ` Jens Axboe
2020-05-05 16:53     ` Jens Axboe
2020-05-05 17:39       ` Jens Axboe
2020-05-05 17:48         ` Jeremy Allison
2020-05-05 17:50           ` Jens Axboe
     [not found]           ` <[email protected]>
2020-05-06 10:33             ` Stefan Metzmacher
2020-05-06 10:41               ` Stefan Metzmacher
     [not found]               ` <[email protected]>
2020-05-06 14:08                 ` Stefan Metzmacher
2020-05-06 14:43                   ` Andreas Schneider
2020-05-06 14:46                   ` Andreas Schneider
2020-05-06 15:06                     ` Stefan Metzmacher
2020-05-06 17:03                   ` Jeremy Allison [this message]
2020-05-06 17:13                     ` Jeremy Allison
2020-05-06 18:01                     ` Jeremy Allison
2020-05-05 20:19       ` Stefan Metzmacher
2020-05-06 12:55         ` Pavel Begunkov
2020-05-06 15:20           ` Stefan Metzmacher
2020-05-06 15:42             ` Pavel Begunkov
2020-05-07 16:43               ` Jens Axboe
2020-05-07 16:48                 ` Jeremy Allison
2020-05-07 16:50                   ` Jens Axboe
2020-05-07 18:31                     ` Jeremy Allison
2020-05-07 18:35                       ` Jens Axboe
2020-05-07 18:55                         ` Jeremy Allison
2020-05-07 18:58                           ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200506170344.GA32399@jeremy-acer \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox