public inbox for [email protected]
 help / color / mirror / Atom feed
From: Luis Chamberlain <[email protected]>
To: John Garry <[email protected]>,
	Dan Helmick <[email protected]>
Cc: "Matthew Wilcox" <[email protected]>,
	"Pankaj Raghav" <[email protected]>,
	"Daniel Gomez" <[email protected]>,
	"Javier González" <[email protected]>,
	[email protected], [email protected], [email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected], [email protected],
	[email protected], [email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected]
Subject: Re: [PATCH v6 00/10] block atomic writes
Date: Fri, 12 Apr 2024 11:28:39 -0700	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

+ Dan,

On Fri, Apr 12, 2024 at 09:15:57AM +0100, John Garry wrote:
> On 11/04/2024 20:07, Luis Chamberlain wrote:
> > > So if you
> > > have a 4K PBS and 512B LBS, then WRITE_ATOMIC_16 would be required to write
> > > 16KB atomically.
> > Ugh. Why does SCSI requires a special command for this?
> 
> The actual question from others is why does NVMe not have a dedicated
> command for this, like:
> https://lore.kernel.org/linux-nvme/[email protected]/

Because we don't really need it for the hardware that supports it if the
host does the respective topology checks. For instance the respective
checks for NVMe are that atomics respect AWUN as the cap as the drive
already can go up to AWUN, and the limit for power-fail is implicit by
checking for AWUPF / NAWUPF. The alignment constraints can be dealt with
by the host software.

> It's a data integrity feature, and we want to know if it works properly.

For drives which already support this integrity is ensured already for
you. An NVMe specific atomic write command could be useful for for
existing drives for other reasons or future uses but its not a requirement
with the existing use cases if the NVMe alignment / atomic are respected by
the host.

> > Now we know what would be needed to bump the physical block size, it is
> > certainly a different feature, however I think it would be good to
> > evaluate that world too. For NVMe we don't have such special write
> > requirements.
> > 
> > I put together this kludge with the last patches series of LBS + the
> > bdev cache aops stuff (which as I said before needs an alternative
> > solution) and just the scsi atomics topology + physical block size
> > change to easily experiment to see what would break:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=20240408-lbs-scsi-kludge
> > 
> > Using a larger sector size works but it does not use the special scsi
> > atomic write.
> 
> If you are using scsi_debug driver, then you can just pass the desired
> physblk_exp and sector_size args - they both default to 512B. Then you don't
> need bother with sd.c atomic stuff, which I think is what you want.
> 
> > 
> > > > > To me, O_ATOMIC would be required for buffered atomic writes IO, as we want
> > > > > a fixed-sized IO, so that would mean no mixing of atomic and non-atomic IO.
> > > > Would using the same min and max order for the inode work instead?
> > > Maybe, I would need to check further.
> > I'd be happy to help review too.
> 
> Yeah, I'm starting to think that min and max inode would make life easier,
> as we don't need to deal with the scenario of an atomic write to a folio >
> atomic write size.

And aligments constraints could be dealt with as well.

  Luis

  reply	other threads:[~2024-04-12 18:28 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-26 13:38 [PATCH v6 00/10] block atomic writes John Garry
2024-03-26 13:38 ` [PATCH v6 01/10] block: Pass blk_queue_get_max_sectors() a request pointer John Garry
2024-04-10 22:58   ` Luis Chamberlain
2024-03-26 13:38 ` [PATCH v6 02/10] block: Call blkdev_dio_unaligned() from blkdev_direct_IO() John Garry
2024-04-10 22:53   ` Luis Chamberlain
2024-04-11  8:06     ` John Garry
2024-03-26 13:38 ` [PATCH v6 03/10] fs: Initial atomic write support John Garry
2024-03-26 13:38 ` [PATCH v6 04/10] fs: Add initial atomic write support info to statx John Garry
2024-03-26 13:38 ` [PATCH v6 05/10] block: Add core atomic write support John Garry
2024-03-26 17:11   ` Randy Dunlap
2024-04-10 23:34     ` Luis Chamberlain
2024-04-11  8:15       ` John Garry
2024-03-26 13:38 ` [PATCH v6 06/10] block: Add atomic write support for statx John Garry
2024-03-26 13:38 ` [PATCH v6 07/10] block: Add fops atomic write support John Garry
2024-03-26 13:38 ` [PATCH v6 08/10] scsi: sd: Atomic " John Garry
2024-03-26 13:38 ` [PATCH v6 09/10] scsi: scsi_debug: " John Garry
2024-03-26 13:38 ` [PATCH v6 10/10] nvme: " John Garry
2024-04-11  0:29   ` Luis Chamberlain
2024-04-11  8:59     ` John Garry
2024-04-11 16:22       ` Luis Chamberlain
2024-04-11 23:32         ` Dan Helmick
2024-03-27  3:50 ` [PATCH v6 00/10] block atomic writes Matthew Wilcox
2024-03-27 13:37   ` John Garry
2024-04-04 16:48     ` Matthew Wilcox
2024-04-05 10:06       ` John Garry
2024-04-08 17:50         ` Luis Chamberlain
2024-04-10  4:05           ` Matthew Wilcox
2024-04-10  6:20             ` Hannes Reinecke
2024-04-11  0:38               ` Luis Chamberlain
2024-04-14 20:50             ` Luis Chamberlain
2024-04-15 21:18               ` Matthew Wilcox
2024-04-16 21:11                 ` Luis Chamberlain
2024-04-10  8:34           ` John Garry
2024-04-11 19:07             ` Luis Chamberlain
2024-04-12  8:15               ` John Garry
2024-04-12 18:28                 ` Luis Chamberlain [this message]
2024-03-27 20:31   ` Dave Chinner
2024-04-05 10:20     ` Kent Overstreet
2024-04-05 10:55       ` John Garry
2024-04-05  6:14   ` Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox