From: John Garry <[email protected]>
To: Luis Chamberlain <[email protected]>,
Dan Helmick <[email protected]>
Cc: [email protected], [email protected], [email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected], [email protected],
[email protected], [email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
Alan Adamson <[email protected]>
Subject: Re: [PATCH v6 10/10] nvme: Atomic write support
Date: Thu, 11 Apr 2024 09:59:57 +0100 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 11/04/2024 01:29, Luis Chamberlain wrote:
> On Tue, Mar 26, 2024 at 01:38:13PM +0000, John Garry wrote:
>> From: Alan Adamson <[email protected]>
>>
>> Add support to set block layer request_queue atomic write limits. The
>> limits will be derived from either the namespace or controller atomic
>> parameters.
>>
>> NVMe atomic-related parameters are grouped into "normal" and "power-fail"
>> (or PF) class of parameter. For atomic write support, only PF parameters
>> are of interest. The "normal" parameters are concerned with racing reads
>> and writes (which also applies to PF). See NVM Command Set Specification
>> Revision 1.0d section 2.1.4 for reference.
>>
>> Whether to use per namespace or controller atomic parameters is decided by
>> NSFEAT bit 1 - see Figure 97: Identify – Identify Namespace Data
>> Structure, NVM Command Set.
>>
>> NVMe namespaces may define an atomic boundary, whereby no atomic guarantees
>> are provided for a write which straddles this per-lba space boundary. The
>> block layer merging policy is such that no merges may occur in which the
>> resultant request would straddle such a boundary.
>>
>> Unlike SCSI, NVMe specifies no granularity or alignment rules, apart from
>> atomic boundary rule.
>
> Larger IU drives a larger alignment *preference*, and it can be multiples
> of the LBA format, it's called Namespace Preferred Write Granularity (NPWG)
> and the NVMe driver already parses it. So say you have a 4k LBA format
> but a 16k NPWG. I suspect this means we'd want atomics writes to align to 16k
> but I can let Dan confirm.
If we need to be aligned to NPWG, then the min atomic write unit would
also need to be NPWG. Any NPWG relation to atomic writes is not defined
in the spec, AFAICS.
We simply use the LBA data size as the min atomic unit in this patch.
>
>> Note on NABSPF:
>> There seems to be some vagueness in the spec as to whether NABSPF applies
>> for NSFEAT bit 1 being unset. Figure 97 does not explicitly mention NABSPF
>> and how it is affected by bit 1. However Figure 4 does tell to check Figure
>> 97 for info about per-namespace parameters, which NABSPF is, so it is
>> implied. However currently nvme_update_disk_info() does check namespace
>> parameter NABO regardless of this bit.
>
> Yeah that its quirky.
>
> Also today we set the physical block size to min(npwg, atomic) and that
> means for a today's average 4k IU drive if they get 16k atomic the
> physical block size would still be 4k. As the physical block size in
> practice can also lift the sector size filesystems used it would seem
> odd only a larger npwg could lift it.
It seems to me that if you want to provide atomic guarantees for this
large "physical block size", then it needs to be based on (N)AWUPF and NPWG.
> So we may want to revisit this
> eventually, specially if we have an API to do atomics properly across the
> block layer.
>
Thanks,
John
next prev parent reply other threads:[~2024-04-11 9:00 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-26 13:38 [PATCH v6 00/10] block atomic writes John Garry
2024-03-26 13:38 ` [PATCH v6 01/10] block: Pass blk_queue_get_max_sectors() a request pointer John Garry
2024-04-10 22:58 ` Luis Chamberlain
2024-03-26 13:38 ` [PATCH v6 02/10] block: Call blkdev_dio_unaligned() from blkdev_direct_IO() John Garry
2024-04-10 22:53 ` Luis Chamberlain
2024-04-11 8:06 ` John Garry
2024-03-26 13:38 ` [PATCH v6 03/10] fs: Initial atomic write support John Garry
2024-03-26 13:38 ` [PATCH v6 04/10] fs: Add initial atomic write support info to statx John Garry
2024-03-26 13:38 ` [PATCH v6 05/10] block: Add core atomic write support John Garry
2024-03-26 17:11 ` Randy Dunlap
2024-04-10 23:34 ` Luis Chamberlain
2024-04-11 8:15 ` John Garry
2024-03-26 13:38 ` [PATCH v6 06/10] block: Add atomic write support for statx John Garry
2024-03-26 13:38 ` [PATCH v6 07/10] block: Add fops atomic write support John Garry
2024-03-26 13:38 ` [PATCH v6 08/10] scsi: sd: Atomic " John Garry
2024-03-26 13:38 ` [PATCH v6 09/10] scsi: scsi_debug: " John Garry
2024-03-26 13:38 ` [PATCH v6 10/10] nvme: " John Garry
2024-04-11 0:29 ` Luis Chamberlain
2024-04-11 8:59 ` John Garry [this message]
2024-04-11 16:22 ` Luis Chamberlain
2024-04-11 23:32 ` Dan Helmick
2024-03-27 3:50 ` [PATCH v6 00/10] block atomic writes Matthew Wilcox
2024-03-27 13:37 ` John Garry
2024-04-04 16:48 ` Matthew Wilcox
2024-04-05 10:06 ` John Garry
2024-04-08 17:50 ` Luis Chamberlain
2024-04-10 4:05 ` Matthew Wilcox
2024-04-10 6:20 ` Hannes Reinecke
2024-04-11 0:38 ` Luis Chamberlain
2024-04-14 20:50 ` Luis Chamberlain
2024-04-15 21:18 ` Matthew Wilcox
2024-04-16 21:11 ` Luis Chamberlain
2024-04-10 8:34 ` John Garry
2024-04-11 19:07 ` Luis Chamberlain
2024-04-12 8:15 ` John Garry
2024-04-12 18:28 ` Luis Chamberlain
2024-03-27 20:31 ` Dave Chinner
2024-04-05 10:20 ` Kent Overstreet
2024-04-05 10:55 ` John Garry
2024-04-05 6:14 ` Kent Overstreet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox