public inbox for [email protected]
 help / color / mirror / Atom feed
From: Keith Busch <[email protected]>
To: "Darrick J. Wong" <[email protected]>
Cc: Christoph Hellwig <[email protected]>,
	Dave Chinner <[email protected]>,
	Pierre Labat <[email protected]>,
	Kanchan Joshi <[email protected]>,
	Keith Busch <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>
Subject: Re: [EXT] Re: [PATCHv11 0/9] write hints with nvme fdp and scsi streams
Date: Wed, 20 Nov 2024 11:11:12 -0700	[thread overview]
Message-ID: <Zz4mQGrlKMiPa8NH@kbusch-mbp> (raw)
In-Reply-To: <20241120172158.GP9425@frogsfrogsfrogs>

On Wed, Nov 20, 2024 at 09:21:58AM -0800, Darrick J. Wong wrote:
> 
> How do filesystem users pick a write stream?  I get a pretty strong
> sense that you're aiming to provide the ability for application software
> to group together a bunch of (potentially arbitrary) files in a cohort.
> Then (maybe?) you can say "This cohort of files are all expected to have
> data blocks related to each other in some fashion, so put them together
> so that the storage doesn't have to work so hard".
> 
> Part of my comprehension problem here (and probably why few fs people
> commented on this thread) is that I have no idea what FDP is, or what
> the write lifetime hints in scsi were/are, or what the current "hinting"
> scheme is.

FDP is just the "new" version of NVMe's streams. Support for its
predecessor was added in commit f5d118406247acf ("nvme: add support for
streams")

  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f5d118406247acfc4fc481e441e01ea4d6318fdc

Various applications were written to that interface and showed initial
promise, but production quality hardware never materialized. Some of
these applications are still setting the write hints today, and the
filesystems are all passing through the block stack, but there's just
currently no nvme driver listening on the other side.

Contrast to the older nvme streams, capable hardware subscribing to this
newer FDP scheme have been developed, and so people want to use those
same applications using those same hints in the exact same way that it
was originally designed. Enabling them could be just be a simple driver
patch like the above without bothering the filesystem people :)
 
> Is this what we're arguing about?
> 
> enum rw_hint {
> 	WRITE_LIFE_NOT_SET	= RWH_WRITE_LIFE_NOT_SET,
> 	WRITE_LIFE_NONE		= RWH_WRITE_LIFE_NONE,
> 	WRITE_LIFE_SHORT	= RWH_WRITE_LIFE_SHORT,
> 	WRITE_LIFE_MEDIUM	= RWH_WRITE_LIFE_MEDIUM,
> 	WRITE_LIFE_LONG		= RWH_WRITE_LIFE_LONG,
> 	WRITE_LIFE_EXTREME	= RWH_WRITE_LIFE_EXTREME,
> } __packed;
> 
> (What happens if you have two disjoint sets of files, both of which are
> MEDIUM, but they shouldn't be intertwined?)

It's not going to perform as well. You'd be advised against over
subscribing the hint value among applications with different relative
expectations, but it generally (but not always) should be no worse than
if you hadn't given any hints at all.
 
> Or are these new fdp hint things an overload of the existing write hint
> fields in the iocb/inode/bio?  With a totally different meaning from
> anticipated lifetime of the data blocks?

The meaning assigned to an FDP stream is whatever the user wants it to
mean. It's not strictly a lifetime hint, but that is certainly a valid
way to use them. The contract on the device's side is that writes to
one stream won't create media interfere or contention with writes to
other streams. This is the same as nvme's original streams, which for
some reason did not carry any of this controversy.

  reply	other threads:[~2024-11-20 18:11 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-08 19:36 [PATCHv11 0/9] write hints with nvme fdp and scsi streams Keith Busch
2024-11-08 19:36 ` [PATCHv11 1/9] block: use generic u16 for write hints Keith Busch
2024-11-08 19:36 ` [PATCHv11 2/9] block: introduce max_write_hints queue limit Keith Busch
2024-11-08 19:36 ` [PATCHv11 3/9] statx: add write hint information Keith Busch
2024-11-08 19:36 ` [PATCHv11 4/9] block: allow ability to limit partition write hints Keith Busch
2024-11-08 19:36 ` [PATCHv11 5/9] block, fs: add write hint to kiocb Keith Busch
2024-11-08 19:36 ` [PATCHv11 6/9] io_uring: enable per-io hinting capability Keith Busch
2024-11-08 19:36 ` [PATCHv11 7/9] block: export placement hint feature Keith Busch
2024-11-11 10:29 ` [PATCHv11 0/9] write hints with nvme fdp and scsi streams Christoph Hellwig
2024-11-11 16:27   ` Keith Busch
2024-11-11 16:34     ` Christoph Hellwig
2024-11-12 13:26   ` Kanchan Joshi
2024-11-12 13:34     ` Christoph Hellwig
2024-11-12 14:25       ` Keith Busch
2024-11-12 16:50         ` Christoph Hellwig
2024-11-12 17:19           ` Christoph Hellwig
2024-11-12 18:18         ` [EXT] " Pierre Labat
2024-11-13  4:47           ` Christoph Hellwig
2024-11-13 23:51             ` Dave Chinner
2024-11-14  3:09               ` Martin K. Petersen
2024-11-14  6:07               ` Christoph Hellwig
2024-11-15 16:28                 ` Keith Busch
2024-11-15 16:53                   ` Christoph Hellwig
2024-11-18 23:37                     ` Keith Busch
2024-11-19  7:15                       ` Christoph Hellwig
2024-11-20 17:21                         ` Darrick J. Wong
2024-11-20 18:11                           ` Keith Busch [this message]
2024-11-21  7:17                             ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zz4mQGrlKMiPa8NH@kbusch-mbp \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox