public inbox for [email protected]
 help / color / mirror / Atom feed
From: Christoph Hellwig <[email protected]>
To: Keith Busch <[email protected]>
Cc: Christoph Hellwig <[email protected]>, Keith Busch <[email protected]>,
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected],
	Hannes Reinecke <[email protected]>
Subject: Re: [PATCHv10 9/9] scsi: set permanent stream count in block limits
Date: Wed, 30 Oct 2024 05:55:26 +0100	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <ZyEL4FOBMr4H8DGM@kbusch-mbp>

On Tue, Oct 29, 2024 at 10:22:56AM -0600, Keith Busch wrote:
> On Tue, Oct 29, 2024 at 04:53:30PM +0100, Christoph Hellwig wrote:
> > On Tue, Oct 29, 2024 at 09:38:44AM -0600, Keith Busch wrote:
> > > They're not exposed as write streams. Patch 7/9 sets the feature if it
> > > is a placement id or not, and only nvme sets it, so scsi's attributes
> > > are not claiming to be a write stream.
> > 
> > So it shows up in sysfs, but:
> > 
> >  - queue_max_write_hints (which really should be queue_max_write_streams)
> >    still picks it up, and from there the statx interface
> > 
> >  - per-inode fcntl hint that encode a temperature still magically
> >    get dumpted into the write streams if they are set.
> > 
> > In other words it's a really leaky half-backed abstraction.
> 
> Exactly why I asked last time: "who uses it and how do you want them to
> use it" :)

For the temperature hints the only public user I known is rocksdb, and
that only started working when Hans fixed a brown paperbag bug in the
rocksdb code a while ago.  Given that f2fs interprets the hints I suspect
something in the Android world does as well, maybe Bart knows more.

For the separate write streams the usage I want for them is poor mans
zones - e.g. write N LBAs sequentially into a separate write streams
and then eventually discard them together.  This will fit nicely into
f2fs and the pending xfs work as well as quite a few userspace storage
systems.  For that the file system or application needs to query
the number of available write streams (and in the bitmap world their
numbers of they are distontigous) and the size your can fit into the
"reclaim unit" in FDP terms.  I've not been bothering you much with
the latter as it is an easy retrofit once the I/O path bits lands.

> > Let's brainstorm how it could be done better:
> > 
> >  - the max_write_streams values only set by block devices that actually
> >    do support write streams, and not the fire and forget temperature
> >    hints.  They way this is queried is by having a non-zero value
> >    there, not need for an extra flag.
> 
> So we need a completely different attribute for SCSI's permanent write
> streams? You'd mentioned earlier you were okay with having SCSI be able
> to utilized per-io raw block write hints. Having multiple things to
> check for what are all just write classifiers seems unnecessarily
> complicated.

I don't think the multiple write streams interface applies to SCSIs
write streams, as they enforce a relative temperature, and they don't
have the concept of how much you can write into an "reclaim unit".

OTOH there isn't much you need to query for them anyway, as the
temperature hints have always been defined as pure hints with all
up and downsides of that.

> No need to create a new fcntl. The people already testing this are
> successfully using FDP with the existing fcntl hints. Their applications
> leverage FDP as way to separate files based on expected lifetime. It is
> how they want to use it and it is working above expectations. 

FYI, I think it's always fine and easy to map the temperature hits to
write streams if that's all the driver offers.  It loses a lot of the
capapilities, but as long as it doesn't enforce a lower level interface
that never exposes more that's fine.


  reply	other threads:[~2024-10-30  4:55 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-29 15:19 [PATCHv10 0/9] write hints with nvme fdp, scsi streams Keith Busch
2024-10-29 15:19 ` [PATCHv10 1/9] block: use generic u16 for write hints Keith Busch
2024-10-29 17:21   ` Bart Van Assche
2024-10-29 15:19 ` [PATCHv10 2/9] block: introduce max_write_hints queue limit Keith Busch
2024-10-29 15:19 ` [PATCHv10 3/9] statx: add write hint information Keith Busch
2024-10-29 15:19 ` [PATCHv10 4/9] block: allow ability to limit partition write hints Keith Busch
2024-10-29 15:23   ` Christoph Hellwig
2024-10-29 17:25   ` Bart Van Assche
2024-10-30  4:46     ` Christoph Hellwig
2024-10-29 15:19 ` [PATCHv10 5/9] block, fs: add write hint to kiocb Keith Busch
2024-10-29 15:19 ` [PATCHv10 6/9] io_uring: enable per-io hinting capability Keith Busch
2024-10-29 15:19 ` [PATCHv10 7/9] block: export placement hint feature Keith Busch
2024-10-29 15:19 ` [PATCHv10 8/9] nvme: enable FDP support Keith Busch
2024-10-30  0:24   ` Chaitanya Kulkarni
2024-10-29 15:19 ` [PATCHv10 9/9] scsi: set permanent stream count in block limits Keith Busch
2024-10-29 15:26   ` Christoph Hellwig
2024-10-29 15:34     ` Keith Busch
2024-10-29 15:37       ` Christoph Hellwig
2024-10-29 15:38         ` Keith Busch
2024-10-29 15:53           ` Christoph Hellwig
2024-10-29 16:22             ` Keith Busch
2024-10-30  4:55               ` Christoph Hellwig [this message]
2024-10-29 17:18     ` Bart Van Assche
2024-10-30  5:42       ` Christoph Hellwig
2024-10-29 15:24 ` [PATCHv10 0/9] write hints with nvme fdp, scsi streams Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox