public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pierre Labat <[email protected]>
To: Keith Busch <[email protected]>, Christoph Hellwig <[email protected]>
Cc: Kanchan Joshi <[email protected]>,
	Keith Busch <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>
Subject: RE: [EXT] Re: [PATCHv11 0/9] write hints with nvme fdp and scsi streams
Date: Tue, 12 Nov 2024 18:18:21 +0000	[thread overview]
Message-ID: <DS0PR08MB854131CDA4CDDF2451CEB71DAB592@DS0PR08MB8541.namprd08.prod.outlook.com> (raw)
In-Reply-To: <[email protected]>

My 2 cents.

Overall, it seems to me that the difficulty here comes from 2 things:
1)  The write hints may have different semantics (temperature, FDP placement, and whatever will come next).
2) Different software layers may want to use the hints, and if several do that at the same time on the same storage that may result in a mess.

About 1)
Seems to me that having a different interface for each semantic is an overkill, extra code to maintain.  And extra work when a new semantic comes along.
To keep things simple, keep one set of interfaces (per IO interface, per file interface) for all write hints semantics, and carry the difference in semantic in the hint itself.
For example, with 32 bits hints, store the semantic in 8 bits and the use the rest in the context of that semantic.
The storage transport driver (nvme driver for ex), based on the 8 bits semantic in the write hint, translates adequately the write hint for the storage device.
The storage driver can support several translations, one for each semantics supported. Linux doesn't need to yank out a translation to replace it with a another/new one.

About 2)
Provide a simple way to the user to decide which layer generate write hints.
As an example, as some of you pointed out, what if the filesystem wants to generate write hints to optimize its [own] data handling by the storage, and at the same time the application using the FS understand the storage and also wants to optimize using write hints.
Both use cases are legit, I think.
To handle that in a simple way, why not have a filesystem mount parameter enabling/disabling the use of write hints by the FS?
In the case of an application not needing/wanting to use write hints on its own, the user would mount the filesystem enabling generation of write hints. That could be the default.
On the contrary if the user decides it is best for one application to directly generate write hints to get the best performance, then mount the filesystem disabling the generation of write hints by the FS. The FS act as a passthrough regarding write hints.

Regards,

Pierre
> -----Original Message-----
> From: Keith Busch <[email protected]>
> Sent: Tuesday, November 12, 2024 6:26 AM
> To: Christoph Hellwig <[email protected]>
> Cc: Kanchan Joshi <[email protected]>; Keith Busch
> <[email protected]>; [email protected]; linux-
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]
> Subject: [EXT] Re: [PATCHv11 0/9] write hints with nvme fdp and scsi streams
> 
> CAUTION: EXTERNAL EMAIL. Do not click links or open attachments unless you
> recognize the sender and were expecting this message.
> 
> 
> On Tue, Nov 12, 2024 at 02:34:39PM +0100, Christoph Hellwig wrote:
> > On Tue, Nov 12, 2024 at 06:56:25PM +0530, Kanchan Joshi wrote:
> > > IMO, passthrough propagation of hints/streams should continue to
> > > remain the default behavior as it applies on multiple filesystems.
> > > And more active placement by FS should rather be enabled by some opt
> > > in (e.g., mount option). Such opt in will anyway be needed for other
> > > reasons (like regression avoidance on a broken device).
> >
> > I feel like banging my head against the wall.  No, passing through
> > write streams is simply not acceptable without the file system being
> > in control.  I've said and explained this in detail about a dozend
> > times and the file system actually needing to do data separation for
> > it's own purpose doesn't go away by ignoring it.
> 
> But that's just an ideological decision that doesn't jive with how people use
> these. The applications know how they use their data better than the
> filesystem, so putting the filesystem in the way to force streams look like zones
> is just a unnecessary layer of indirection getting in the way.


  parent reply	other threads:[~2024-11-12 18:18 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-08 19:36 [PATCHv11 0/9] write hints with nvme fdp and scsi streams Keith Busch
2024-11-08 19:36 ` [PATCHv11 1/9] block: use generic u16 for write hints Keith Busch
2024-11-08 19:36 ` [PATCHv11 2/9] block: introduce max_write_hints queue limit Keith Busch
2024-11-08 19:36 ` [PATCHv11 3/9] statx: add write hint information Keith Busch
2024-11-08 19:36 ` [PATCHv11 4/9] block: allow ability to limit partition write hints Keith Busch
2024-11-08 19:36 ` [PATCHv11 5/9] block, fs: add write hint to kiocb Keith Busch
2024-11-08 19:36 ` [PATCHv11 6/9] io_uring: enable per-io hinting capability Keith Busch
2024-11-08 19:36 ` [PATCHv11 7/9] block: export placement hint feature Keith Busch
2024-11-11 10:29 ` [PATCHv11 0/9] write hints with nvme fdp and scsi streams Christoph Hellwig
2024-11-11 16:27   ` Keith Busch
2024-11-11 16:34     ` Christoph Hellwig
2024-11-12 13:26   ` Kanchan Joshi
2024-11-12 13:34     ` Christoph Hellwig
2024-11-12 14:25       ` Keith Busch
2024-11-12 16:50         ` Christoph Hellwig
2024-11-12 17:19           ` Christoph Hellwig
2024-11-12 18:18         ` Pierre Labat [this message]
2024-11-13  4:47           ` [EXT] " Christoph Hellwig
2024-11-13 23:51             ` Dave Chinner
2024-11-14  3:09               ` Martin K. Petersen
2024-11-14  6:07               ` Christoph Hellwig
2024-11-15 16:28                 ` Keith Busch
2024-11-15 16:53                   ` Christoph Hellwig
2024-11-18 23:37                     ` Keith Busch
2024-11-19  7:15                       ` Christoph Hellwig
2024-11-20 17:21                         ` Darrick J. Wong
2024-11-20 18:11                           ` Keith Busch
2024-11-21  7:17                             ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DS0PR08MB854131CDA4CDDF2451CEB71DAB592@DS0PR08MB8541.namprd08.prod.outlook.com \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox