public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCHv12 00/12] block write streams with nvme fdp
@ 2024-12-06 22:17 Keith Busch
  2024-12-06 22:17 ` [PATCHv12 01/12] fs: add write stream information to statx Keith Busch
                   ` (12 more replies)
  0 siblings, 13 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
  To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

From: Keith Busch <[email protected]>

changes from v11:

 - Place the write hint in an unused io_uring SQE field
   - Obviates the need to modify the external "attributes" stuff that
     support PI
   - Make it a u8 to match the type the block layer supports
   - And it's just easier to use for the user

 - Fix the sparse warnings from FDP definitions
   - Just use the patches that Christoph posted a few weeks ago since
     it already defined it in a way that makes sparse happy; I just made
     some minor changes to field names to match what the spec calls them

 - Actually include the first patch in this series

Christoph Hellwig (7):
  fs: add a write stream field to the kiocb
  block: add a bi_write_stream field
  block: introduce a write_stream_granularity queue limit
  block: expose write streams for block device nodes
  nvme: add a nvme_get_log_lsi helper
  nvme: pass a void pointer to nvme_get/set_features for the result
  nvme.h: add FDP definitions

Keith Busch (5):
  fs: add write stream information to statx
  block: introduce max_write_streams queue limit
  io_uring: enable per-io write streams
  nvme: register fdp parameters with the block layer
  nvme: use fdp streams if write stream is provided

 Documentation/ABI/stable/sysfs-block |  15 +++
 block/bdev.c                         |   6 +
 block/bio.c                          |   2 +
 block/blk-crypto-fallback.c          |   1 +
 block/blk-merge.c                    |   4 +
 block/blk-sysfs.c                    |   6 +
 block/bounce.c                       |   1 +
 block/fops.c                         |  23 ++++
 drivers/nvme/host/core.c             | 160 ++++++++++++++++++++++++++-
 drivers/nvme/host/nvme.h             |   9 +-
 fs/stat.c                            |   2 +
 include/linux/blk_types.h            |   1 +
 include/linux/blkdev.h               |  16 +++
 include/linux/fs.h                   |   1 +
 include/linux/nvme.h                 |  77 +++++++++++++
 include/linux/stat.h                 |   2 +
 include/uapi/linux/io_uring.h        |   4 +
 include/uapi/linux/stat.h            |   7 +-
 io_uring/io_uring.c                  |   2 +
 io_uring/rw.c                        |   1 +
 20 files changed, 332 insertions(+), 8 deletions(-)

-- 
2.43.5


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCHv12 01/12] fs: add write stream information to statx
  2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
  2024-12-09  8:25   ` Hannes Reinecke
       [not found]   ` <CGME20241209115219epcas5p4cfc217e25d977cd87025a4284ba0436c@epcas5p4.samsung.com>
  2024-12-06 22:17 ` [PATCHv12 02/12] fs: add a write stream field to the kiocb Keith Busch
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
  To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

From: Keith Busch <[email protected]>

Add new statx field to report the maximum number of write streams
supported and the granularity for them.

Signed-off-by: Keith Busch <[email protected]>
[hch: rename hint to stream, add granularity]
Signed-off-by: Christoph Hellwig <[email protected]>
---
 fs/stat.c                 | 2 ++
 include/linux/stat.h      | 2 ++
 include/uapi/linux/stat.h | 7 +++++--
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/stat.c b/fs/stat.c
index 0870e969a8a0b..00e4598b1ff25 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -729,6 +729,8 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer)
 	tmp.stx_atomic_write_unit_min = stat->atomic_write_unit_min;
 	tmp.stx_atomic_write_unit_max = stat->atomic_write_unit_max;
 	tmp.stx_atomic_write_segments_max = stat->atomic_write_segments_max;
+	tmp.stx_write_stream_granularity = stat->write_stream_granularity;
+	tmp.stx_write_stream_max = stat->write_stream_max;
 
 	return copy_to_user(buffer, &tmp, sizeof(tmp)) ? -EFAULT : 0;
 }
diff --git a/include/linux/stat.h b/include/linux/stat.h
index 3d900c86981c5..36d4dfb291abd 100644
--- a/include/linux/stat.h
+++ b/include/linux/stat.h
@@ -57,6 +57,8 @@ struct kstat {
 	u32		atomic_write_unit_min;
 	u32		atomic_write_unit_max;
 	u32		atomic_write_segments_max;
+	u32		write_stream_granularity;
+	u16		write_stream_max;
 };
 
 /* These definitions are internal to the kernel for now. Mainly used by nfsd. */
diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
index 887a252864416..547c62a1a3a7c 100644
--- a/include/uapi/linux/stat.h
+++ b/include/uapi/linux/stat.h
@@ -132,9 +132,11 @@ struct statx {
 	__u32	stx_atomic_write_unit_max;	/* Max atomic write unit in bytes */
 	/* 0xb0 */
 	__u32   stx_atomic_write_segments_max;	/* Max atomic write segment count */
-	__u32   __spare1[1];
+	__u32   stx_write_stream_granularity;
 	/* 0xb8 */
-	__u64	__spare3[9];	/* Spare space for future expansion */
+	__u16   stx_write_stream_max;
+	__u16	__sparse2[3];
+	__u64	__spare3[8];	/* Spare space for future expansion */
 	/* 0x100 */
 };
 
@@ -164,6 +166,7 @@ struct statx {
 #define STATX_MNT_ID_UNIQUE	0x00004000U	/* Want/got extended stx_mount_id */
 #define STATX_SUBVOL		0x00008000U	/* Want/got stx_subvol */
 #define STATX_WRITE_ATOMIC	0x00010000U	/* Want/got atomic_write_* fields */
+#define STATX_WRITE_STREAM	0x00020000U	/* Want/got write_stream_* */
 
 #define STATX__RESERVED		0x80000000U	/* Reserved for future struct statx expansion */
 
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv12 02/12] fs: add a write stream field to the kiocb
  2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
  2024-12-06 22:17 ` [PATCHv12 01/12] fs: add write stream information to statx Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
  2024-12-09  8:25   ` Hannes Reinecke
                     ` (2 more replies)
  2024-12-06 22:17 ` [PATCHv12 03/12] block: add a bi_write_stream field Keith Busch
                   ` (10 subsequent siblings)
  12 siblings, 3 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
  To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

From: Christoph Hellwig <[email protected]>

Prepare for io_uring passthrough of write streams. The write stream
field in the kiocb structure fits into an existing 2-byte hole, so its
size is not changed.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
---
 include/linux/fs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2cc3d45da7b01..26940c451f319 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -373,6 +373,7 @@ struct kiocb {
 	void			*private;
 	int			ki_flags;
 	u16			ki_ioprio; /* See linux/ioprio.h */
+	u8			ki_write_stream;
 	union {
 		/*
 		 * Only used for async buffered reads, where it denotes the
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv12 03/12] block: add a bi_write_stream field
  2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
  2024-12-06 22:17 ` [PATCHv12 01/12] fs: add write stream information to statx Keith Busch
  2024-12-06 22:17 ` [PATCHv12 02/12] fs: add a write stream field to the kiocb Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
  2024-12-09  8:26   ` Hannes Reinecke
       [not found]   ` <CGME20241210074213epcas5p22330d197c3e7058e9c2226f28fdb1475@epcas5p2.samsung.com>
  2024-12-06 22:17 ` [PATCHv12 04/12] block: introduce max_write_streams queue limit Keith Busch
                   ` (9 subsequent siblings)
  12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
  To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

From: Christoph Hellwig <[email protected]>

Add the ability to pass a write stream for placement control in the bio.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
---
 block/bio.c                 | 2 ++
 block/blk-crypto-fallback.c | 1 +
 block/blk-merge.c           | 4 ++++
 block/bounce.c              | 1 +
 include/linux/blk_types.h   | 1 +
 5 files changed, 9 insertions(+)

diff --git a/block/bio.c b/block/bio.c
index 699a78c85c756..2aa86edc7cd6f 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -251,6 +251,7 @@ void bio_init(struct bio *bio, struct block_device *bdev, struct bio_vec *table,
 	bio->bi_flags = 0;
 	bio->bi_ioprio = 0;
 	bio->bi_write_hint = 0;
+	bio->bi_write_stream = 0;
 	bio->bi_status = 0;
 	bio->bi_iter.bi_sector = 0;
 	bio->bi_iter.bi_size = 0;
@@ -827,6 +828,7 @@ static int __bio_clone(struct bio *bio, struct bio *bio_src, gfp_t gfp)
 	bio_set_flag(bio, BIO_CLONED);
 	bio->bi_ioprio = bio_src->bi_ioprio;
 	bio->bi_write_hint = bio_src->bi_write_hint;
+	bio->bi_write_stream = bio_src->bi_write_stream;
 	bio->bi_iter = bio_src->bi_iter;
 
 	if (bio->bi_bdev) {
diff --git a/block/blk-crypto-fallback.c b/block/blk-crypto-fallback.c
index 29a205482617c..66762243a886b 100644
--- a/block/blk-crypto-fallback.c
+++ b/block/blk-crypto-fallback.c
@@ -173,6 +173,7 @@ static struct bio *blk_crypto_fallback_clone_bio(struct bio *bio_src)
 		bio_set_flag(bio, BIO_REMAPPED);
 	bio->bi_ioprio		= bio_src->bi_ioprio;
 	bio->bi_write_hint	= bio_src->bi_write_hint;
+	bio->bi_write_stream	= bio_src->bi_write_stream;
 	bio->bi_iter.bi_sector	= bio_src->bi_iter.bi_sector;
 	bio->bi_iter.bi_size	= bio_src->bi_iter.bi_size;
 
diff --git a/block/blk-merge.c b/block/blk-merge.c
index e01383c6e534b..1e5327fb6c45b 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -866,6 +866,8 @@ static struct request *attempt_merge(struct request_queue *q,
 
 	if (req->bio->bi_write_hint != next->bio->bi_write_hint)
 		return NULL;
+	if (req->bio->bi_write_stream != next->bio->bi_write_stream)
+		return NULL;
 	if (req->bio->bi_ioprio != next->bio->bi_ioprio)
 		return NULL;
 	if (!blk_atomic_write_mergeable_rqs(req, next))
@@ -987,6 +989,8 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
 		return false;
 	if (rq->bio->bi_write_hint != bio->bi_write_hint)
 		return false;
+	if (rq->bio->bi_write_stream != bio->bi_write_stream)
+		return false;
 	if (rq->bio->bi_ioprio != bio->bi_ioprio)
 		return false;
 	if (blk_atomic_write_mergeable_rq_bio(rq, bio) == false)
diff --git a/block/bounce.c b/block/bounce.c
index 0d898cd5ec497..fb8f60f114d7d 100644
--- a/block/bounce.c
+++ b/block/bounce.c
@@ -170,6 +170,7 @@ static struct bio *bounce_clone_bio(struct bio *bio_src)
 		bio_set_flag(bio, BIO_REMAPPED);
 	bio->bi_ioprio		= bio_src->bi_ioprio;
 	bio->bi_write_hint	= bio_src->bi_write_hint;
+	bio->bi_write_stream	= bio_src->bi_write_stream;
 	bio->bi_iter.bi_sector	= bio_src->bi_iter.bi_sector;
 	bio->bi_iter.bi_size	= bio_src->bi_iter.bi_size;
 
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index dce7615c35e7e..4ca3449ce9c95 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -220,6 +220,7 @@ struct bio {
 	unsigned short		bi_flags;	/* BIO_* below */
 	unsigned short		bi_ioprio;
 	enum rw_hint		bi_write_hint;
+	u8			bi_write_stream;
 	blk_status_t		bi_status;
 	atomic_t		__bi_remaining;
 
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv12 04/12] block: introduce max_write_streams queue limit
  2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
                   ` (2 preceding siblings ...)
  2024-12-06 22:17 ` [PATCHv12 03/12] block: add a bi_write_stream field Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
  2024-12-09  8:27   ` Hannes Reinecke
       [not found]   ` <CGME20241210074628epcas5p3e36c7615cf2a5160d7fe169774fd30db@epcas5p3.samsung.com>
  2024-12-06 22:17 ` [PATCHv12 05/12] block: introduce a write_stream_granularity " Keith Busch
                   ` (8 subsequent siblings)
  12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
  To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

From: Keith Busch <[email protected]>

Drivers with hardware that support write streams need a way to export how
many are available so applications can generically query this.

Signed-off-by: Keith Busch <[email protected]>
[hch: renamed hints to streams, removed stacking]
Signed-off-by: Christoph Hellwig <[email protected]>
---
 Documentation/ABI/stable/sysfs-block | 7 +++++++
 block/blk-sysfs.c                    | 3 +++
 include/linux/blkdev.h               | 9 +++++++++
 3 files changed, 19 insertions(+)

diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block
index 0cceb2badc836..f67139b8b8eff 100644
--- a/Documentation/ABI/stable/sysfs-block
+++ b/Documentation/ABI/stable/sysfs-block
@@ -506,6 +506,13 @@ Description:
 		[RO] Maximum size in bytes of a single element in a DMA
 		scatter/gather list.
 
+What:		/sys/block/<disk>/queue/max_write_streams
+Date:		November 2024
+Contact:	[email protected]
+Description:
+		[RO] Maximum number of write streams supported, 0 if not
+		supported. If supported, valid values are 1 through
+		max_write_streams, inclusive.
 
 What:		/sys/block/<disk>/queue/max_segments
 Date:		March 2010
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 4241aea84161c..c514c0cb5e93c 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -104,6 +104,7 @@ QUEUE_SYSFS_LIMIT_SHOW(max_segments)
 QUEUE_SYSFS_LIMIT_SHOW(max_discard_segments)
 QUEUE_SYSFS_LIMIT_SHOW(max_integrity_segments)
 QUEUE_SYSFS_LIMIT_SHOW(max_segment_size)
+QUEUE_SYSFS_LIMIT_SHOW(max_write_streams)
 QUEUE_SYSFS_LIMIT_SHOW(logical_block_size)
 QUEUE_SYSFS_LIMIT_SHOW(physical_block_size)
 QUEUE_SYSFS_LIMIT_SHOW(chunk_sectors)
@@ -446,6 +447,7 @@ QUEUE_RO_ENTRY(queue_max_hw_sectors, "max_hw_sectors_kb");
 QUEUE_RO_ENTRY(queue_max_segments, "max_segments");
 QUEUE_RO_ENTRY(queue_max_integrity_segments, "max_integrity_segments");
 QUEUE_RO_ENTRY(queue_max_segment_size, "max_segment_size");
+QUEUE_RO_ENTRY(queue_max_write_streams, "max_write_streams");
 QUEUE_RW_LOAD_MODULE_ENTRY(elv_iosched, "scheduler");
 
 QUEUE_RO_ENTRY(queue_logical_block_size, "logical_block_size");
@@ -580,6 +582,7 @@ static struct attribute *queue_attrs[] = {
 	&queue_max_discard_segments_entry.attr,
 	&queue_max_integrity_segments_entry.attr,
 	&queue_max_segment_size_entry.attr,
+	&queue_max_write_streams_entry.attr,
 	&queue_hw_sector_size_entry.attr,
 	&queue_logical_block_size_entry.attr,
 	&queue_physical_block_size_entry.attr,
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 08a727b408164..ce2c3ddda2411 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -399,6 +399,8 @@ struct queue_limits {
 	unsigned short		max_integrity_segments;
 	unsigned short		max_discard_segments;
 
+	unsigned short		max_write_streams;
+
 	unsigned int		max_open_zones;
 	unsigned int		max_active_zones;
 
@@ -1240,6 +1242,13 @@ static inline unsigned int bdev_max_segments(struct block_device *bdev)
 	return queue_max_segments(bdev_get_queue(bdev));
 }
 
+static inline unsigned short bdev_max_write_streams(struct block_device *bdev)
+{
+	if (bdev_is_partition(bdev))
+		return 0;
+	return bdev_limits(bdev)->max_write_streams;
+}
+
 static inline unsigned queue_logical_block_size(const struct request_queue *q)
 {
 	return q->limits.logical_block_size;
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv12 05/12] block: introduce a write_stream_granularity queue limit
  2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
                   ` (3 preceding siblings ...)
  2024-12-06 22:17 ` [PATCHv12 04/12] block: introduce max_write_streams queue limit Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
  2024-12-09  8:29   ` Hannes Reinecke
       [not found]   ` <CGME20241210075259epcas5p23bbb79cdb18ddbfad337d764d4fe75da@epcas5p2.samsung.com>
  2024-12-06 22:17 ` [PATCHv12 06/12] block: expose write streams for block device nodes Keith Busch
                   ` (7 subsequent siblings)
  12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
  To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

From: Christoph Hellwig <[email protected]>

Export the granularity that write streams should be discarded with,
as it is essential for making good use of them.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
---
 Documentation/ABI/stable/sysfs-block | 8 ++++++++
 block/blk-sysfs.c                    | 3 +++
 include/linux/blkdev.h               | 7 +++++++
 3 files changed, 18 insertions(+)

diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block
index f67139b8b8eff..c454c68b68fe6 100644
--- a/Documentation/ABI/stable/sysfs-block
+++ b/Documentation/ABI/stable/sysfs-block
@@ -514,6 +514,14 @@ Description:
 		supported. If supported, valid values are 1 through
 		max_write_streams, inclusive.
 
+What:		/sys/block/<disk>/queue/write_stream_granularity
+Date:		November 2024
+Contact:	[email protected]
+Description:
+		[RO] Granularity of a write stream in bytes.  The granularity
+		of a write stream is the size that should be discarded or
+		overwritten together to avoid write amplification in the device.
+
 What:		/sys/block/<disk>/queue/max_segments
 Date:		March 2010
 Contact:	[email protected]
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index c514c0cb5e93c..525f4fa132cd3 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -105,6 +105,7 @@ QUEUE_SYSFS_LIMIT_SHOW(max_discard_segments)
 QUEUE_SYSFS_LIMIT_SHOW(max_integrity_segments)
 QUEUE_SYSFS_LIMIT_SHOW(max_segment_size)
 QUEUE_SYSFS_LIMIT_SHOW(max_write_streams)
+QUEUE_SYSFS_LIMIT_SHOW(write_stream_granularity)
 QUEUE_SYSFS_LIMIT_SHOW(logical_block_size)
 QUEUE_SYSFS_LIMIT_SHOW(physical_block_size)
 QUEUE_SYSFS_LIMIT_SHOW(chunk_sectors)
@@ -448,6 +449,7 @@ QUEUE_RO_ENTRY(queue_max_segments, "max_segments");
 QUEUE_RO_ENTRY(queue_max_integrity_segments, "max_integrity_segments");
 QUEUE_RO_ENTRY(queue_max_segment_size, "max_segment_size");
 QUEUE_RO_ENTRY(queue_max_write_streams, "max_write_streams");
+QUEUE_RO_ENTRY(queue_write_stream_granularity, "write_stream_granularity");
 QUEUE_RW_LOAD_MODULE_ENTRY(elv_iosched, "scheduler");
 
 QUEUE_RO_ENTRY(queue_logical_block_size, "logical_block_size");
@@ -583,6 +585,7 @@ static struct attribute *queue_attrs[] = {
 	&queue_max_integrity_segments_entry.attr,
 	&queue_max_segment_size_entry.attr,
 	&queue_max_write_streams_entry.attr,
+	&queue_write_stream_granularity_entry.attr,
 	&queue_hw_sector_size_entry.attr,
 	&queue_logical_block_size_entry.attr,
 	&queue_physical_block_size_entry.attr,
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index ce2c3ddda2411..7be8cc57561a1 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -400,6 +400,7 @@ struct queue_limits {
 	unsigned short		max_discard_segments;
 
 	unsigned short		max_write_streams;
+	unsigned int		write_stream_granularity;
 
 	unsigned int		max_open_zones;
 	unsigned int		max_active_zones;
@@ -1249,6 +1250,12 @@ static inline unsigned short bdev_max_write_streams(struct block_device *bdev)
 	return bdev_limits(bdev)->max_write_streams;
 }
 
+static inline unsigned int
+bdev_write_stream_granularity(struct block_device *bdev)
+{
+	return bdev_limits(bdev)->write_stream_granularity;
+}
+
 static inline unsigned queue_logical_block_size(const struct request_queue *q)
 {
 	return q->limits.logical_block_size;
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv12 06/12] block: expose write streams for block device nodes
  2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
                   ` (4 preceding siblings ...)
  2024-12-06 22:17 ` [PATCHv12 05/12] block: introduce a write_stream_granularity " Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
  2024-12-09  8:30   ` Hannes Reinecke
       [not found]   ` <CGME20241209110649epcas5p41df7db0f7ea58f250da647106d25134b@epcas5p4.samsung.com>
  2024-12-06 22:17 ` [PATCHv12 07/12] io_uring: enable per-io write streams Keith Busch
                   ` (6 subsequent siblings)
  12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
  To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

From: Christoph Hellwig <[email protected]>

Export statx information about the number and granularity of write
streams, use the per-kiocb write hint and map temperature hints
to write streams (which is a bit questionable, but this shows how it is
done).

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
---
 block/bdev.c |  6 ++++++
 block/fops.c | 23 +++++++++++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/block/bdev.c b/block/bdev.c
index 738e3c8457e7f..c23245f1fdfe3 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -1296,6 +1296,12 @@ void bdev_statx(struct path *path, struct kstat *stat,
 		stat->result_mask |= STATX_DIOALIGN;
 	}
 
+	if ((request_mask & STATX_WRITE_STREAM) &&
+	    bdev_max_write_streams(bdev)) {
+		stat->write_stream_max = bdev_max_write_streams(bdev);
+		stat->result_mask |= STATX_WRITE_STREAM;
+	}
+
 	if (request_mask & STATX_WRITE_ATOMIC && bdev_can_atomic_write(bdev)) {
 		struct request_queue *bd_queue = bdev->bd_queue;
 
diff --git a/block/fops.c b/block/fops.c
index 6d5c4fc5a2168..f16aa39bf5bad 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -73,6 +73,7 @@ static ssize_t __blkdev_direct_IO_simple(struct kiocb *iocb,
 	}
 	bio.bi_iter.bi_sector = pos >> SECTOR_SHIFT;
 	bio.bi_write_hint = file_inode(iocb->ki_filp)->i_write_hint;
+	bio.bi_write_stream = iocb->ki_write_stream;
 	bio.bi_ioprio = iocb->ki_ioprio;
 	if (iocb->ki_flags & IOCB_ATOMIC)
 		bio.bi_opf |= REQ_ATOMIC;
@@ -206,6 +207,7 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
 	for (;;) {
 		bio->bi_iter.bi_sector = pos >> SECTOR_SHIFT;
 		bio->bi_write_hint = file_inode(iocb->ki_filp)->i_write_hint;
+		bio->bi_write_stream = iocb->ki_write_stream;
 		bio->bi_private = dio;
 		bio->bi_end_io = blkdev_bio_end_io;
 		bio->bi_ioprio = iocb->ki_ioprio;
@@ -333,6 +335,7 @@ static ssize_t __blkdev_direct_IO_async(struct kiocb *iocb,
 	dio->iocb = iocb;
 	bio->bi_iter.bi_sector = pos >> SECTOR_SHIFT;
 	bio->bi_write_hint = file_inode(iocb->ki_filp)->i_write_hint;
+	bio->bi_write_stream = iocb->ki_write_stream;
 	bio->bi_end_io = blkdev_bio_end_io_async;
 	bio->bi_ioprio = iocb->ki_ioprio;
 
@@ -398,6 +401,26 @@ static ssize_t blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 	if (blkdev_dio_invalid(bdev, iocb, iter))
 		return -EINVAL;
 
+	if (iov_iter_rw(iter) == WRITE) {
+		u16 max_write_streams = bdev_max_write_streams(bdev);
+
+		if (iocb->ki_write_stream) {
+			if (iocb->ki_write_stream > max_write_streams)
+				return -EINVAL;
+		} else if (max_write_streams) {
+			enum rw_hint write_hint =
+				file_inode(iocb->ki_filp)->i_write_hint;
+
+			/*
+			 * Just use the write hint as write stream for block
+			 * device writes.  This assumes no file system is
+			 * mounted that would use the streams differently.
+			 */
+			if (write_hint <= max_write_streams)
+				iocb->ki_write_stream = write_hint;
+		}
+	}
+
 	nr_pages = bio_iov_vecs_to_alloc(iter, BIO_MAX_VECS + 1);
 	if (likely(nr_pages <= BIO_MAX_VECS)) {
 		if (is_sync_kiocb(iocb))
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv12 07/12] io_uring: enable per-io write streams
  2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
                   ` (5 preceding siblings ...)
  2024-12-06 22:17 ` [PATCHv12 06/12] block: expose write streams for block device nodes Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
  2024-12-09  8:31   ` Hannes Reinecke
  2024-12-06 22:17 ` [PATCHv12 08/12] nvme: add a nvme_get_log_lsi helper Keith Busch
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
  To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

From: Keith Busch <[email protected]>

Allow userspace to pass a per-I/O write stream in the SQE:

      __u8 write_stream;

The __u8 type matches the size the filesystems and block layer support.

Application can query the supported values from the statx
max_write_streams field. Unsupported values are ignored by file
operations that do not support write streams or rejected with an error
by those that support them.

Signed-off-by: Keith Busch <[email protected]>
---
 include/uapi/linux/io_uring.h | 4 ++++
 io_uring/io_uring.c           | 2 ++
 io_uring/rw.c                 | 1 +
 3 files changed, 7 insertions(+)

diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 38f0d6b10eaf7..986a480e3b9c2 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -92,6 +92,10 @@ struct io_uring_sqe {
 			__u16	addr_len;
 			__u16	__pad3[1];
 		};
+		struct {
+			__u8	write_stream;
+			__u8	__pad4[3];
+		};
 	};
 	union {
 		struct {
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index a8cbe674e5d63..978d0617d7af8 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3868,6 +3868,8 @@ static int __init io_uring_init(void)
 	BUILD_BUG_SQE_ELEM(44, __s32,  splice_fd_in);
 	BUILD_BUG_SQE_ELEM(44, __u32,  file_index);
 	BUILD_BUG_SQE_ELEM(44, __u16,  addr_len);
+	BUILD_BUG_SQE_ELEM(44, __u8,   write_stream);
+	BUILD_BUG_SQE_ELEM(45, __u8,   __pad4[0]);
 	BUILD_BUG_SQE_ELEM(46, __u16,  __pad3[0]);
 	BUILD_BUG_SQE_ELEM(48, __u64,  addr3);
 	BUILD_BUG_SQE_ELEM_SIZE(48, 0, cmd);
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 04e4467ab0ee8..b8aa2dfcbf48c 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -322,6 +322,7 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 	}
 	rw->kiocb.dio_complete = NULL;
 	rw->kiocb.ki_flags = 0;
+	rw->kiocb.ki_write_stream = READ_ONCE(sqe->write_stream);
 
 	rw->addr = READ_ONCE(sqe->addr);
 	rw->len = READ_ONCE(sqe->len);
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv12 08/12] nvme: add a nvme_get_log_lsi helper
  2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
                   ` (6 preceding siblings ...)
  2024-12-06 22:17 ` [PATCHv12 07/12] io_uring: enable per-io write streams Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
  2024-12-09  8:31   ` Hannes Reinecke
       [not found]   ` <CGME20241210121958epcas5p27d14abfca66757a2c42ec71895b008b1@epcas5p2.samsung.com>
  2024-12-06 22:17 ` [PATCHv12 09/12] nvme: pass a void pointer to nvme_get/set_features for the result Keith Busch
                   ` (4 subsequent siblings)
  12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
  To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

From: Christoph Hellwig <[email protected]>

For log pages that need to pass in a LSI value, while at the same time
not touching all the existing nvme_get_log callers.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
---
 drivers/nvme/host/core.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 571d4106d256d..36c44be98e38c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -150,6 +150,8 @@ static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
 					   unsigned nsid);
 static void nvme_update_keep_alive(struct nvme_ctrl *ctrl,
 				   struct nvme_command *cmd);
+static int nvme_get_log_lsi(struct nvme_ctrl *ctrl, u32 nsid, u8 log_page,
+		u8 lsp, u8 csi, void *log, size_t size, u64 offset, u16 lsi);
 
 void nvme_queue_scan(struct nvme_ctrl *ctrl)
 {
@@ -3074,8 +3076,8 @@ static int nvme_init_subsystem(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
 	return ret;
 }
 
-int nvme_get_log(struct nvme_ctrl *ctrl, u32 nsid, u8 log_page, u8 lsp, u8 csi,
-		void *log, size_t size, u64 offset)
+static int nvme_get_log_lsi(struct nvme_ctrl *ctrl, u32 nsid, u8 log_page,
+		u8 lsp, u8 csi, void *log, size_t size, u64 offset, u16 lsi)
 {
 	struct nvme_command c = { };
 	u32 dwlen = nvme_bytes_to_numd(size);
@@ -3089,10 +3091,18 @@ int nvme_get_log(struct nvme_ctrl *ctrl, u32 nsid, u8 log_page, u8 lsp, u8 csi,
 	c.get_log_page.lpol = cpu_to_le32(lower_32_bits(offset));
 	c.get_log_page.lpou = cpu_to_le32(upper_32_bits(offset));
 	c.get_log_page.csi = csi;
+	c.get_log_page.lsi = cpu_to_le16(lsi);
 
 	return nvme_submit_sync_cmd(ctrl->admin_q, &c, log, size);
 }
 
+int nvme_get_log(struct nvme_ctrl *ctrl, u32 nsid, u8 log_page, u8 lsp, u8 csi,
+		void *log, size_t size, u64 offset)
+{
+	return nvme_get_log_lsi(ctrl, nsid, log_page, lsp, csi, log, size,
+			offset, 0);
+}
+
 static int nvme_get_effects_log(struct nvme_ctrl *ctrl, u8 csi,
 				struct nvme_effects_log **log)
 {
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv12 09/12] nvme: pass a void pointer to nvme_get/set_features for the result
  2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
                   ` (7 preceding siblings ...)
  2024-12-06 22:17 ` [PATCHv12 08/12] nvme: add a nvme_get_log_lsi helper Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
  2024-12-09  8:32   ` Hannes Reinecke
       [not found]   ` <CGME20241210122137epcas5p2e373baa1c99b78341928cc7bf0fe3bdf@epcas5p2.samsung.com>
  2024-12-06 22:17 ` [PATCHv12 10/12] nvme.h: add FDP definitions Keith Busch
                   ` (3 subsequent siblings)
  12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
  To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

From: Christoph Hellwig <[email protected]>

That allows passing in structures instead of the u32 result, and thus
reduce the amount of bit shifting and masking required to parse the
result.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
---
 drivers/nvme/host/core.c | 4 ++--
 drivers/nvme/host/nvme.h | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 36c44be98e38c..c2a3585a3fa59 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1678,7 +1678,7 @@ static int nvme_features(struct nvme_ctrl *dev, u8 op, unsigned int fid,
 
 int nvme_set_features(struct nvme_ctrl *dev, unsigned int fid,
 		      unsigned int dword11, void *buffer, size_t buflen,
-		      u32 *result)
+		      void *result)
 {
 	return nvme_features(dev, nvme_admin_set_features, fid, dword11, buffer,
 			     buflen, result);
@@ -1687,7 +1687,7 @@ EXPORT_SYMBOL_GPL(nvme_set_features);
 
 int nvme_get_features(struct nvme_ctrl *dev, unsigned int fid,
 		      unsigned int dword11, void *buffer, size_t buflen,
-		      u32 *result)
+		      void *result)
 {
 	return nvme_features(dev, nvme_admin_get_features, fid, dword11, buffer,
 			     buflen, result);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 611b02c8a8b37..c1995d89ffdb8 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -890,10 +890,10 @@ int __nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
 		int qid, nvme_submit_flags_t flags);
 int nvme_set_features(struct nvme_ctrl *dev, unsigned int fid,
 		      unsigned int dword11, void *buffer, size_t buflen,
-		      u32 *result);
+		      void *result);
 int nvme_get_features(struct nvme_ctrl *dev, unsigned int fid,
 		      unsigned int dword11, void *buffer, size_t buflen,
-		      u32 *result);
+		      void *result);
 int nvme_set_queue_count(struct nvme_ctrl *ctrl, int *count);
 void nvme_stop_keep_alive(struct nvme_ctrl *ctrl);
 int nvme_reset_ctrl(struct nvme_ctrl *ctrl);
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv12 10/12] nvme.h: add FDP definitions
  2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
                   ` (8 preceding siblings ...)
  2024-12-06 22:17 ` [PATCHv12 09/12] nvme: pass a void pointer to nvme_get/set_features for the result Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
  2024-12-09  8:33   ` Hannes Reinecke
       [not found]   ` <CGME20241210122702epcas5p4fe3ed43ad714c6b467a35d16135d07c5@epcas5p4.samsung.com>
  2024-12-06 22:18 ` [PATCHv12 11/12] nvme: register fdp parameters with the block layer Keith Busch
                   ` (2 subsequent siblings)
  12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
  To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

From: Christoph Hellwig <[email protected]>

Add the config feature result, config log page, and management receive
commands needed for FDP.

Partially based on a patch from Kanchan Joshi <[email protected]>.

Signed-off-by: Christoph Hellwig <[email protected]>
[kbusch: renamed some fields to match spec]
Signed-off-by: Keith Busch <[email protected]>
---
 include/linux/nvme.h | 77 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 13377dde4527b..7680078fa67fd 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -275,6 +275,7 @@ enum nvme_ctrl_attr {
 	NVME_CTRL_ATTR_HID_128_BIT	= (1 << 0),
 	NVME_CTRL_ATTR_TBKAS		= (1 << 6),
 	NVME_CTRL_ATTR_ELBAS		= (1 << 15),
+	NVME_CTRL_ATTR_FDPS		= (1 << 19),
 };
 
 struct nvme_id_ctrl {
@@ -661,6 +662,44 @@ struct nvme_rotational_media_log {
 	__u8	rsvd24[488];
 };
 
+struct nvme_fdp_config {
+	__u8			flags;
+#define FDPCFG_FDPE	(1U << 0)
+	__u8			fdpcidx;
+	__le16			reserved;
+};
+
+struct nvme_fdp_ruh_desc {
+	__u8			ruht;
+	__u8			reserved[3];
+};
+
+struct nvme_fdp_config_desc {
+	__le16			dsze;
+	__u8			fdpa;
+	__u8			vss;
+	__le32			nrg;
+	__le16			nruh;
+	__le16			maxpids;
+	__le32			nns;
+	__le64			runs;
+	__le32			erutl;
+	__u8			rsvd28[36];
+	struct nvme_fdp_ruh_desc ruhs[];
+};
+
+struct nvme_fdp_config_log {
+	__le16			numfdpc;
+	__u8			ver;
+	__u8			rsvd3;
+	__le32			sze;
+	__u8			rsvd8[8];
+	/*
+	 * This is followed by variable number of nvme_fdp_config_desc
+	 * structures, but sparse doesn't like nested variable sized arrays.
+	 */
+};
+
 struct nvme_smart_log {
 	__u8			critical_warning;
 	__u8			temperature[2];
@@ -887,6 +926,7 @@ enum nvme_opcode {
 	nvme_cmd_resv_register	= 0x0d,
 	nvme_cmd_resv_report	= 0x0e,
 	nvme_cmd_resv_acquire	= 0x11,
+	nvme_cmd_io_mgmt_recv	= 0x12,
 	nvme_cmd_resv_release	= 0x15,
 	nvme_cmd_zone_mgmt_send	= 0x79,
 	nvme_cmd_zone_mgmt_recv	= 0x7a,
@@ -908,6 +948,7 @@ enum nvme_opcode {
 		nvme_opcode_name(nvme_cmd_resv_register),	\
 		nvme_opcode_name(nvme_cmd_resv_report),		\
 		nvme_opcode_name(nvme_cmd_resv_acquire),	\
+		nvme_opcode_name(nvme_cmd_io_mgmt_recv),	\
 		nvme_opcode_name(nvme_cmd_resv_release),	\
 		nvme_opcode_name(nvme_cmd_zone_mgmt_send),	\
 		nvme_opcode_name(nvme_cmd_zone_mgmt_recv),	\
@@ -1059,6 +1100,7 @@ enum {
 	NVME_RW_PRINFO_PRCHK_GUARD	= 1 << 12,
 	NVME_RW_PRINFO_PRACT		= 1 << 13,
 	NVME_RW_DTYPE_STREAMS		= 1 << 4,
+	NVME_RW_DTYPE_DPLCMT		= 2 << 4,
 	NVME_WZ_DEAC			= 1 << 9,
 };
 
@@ -1146,6 +1188,38 @@ struct nvme_zone_mgmt_recv_cmd {
 	__le32			cdw14[2];
 };
 
+struct nvme_io_mgmt_recv_cmd {
+	__u8			opcode;
+	__u8			flags;
+	__u16			command_id;
+	__le32			nsid;
+	__le64			rsvd2[2];
+	union nvme_data_ptr	dptr;
+	__u8			mo;
+	__u8			rsvd11;
+	__u16			mos;
+	__le32			numd;
+	__le32			cdw12[4];
+};
+
+enum {
+	NVME_IO_MGMT_RECV_MO_RUHS	= 1,
+};
+
+struct nvme_fdp_ruh_status_desc {
+	__le16			pid;
+	__le16			ruhid;
+	__le32			earutr;
+	__le64			ruamw;
+	__u8			reserved[16];
+};
+
+struct nvme_fdp_ruh_status {
+	__u8			rsvd0[14];
+	__le16			nruhsd;
+	struct nvme_fdp_ruh_status_desc ruhsd[];
+};
+
 enum {
 	NVME_ZRA_ZONE_REPORT		= 0,
 	NVME_ZRASF_ZONE_REPORT_ALL	= 0,
@@ -1281,6 +1355,7 @@ enum {
 	NVME_FEAT_PLM_WINDOW	= 0x14,
 	NVME_FEAT_HOST_BEHAVIOR	= 0x16,
 	NVME_FEAT_SANITIZE	= 0x17,
+	NVME_FEAT_FDP		= 0x1d,
 	NVME_FEAT_SW_PROGRESS	= 0x80,
 	NVME_FEAT_HOST_ID	= 0x81,
 	NVME_FEAT_RESV_MASK	= 0x82,
@@ -1301,6 +1376,7 @@ enum {
 	NVME_LOG_ANA		= 0x0c,
 	NVME_LOG_FEATURES	= 0x12,
 	NVME_LOG_RMI		= 0x16,
+	NVME_LOG_FDP_CONFIGS	= 0x20,
 	NVME_LOG_DISC		= 0x70,
 	NVME_LOG_RESERVATION	= 0x80,
 	NVME_FWACT_REPL		= (0 << 3),
@@ -1888,6 +1964,7 @@ struct nvme_command {
 		struct nvmf_auth_receive_command auth_receive;
 		struct nvme_dbbuf dbbuf;
 		struct nvme_directive_cmd directive;
+		struct nvme_io_mgmt_recv_cmd imr;
 	};
 };
 
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv12 11/12] nvme: register fdp parameters with the block layer
  2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
                   ` (9 preceding siblings ...)
  2024-12-06 22:17 ` [PATCHv12 10/12] nvme.h: add FDP definitions Keith Busch
@ 2024-12-06 22:18 ` Keith Busch
  2024-12-09  4:05   ` kernel test robot
                     ` (3 more replies)
  2024-12-06 22:18 ` [PATCHv12 12/12] nvme: use fdp streams if write stream is provided Keith Busch
  2024-12-09 12:55 ` [PATCHv12 00/12] block write streams with nvme fdp Christoph Hellwig
  12 siblings, 4 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:18 UTC (permalink / raw)
  To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

From: Keith Busch <[email protected]>

Register the device data placement limits if supported. This is just
registering the limits with the block layer. Nothing beyond reporting
these attributes is happening in this patch.

Signed-off-by: Keith Busch <[email protected]>
---
 drivers/nvme/host/core.c | 112 +++++++++++++++++++++++++++++++++++++++
 drivers/nvme/host/nvme.h |   4 ++
 2 files changed, 116 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index c2a3585a3fa59..5f802e243736a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -38,6 +38,8 @@ struct nvme_ns_info {
 	u32 nsid;
 	__le32 anagrpid;
 	u8 pi_offset;
+	u16 endgid;
+	u64 runs;
 	bool is_shared;
 	bool is_readonly;
 	bool is_ready;
@@ -1613,6 +1615,7 @@ static int nvme_ns_info_from_identify(struct nvme_ctrl *ctrl,
 	info->is_shared = id->nmic & NVME_NS_NMIC_SHARED;
 	info->is_readonly = id->nsattr & NVME_NS_ATTR_RO;
 	info->is_ready = true;
+	info->endgid = le16_to_cpu(id->endgid);
 	if (ctrl->quirks & NVME_QUIRK_BOGUS_NID) {
 		dev_info(ctrl->device,
 			 "Ignoring bogus Namespace Identifiers\n");
@@ -1653,6 +1656,7 @@ static int nvme_ns_info_from_id_cs_indep(struct nvme_ctrl *ctrl,
 		info->is_ready = id->nstat & NVME_NSTAT_NRDY;
 		info->is_rotational = id->nsfeat & NVME_NS_ROTATIONAL;
 		info->no_vwc = id->nsfeat & NVME_NS_VWC_NOT_PRESENT;
+		info->endgid = le16_to_cpu(id->endgid);
 	}
 	kfree(id);
 	return ret;
@@ -2147,6 +2151,97 @@ static int nvme_update_ns_info_generic(struct nvme_ns *ns,
 	return ret;
 }
 
+static int nvme_check_fdp(struct nvme_ns *ns, struct nvme_ns_info *info,
+			  u8 fdp_idx)
+{
+	struct nvme_fdp_config_log hdr, *h;
+	struct nvme_fdp_config_desc *desc;
+	size_t size = sizeof(hdr);
+	int i, n, ret;
+	void *log;
+
+	info->runs = 0;
+	ret = nvme_get_log_lsi(ns->ctrl, 0, NVME_LOG_FDP_CONFIGS, 0, NVME_CSI_NVM,
+			   (void *)&hdr, size, 0, info->endgid);
+	if (ret)
+		return ret;
+
+	size = le32_to_cpu(hdr.sze);
+	h = kzalloc(size, GFP_KERNEL);
+	if (!h)
+		return 0;
+
+	ret = nvme_get_log_lsi(ns->ctrl, 0, NVME_LOG_FDP_CONFIGS, 0, NVME_CSI_NVM,
+			   h, size, 0, info->endgid);
+	if (ret)
+		goto out;
+
+	n = le16_to_cpu(h->numfdpc) + 1;
+	if (fdp_idx > n)
+		goto out;
+
+	log = h + 1;
+	do {
+		desc = log;
+		log += le16_to_cpu(desc->dsze);
+	} while (i++ < fdp_idx);
+
+	info->runs = le64_to_cpu(desc->runs);
+out:
+	kfree(h);
+	return ret;
+}
+
+static int nvme_query_fdp_info(struct nvme_ns *ns, struct nvme_ns_info *info)
+{
+	struct nvme_ns_head *head = ns->head;
+	struct nvme_fdp_ruh_status *ruhs;
+	struct nvme_fdp_config fdp;
+	struct nvme_command c = {};
+	int size, ret;
+
+	ret = nvme_get_features(ns->ctrl, NVME_FEAT_FDP, info->endgid, NULL, 0,
+				&fdp);
+	if (ret)
+		goto err;
+
+	if (!(fdp.flags & FDPCFG_FDPE))
+		goto err;
+
+	ret = nvme_check_fdp(ns, info, fdp.fdpcidx);
+	if (ret || !info->runs)
+		goto err;
+
+	size = struct_size(ruhs, ruhsd, NVME_MAX_PLIDS);
+	ruhs = kzalloc(size, GFP_KERNEL);
+	if (!ruhs) {
+		ret = -ENOMEM;
+		goto err;
+	}
+
+	c.imr.opcode = nvme_cmd_io_mgmt_recv;
+	c.imr.nsid = cpu_to_le32(head->ns_id);
+	c.imr.mo = NVME_IO_MGMT_RECV_MO_RUHS;
+	c.imr.numd = cpu_to_le32(nvme_bytes_to_numd(size));
+	ret = nvme_submit_sync_cmd(ns->queue, &c, ruhs, size);
+	if (ret)
+		goto free;
+
+	head->nr_plids = le16_to_cpu(ruhs->nruhsd);
+	if (!head->nr_plids)
+		goto free;
+
+	kfree(ruhs);
+	return 0;
+
+free:
+	kfree(ruhs);
+err:
+	head->nr_plids = 0;
+	info->runs = 0;
+	return ret;
+}
+
 static int nvme_update_ns_info_block(struct nvme_ns *ns,
 		struct nvme_ns_info *info)
 {
@@ -2183,6 +2278,15 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 			goto out;
 	}
 
+	if (ns->ctrl->ctratt & NVME_CTRL_ATTR_FDPS) {
+		ret = nvme_query_fdp_info(ns, info);
+		if (ret)
+			dev_warn(ns->ctrl->device,
+				"FDP failure status:0x%x\n", ret);
+		if (ret < 0)
+			goto out;
+	}
+
 	blk_mq_freeze_queue(ns->disk->queue);
 	ns->head->lba_shift = id->lbaf[lbaf].ds;
 	ns->head->nuse = le64_to_cpu(id->nuse);
@@ -2216,6 +2320,12 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 	if (!nvme_init_integrity(ns->head, &lim, info))
 		capacity = 0;
 
+	lim.max_write_streams = ns->head->nr_plids;
+	if (lim.max_write_streams)
+		lim.write_stream_granularity = info->runs;
+	else
+		lim.write_stream_granularity = 0;
+
 	ret = queue_limits_commit_update(ns->disk->queue, &lim);
 	if (ret) {
 		blk_mq_unfreeze_queue(ns->disk->queue);
@@ -2318,6 +2428,8 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
 			ns->head->disk->flags |= GENHD_FL_HIDDEN;
 		else
 			nvme_init_integrity(ns->head, &lim, info);
+		lim.max_write_streams = ns_lim->max_write_streams;
+		lim.write_stream_granularity = ns_lim->write_stream_granularity;
 		ret = queue_limits_commit_update(ns->head->disk->queue, &lim);
 
 		set_capacity_and_notify(ns->head->disk, get_capacity(ns->disk));
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index c1995d89ffdb8..914cc93e91f6d 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -454,6 +454,8 @@ struct nvme_ns_ids {
 	u8	csi;
 };
 
+#define NVME_MAX_PLIDS   (S8_MAX - 1)
+
 /*
  * Anchor structure for namespaces.  There is one for each namespace in a
  * NVMe subsystem that any of our controllers can see, and the namespace
@@ -491,6 +493,8 @@ struct nvme_ns_head {
 	struct device		cdev_device;
 
 	struct gendisk		*disk;
+
+	u16			nr_plids;
 #ifdef CONFIG_NVME_MULTIPATH
 	struct bio_list		requeue_list;
 	spinlock_t		requeue_lock;
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv12 12/12] nvme: use fdp streams if write stream is provided
  2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
                   ` (10 preceding siblings ...)
  2024-12-06 22:18 ` [PATCHv12 11/12] nvme: register fdp parameters with the block layer Keith Busch
@ 2024-12-06 22:18 ` Keith Busch
  2024-12-09  8:34   ` Hannes Reinecke
       [not found]   ` <CGME20241210073523epcas5p149482220b87ff3926fb8864ff1660e0c@epcas5p1.samsung.com>
  2024-12-09 12:55 ` [PATCHv12 00/12] block write streams with nvme fdp Christoph Hellwig
  12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:18 UTC (permalink / raw)
  To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

From: Keith Busch <[email protected]>

Maps a user requested write stream to an FDP placement ID if possible.

Signed-off-by: Keith Busch <[email protected]>
---
 drivers/nvme/host/core.c | 32 +++++++++++++++++++++++++++++++-
 drivers/nvme/host/nvme.h |  1 +
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 5f802e243736a..63c8a117b3b4a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -997,6 +997,18 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
 	if (req->cmd_flags & REQ_RAHEAD)
 		dsmgmt |= NVME_RW_DSM_FREQ_PREFETCH;
 
+	if (op == nvme_cmd_write && ns->head->nr_plids) {
+		u16 write_stream = req->bio->bi_write_stream;
+
+		if (WARN_ON_ONCE(write_stream > ns->head->nr_plids))
+			return BLK_STS_INVAL;
+
+		if (write_stream) {
+			dsmgmt |= ns->head->plids[write_stream - 1] << 16;
+			control |= NVME_RW_DTYPE_DPLCMT;
+		}
+	}
+
 	if (req->cmd_flags & REQ_ATOMIC && !nvme_valid_atomic_write(req))
 		return BLK_STS_INVAL;
 
@@ -2194,11 +2206,12 @@ static int nvme_check_fdp(struct nvme_ns *ns, struct nvme_ns_info *info,
 
 static int nvme_query_fdp_info(struct nvme_ns *ns, struct nvme_ns_info *info)
 {
+	struct nvme_fdp_ruh_status_desc *ruhsd;
 	struct nvme_ns_head *head = ns->head;
 	struct nvme_fdp_ruh_status *ruhs;
 	struct nvme_fdp_config fdp;
 	struct nvme_command c = {};
-	int size, ret;
+	int size, ret, i;
 
 	ret = nvme_get_features(ns->ctrl, NVME_FEAT_FDP, info->endgid, NULL, 0,
 				&fdp);
@@ -2231,6 +2244,19 @@ static int nvme_query_fdp_info(struct nvme_ns *ns, struct nvme_ns_info *info)
 	if (!head->nr_plids)
 		goto free;
 
+	head->nr_plids = min(head->nr_plids, NVME_MAX_PLIDS);
+	head->plids = kcalloc(head->nr_plids, sizeof(head->plids),
+			      GFP_KERNEL);
+	if (!head->plids) {
+		ret = -ENOMEM;
+		goto free;
+	}
+
+	for (i = 0; i < head->nr_plids; i++) {
+		ruhsd = &ruhs->ruhsd[i];
+		head->plids[i] = le16_to_cpu(ruhsd->pid);
+	}
+
 	kfree(ruhs);
 	return 0;
 
@@ -2285,6 +2311,10 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 				"FDP failure status:0x%x\n", ret);
 		if (ret < 0)
 			goto out;
+	} else {
+		ns->head->nr_plids = 0;
+		kfree(ns->head->plids);
+		ns->head->plids = NULL;
 	}
 
 	blk_mq_freeze_queue(ns->disk->queue);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 914cc93e91f6d..49b234bfb42c4 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -495,6 +495,7 @@ struct nvme_ns_head {
 	struct gendisk		*disk;
 
 	u16			nr_plids;
+	u16			*plids;
 #ifdef CONFIG_NVME_MULTIPATH
 	struct bio_list		requeue_list;
 	spinlock_t		requeue_lock;
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
  2024-12-06 22:18 ` [PATCHv12 11/12] nvme: register fdp parameters with the block layer Keith Busch
@ 2024-12-09  4:05   ` kernel test robot
  2024-12-09 12:44     ` Christoph Hellwig
  2024-12-09  8:34   ` Hannes Reinecke
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 46+ messages in thread
From: kernel test robot @ 2024-12-09  4:05 UTC (permalink / raw)
  To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
	io-uring
  Cc: llvm, oe-kbuild-all, sagi, asml.silence, anuj20.g, joshi.k,
	Keith Busch

Hi Keith,

kernel test robot noticed the following build warnings:

[auto build test WARNING on axboe-block/for-next]
[also build test WARNING on next-20241206]
[cannot apply to brauner-vfs/vfs.all hch-configfs/for-next linus/master v6.13-rc1]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Keith-Busch/fs-add-write-stream-information-to-statx/20241207-063826
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20241206221801.790690-12-kbusch%40meta.com
patch subject: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
config: i386-buildonly-randconfig-001-20241207 (https://download.01.org/0day-ci/archive/20241207/[email protected]/config)
compiler: clang version 19.1.3 (https://github.com/llvm/llvm-project ab51eccf88f5321e7c60591c5546b254b6afab99)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241207/[email protected]/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All warnings (new ones prefixed by >>):

   In file included from drivers/nvme/host/core.c:8:
   In file included from include/linux/blkdev.h:9:
   In file included from include/linux/blk_types.h:10:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:8:
   In file included from include/linux/cacheflush.h:5:
   In file included from arch/x86/include/asm/cacheflush.h:5:
   In file included from include/linux/mm.h:2223:
   include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     518 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
>> drivers/nvme/host/core.c:2187:11: warning: variable 'i' is uninitialized when used here [-Wuninitialized]
    2187 |         } while (i++ < fdp_idx);
         |                  ^
   drivers/nvme/host/core.c:2160:7: note: initialize the variable 'i' to silence this warning
    2160 |         int i, n, ret;
         |              ^
         |               = 0
   2 warnings generated.


vim +/i +2187 drivers/nvme/host/core.c

  2153	
  2154	static int nvme_check_fdp(struct nvme_ns *ns, struct nvme_ns_info *info,
  2155				  u8 fdp_idx)
  2156	{
  2157		struct nvme_fdp_config_log hdr, *h;
  2158		struct nvme_fdp_config_desc *desc;
  2159		size_t size = sizeof(hdr);
  2160		int i, n, ret;
  2161		void *log;
  2162	
  2163		info->runs = 0;
  2164		ret = nvme_get_log_lsi(ns->ctrl, 0, NVME_LOG_FDP_CONFIGS, 0, NVME_CSI_NVM,
  2165				   (void *)&hdr, size, 0, info->endgid);
  2166		if (ret)
  2167			return ret;
  2168	
  2169		size = le32_to_cpu(hdr.sze);
  2170		h = kzalloc(size, GFP_KERNEL);
  2171		if (!h)
  2172			return 0;
  2173	
  2174		ret = nvme_get_log_lsi(ns->ctrl, 0, NVME_LOG_FDP_CONFIGS, 0, NVME_CSI_NVM,
  2175				   h, size, 0, info->endgid);
  2176		if (ret)
  2177			goto out;
  2178	
  2179		n = le16_to_cpu(h->numfdpc) + 1;
  2180		if (fdp_idx > n)
  2181			goto out;
  2182	
  2183		log = h + 1;
  2184		do {
  2185			desc = log;
  2186			log += le16_to_cpu(desc->dsze);
> 2187		} while (i++ < fdp_idx);
  2188	
  2189		info->runs = le64_to_cpu(desc->runs);
  2190	out:
  2191		kfree(h);
  2192		return ret;
  2193	}
  2194	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 01/12] fs: add write stream information to statx
  2024-12-06 22:17 ` [PATCHv12 01/12] fs: add write stream information to statx Keith Busch
@ 2024-12-09  8:25   ` Hannes Reinecke
       [not found]   ` <CGME20241209115219epcas5p4cfc217e25d977cd87025a4284ba0436c@epcas5p4.samsung.com>
  1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09  8:25 UTC (permalink / raw)
  To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
	io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

On 12/6/24 23:17, Keith Busch wrote:
> From: Keith Busch <[email protected]>
> 
> Add new statx field to report the maximum number of write streams
> supported and the granularity for them.
> 
> Signed-off-by: Keith Busch <[email protected]>
> [hch: rename hint to stream, add granularity]
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
>   fs/stat.c                 | 2 ++
>   include/linux/stat.h      | 2 ++
>   include/uapi/linux/stat.h | 7 +++++--
>   3 files changed, 9 insertions(+), 2 deletions(-)
> 
Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
[email protected]                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 02/12] fs: add a write stream field to the kiocb
  2024-12-06 22:17 ` [PATCHv12 02/12] fs: add a write stream field to the kiocb Keith Busch
@ 2024-12-09  8:25   ` Hannes Reinecke
  2024-12-09 12:47   ` [PATCHv12 01/12] fs: add write stream information to statx Christian Brauner
       [not found]   ` <CGME20241210073225epcas5p4b2ed325714e6d17fae9e3e45b8e963f6@epcas5p4.samsung.com>
  2 siblings, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09  8:25 UTC (permalink / raw)
  To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
	io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

On 12/6/24 23:17, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
> 
> Prepare for io_uring passthrough of write streams. The write stream
> field in the kiocb structure fits into an existing 2-byte hole, so its
> size is not changed.
> 
> Signed-off-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
>   include/linux/fs.h | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 2cc3d45da7b01..26940c451f319 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -373,6 +373,7 @@ struct kiocb {
>   	void			*private;
>   	int			ki_flags;
>   	u16			ki_ioprio; /* See linux/ioprio.h */
> +	u8			ki_write_stream;
>   	union {
>   		/*
>   		 * Only used for async buffered reads, where it denotes the

Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
[email protected]                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 03/12] block: add a bi_write_stream field
  2024-12-06 22:17 ` [PATCHv12 03/12] block: add a bi_write_stream field Keith Busch
@ 2024-12-09  8:26   ` Hannes Reinecke
       [not found]   ` <CGME20241210074213epcas5p22330d197c3e7058e9c2226f28fdb1475@epcas5p2.samsung.com>
  1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09  8:26 UTC (permalink / raw)
  To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
	io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

On 12/6/24 23:17, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
> 
> Add the ability to pass a write stream for placement control in the bio.
> 
> Signed-off-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
>   block/bio.c                 | 2 ++
>   block/blk-crypto-fallback.c | 1 +
>   block/blk-merge.c           | 4 ++++
>   block/bounce.c              | 1 +
>   include/linux/blk_types.h   | 1 +
>   5 files changed, 9 insertions(+)
> 
Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
[email protected]                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 04/12] block: introduce max_write_streams queue limit
  2024-12-06 22:17 ` [PATCHv12 04/12] block: introduce max_write_streams queue limit Keith Busch
@ 2024-12-09  8:27   ` Hannes Reinecke
       [not found]   ` <CGME20241210074628epcas5p3e36c7615cf2a5160d7fe169774fd30db@epcas5p3.samsung.com>
  1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09  8:27 UTC (permalink / raw)
  To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
	io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

On 12/6/24 23:17, Keith Busch wrote:
> From: Keith Busch <[email protected]>
> 
> Drivers with hardware that support write streams need a way to export how
> many are available so applications can generically query this.
> 
> Signed-off-by: Keith Busch <[email protected]>
> [hch: renamed hints to streams, removed stacking]
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
>   Documentation/ABI/stable/sysfs-block | 7 +++++++
>   block/blk-sysfs.c                    | 3 +++
>   include/linux/blkdev.h               | 9 +++++++++
>   3 files changed, 19 insertions(+)
> 
Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
[email protected]                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 05/12] block: introduce a write_stream_granularity queue limit
  2024-12-06 22:17 ` [PATCHv12 05/12] block: introduce a write_stream_granularity " Keith Busch
@ 2024-12-09  8:29   ` Hannes Reinecke
       [not found]   ` <CGME20241210075259epcas5p23bbb79cdb18ddbfad337d764d4fe75da@epcas5p2.samsung.com>
  1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09  8:29 UTC (permalink / raw)
  To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
	io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

On 12/6/24 23:17, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
> 
> Export the granularity that write streams should be discarded with,
> as it is essential for making good use of them.
> 
> Signed-off-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
>   Documentation/ABI/stable/sysfs-block | 8 ++++++++
>   block/blk-sysfs.c                    | 3 +++
>   include/linux/blkdev.h               | 7 +++++++
>   3 files changed, 18 insertions(+)
> 
Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
[email protected]                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 06/12] block: expose write streams for block device nodes
  2024-12-06 22:17 ` [PATCHv12 06/12] block: expose write streams for block device nodes Keith Busch
@ 2024-12-09  8:30   ` Hannes Reinecke
       [not found]   ` <CGME20241209110649epcas5p41df7db0f7ea58f250da647106d25134b@epcas5p4.samsung.com>
  1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09  8:30 UTC (permalink / raw)
  To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
	io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

On 12/6/24 23:17, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
> 
> Export statx information about the number and granularity of write
> streams, use the per-kiocb write hint and map temperature hints
> to write streams (which is a bit questionable, but this shows how it is
> done).
> 
> Signed-off-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
>   block/bdev.c |  6 ++++++
>   block/fops.c | 23 +++++++++++++++++++++++
>   2 files changed, 29 insertions(+)
> 
Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
[email protected]                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 07/12] io_uring: enable per-io write streams
  2024-12-06 22:17 ` [PATCHv12 07/12] io_uring: enable per-io write streams Keith Busch
@ 2024-12-09  8:31   ` Hannes Reinecke
  0 siblings, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09  8:31 UTC (permalink / raw)
  To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
	io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

On 12/6/24 23:17, Keith Busch wrote:
> From: Keith Busch <[email protected]>
> 
> Allow userspace to pass a per-I/O write stream in the SQE:
> 
>        __u8 write_stream;
> 
> The __u8 type matches the size the filesystems and block layer support.
> 
> Application can query the supported values from the statx
> max_write_streams field. Unsupported values are ignored by file
> operations that do not support write streams or rejected with an error
> by those that support them.
> 
> Signed-off-by: Keith Busch <[email protected]>
> ---
>   include/uapi/linux/io_uring.h | 4 ++++
>   io_uring/io_uring.c           | 2 ++
>   io_uring/rw.c                 | 1 +
>   3 files changed, 7 insertions(+)
> 
Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
[email protected]                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 08/12] nvme: add a nvme_get_log_lsi helper
  2024-12-06 22:17 ` [PATCHv12 08/12] nvme: add a nvme_get_log_lsi helper Keith Busch
@ 2024-12-09  8:31   ` Hannes Reinecke
       [not found]   ` <CGME20241210121958epcas5p27d14abfca66757a2c42ec71895b008b1@epcas5p2.samsung.com>
  1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09  8:31 UTC (permalink / raw)
  To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
	io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

On 12/6/24 23:17, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
> 
> For log pages that need to pass in a LSI value, while at the same time
> not touching all the existing nvme_get_log callers.
> 
> Signed-off-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
>   drivers/nvme/host/core.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
> 
Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
[email protected]                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 09/12] nvme: pass a void pointer to nvme_get/set_features for the result
  2024-12-06 22:17 ` [PATCHv12 09/12] nvme: pass a void pointer to nvme_get/set_features for the result Keith Busch
@ 2024-12-09  8:32   ` Hannes Reinecke
       [not found]   ` <CGME20241210122137epcas5p2e373baa1c99b78341928cc7bf0fe3bdf@epcas5p2.samsung.com>
  1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09  8:32 UTC (permalink / raw)
  To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
	io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

On 12/6/24 23:17, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
> 
> That allows passing in structures instead of the u32 result, and thus
> reduce the amount of bit shifting and masking required to parse the
> result.
> 
> Signed-off-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
>   drivers/nvme/host/core.c | 4 ++--
>   drivers/nvme/host/nvme.h | 4 ++--
>   2 files changed, 4 insertions(+), 4 deletions(-)
> 
Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
[email protected]                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 10/12] nvme.h: add FDP definitions
  2024-12-06 22:17 ` [PATCHv12 10/12] nvme.h: add FDP definitions Keith Busch
@ 2024-12-09  8:33   ` Hannes Reinecke
       [not found]   ` <CGME20241210122702epcas5p4fe3ed43ad714c6b467a35d16135d07c5@epcas5p4.samsung.com>
  1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09  8:33 UTC (permalink / raw)
  To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
	io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

On 12/6/24 23:17, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
> 
> Add the config feature result, config log page, and management receive
> commands needed for FDP.
> 
> Partially based on a patch from Kanchan Joshi <[email protected]>.
> 
> Signed-off-by: Christoph Hellwig <[email protected]>
> [kbusch: renamed some fields to match spec]
> Signed-off-by: Keith Busch <[email protected]>
> ---
>   include/linux/nvme.h | 77 ++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 77 insertions(+)
> 

Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
[email protected]                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
  2024-12-06 22:18 ` [PATCHv12 11/12] nvme: register fdp parameters with the block layer Keith Busch
  2024-12-09  4:05   ` kernel test robot
@ 2024-12-09  8:34   ` Hannes Reinecke
  2024-12-09 13:18   ` Christoph Hellwig
  2024-12-10  8:45   ` Dan Carpenter
  3 siblings, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09  8:34 UTC (permalink / raw)
  To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
	io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

On 12/6/24 23:18, Keith Busch wrote:
> From: Keith Busch <[email protected]>
> 
> Register the device data placement limits if supported. This is just
> registering the limits with the block layer. Nothing beyond reporting
> these attributes is happening in this patch.
> 
> Signed-off-by: Keith Busch <[email protected]>
> ---
>   drivers/nvme/host/core.c | 112 +++++++++++++++++++++++++++++++++++++++
>   drivers/nvme/host/nvme.h |   4 ++
>   2 files changed, 116 insertions(+)
> 
Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
[email protected]                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 12/12] nvme: use fdp streams if write stream is provided
  2024-12-06 22:18 ` [PATCHv12 12/12] nvme: use fdp streams if write stream is provided Keith Busch
@ 2024-12-09  8:34   ` Hannes Reinecke
       [not found]   ` <CGME20241210073523epcas5p149482220b87ff3926fb8864ff1660e0c@epcas5p1.samsung.com>
  1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09  8:34 UTC (permalink / raw)
  To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
	io-uring
  Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

On 12/6/24 23:18, Keith Busch wrote:
> From: Keith Busch <[email protected]>
> 
> Maps a user requested write stream to an FDP placement ID if possible.
> 
> Signed-off-by: Keith Busch <[email protected]>
> ---
>   drivers/nvme/host/core.c | 32 +++++++++++++++++++++++++++++++-
>   drivers/nvme/host/nvme.h |  1 +
>   2 files changed, 32 insertions(+), 1 deletion(-)
> 
Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
[email protected]                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 06/12] block: expose write streams for block device nodes
       [not found]   ` <CGME20241209110649epcas5p41df7db0f7ea58f250da647106d25134b@epcas5p4.samsung.com>
@ 2024-12-09 10:58     ` Nitesh Shetty
  0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-09 10:58 UTC (permalink / raw)
  To: Keith Busch
  Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
	sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

[-- Attachment #1: Type: text/plain, Size: 1975 bytes --]

On 06/12/24 02:17PM, Keith Busch wrote:
>From: Christoph Hellwig <[email protected]>
>
>Export statx information about the number and granularity of write
>streams, use the per-kiocb write hint and map temperature hints
>to write streams (which is a bit questionable, but this shows how it is
>done).
>
>Signed-off-by: Christoph Hellwig <[email protected]>
>Signed-off-by: Keith Busch <[email protected]>
>---
> block/bdev.c |  6 ++++++
> block/fops.c | 23 +++++++++++++++++++++++
> 2 files changed, 29 insertions(+)
>
>diff --git a/block/bdev.c b/block/bdev.c
>index 738e3c8457e7f..c23245f1fdfe3 100644
>--- a/block/bdev.c
>+++ b/block/bdev.c
>@@ -1296,6 +1296,12 @@ void bdev_statx(struct path *path, struct kstat *stat,
> 		stat->result_mask |= STATX_DIOALIGN;
> 	}
>
>+	if ((request_mask & STATX_WRITE_STREAM) &&
We may not reach this point, if user application doesn't set either of
STATX_DIOALIGN or STATX_WRITE_ATOMIC.

>+	    bdev_max_write_streams(bdev)) {
>+		stat->write_stream_max = bdev_max_write_streams(bdev);
>+		stat->result_mask |= STATX_WRITE_STREAM;
statx will show value of 0 for write_stream_granularity.

Below is the fix which might help you,

diff --git a/block/bdev.c b/block/bdev.c
index c23245f1fdfe..290577e20457 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -1275,7 +1275,8 @@ void bdev_statx(struct path *path, struct kstat *stat,
  	struct inode *backing_inode;
  	struct block_device *bdev;
  
-	if (!(request_mask & (STATX_DIOALIGN | STATX_WRITE_ATOMIC)))
+	if (!(request_mask & (STATX_DIOALIGN | STATX_WRITE_ATOMIC |
+		STATX_WRITE_STREAM)))
  		return;
  
  	backing_inode = d_backing_inode(path->dentry);
@@ -1299,6 +1300,7 @@ void bdev_statx(struct path *path, struct kstat *stat,
  	if ((request_mask & STATX_WRITE_STREAM) &&
  	    bdev_max_write_streams(bdev)) {
  		stat->write_stream_max = bdev_max_write_streams(bdev);
+		stat->write_stream_granularity = bdev_write_stream_granularity(bdev);
  		stat->result_mask |= STATX_WRITE_STREAM;
  	}


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 01/12] fs: add write stream information to statx
       [not found]   ` <CGME20241209115219epcas5p4cfc217e25d977cd87025a4284ba0436c@epcas5p4.samsung.com>
@ 2024-12-09 11:44     ` Nitesh Shetty
  0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-09 11:44 UTC (permalink / raw)
  To: Keith Busch
  Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
	sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

[-- Attachment #1: Type: text/plain, Size: 552 bytes --]

On 06/12/24 02:17PM, Keith Busch wrote:
>From: Keith Busch <[email protected]>
>
>Add new statx field to report the maximum number of write streams
>supported and the granularity for them.
>
>Signed-off-by: Keith Busch <[email protected]>
>[hch: rename hint to stream, add granularity]
>Signed-off-by: Christoph Hellwig <[email protected]>
>---
> fs/stat.c                 | 2 ++
> include/linux/stat.h      | 2 ++
> include/uapi/linux/stat.h | 7 +++++--
> 3 files changed, 9 insertions(+), 2 deletions(-)
>
Reviewed-by: Nitesh Shetty <[email protected]>

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
  2024-12-09  4:05   ` kernel test robot
@ 2024-12-09 12:44     ` Christoph Hellwig
  0 siblings, 0 replies; 46+ messages in thread
From: Christoph Hellwig @ 2024-12-09 12:44 UTC (permalink / raw)
  To: kernel test robot
  Cc: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
	io-uring, llvm, oe-kbuild-all, sagi, asml.silence, anuj20.g,
	joshi.k, Keith Busch

On Mon, Dec 09, 2024 at 12:05:27PM +0800, kernel test robot wrote:
> >> drivers/nvme/host/core.c:2187:11: warning: variable 'i' is uninitialized when used here [-Wuninitialized]
>     2187 |         } while (i++ < fdp_idx);
>          |                  ^
>    drivers/nvme/host/core.c:2160:7: note: initialize the variable 'i' to silence this warning
>     2160 |         int i, n, ret;
>          |              ^
>          |               = 0
>    2 warnings generated.

Yeah, looks like this is uninitialized.  Did I mention I hate these
variable length log entries in nvme?  They've already been a major
pain in ANA before..


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 01/12] fs: add write stream information to statx
  2024-12-06 22:17 ` [PATCHv12 02/12] fs: add a write stream field to the kiocb Keith Busch
  2024-12-09  8:25   ` Hannes Reinecke
@ 2024-12-09 12:47   ` Christian Brauner
       [not found]   ` <CGME20241210073225epcas5p4b2ed325714e6d17fae9e3e45b8e963f6@epcas5p4.samsung.com>
  2 siblings, 0 replies; 46+ messages in thread
From: Christian Brauner @ 2024-12-09 12:47 UTC (permalink / raw)
  To: Keith Busch, axboe
  Cc: hch, linux-block, linux-nvme, linux-fsdevel, io-uring, sagi,
	asml.silence, anuj20.g, joshi.k, Keith Busch

On Fri, Dec 06, 2024 at 02:17:50PM -0800, Keith Busch wrote:
> From: Keith Busch <[email protected]>
> 
> Add new statx field to report the maximum number of write streams
> supported and the granularity for them.
> 
> Signed-off-by: Keith Busch <[email protected]>
> [hch: rename hint to stream, add granularity]
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
>  fs/stat.c                 | 2 ++
>  include/linux/stat.h      | 2 ++
>  include/uapi/linux/stat.h | 7 +++++--
>  3 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/stat.c b/fs/stat.c
> index 0870e969a8a0b..00e4598b1ff25 100644
> --- a/fs/stat.c
> +++ b/fs/stat.c
> @@ -729,6 +729,8 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer)
>  	tmp.stx_atomic_write_unit_min = stat->atomic_write_unit_min;
>  	tmp.stx_atomic_write_unit_max = stat->atomic_write_unit_max;
>  	tmp.stx_atomic_write_segments_max = stat->atomic_write_segments_max;
> +	tmp.stx_write_stream_granularity = stat->write_stream_granularity;
> +	tmp.stx_write_stream_max = stat->write_stream_max;
>  
>  	return copy_to_user(buffer, &tmp, sizeof(tmp)) ? -EFAULT : 0;
>  }
> diff --git a/include/linux/stat.h b/include/linux/stat.h
> index 3d900c86981c5..36d4dfb291abd 100644
> --- a/include/linux/stat.h
> +++ b/include/linux/stat.h
> @@ -57,6 +57,8 @@ struct kstat {
>  	u32		atomic_write_unit_min;
>  	u32		atomic_write_unit_max;
>  	u32		atomic_write_segments_max;
> +	u32		write_stream_granularity;
> +	u16		write_stream_max;
>  };
>  
>  /* These definitions are internal to the kernel for now. Mainly used by nfsd. */
> diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
> index 887a252864416..547c62a1a3a7c 100644
> --- a/include/uapi/linux/stat.h
> +++ b/include/uapi/linux/stat.h
> @@ -132,9 +132,11 @@ struct statx {
>  	__u32	stx_atomic_write_unit_max;	/* Max atomic write unit in bytes */
>  	/* 0xb0 */
>  	__u32   stx_atomic_write_segments_max;	/* Max atomic write segment count */
> -	__u32   __spare1[1];
> +	__u32   stx_write_stream_granularity;
>  	/* 0xb8 */
> -	__u64	__spare3[9];	/* Spare space for future expansion */
> +	__u16   stx_write_stream_max;
> +	__u16	__sparse2[3];
> +	__u64	__spare3[8];	/* Spare space for future expansion */
>  	/* 0x100 */
>  };

Once you're ready to merge, let me know so I can give you a stable
branch with the fs changes.

>  
> @@ -164,6 +166,7 @@ struct statx {
>  #define STATX_MNT_ID_UNIQUE	0x00004000U	/* Want/got extended stx_mount_id */
>  #define STATX_SUBVOL		0x00008000U	/* Want/got stx_subvol */
>  #define STATX_WRITE_ATOMIC	0x00010000U	/* Want/got atomic_write_* fields */
> +#define STATX_WRITE_STREAM	0x00020000U	/* Want/got write_stream_* */
>  
>  #define STATX__RESERVED		0x80000000U	/* Reserved for future struct statx expansion */
>  
> -- 
> 2.43.5
> 

On Fri, Dec 06, 2024 at 02:17:51PM -0800, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
> 
> Prepare for io_uring passthrough of write streams. The write stream
> field in the kiocb structure fits into an existing 2-byte hole, so its
> size is not changed.
> 
> Signed-off-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
>  include/linux/fs.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 2cc3d45da7b01..26940c451f319 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -373,6 +373,7 @@ struct kiocb {
>  	void			*private;
>  	int			ki_flags;
>  	u16			ki_ioprio; /* See linux/ioprio.h */
> +	u8			ki_write_stream;
>  	union {
>  		/*
>  		 * Only used for async buffered reads, where it denotes the
> -- 
> 2.43.5
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 00/12] block write streams with nvme fdp
  2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
                   ` (11 preceding siblings ...)
  2024-12-06 22:18 ` [PATCHv12 12/12] nvme: use fdp streams if write stream is provided Keith Busch
@ 2024-12-09 12:55 ` Christoph Hellwig
  2024-12-09 16:07   ` Keith Busch
  12 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2024-12-09 12:55 UTC (permalink / raw)
  To: Keith Busch
  Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
	sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

I just compared this to a crude rebase of what I last sent out, and
AFAICS the differences are:

 1) basically all new io_uring handling due to the integrity stuff that
   went in
 2) fixes for the NVMe FDP log page parsing
 3) drop the support for the remapping of per-partition streams
 
conceptually this all looks fine to me.  I'll throw in a few nitpicks
on the nvme bits, and I'd need to get up to speed a bit more on the
io_uring bits before commenting useful.

One thing that came I was pondering for a new version is if statx
really is the right vehicle for this as it is a very common fast-path
information.  If we had a separate streaminfo ioctl or fcntl it might
be easier to leave a bit spare space for extensibility.  I can try to
prototype that or we can leave it as-is because everyone is tired of
the series.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
  2024-12-06 22:18 ` [PATCHv12 11/12] nvme: register fdp parameters with the block layer Keith Busch
  2024-12-09  4:05   ` kernel test robot
  2024-12-09  8:34   ` Hannes Reinecke
@ 2024-12-09 13:18   ` Christoph Hellwig
  2024-12-09 16:29     ` Keith Busch
  2024-12-10  8:45   ` Dan Carpenter
  3 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2024-12-09 13:18 UTC (permalink / raw)
  To: Keith Busch
  Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
	sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

> +static int nvme_check_fdp(struct nvme_ns *ns, struct nvme_ns_info *info,
> +			  u8 fdp_idx)

Maybe nvme_query_fdp_runs or something else that makes it clear this
is trying to find the runs field might make sense to name this a little
bit more descriptively.



> +{
> +	struct nvme_fdp_config_log hdr, *h;
> +	struct nvme_fdp_config_desc *desc;
> +	size_t size = sizeof(hdr);
> +	int i, n, ret;
> +	void *log;
> +
> +	info->runs = 0;
> +	ret = nvme_get_log_lsi(ns->ctrl, 0, NVME_LOG_FDP_CONFIGS, 0, NVME_CSI_NVM,

Overly long line here, and same for the second call below.

> +			   (void *)&hdr, size, 0, info->endgid);

And this cast isn't actually needed.

> +	n = le16_to_cpu(h->numfdpc) + 1;
> +	if (fdp_idx > n)
> +		goto out;
> +
> +	log = h + 1;
> +	do {
> +		desc = log;
> +		log += le16_to_cpu(desc->dsze);
> +	} while (i++ < fdp_idx);

Maybe a for loop makes it easier to avoid the uninitialized variable,
e.g.

	for (i = 0; i < fdp_index; i++) {
		..

> +	if (ns->ctrl->ctratt & NVME_CTRL_ATTR_FDPS) {
> +		ret = nvme_query_fdp_info(ns, info);
> +		if (ret)
> +			dev_warn(ns->ctrl->device,
> +				"FDP failure status:0x%x\n", ret);
> +		if (ret < 0)
> +			goto out;
> +	}

Looking at the full series with the next patch applied I'm a bit
confused about the handling when rescanning.  AFAIK the code now always
goes into nvme_query_fdp_info when NVME_CTRL_ATTR_FDPS even if
head->plids/head->nr_plids is already set, and that will then simply
override them, even if they were already set.

Also the old freeing of head->plids in nvme_free_ns_head seems gone in
this version.

Last not but least "FDP failure" is probably not a very helpful message
when it could come from about half a dozen different commands sent to
the device.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 00/12] block write streams with nvme fdp
  2024-12-09 12:55 ` [PATCHv12 00/12] block write streams with nvme fdp Christoph Hellwig
@ 2024-12-09 16:07   ` Keith Busch
  2024-12-10  1:49     ` Martin K. Petersen
  2024-12-10  7:19     ` Christoph Hellwig
  0 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-09 16:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, axboe, linux-block, linux-nvme, linux-fsdevel,
	io-uring, sagi, asml.silence, anuj20.g, joshi.k

On Mon, Dec 09, 2024 at 01:55:11PM +0100, Christoph Hellwig wrote:
> I just compared this to a crude rebase of what I last sent out, and
> AFAICS the differences are:
> 
>  1) basically all new io_uring handling due to the integrity stuff that
>    went in
>  2) fixes for the NVMe FDP log page parsing
>  3) drop the support for the remapping of per-partition streams

Yep, pretty much. I will revisit the partition mapping. I just haven't
heard any use cases for divvying the streams up this way, so it's not
clear to me what the interface needs to provide.

> One thing that came I was pondering for a new version is if statx
> really is the right vehicle for this as it is a very common fast-path
> information.  If we had a separate streaminfo ioctl or fcntl it might
> be easier to leave a bit spare space for extensibility.  I can try to
> prototype that or we can leave it as-is because everyone is tired of
> the series.

Oh sure. I can live without the statx parts from this series if you
prefer we take additional time to consider other approaches. We have the
sysfs block attributes reporting the same information, and that is okay
for now.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
  2024-12-09 13:18   ` Christoph Hellwig
@ 2024-12-09 16:29     ` Keith Busch
  0 siblings, 0 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-09 16:29 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, axboe, linux-block, linux-nvme, linux-fsdevel,
	io-uring, sagi, asml.silence, anuj20.g, joshi.k

On Mon, Dec 09, 2024 at 02:18:19PM +0100, Christoph Hellwig wrote:
> > +	n = le16_to_cpu(h->numfdpc) + 1;
> > +	if (fdp_idx > n)
> > +		goto out;
> > +
> > +	log = h + 1;
> > +	do {
> > +		desc = log;
> > +		log += le16_to_cpu(desc->dsze);
> > +	} while (i++ < fdp_idx);
> 
> Maybe a for loop makes it easier to avoid the uninitialized variable,
> e.g.
> 
> 	for (i = 0; i < fdp_index; i++) {
> 		..

Yeah, okay. I was just trying to cleverly have a single place where the
descriptor is set. A for-loop needs to set it both within and after the
loop.

> > +	if (ns->ctrl->ctratt & NVME_CTRL_ATTR_FDPS) {
> > +		ret = nvme_query_fdp_info(ns, info);
> > +		if (ret)
> > +			dev_warn(ns->ctrl->device,
> > +				"FDP failure status:0x%x\n", ret);
> > +		if (ret < 0)
> > +			goto out;
> > +	}
> 
> Looking at the full series with the next patch applied I'm a bit
> confused about the handling when rescanning.  AFAIK the code now always
> goes into nvme_query_fdp_info when NVME_CTRL_ATTR_FDPS even if
> head->plids/head->nr_plids is already set, and that will then simply
> override them, even if they were already set.

I thought you could change the FDP configuration on a live namespace
with the Set Feature command, so needed to account for that. But the
spec really does restrict that feature to endurance groups without
namespaces, so I was mistaken and we can skip re-validiting FDP state
after the first scan.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 00/12] block write streams with nvme fdp
  2024-12-09 16:07   ` Keith Busch
@ 2024-12-10  1:49     ` Martin K. Petersen
  2024-12-10  7:19     ` Christoph Hellwig
  1 sibling, 0 replies; 46+ messages in thread
From: Martin K. Petersen @ 2024-12-10  1:49 UTC (permalink / raw)
  To: Keith Busch
  Cc: Christoph Hellwig, Keith Busch, axboe, linux-block, linux-nvme,
	linux-fsdevel, io-uring, sagi, asml.silence, anuj20.g, joshi.k


Hi Keith!

>>  3) drop the support for the remapping of per-partition streams
>
> Yep, pretty much. I will revisit the partition mapping. I just haven't
> heard any use cases for divvying the streams up this way, so it's not
> clear to me what the interface needs to provide.

Since the streams are a (very) scarce hardware resource, it does seem to
me like we should have an explicit interface for an entity (whether
app-on-bdev or a filesystem) to allocate them.

While there certainly are cases where there is a 1:1 app-to-device
mapping, as soon as you add virtualization or enterprise apps to the
mix, that assumption quickly falls apart...

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 00/12] block write streams with nvme fdp
  2024-12-09 16:07   ` Keith Busch
  2024-12-10  1:49     ` Martin K. Petersen
@ 2024-12-10  7:19     ` Christoph Hellwig
  1 sibling, 0 replies; 46+ messages in thread
From: Christoph Hellwig @ 2024-12-10  7:19 UTC (permalink / raw)
  To: Keith Busch
  Cc: Christoph Hellwig, Keith Busch, axboe, linux-block, linux-nvme,
	linux-fsdevel, io-uring, sagi, asml.silence, anuj20.g, joshi.k

On Mon, Dec 09, 2024 at 09:07:35AM -0700, Keith Busch wrote:
> Yep, pretty much. I will revisit the partition mapping. I just haven't
> heard any use cases for divvying the streams up this way, so it's not
> clear to me what the interface needs to provide.

Yes, it would be good to understand use cases first.  I just threw the
patch in as a POC to show we can do it.

> > One thing that came I was pondering for a new version is if statx
> > really is the right vehicle for this as it is a very common fast-path
> > information.  If we had a separate streaminfo ioctl or fcntl it might
> > be easier to leave a bit spare space for extensibility.  I can try to
> > prototype that or we can leave it as-is because everyone is tired of
> > the series.
> 
> Oh sure. I can live without the statx parts from this series if you
> prefer we take additional time to consider other approaches. We have the
> sysfs block attributes reporting the same information, and that is okay
> for now.

I'll try to find some time this afternoon for an interface, but if it
doesn't arrive in time we can probably drop if for the next submission.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 02/12] fs: add a write stream field to the kiocb
       [not found]   ` <CGME20241210073225epcas5p4b2ed325714e6d17fae9e3e45b8e963f6@epcas5p4.samsung.com>
@ 2024-12-10  7:24     ` Nitesh Shetty
  0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10  7:24 UTC (permalink / raw)
  To: Keith Busch
  Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
	sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

[-- Attachment #1: Type: text/plain, Size: 397 bytes --]

On 06/12/24 02:17PM, Keith Busch wrote:
>From: Christoph Hellwig <[email protected]>
>
>Prepare for io_uring passthrough of write streams. The write stream
>field in the kiocb structure fits into an existing 2-byte hole, so its
>size is not changed.
>
>Signed-off-by: Christoph Hellwig <[email protected]>
>Signed-off-by: Keith Busch <[email protected]>
>---

Reviewed-by: Nitesh Shetty <[email protected]>

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 12/12] nvme: use fdp streams if write stream is provided
       [not found]   ` <CGME20241210073523epcas5p149482220b87ff3926fb8864ff1660e0c@epcas5p1.samsung.com>
@ 2024-12-10  7:27     ` Nitesh Shetty
  0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10  7:27 UTC (permalink / raw)
  To: Keith Busch
  Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
	sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

[-- Attachment #1: Type: text/plain, Size: 255 bytes --]

On 06/12/24 02:18PM, Keith Busch wrote:
>From: Keith Busch <[email protected]>
>
>Maps a user requested write stream to an FDP placement ID if possible.
>
>Signed-off-by: Keith Busch <[email protected]>

Reviewed-by: Nitesh Shetty <[email protected]>

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 03/12] block: add a bi_write_stream field
       [not found]   ` <CGME20241210074213epcas5p22330d197c3e7058e9c2226f28fdb1475@epcas5p2.samsung.com>
@ 2024-12-10  7:34     ` Nitesh Shetty
  0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10  7:34 UTC (permalink / raw)
  To: Keith Busch
  Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
	sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

[-- Attachment #1: Type: text/plain, Size: 307 bytes --]

On 06/12/24 02:17PM, Keith Busch wrote:
>From: Christoph Hellwig <[email protected]>
>
>Add the ability to pass a write stream for placement control in the bio.
>
>Signed-off-by: Christoph Hellwig <[email protected]>
>Signed-off-by: Keith Busch <[email protected]>
>---
Reviewed-by: Nitesh Shetty <[email protected]>

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 04/12] block: introduce max_write_streams queue limit
       [not found]   ` <CGME20241210074628epcas5p3e36c7615cf2a5160d7fe169774fd30db@epcas5p3.samsung.com>
@ 2024-12-10  7:38     ` Nitesh Shetty
  0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10  7:38 UTC (permalink / raw)
  To: Keith Busch
  Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
	sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

[-- Attachment #1: Type: text/plain, Size: 424 bytes --]

On 06/12/24 02:17PM, Keith Busch wrote:
>From: Keith Busch <[email protected]>
>
>Drivers with hardware that support write streams need a way to export how
>many are available so applications can generically query this.
>
>Signed-off-by: Keith Busch <[email protected]>
>[hch: renamed hints to streams, removed stacking]
>Signed-off-by: Christoph Hellwig <[email protected]>
>---
Reviewed-by: Nitesh Shetty <[email protected]>

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 05/12] block: introduce a write_stream_granularity queue limit
       [not found]   ` <CGME20241210075259epcas5p23bbb79cdb18ddbfad337d764d4fe75da@epcas5p2.samsung.com>
@ 2024-12-10  7:45     ` Nitesh Shetty
  0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10  7:45 UTC (permalink / raw)
  To: Keith Busch
  Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
	sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

[-- Attachment #1: Type: text/plain, Size: 352 bytes --]

On 06/12/24 02:17PM, Keith Busch wrote:
>From: Christoph Hellwig <[email protected]>
>
>Export the granularity that write streams should be discarded with,
>as it is essential for making good use of them.
>
>Signed-off-by: Christoph Hellwig <[email protected]>
>Signed-off-by: Keith Busch <[email protected]>
>---

Reviewed-by: Nitesh Shetty <[email protected]>

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
  2024-12-06 22:18 ` [PATCHv12 11/12] nvme: register fdp parameters with the block layer Keith Busch
                     ` (2 preceding siblings ...)
  2024-12-09 13:18   ` Christoph Hellwig
@ 2024-12-10  8:45   ` Dan Carpenter
  2024-12-10 15:23     ` Keith Busch
  3 siblings, 1 reply; 46+ messages in thread
From: Dan Carpenter @ 2024-12-10  8:45 UTC (permalink / raw)
  To: oe-kbuild, Keith Busch, axboe, hch, linux-block, linux-nvme,
	linux-fsdevel, io-uring
  Cc: lkp, oe-kbuild-all, sagi, asml.silence, anuj20.g, joshi.k,
	Keith Busch

Hi Keith,

kernel test robot noticed the following build warnings:

https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Keith-Busch/fs-add-write-stream-information-to-statx/20241207-063826
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20241206221801.790690-12-kbusch%40meta.com
patch subject: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
config: csky-randconfig-r072-20241209 (https://download.01.org/0day-ci/archive/20241210/[email protected]/config)
compiler: csky-linux-gcc (GCC) 14.2.0

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Reported-by: Dan Carpenter <[email protected]>
| Closes: https://lore.kernel.org/r/[email protected]/

New smatch warnings:
drivers/nvme/host/core.c:2187 nvme_check_fdp() error: uninitialized symbol 'i'.
drivers/nvme/host/core.c:2232 nvme_query_fdp_info() warn: missing error code 'ret'

vim +/i +2187 drivers/nvme/host/core.c

04ca0849938146 Keith Busch   2024-12-06  2154  static int nvme_check_fdp(struct nvme_ns *ns, struct nvme_ns_info *info,
04ca0849938146 Keith Busch   2024-12-06  2155  			  u8 fdp_idx)
04ca0849938146 Keith Busch   2024-12-06  2156  {
04ca0849938146 Keith Busch   2024-12-06  2157  	struct nvme_fdp_config_log hdr, *h;
04ca0849938146 Keith Busch   2024-12-06  2158  	struct nvme_fdp_config_desc *desc;
04ca0849938146 Keith Busch   2024-12-06  2159  	size_t size = sizeof(hdr);
04ca0849938146 Keith Busch   2024-12-06  2160  	int i, n, ret;
04ca0849938146 Keith Busch   2024-12-06  2161  	void *log;
04ca0849938146 Keith Busch   2024-12-06  2162  
04ca0849938146 Keith Busch   2024-12-06  2163  	info->runs = 0;
04ca0849938146 Keith Busch   2024-12-06  2164  	ret = nvme_get_log_lsi(ns->ctrl, 0, NVME_LOG_FDP_CONFIGS, 0, NVME_CSI_NVM,
04ca0849938146 Keith Busch   2024-12-06  2165  			   (void *)&hdr, size, 0, info->endgid);
04ca0849938146 Keith Busch   2024-12-06  2166  	if (ret)
04ca0849938146 Keith Busch   2024-12-06  2167  		return ret;
04ca0849938146 Keith Busch   2024-12-06  2168  
04ca0849938146 Keith Busch   2024-12-06  2169  	size = le32_to_cpu(hdr.sze);
04ca0849938146 Keith Busch   2024-12-06  2170  	h = kzalloc(size, GFP_KERNEL);
04ca0849938146 Keith Busch   2024-12-06  2171  	if (!h)
04ca0849938146 Keith Busch   2024-12-06  2172  		return 0;
04ca0849938146 Keith Busch   2024-12-06  2173  
04ca0849938146 Keith Busch   2024-12-06  2174  	ret = nvme_get_log_lsi(ns->ctrl, 0, NVME_LOG_FDP_CONFIGS, 0, NVME_CSI_NVM,
04ca0849938146 Keith Busch   2024-12-06  2175  			   h, size, 0, info->endgid);
04ca0849938146 Keith Busch   2024-12-06  2176  	if (ret)
04ca0849938146 Keith Busch   2024-12-06  2177  		goto out;
04ca0849938146 Keith Busch   2024-12-06  2178  
04ca0849938146 Keith Busch   2024-12-06  2179  	n = le16_to_cpu(h->numfdpc) + 1;
04ca0849938146 Keith Busch   2024-12-06  2180  	if (fdp_idx > n)
04ca0849938146 Keith Busch   2024-12-06  2181  		goto out;
04ca0849938146 Keith Busch   2024-12-06  2182  
04ca0849938146 Keith Busch   2024-12-06  2183  	log = h + 1;
04ca0849938146 Keith Busch   2024-12-06  2184  	do {
04ca0849938146 Keith Busch   2024-12-06  2185  		desc = log;
04ca0849938146 Keith Busch   2024-12-06  2186  		log += le16_to_cpu(desc->dsze);
04ca0849938146 Keith Busch   2024-12-06 @2187  	} while (i++ < fdp_idx);
                                                         ^
i needs to be initialized to zero at the start.

04ca0849938146 Keith Busch   2024-12-06  2188  
04ca0849938146 Keith Busch   2024-12-06  2189  	info->runs = le64_to_cpu(desc->runs);
04ca0849938146 Keith Busch   2024-12-06  2190  out:
04ca0849938146 Keith Busch   2024-12-06  2191  	kfree(h);
04ca0849938146 Keith Busch   2024-12-06  2192  	return ret;
04ca0849938146 Keith Busch   2024-12-06  2193  }
04ca0849938146 Keith Busch   2024-12-06  2194  
04ca0849938146 Keith Busch   2024-12-06  2195  static int nvme_query_fdp_info(struct nvme_ns *ns, struct nvme_ns_info *info)
04ca0849938146 Keith Busch   2024-12-06  2196  {
04ca0849938146 Keith Busch   2024-12-06  2197  	struct nvme_ns_head *head = ns->head;
04ca0849938146 Keith Busch   2024-12-06  2198  	struct nvme_fdp_ruh_status *ruhs;
04ca0849938146 Keith Busch   2024-12-06  2199  	struct nvme_fdp_config fdp;
04ca0849938146 Keith Busch   2024-12-06  2200  	struct nvme_command c = {};
04ca0849938146 Keith Busch   2024-12-06  2201  	int size, ret;
04ca0849938146 Keith Busch   2024-12-06  2202  
04ca0849938146 Keith Busch   2024-12-06  2203  	ret = nvme_get_features(ns->ctrl, NVME_FEAT_FDP, info->endgid, NULL, 0,
04ca0849938146 Keith Busch   2024-12-06  2204  				&fdp);
04ca0849938146 Keith Busch   2024-12-06  2205  	if (ret)
04ca0849938146 Keith Busch   2024-12-06  2206  		goto err;
04ca0849938146 Keith Busch   2024-12-06  2207  
04ca0849938146 Keith Busch   2024-12-06  2208  	if (!(fdp.flags & FDPCFG_FDPE))
04ca0849938146 Keith Busch   2024-12-06  2209  		goto err;
04ca0849938146 Keith Busch   2024-12-06  2210  
04ca0849938146 Keith Busch   2024-12-06  2211  	ret = nvme_check_fdp(ns, info, fdp.fdpcidx);
04ca0849938146 Keith Busch   2024-12-06  2212  	if (ret || !info->runs)
04ca0849938146 Keith Busch   2024-12-06  2213  		goto err;
04ca0849938146 Keith Busch   2024-12-06  2214  
04ca0849938146 Keith Busch   2024-12-06  2215  	size = struct_size(ruhs, ruhsd, NVME_MAX_PLIDS);
04ca0849938146 Keith Busch   2024-12-06  2216  	ruhs = kzalloc(size, GFP_KERNEL);
04ca0849938146 Keith Busch   2024-12-06  2217  	if (!ruhs) {
04ca0849938146 Keith Busch   2024-12-06  2218  		ret = -ENOMEM;
04ca0849938146 Keith Busch   2024-12-06  2219  		goto err;
04ca0849938146 Keith Busch   2024-12-06  2220  	}
04ca0849938146 Keith Busch   2024-12-06  2221  
04ca0849938146 Keith Busch   2024-12-06  2222  	c.imr.opcode = nvme_cmd_io_mgmt_recv;
04ca0849938146 Keith Busch   2024-12-06  2223  	c.imr.nsid = cpu_to_le32(head->ns_id);
04ca0849938146 Keith Busch   2024-12-06  2224  	c.imr.mo = NVME_IO_MGMT_RECV_MO_RUHS;
04ca0849938146 Keith Busch   2024-12-06  2225  	c.imr.numd = cpu_to_le32(nvme_bytes_to_numd(size));
04ca0849938146 Keith Busch   2024-12-06  2226  	ret = nvme_submit_sync_cmd(ns->queue, &c, ruhs, size);
04ca0849938146 Keith Busch   2024-12-06  2227  	if (ret)
04ca0849938146 Keith Busch   2024-12-06  2228  		goto free;
04ca0849938146 Keith Busch   2024-12-06  2229  
04ca0849938146 Keith Busch   2024-12-06  2230  	head->nr_plids = le16_to_cpu(ruhs->nruhsd);
04ca0849938146 Keith Busch   2024-12-06  2231  	if (!head->nr_plids)
04ca0849938146 Keith Busch   2024-12-06 @2232  		goto free;

ret = -EINVAL?

04ca0849938146 Keith Busch   2024-12-06  2233  
04ca0849938146 Keith Busch   2024-12-06  2234  	kfree(ruhs);
04ca0849938146 Keith Busch   2024-12-06  2235  	return 0;
04ca0849938146 Keith Busch   2024-12-06  2236  
04ca0849938146 Keith Busch   2024-12-06  2237  free:
04ca0849938146 Keith Busch   2024-12-06  2238  	kfree(ruhs);
04ca0849938146 Keith Busch   2024-12-06  2239  err:
04ca0849938146 Keith Busch   2024-12-06  2240  	head->nr_plids = 0;
04ca0849938146 Keith Busch   2024-12-06  2241  	info->runs = 0;
04ca0849938146 Keith Busch   2024-12-06  2242  	return ret;
04ca0849938146 Keith Busch   2024-12-06  2243  }

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 08/12] nvme: add a nvme_get_log_lsi helper
       [not found]   ` <CGME20241210121958epcas5p27d14abfca66757a2c42ec71895b008b1@epcas5p2.samsung.com>
@ 2024-12-10 12:12     ` Nitesh Shetty
  0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10 12:12 UTC (permalink / raw)
  To: Keith Busch
  Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
	sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

[-- Attachment #1: Type: text/plain, Size: 359 bytes --]

On 06/12/24 02:17PM, Keith Busch wrote:
>From: Christoph Hellwig <[email protected]>
>
>For log pages that need to pass in a LSI value, while at the same time
>not touching all the existing nvme_get_log callers.
>
>Signed-off-by: Christoph Hellwig <[email protected]>
>Signed-off-by: Keith Busch <[email protected]>
>---

Reviewed-by: Nitesh Shetty <[email protected]>

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 09/12] nvme: pass a void pointer to nvme_get/set_features for the result
       [not found]   ` <CGME20241210122137epcas5p2e373baa1c99b78341928cc7bf0fe3bdf@epcas5p2.samsung.com>
@ 2024-12-10 12:13     ` Nitesh Shetty
  0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10 12:13 UTC (permalink / raw)
  To: Keith Busch
  Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
	sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

[-- Attachment #1: Type: text/plain, Size: 383 bytes --]

On 06/12/24 02:17PM, Keith Busch wrote:
>From: Christoph Hellwig <[email protected]>
>
>That allows passing in structures instead of the u32 result, and thus
>reduce the amount of bit shifting and masking required to parse the
>result.
>
>Signed-off-by: Christoph Hellwig <[email protected]>
>Signed-off-by: Keith Busch <[email protected]>
>---

Reviewed-by: Nitesh Shetty <[email protected]>

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 10/12] nvme.h: add FDP definitions
       [not found]   ` <CGME20241210122702epcas5p4fe3ed43ad714c6b467a35d16135d07c5@epcas5p4.samsung.com>
@ 2024-12-10 12:19     ` Nitesh Shetty
  0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10 12:19 UTC (permalink / raw)
  To: Keith Busch
  Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
	sagi, asml.silence, anuj20.g, joshi.k, Keith Busch

[-- Attachment #1: Type: text/plain, Size: 449 bytes --]

On 06/12/24 02:17PM, Keith Busch wrote:
>From: Christoph Hellwig <[email protected]>
>
>Add the config feature result, config log page, and management receive
>commands needed for FDP.
>
>Partially based on a patch from Kanchan Joshi <[email protected]>.
>
>Signed-off-by: Christoph Hellwig <[email protected]>
>[kbusch: renamed some fields to match spec]
>Signed-off-by: Keith Busch <[email protected]>
>---

Reviewed-by: Nitesh Shetty <[email protected]>

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
  2024-12-10  8:45   ` Dan Carpenter
@ 2024-12-10 15:23     ` Keith Busch
  0 siblings, 0 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-10 15:23 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: oe-kbuild, Keith Busch, axboe, hch, linux-block, linux-nvme,
	linux-fsdevel, io-uring, lkp, oe-kbuild-all, sagi, asml.silence,
	anuj20.g, joshi.k

On Tue, Dec 10, 2024 at 11:45:43AM +0300, Dan Carpenter wrote:
> 04ca0849938146 Keith Busch   2024-12-06  2226  	ret = nvme_submit_sync_cmd(ns->queue, &c, ruhs, size);
> 04ca0849938146 Keith Busch   2024-12-06  2227  	if (ret)
> 04ca0849938146 Keith Busch   2024-12-06  2228  		goto free;
> 04ca0849938146 Keith Busch   2024-12-06  2229  
> 04ca0849938146 Keith Busch   2024-12-06  2230  	head->nr_plids = le16_to_cpu(ruhs->nruhsd);
> 04ca0849938146 Keith Busch   2024-12-06  2231  	if (!head->nr_plids)
> 04ca0849938146 Keith Busch   2024-12-06 @2232  		goto free;
> 
> ret = -EINVAL?

It's very much on purpose to return "0" here. Returning a negative error
has the driver fail the namespace disk creation. Seeing a stream
configuration the driver doesn't support just means you don't get to use
the block layer's write stream features. You should still be able to use
your namespace the same as before the driver started checking these
configs, otherwise it's a regression since such namespaces are usable
today.

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2024-12-10 15:23 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
2024-12-06 22:17 ` [PATCHv12 01/12] fs: add write stream information to statx Keith Busch
2024-12-09  8:25   ` Hannes Reinecke
     [not found]   ` <CGME20241209115219epcas5p4cfc217e25d977cd87025a4284ba0436c@epcas5p4.samsung.com>
2024-12-09 11:44     ` Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 02/12] fs: add a write stream field to the kiocb Keith Busch
2024-12-09  8:25   ` Hannes Reinecke
2024-12-09 12:47   ` [PATCHv12 01/12] fs: add write stream information to statx Christian Brauner
     [not found]   ` <CGME20241210073225epcas5p4b2ed325714e6d17fae9e3e45b8e963f6@epcas5p4.samsung.com>
2024-12-10  7:24     ` [PATCHv12 02/12] fs: add a write stream field to the kiocb Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 03/12] block: add a bi_write_stream field Keith Busch
2024-12-09  8:26   ` Hannes Reinecke
     [not found]   ` <CGME20241210074213epcas5p22330d197c3e7058e9c2226f28fdb1475@epcas5p2.samsung.com>
2024-12-10  7:34     ` Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 04/12] block: introduce max_write_streams queue limit Keith Busch
2024-12-09  8:27   ` Hannes Reinecke
     [not found]   ` <CGME20241210074628epcas5p3e36c7615cf2a5160d7fe169774fd30db@epcas5p3.samsung.com>
2024-12-10  7:38     ` Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 05/12] block: introduce a write_stream_granularity " Keith Busch
2024-12-09  8:29   ` Hannes Reinecke
     [not found]   ` <CGME20241210075259epcas5p23bbb79cdb18ddbfad337d764d4fe75da@epcas5p2.samsung.com>
2024-12-10  7:45     ` Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 06/12] block: expose write streams for block device nodes Keith Busch
2024-12-09  8:30   ` Hannes Reinecke
     [not found]   ` <CGME20241209110649epcas5p41df7db0f7ea58f250da647106d25134b@epcas5p4.samsung.com>
2024-12-09 10:58     ` Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 07/12] io_uring: enable per-io write streams Keith Busch
2024-12-09  8:31   ` Hannes Reinecke
2024-12-06 22:17 ` [PATCHv12 08/12] nvme: add a nvme_get_log_lsi helper Keith Busch
2024-12-09  8:31   ` Hannes Reinecke
     [not found]   ` <CGME20241210121958epcas5p27d14abfca66757a2c42ec71895b008b1@epcas5p2.samsung.com>
2024-12-10 12:12     ` Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 09/12] nvme: pass a void pointer to nvme_get/set_features for the result Keith Busch
2024-12-09  8:32   ` Hannes Reinecke
     [not found]   ` <CGME20241210122137epcas5p2e373baa1c99b78341928cc7bf0fe3bdf@epcas5p2.samsung.com>
2024-12-10 12:13     ` Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 10/12] nvme.h: add FDP definitions Keith Busch
2024-12-09  8:33   ` Hannes Reinecke
     [not found]   ` <CGME20241210122702epcas5p4fe3ed43ad714c6b467a35d16135d07c5@epcas5p4.samsung.com>
2024-12-10 12:19     ` Nitesh Shetty
2024-12-06 22:18 ` [PATCHv12 11/12] nvme: register fdp parameters with the block layer Keith Busch
2024-12-09  4:05   ` kernel test robot
2024-12-09 12:44     ` Christoph Hellwig
2024-12-09  8:34   ` Hannes Reinecke
2024-12-09 13:18   ` Christoph Hellwig
2024-12-09 16:29     ` Keith Busch
2024-12-10  8:45   ` Dan Carpenter
2024-12-10 15:23     ` Keith Busch
2024-12-06 22:18 ` [PATCHv12 12/12] nvme: use fdp streams if write stream is provided Keith Busch
2024-12-09  8:34   ` Hannes Reinecke
     [not found]   ` <CGME20241210073523epcas5p149482220b87ff3926fb8864ff1660e0c@epcas5p1.samsung.com>
2024-12-10  7:27     ` Nitesh Shetty
2024-12-09 12:55 ` [PATCHv12 00/12] block write streams with nvme fdp Christoph Hellwig
2024-12-09 16:07   ` Keith Busch
2024-12-10  1:49     ` Martin K. Petersen
2024-12-10  7:19     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox