* [PATCHv12 00/12] block write streams with nvme fdp
@ 2024-12-06 22:17 Keith Busch
2024-12-06 22:17 ` [PATCHv12 01/12] fs: add write stream information to statx Keith Busch
` (12 more replies)
0 siblings, 13 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
From: Keith Busch <[email protected]>
changes from v11:
- Place the write hint in an unused io_uring SQE field
- Obviates the need to modify the external "attributes" stuff that
support PI
- Make it a u8 to match the type the block layer supports
- And it's just easier to use for the user
- Fix the sparse warnings from FDP definitions
- Just use the patches that Christoph posted a few weeks ago since
it already defined it in a way that makes sparse happy; I just made
some minor changes to field names to match what the spec calls them
- Actually include the first patch in this series
Christoph Hellwig (7):
fs: add a write stream field to the kiocb
block: add a bi_write_stream field
block: introduce a write_stream_granularity queue limit
block: expose write streams for block device nodes
nvme: add a nvme_get_log_lsi helper
nvme: pass a void pointer to nvme_get/set_features for the result
nvme.h: add FDP definitions
Keith Busch (5):
fs: add write stream information to statx
block: introduce max_write_streams queue limit
io_uring: enable per-io write streams
nvme: register fdp parameters with the block layer
nvme: use fdp streams if write stream is provided
Documentation/ABI/stable/sysfs-block | 15 +++
block/bdev.c | 6 +
block/bio.c | 2 +
block/blk-crypto-fallback.c | 1 +
block/blk-merge.c | 4 +
block/blk-sysfs.c | 6 +
block/bounce.c | 1 +
block/fops.c | 23 ++++
drivers/nvme/host/core.c | 160 ++++++++++++++++++++++++++-
drivers/nvme/host/nvme.h | 9 +-
fs/stat.c | 2 +
include/linux/blk_types.h | 1 +
include/linux/blkdev.h | 16 +++
include/linux/fs.h | 1 +
include/linux/nvme.h | 77 +++++++++++++
include/linux/stat.h | 2 +
include/uapi/linux/io_uring.h | 4 +
include/uapi/linux/stat.h | 7 +-
io_uring/io_uring.c | 2 +
io_uring/rw.c | 1 +
20 files changed, 332 insertions(+), 8 deletions(-)
--
2.43.5
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCHv12 01/12] fs: add write stream information to statx
2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
2024-12-09 8:25 ` Hannes Reinecke
[not found] ` <CGME20241209115219epcas5p4cfc217e25d977cd87025a4284ba0436c@epcas5p4.samsung.com>
2024-12-06 22:17 ` [PATCHv12 02/12] fs: add a write stream field to the kiocb Keith Busch
` (11 subsequent siblings)
12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
From: Keith Busch <[email protected]>
Add new statx field to report the maximum number of write streams
supported and the granularity for them.
Signed-off-by: Keith Busch <[email protected]>
[hch: rename hint to stream, add granularity]
Signed-off-by: Christoph Hellwig <[email protected]>
---
fs/stat.c | 2 ++
include/linux/stat.h | 2 ++
include/uapi/linux/stat.h | 7 +++++--
3 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/fs/stat.c b/fs/stat.c
index 0870e969a8a0b..00e4598b1ff25 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -729,6 +729,8 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer)
tmp.stx_atomic_write_unit_min = stat->atomic_write_unit_min;
tmp.stx_atomic_write_unit_max = stat->atomic_write_unit_max;
tmp.stx_atomic_write_segments_max = stat->atomic_write_segments_max;
+ tmp.stx_write_stream_granularity = stat->write_stream_granularity;
+ tmp.stx_write_stream_max = stat->write_stream_max;
return copy_to_user(buffer, &tmp, sizeof(tmp)) ? -EFAULT : 0;
}
diff --git a/include/linux/stat.h b/include/linux/stat.h
index 3d900c86981c5..36d4dfb291abd 100644
--- a/include/linux/stat.h
+++ b/include/linux/stat.h
@@ -57,6 +57,8 @@ struct kstat {
u32 atomic_write_unit_min;
u32 atomic_write_unit_max;
u32 atomic_write_segments_max;
+ u32 write_stream_granularity;
+ u16 write_stream_max;
};
/* These definitions are internal to the kernel for now. Mainly used by nfsd. */
diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
index 887a252864416..547c62a1a3a7c 100644
--- a/include/uapi/linux/stat.h
+++ b/include/uapi/linux/stat.h
@@ -132,9 +132,11 @@ struct statx {
__u32 stx_atomic_write_unit_max; /* Max atomic write unit in bytes */
/* 0xb0 */
__u32 stx_atomic_write_segments_max; /* Max atomic write segment count */
- __u32 __spare1[1];
+ __u32 stx_write_stream_granularity;
/* 0xb8 */
- __u64 __spare3[9]; /* Spare space for future expansion */
+ __u16 stx_write_stream_max;
+ __u16 __sparse2[3];
+ __u64 __spare3[8]; /* Spare space for future expansion */
/* 0x100 */
};
@@ -164,6 +166,7 @@ struct statx {
#define STATX_MNT_ID_UNIQUE 0x00004000U /* Want/got extended stx_mount_id */
#define STATX_SUBVOL 0x00008000U /* Want/got stx_subvol */
#define STATX_WRITE_ATOMIC 0x00010000U /* Want/got atomic_write_* fields */
+#define STATX_WRITE_STREAM 0x00020000U /* Want/got write_stream_* */
#define STATX__RESERVED 0x80000000U /* Reserved for future struct statx expansion */
--
2.43.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCHv12 02/12] fs: add a write stream field to the kiocb
2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
2024-12-06 22:17 ` [PATCHv12 01/12] fs: add write stream information to statx Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
2024-12-09 8:25 ` Hannes Reinecke
` (2 more replies)
2024-12-06 22:17 ` [PATCHv12 03/12] block: add a bi_write_stream field Keith Busch
` (10 subsequent siblings)
12 siblings, 3 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
From: Christoph Hellwig <[email protected]>
Prepare for io_uring passthrough of write streams. The write stream
field in the kiocb structure fits into an existing 2-byte hole, so its
size is not changed.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
---
include/linux/fs.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2cc3d45da7b01..26940c451f319 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -373,6 +373,7 @@ struct kiocb {
void *private;
int ki_flags;
u16 ki_ioprio; /* See linux/ioprio.h */
+ u8 ki_write_stream;
union {
/*
* Only used for async buffered reads, where it denotes the
--
2.43.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCHv12 03/12] block: add a bi_write_stream field
2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
2024-12-06 22:17 ` [PATCHv12 01/12] fs: add write stream information to statx Keith Busch
2024-12-06 22:17 ` [PATCHv12 02/12] fs: add a write stream field to the kiocb Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
2024-12-09 8:26 ` Hannes Reinecke
[not found] ` <CGME20241210074213epcas5p22330d197c3e7058e9c2226f28fdb1475@epcas5p2.samsung.com>
2024-12-06 22:17 ` [PATCHv12 04/12] block: introduce max_write_streams queue limit Keith Busch
` (9 subsequent siblings)
12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
From: Christoph Hellwig <[email protected]>
Add the ability to pass a write stream for placement control in the bio.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
---
block/bio.c | 2 ++
block/blk-crypto-fallback.c | 1 +
block/blk-merge.c | 4 ++++
block/bounce.c | 1 +
include/linux/blk_types.h | 1 +
5 files changed, 9 insertions(+)
diff --git a/block/bio.c b/block/bio.c
index 699a78c85c756..2aa86edc7cd6f 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -251,6 +251,7 @@ void bio_init(struct bio *bio, struct block_device *bdev, struct bio_vec *table,
bio->bi_flags = 0;
bio->bi_ioprio = 0;
bio->bi_write_hint = 0;
+ bio->bi_write_stream = 0;
bio->bi_status = 0;
bio->bi_iter.bi_sector = 0;
bio->bi_iter.bi_size = 0;
@@ -827,6 +828,7 @@ static int __bio_clone(struct bio *bio, struct bio *bio_src, gfp_t gfp)
bio_set_flag(bio, BIO_CLONED);
bio->bi_ioprio = bio_src->bi_ioprio;
bio->bi_write_hint = bio_src->bi_write_hint;
+ bio->bi_write_stream = bio_src->bi_write_stream;
bio->bi_iter = bio_src->bi_iter;
if (bio->bi_bdev) {
diff --git a/block/blk-crypto-fallback.c b/block/blk-crypto-fallback.c
index 29a205482617c..66762243a886b 100644
--- a/block/blk-crypto-fallback.c
+++ b/block/blk-crypto-fallback.c
@@ -173,6 +173,7 @@ static struct bio *blk_crypto_fallback_clone_bio(struct bio *bio_src)
bio_set_flag(bio, BIO_REMAPPED);
bio->bi_ioprio = bio_src->bi_ioprio;
bio->bi_write_hint = bio_src->bi_write_hint;
+ bio->bi_write_stream = bio_src->bi_write_stream;
bio->bi_iter.bi_sector = bio_src->bi_iter.bi_sector;
bio->bi_iter.bi_size = bio_src->bi_iter.bi_size;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index e01383c6e534b..1e5327fb6c45b 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -866,6 +866,8 @@ static struct request *attempt_merge(struct request_queue *q,
if (req->bio->bi_write_hint != next->bio->bi_write_hint)
return NULL;
+ if (req->bio->bi_write_stream != next->bio->bi_write_stream)
+ return NULL;
if (req->bio->bi_ioprio != next->bio->bi_ioprio)
return NULL;
if (!blk_atomic_write_mergeable_rqs(req, next))
@@ -987,6 +989,8 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
return false;
if (rq->bio->bi_write_hint != bio->bi_write_hint)
return false;
+ if (rq->bio->bi_write_stream != bio->bi_write_stream)
+ return false;
if (rq->bio->bi_ioprio != bio->bi_ioprio)
return false;
if (blk_atomic_write_mergeable_rq_bio(rq, bio) == false)
diff --git a/block/bounce.c b/block/bounce.c
index 0d898cd5ec497..fb8f60f114d7d 100644
--- a/block/bounce.c
+++ b/block/bounce.c
@@ -170,6 +170,7 @@ static struct bio *bounce_clone_bio(struct bio *bio_src)
bio_set_flag(bio, BIO_REMAPPED);
bio->bi_ioprio = bio_src->bi_ioprio;
bio->bi_write_hint = bio_src->bi_write_hint;
+ bio->bi_write_stream = bio_src->bi_write_stream;
bio->bi_iter.bi_sector = bio_src->bi_iter.bi_sector;
bio->bi_iter.bi_size = bio_src->bi_iter.bi_size;
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index dce7615c35e7e..4ca3449ce9c95 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -220,6 +220,7 @@ struct bio {
unsigned short bi_flags; /* BIO_* below */
unsigned short bi_ioprio;
enum rw_hint bi_write_hint;
+ u8 bi_write_stream;
blk_status_t bi_status;
atomic_t __bi_remaining;
--
2.43.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCHv12 04/12] block: introduce max_write_streams queue limit
2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
` (2 preceding siblings ...)
2024-12-06 22:17 ` [PATCHv12 03/12] block: add a bi_write_stream field Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
2024-12-09 8:27 ` Hannes Reinecke
[not found] ` <CGME20241210074628epcas5p3e36c7615cf2a5160d7fe169774fd30db@epcas5p3.samsung.com>
2024-12-06 22:17 ` [PATCHv12 05/12] block: introduce a write_stream_granularity " Keith Busch
` (8 subsequent siblings)
12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
From: Keith Busch <[email protected]>
Drivers with hardware that support write streams need a way to export how
many are available so applications can generically query this.
Signed-off-by: Keith Busch <[email protected]>
[hch: renamed hints to streams, removed stacking]
Signed-off-by: Christoph Hellwig <[email protected]>
---
Documentation/ABI/stable/sysfs-block | 7 +++++++
block/blk-sysfs.c | 3 +++
include/linux/blkdev.h | 9 +++++++++
3 files changed, 19 insertions(+)
diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block
index 0cceb2badc836..f67139b8b8eff 100644
--- a/Documentation/ABI/stable/sysfs-block
+++ b/Documentation/ABI/stable/sysfs-block
@@ -506,6 +506,13 @@ Description:
[RO] Maximum size in bytes of a single element in a DMA
scatter/gather list.
+What: /sys/block/<disk>/queue/max_write_streams
+Date: November 2024
+Contact: [email protected]
+Description:
+ [RO] Maximum number of write streams supported, 0 if not
+ supported. If supported, valid values are 1 through
+ max_write_streams, inclusive.
What: /sys/block/<disk>/queue/max_segments
Date: March 2010
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 4241aea84161c..c514c0cb5e93c 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -104,6 +104,7 @@ QUEUE_SYSFS_LIMIT_SHOW(max_segments)
QUEUE_SYSFS_LIMIT_SHOW(max_discard_segments)
QUEUE_SYSFS_LIMIT_SHOW(max_integrity_segments)
QUEUE_SYSFS_LIMIT_SHOW(max_segment_size)
+QUEUE_SYSFS_LIMIT_SHOW(max_write_streams)
QUEUE_SYSFS_LIMIT_SHOW(logical_block_size)
QUEUE_SYSFS_LIMIT_SHOW(physical_block_size)
QUEUE_SYSFS_LIMIT_SHOW(chunk_sectors)
@@ -446,6 +447,7 @@ QUEUE_RO_ENTRY(queue_max_hw_sectors, "max_hw_sectors_kb");
QUEUE_RO_ENTRY(queue_max_segments, "max_segments");
QUEUE_RO_ENTRY(queue_max_integrity_segments, "max_integrity_segments");
QUEUE_RO_ENTRY(queue_max_segment_size, "max_segment_size");
+QUEUE_RO_ENTRY(queue_max_write_streams, "max_write_streams");
QUEUE_RW_LOAD_MODULE_ENTRY(elv_iosched, "scheduler");
QUEUE_RO_ENTRY(queue_logical_block_size, "logical_block_size");
@@ -580,6 +582,7 @@ static struct attribute *queue_attrs[] = {
&queue_max_discard_segments_entry.attr,
&queue_max_integrity_segments_entry.attr,
&queue_max_segment_size_entry.attr,
+ &queue_max_write_streams_entry.attr,
&queue_hw_sector_size_entry.attr,
&queue_logical_block_size_entry.attr,
&queue_physical_block_size_entry.attr,
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 08a727b408164..ce2c3ddda2411 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -399,6 +399,8 @@ struct queue_limits {
unsigned short max_integrity_segments;
unsigned short max_discard_segments;
+ unsigned short max_write_streams;
+
unsigned int max_open_zones;
unsigned int max_active_zones;
@@ -1240,6 +1242,13 @@ static inline unsigned int bdev_max_segments(struct block_device *bdev)
return queue_max_segments(bdev_get_queue(bdev));
}
+static inline unsigned short bdev_max_write_streams(struct block_device *bdev)
+{
+ if (bdev_is_partition(bdev))
+ return 0;
+ return bdev_limits(bdev)->max_write_streams;
+}
+
static inline unsigned queue_logical_block_size(const struct request_queue *q)
{
return q->limits.logical_block_size;
--
2.43.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCHv12 05/12] block: introduce a write_stream_granularity queue limit
2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
` (3 preceding siblings ...)
2024-12-06 22:17 ` [PATCHv12 04/12] block: introduce max_write_streams queue limit Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
2024-12-09 8:29 ` Hannes Reinecke
[not found] ` <CGME20241210075259epcas5p23bbb79cdb18ddbfad337d764d4fe75da@epcas5p2.samsung.com>
2024-12-06 22:17 ` [PATCHv12 06/12] block: expose write streams for block device nodes Keith Busch
` (7 subsequent siblings)
12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
From: Christoph Hellwig <[email protected]>
Export the granularity that write streams should be discarded with,
as it is essential for making good use of them.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
---
Documentation/ABI/stable/sysfs-block | 8 ++++++++
block/blk-sysfs.c | 3 +++
include/linux/blkdev.h | 7 +++++++
3 files changed, 18 insertions(+)
diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block
index f67139b8b8eff..c454c68b68fe6 100644
--- a/Documentation/ABI/stable/sysfs-block
+++ b/Documentation/ABI/stable/sysfs-block
@@ -514,6 +514,14 @@ Description:
supported. If supported, valid values are 1 through
max_write_streams, inclusive.
+What: /sys/block/<disk>/queue/write_stream_granularity
+Date: November 2024
+Contact: [email protected]
+Description:
+ [RO] Granularity of a write stream in bytes. The granularity
+ of a write stream is the size that should be discarded or
+ overwritten together to avoid write amplification in the device.
+
What: /sys/block/<disk>/queue/max_segments
Date: March 2010
Contact: [email protected]
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index c514c0cb5e93c..525f4fa132cd3 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -105,6 +105,7 @@ QUEUE_SYSFS_LIMIT_SHOW(max_discard_segments)
QUEUE_SYSFS_LIMIT_SHOW(max_integrity_segments)
QUEUE_SYSFS_LIMIT_SHOW(max_segment_size)
QUEUE_SYSFS_LIMIT_SHOW(max_write_streams)
+QUEUE_SYSFS_LIMIT_SHOW(write_stream_granularity)
QUEUE_SYSFS_LIMIT_SHOW(logical_block_size)
QUEUE_SYSFS_LIMIT_SHOW(physical_block_size)
QUEUE_SYSFS_LIMIT_SHOW(chunk_sectors)
@@ -448,6 +449,7 @@ QUEUE_RO_ENTRY(queue_max_segments, "max_segments");
QUEUE_RO_ENTRY(queue_max_integrity_segments, "max_integrity_segments");
QUEUE_RO_ENTRY(queue_max_segment_size, "max_segment_size");
QUEUE_RO_ENTRY(queue_max_write_streams, "max_write_streams");
+QUEUE_RO_ENTRY(queue_write_stream_granularity, "write_stream_granularity");
QUEUE_RW_LOAD_MODULE_ENTRY(elv_iosched, "scheduler");
QUEUE_RO_ENTRY(queue_logical_block_size, "logical_block_size");
@@ -583,6 +585,7 @@ static struct attribute *queue_attrs[] = {
&queue_max_integrity_segments_entry.attr,
&queue_max_segment_size_entry.attr,
&queue_max_write_streams_entry.attr,
+ &queue_write_stream_granularity_entry.attr,
&queue_hw_sector_size_entry.attr,
&queue_logical_block_size_entry.attr,
&queue_physical_block_size_entry.attr,
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index ce2c3ddda2411..7be8cc57561a1 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -400,6 +400,7 @@ struct queue_limits {
unsigned short max_discard_segments;
unsigned short max_write_streams;
+ unsigned int write_stream_granularity;
unsigned int max_open_zones;
unsigned int max_active_zones;
@@ -1249,6 +1250,12 @@ static inline unsigned short bdev_max_write_streams(struct block_device *bdev)
return bdev_limits(bdev)->max_write_streams;
}
+static inline unsigned int
+bdev_write_stream_granularity(struct block_device *bdev)
+{
+ return bdev_limits(bdev)->write_stream_granularity;
+}
+
static inline unsigned queue_logical_block_size(const struct request_queue *q)
{
return q->limits.logical_block_size;
--
2.43.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCHv12 06/12] block: expose write streams for block device nodes
2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
` (4 preceding siblings ...)
2024-12-06 22:17 ` [PATCHv12 05/12] block: introduce a write_stream_granularity " Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
2024-12-09 8:30 ` Hannes Reinecke
[not found] ` <CGME20241209110649epcas5p41df7db0f7ea58f250da647106d25134b@epcas5p4.samsung.com>
2024-12-06 22:17 ` [PATCHv12 07/12] io_uring: enable per-io write streams Keith Busch
` (6 subsequent siblings)
12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
From: Christoph Hellwig <[email protected]>
Export statx information about the number and granularity of write
streams, use the per-kiocb write hint and map temperature hints
to write streams (which is a bit questionable, but this shows how it is
done).
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
---
block/bdev.c | 6 ++++++
block/fops.c | 23 +++++++++++++++++++++++
2 files changed, 29 insertions(+)
diff --git a/block/bdev.c b/block/bdev.c
index 738e3c8457e7f..c23245f1fdfe3 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -1296,6 +1296,12 @@ void bdev_statx(struct path *path, struct kstat *stat,
stat->result_mask |= STATX_DIOALIGN;
}
+ if ((request_mask & STATX_WRITE_STREAM) &&
+ bdev_max_write_streams(bdev)) {
+ stat->write_stream_max = bdev_max_write_streams(bdev);
+ stat->result_mask |= STATX_WRITE_STREAM;
+ }
+
if (request_mask & STATX_WRITE_ATOMIC && bdev_can_atomic_write(bdev)) {
struct request_queue *bd_queue = bdev->bd_queue;
diff --git a/block/fops.c b/block/fops.c
index 6d5c4fc5a2168..f16aa39bf5bad 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -73,6 +73,7 @@ static ssize_t __blkdev_direct_IO_simple(struct kiocb *iocb,
}
bio.bi_iter.bi_sector = pos >> SECTOR_SHIFT;
bio.bi_write_hint = file_inode(iocb->ki_filp)->i_write_hint;
+ bio.bi_write_stream = iocb->ki_write_stream;
bio.bi_ioprio = iocb->ki_ioprio;
if (iocb->ki_flags & IOCB_ATOMIC)
bio.bi_opf |= REQ_ATOMIC;
@@ -206,6 +207,7 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
for (;;) {
bio->bi_iter.bi_sector = pos >> SECTOR_SHIFT;
bio->bi_write_hint = file_inode(iocb->ki_filp)->i_write_hint;
+ bio->bi_write_stream = iocb->ki_write_stream;
bio->bi_private = dio;
bio->bi_end_io = blkdev_bio_end_io;
bio->bi_ioprio = iocb->ki_ioprio;
@@ -333,6 +335,7 @@ static ssize_t __blkdev_direct_IO_async(struct kiocb *iocb,
dio->iocb = iocb;
bio->bi_iter.bi_sector = pos >> SECTOR_SHIFT;
bio->bi_write_hint = file_inode(iocb->ki_filp)->i_write_hint;
+ bio->bi_write_stream = iocb->ki_write_stream;
bio->bi_end_io = blkdev_bio_end_io_async;
bio->bi_ioprio = iocb->ki_ioprio;
@@ -398,6 +401,26 @@ static ssize_t blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
if (blkdev_dio_invalid(bdev, iocb, iter))
return -EINVAL;
+ if (iov_iter_rw(iter) == WRITE) {
+ u16 max_write_streams = bdev_max_write_streams(bdev);
+
+ if (iocb->ki_write_stream) {
+ if (iocb->ki_write_stream > max_write_streams)
+ return -EINVAL;
+ } else if (max_write_streams) {
+ enum rw_hint write_hint =
+ file_inode(iocb->ki_filp)->i_write_hint;
+
+ /*
+ * Just use the write hint as write stream for block
+ * device writes. This assumes no file system is
+ * mounted that would use the streams differently.
+ */
+ if (write_hint <= max_write_streams)
+ iocb->ki_write_stream = write_hint;
+ }
+ }
+
nr_pages = bio_iov_vecs_to_alloc(iter, BIO_MAX_VECS + 1);
if (likely(nr_pages <= BIO_MAX_VECS)) {
if (is_sync_kiocb(iocb))
--
2.43.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCHv12 07/12] io_uring: enable per-io write streams
2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
` (5 preceding siblings ...)
2024-12-06 22:17 ` [PATCHv12 06/12] block: expose write streams for block device nodes Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
2024-12-09 8:31 ` Hannes Reinecke
2024-12-06 22:17 ` [PATCHv12 08/12] nvme: add a nvme_get_log_lsi helper Keith Busch
` (5 subsequent siblings)
12 siblings, 1 reply; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
From: Keith Busch <[email protected]>
Allow userspace to pass a per-I/O write stream in the SQE:
__u8 write_stream;
The __u8 type matches the size the filesystems and block layer support.
Application can query the supported values from the statx
max_write_streams field. Unsupported values are ignored by file
operations that do not support write streams or rejected with an error
by those that support them.
Signed-off-by: Keith Busch <[email protected]>
---
include/uapi/linux/io_uring.h | 4 ++++
io_uring/io_uring.c | 2 ++
io_uring/rw.c | 1 +
3 files changed, 7 insertions(+)
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 38f0d6b10eaf7..986a480e3b9c2 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -92,6 +92,10 @@ struct io_uring_sqe {
__u16 addr_len;
__u16 __pad3[1];
};
+ struct {
+ __u8 write_stream;
+ __u8 __pad4[3];
+ };
};
union {
struct {
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index a8cbe674e5d63..978d0617d7af8 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3868,6 +3868,8 @@ static int __init io_uring_init(void)
BUILD_BUG_SQE_ELEM(44, __s32, splice_fd_in);
BUILD_BUG_SQE_ELEM(44, __u32, file_index);
BUILD_BUG_SQE_ELEM(44, __u16, addr_len);
+ BUILD_BUG_SQE_ELEM(44, __u8, write_stream);
+ BUILD_BUG_SQE_ELEM(45, __u8, __pad4[0]);
BUILD_BUG_SQE_ELEM(46, __u16, __pad3[0]);
BUILD_BUG_SQE_ELEM(48, __u64, addr3);
BUILD_BUG_SQE_ELEM_SIZE(48, 0, cmd);
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 04e4467ab0ee8..b8aa2dfcbf48c 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -322,6 +322,7 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe,
}
rw->kiocb.dio_complete = NULL;
rw->kiocb.ki_flags = 0;
+ rw->kiocb.ki_write_stream = READ_ONCE(sqe->write_stream);
rw->addr = READ_ONCE(sqe->addr);
rw->len = READ_ONCE(sqe->len);
--
2.43.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCHv12 08/12] nvme: add a nvme_get_log_lsi helper
2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
` (6 preceding siblings ...)
2024-12-06 22:17 ` [PATCHv12 07/12] io_uring: enable per-io write streams Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
2024-12-09 8:31 ` Hannes Reinecke
[not found] ` <CGME20241210121958epcas5p27d14abfca66757a2c42ec71895b008b1@epcas5p2.samsung.com>
2024-12-06 22:17 ` [PATCHv12 09/12] nvme: pass a void pointer to nvme_get/set_features for the result Keith Busch
` (4 subsequent siblings)
12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
From: Christoph Hellwig <[email protected]>
For log pages that need to pass in a LSI value, while at the same time
not touching all the existing nvme_get_log callers.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
---
drivers/nvme/host/core.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 571d4106d256d..36c44be98e38c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -150,6 +150,8 @@ static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
unsigned nsid);
static void nvme_update_keep_alive(struct nvme_ctrl *ctrl,
struct nvme_command *cmd);
+static int nvme_get_log_lsi(struct nvme_ctrl *ctrl, u32 nsid, u8 log_page,
+ u8 lsp, u8 csi, void *log, size_t size, u64 offset, u16 lsi);
void nvme_queue_scan(struct nvme_ctrl *ctrl)
{
@@ -3074,8 +3076,8 @@ static int nvme_init_subsystem(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
return ret;
}
-int nvme_get_log(struct nvme_ctrl *ctrl, u32 nsid, u8 log_page, u8 lsp, u8 csi,
- void *log, size_t size, u64 offset)
+static int nvme_get_log_lsi(struct nvme_ctrl *ctrl, u32 nsid, u8 log_page,
+ u8 lsp, u8 csi, void *log, size_t size, u64 offset, u16 lsi)
{
struct nvme_command c = { };
u32 dwlen = nvme_bytes_to_numd(size);
@@ -3089,10 +3091,18 @@ int nvme_get_log(struct nvme_ctrl *ctrl, u32 nsid, u8 log_page, u8 lsp, u8 csi,
c.get_log_page.lpol = cpu_to_le32(lower_32_bits(offset));
c.get_log_page.lpou = cpu_to_le32(upper_32_bits(offset));
c.get_log_page.csi = csi;
+ c.get_log_page.lsi = cpu_to_le16(lsi);
return nvme_submit_sync_cmd(ctrl->admin_q, &c, log, size);
}
+int nvme_get_log(struct nvme_ctrl *ctrl, u32 nsid, u8 log_page, u8 lsp, u8 csi,
+ void *log, size_t size, u64 offset)
+{
+ return nvme_get_log_lsi(ctrl, nsid, log_page, lsp, csi, log, size,
+ offset, 0);
+}
+
static int nvme_get_effects_log(struct nvme_ctrl *ctrl, u8 csi,
struct nvme_effects_log **log)
{
--
2.43.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCHv12 09/12] nvme: pass a void pointer to nvme_get/set_features for the result
2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
` (7 preceding siblings ...)
2024-12-06 22:17 ` [PATCHv12 08/12] nvme: add a nvme_get_log_lsi helper Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
2024-12-09 8:32 ` Hannes Reinecke
[not found] ` <CGME20241210122137epcas5p2e373baa1c99b78341928cc7bf0fe3bdf@epcas5p2.samsung.com>
2024-12-06 22:17 ` [PATCHv12 10/12] nvme.h: add FDP definitions Keith Busch
` (3 subsequent siblings)
12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
From: Christoph Hellwig <[email protected]>
That allows passing in structures instead of the u32 result, and thus
reduce the amount of bit shifting and masking required to parse the
result.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
---
drivers/nvme/host/core.c | 4 ++--
drivers/nvme/host/nvme.h | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 36c44be98e38c..c2a3585a3fa59 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1678,7 +1678,7 @@ static int nvme_features(struct nvme_ctrl *dev, u8 op, unsigned int fid,
int nvme_set_features(struct nvme_ctrl *dev, unsigned int fid,
unsigned int dword11, void *buffer, size_t buflen,
- u32 *result)
+ void *result)
{
return nvme_features(dev, nvme_admin_set_features, fid, dword11, buffer,
buflen, result);
@@ -1687,7 +1687,7 @@ EXPORT_SYMBOL_GPL(nvme_set_features);
int nvme_get_features(struct nvme_ctrl *dev, unsigned int fid,
unsigned int dword11, void *buffer, size_t buflen,
- u32 *result)
+ void *result)
{
return nvme_features(dev, nvme_admin_get_features, fid, dword11, buffer,
buflen, result);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 611b02c8a8b37..c1995d89ffdb8 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -890,10 +890,10 @@ int __nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
int qid, nvme_submit_flags_t flags);
int nvme_set_features(struct nvme_ctrl *dev, unsigned int fid,
unsigned int dword11, void *buffer, size_t buflen,
- u32 *result);
+ void *result);
int nvme_get_features(struct nvme_ctrl *dev, unsigned int fid,
unsigned int dword11, void *buffer, size_t buflen,
- u32 *result);
+ void *result);
int nvme_set_queue_count(struct nvme_ctrl *ctrl, int *count);
void nvme_stop_keep_alive(struct nvme_ctrl *ctrl);
int nvme_reset_ctrl(struct nvme_ctrl *ctrl);
--
2.43.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCHv12 10/12] nvme.h: add FDP definitions
2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
` (8 preceding siblings ...)
2024-12-06 22:17 ` [PATCHv12 09/12] nvme: pass a void pointer to nvme_get/set_features for the result Keith Busch
@ 2024-12-06 22:17 ` Keith Busch
2024-12-09 8:33 ` Hannes Reinecke
[not found] ` <CGME20241210122702epcas5p4fe3ed43ad714c6b467a35d16135d07c5@epcas5p4.samsung.com>
2024-12-06 22:18 ` [PATCHv12 11/12] nvme: register fdp parameters with the block layer Keith Busch
` (2 subsequent siblings)
12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:17 UTC (permalink / raw)
To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
From: Christoph Hellwig <[email protected]>
Add the config feature result, config log page, and management receive
commands needed for FDP.
Partially based on a patch from Kanchan Joshi <[email protected]>.
Signed-off-by: Christoph Hellwig <[email protected]>
[kbusch: renamed some fields to match spec]
Signed-off-by: Keith Busch <[email protected]>
---
include/linux/nvme.h | 77 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 77 insertions(+)
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 13377dde4527b..7680078fa67fd 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -275,6 +275,7 @@ enum nvme_ctrl_attr {
NVME_CTRL_ATTR_HID_128_BIT = (1 << 0),
NVME_CTRL_ATTR_TBKAS = (1 << 6),
NVME_CTRL_ATTR_ELBAS = (1 << 15),
+ NVME_CTRL_ATTR_FDPS = (1 << 19),
};
struct nvme_id_ctrl {
@@ -661,6 +662,44 @@ struct nvme_rotational_media_log {
__u8 rsvd24[488];
};
+struct nvme_fdp_config {
+ __u8 flags;
+#define FDPCFG_FDPE (1U << 0)
+ __u8 fdpcidx;
+ __le16 reserved;
+};
+
+struct nvme_fdp_ruh_desc {
+ __u8 ruht;
+ __u8 reserved[3];
+};
+
+struct nvme_fdp_config_desc {
+ __le16 dsze;
+ __u8 fdpa;
+ __u8 vss;
+ __le32 nrg;
+ __le16 nruh;
+ __le16 maxpids;
+ __le32 nns;
+ __le64 runs;
+ __le32 erutl;
+ __u8 rsvd28[36];
+ struct nvme_fdp_ruh_desc ruhs[];
+};
+
+struct nvme_fdp_config_log {
+ __le16 numfdpc;
+ __u8 ver;
+ __u8 rsvd3;
+ __le32 sze;
+ __u8 rsvd8[8];
+ /*
+ * This is followed by variable number of nvme_fdp_config_desc
+ * structures, but sparse doesn't like nested variable sized arrays.
+ */
+};
+
struct nvme_smart_log {
__u8 critical_warning;
__u8 temperature[2];
@@ -887,6 +926,7 @@ enum nvme_opcode {
nvme_cmd_resv_register = 0x0d,
nvme_cmd_resv_report = 0x0e,
nvme_cmd_resv_acquire = 0x11,
+ nvme_cmd_io_mgmt_recv = 0x12,
nvme_cmd_resv_release = 0x15,
nvme_cmd_zone_mgmt_send = 0x79,
nvme_cmd_zone_mgmt_recv = 0x7a,
@@ -908,6 +948,7 @@ enum nvme_opcode {
nvme_opcode_name(nvme_cmd_resv_register), \
nvme_opcode_name(nvme_cmd_resv_report), \
nvme_opcode_name(nvme_cmd_resv_acquire), \
+ nvme_opcode_name(nvme_cmd_io_mgmt_recv), \
nvme_opcode_name(nvme_cmd_resv_release), \
nvme_opcode_name(nvme_cmd_zone_mgmt_send), \
nvme_opcode_name(nvme_cmd_zone_mgmt_recv), \
@@ -1059,6 +1100,7 @@ enum {
NVME_RW_PRINFO_PRCHK_GUARD = 1 << 12,
NVME_RW_PRINFO_PRACT = 1 << 13,
NVME_RW_DTYPE_STREAMS = 1 << 4,
+ NVME_RW_DTYPE_DPLCMT = 2 << 4,
NVME_WZ_DEAC = 1 << 9,
};
@@ -1146,6 +1188,38 @@ struct nvme_zone_mgmt_recv_cmd {
__le32 cdw14[2];
};
+struct nvme_io_mgmt_recv_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 nsid;
+ __le64 rsvd2[2];
+ union nvme_data_ptr dptr;
+ __u8 mo;
+ __u8 rsvd11;
+ __u16 mos;
+ __le32 numd;
+ __le32 cdw12[4];
+};
+
+enum {
+ NVME_IO_MGMT_RECV_MO_RUHS = 1,
+};
+
+struct nvme_fdp_ruh_status_desc {
+ __le16 pid;
+ __le16 ruhid;
+ __le32 earutr;
+ __le64 ruamw;
+ __u8 reserved[16];
+};
+
+struct nvme_fdp_ruh_status {
+ __u8 rsvd0[14];
+ __le16 nruhsd;
+ struct nvme_fdp_ruh_status_desc ruhsd[];
+};
+
enum {
NVME_ZRA_ZONE_REPORT = 0,
NVME_ZRASF_ZONE_REPORT_ALL = 0,
@@ -1281,6 +1355,7 @@ enum {
NVME_FEAT_PLM_WINDOW = 0x14,
NVME_FEAT_HOST_BEHAVIOR = 0x16,
NVME_FEAT_SANITIZE = 0x17,
+ NVME_FEAT_FDP = 0x1d,
NVME_FEAT_SW_PROGRESS = 0x80,
NVME_FEAT_HOST_ID = 0x81,
NVME_FEAT_RESV_MASK = 0x82,
@@ -1301,6 +1376,7 @@ enum {
NVME_LOG_ANA = 0x0c,
NVME_LOG_FEATURES = 0x12,
NVME_LOG_RMI = 0x16,
+ NVME_LOG_FDP_CONFIGS = 0x20,
NVME_LOG_DISC = 0x70,
NVME_LOG_RESERVATION = 0x80,
NVME_FWACT_REPL = (0 << 3),
@@ -1888,6 +1964,7 @@ struct nvme_command {
struct nvmf_auth_receive_command auth_receive;
struct nvme_dbbuf dbbuf;
struct nvme_directive_cmd directive;
+ struct nvme_io_mgmt_recv_cmd imr;
};
};
--
2.43.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCHv12 11/12] nvme: register fdp parameters with the block layer
2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
` (9 preceding siblings ...)
2024-12-06 22:17 ` [PATCHv12 10/12] nvme.h: add FDP definitions Keith Busch
@ 2024-12-06 22:18 ` Keith Busch
2024-12-09 4:05 ` kernel test robot
` (3 more replies)
2024-12-06 22:18 ` [PATCHv12 12/12] nvme: use fdp streams if write stream is provided Keith Busch
2024-12-09 12:55 ` [PATCHv12 00/12] block write streams with nvme fdp Christoph Hellwig
12 siblings, 4 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:18 UTC (permalink / raw)
To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
From: Keith Busch <[email protected]>
Register the device data placement limits if supported. This is just
registering the limits with the block layer. Nothing beyond reporting
these attributes is happening in this patch.
Signed-off-by: Keith Busch <[email protected]>
---
drivers/nvme/host/core.c | 112 +++++++++++++++++++++++++++++++++++++++
drivers/nvme/host/nvme.h | 4 ++
2 files changed, 116 insertions(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index c2a3585a3fa59..5f802e243736a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -38,6 +38,8 @@ struct nvme_ns_info {
u32 nsid;
__le32 anagrpid;
u8 pi_offset;
+ u16 endgid;
+ u64 runs;
bool is_shared;
bool is_readonly;
bool is_ready;
@@ -1613,6 +1615,7 @@ static int nvme_ns_info_from_identify(struct nvme_ctrl *ctrl,
info->is_shared = id->nmic & NVME_NS_NMIC_SHARED;
info->is_readonly = id->nsattr & NVME_NS_ATTR_RO;
info->is_ready = true;
+ info->endgid = le16_to_cpu(id->endgid);
if (ctrl->quirks & NVME_QUIRK_BOGUS_NID) {
dev_info(ctrl->device,
"Ignoring bogus Namespace Identifiers\n");
@@ -1653,6 +1656,7 @@ static int nvme_ns_info_from_id_cs_indep(struct nvme_ctrl *ctrl,
info->is_ready = id->nstat & NVME_NSTAT_NRDY;
info->is_rotational = id->nsfeat & NVME_NS_ROTATIONAL;
info->no_vwc = id->nsfeat & NVME_NS_VWC_NOT_PRESENT;
+ info->endgid = le16_to_cpu(id->endgid);
}
kfree(id);
return ret;
@@ -2147,6 +2151,97 @@ static int nvme_update_ns_info_generic(struct nvme_ns *ns,
return ret;
}
+static int nvme_check_fdp(struct nvme_ns *ns, struct nvme_ns_info *info,
+ u8 fdp_idx)
+{
+ struct nvme_fdp_config_log hdr, *h;
+ struct nvme_fdp_config_desc *desc;
+ size_t size = sizeof(hdr);
+ int i, n, ret;
+ void *log;
+
+ info->runs = 0;
+ ret = nvme_get_log_lsi(ns->ctrl, 0, NVME_LOG_FDP_CONFIGS, 0, NVME_CSI_NVM,
+ (void *)&hdr, size, 0, info->endgid);
+ if (ret)
+ return ret;
+
+ size = le32_to_cpu(hdr.sze);
+ h = kzalloc(size, GFP_KERNEL);
+ if (!h)
+ return 0;
+
+ ret = nvme_get_log_lsi(ns->ctrl, 0, NVME_LOG_FDP_CONFIGS, 0, NVME_CSI_NVM,
+ h, size, 0, info->endgid);
+ if (ret)
+ goto out;
+
+ n = le16_to_cpu(h->numfdpc) + 1;
+ if (fdp_idx > n)
+ goto out;
+
+ log = h + 1;
+ do {
+ desc = log;
+ log += le16_to_cpu(desc->dsze);
+ } while (i++ < fdp_idx);
+
+ info->runs = le64_to_cpu(desc->runs);
+out:
+ kfree(h);
+ return ret;
+}
+
+static int nvme_query_fdp_info(struct nvme_ns *ns, struct nvme_ns_info *info)
+{
+ struct nvme_ns_head *head = ns->head;
+ struct nvme_fdp_ruh_status *ruhs;
+ struct nvme_fdp_config fdp;
+ struct nvme_command c = {};
+ int size, ret;
+
+ ret = nvme_get_features(ns->ctrl, NVME_FEAT_FDP, info->endgid, NULL, 0,
+ &fdp);
+ if (ret)
+ goto err;
+
+ if (!(fdp.flags & FDPCFG_FDPE))
+ goto err;
+
+ ret = nvme_check_fdp(ns, info, fdp.fdpcidx);
+ if (ret || !info->runs)
+ goto err;
+
+ size = struct_size(ruhs, ruhsd, NVME_MAX_PLIDS);
+ ruhs = kzalloc(size, GFP_KERNEL);
+ if (!ruhs) {
+ ret = -ENOMEM;
+ goto err;
+ }
+
+ c.imr.opcode = nvme_cmd_io_mgmt_recv;
+ c.imr.nsid = cpu_to_le32(head->ns_id);
+ c.imr.mo = NVME_IO_MGMT_RECV_MO_RUHS;
+ c.imr.numd = cpu_to_le32(nvme_bytes_to_numd(size));
+ ret = nvme_submit_sync_cmd(ns->queue, &c, ruhs, size);
+ if (ret)
+ goto free;
+
+ head->nr_plids = le16_to_cpu(ruhs->nruhsd);
+ if (!head->nr_plids)
+ goto free;
+
+ kfree(ruhs);
+ return 0;
+
+free:
+ kfree(ruhs);
+err:
+ head->nr_plids = 0;
+ info->runs = 0;
+ return ret;
+}
+
static int nvme_update_ns_info_block(struct nvme_ns *ns,
struct nvme_ns_info *info)
{
@@ -2183,6 +2278,15 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
goto out;
}
+ if (ns->ctrl->ctratt & NVME_CTRL_ATTR_FDPS) {
+ ret = nvme_query_fdp_info(ns, info);
+ if (ret)
+ dev_warn(ns->ctrl->device,
+ "FDP failure status:0x%x\n", ret);
+ if (ret < 0)
+ goto out;
+ }
+
blk_mq_freeze_queue(ns->disk->queue);
ns->head->lba_shift = id->lbaf[lbaf].ds;
ns->head->nuse = le64_to_cpu(id->nuse);
@@ -2216,6 +2320,12 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
if (!nvme_init_integrity(ns->head, &lim, info))
capacity = 0;
+ lim.max_write_streams = ns->head->nr_plids;
+ if (lim.max_write_streams)
+ lim.write_stream_granularity = info->runs;
+ else
+ lim.write_stream_granularity = 0;
+
ret = queue_limits_commit_update(ns->disk->queue, &lim);
if (ret) {
blk_mq_unfreeze_queue(ns->disk->queue);
@@ -2318,6 +2428,8 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
ns->head->disk->flags |= GENHD_FL_HIDDEN;
else
nvme_init_integrity(ns->head, &lim, info);
+ lim.max_write_streams = ns_lim->max_write_streams;
+ lim.write_stream_granularity = ns_lim->write_stream_granularity;
ret = queue_limits_commit_update(ns->head->disk->queue, &lim);
set_capacity_and_notify(ns->head->disk, get_capacity(ns->disk));
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index c1995d89ffdb8..914cc93e91f6d 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -454,6 +454,8 @@ struct nvme_ns_ids {
u8 csi;
};
+#define NVME_MAX_PLIDS (S8_MAX - 1)
+
/*
* Anchor structure for namespaces. There is one for each namespace in a
* NVMe subsystem that any of our controllers can see, and the namespace
@@ -491,6 +493,8 @@ struct nvme_ns_head {
struct device cdev_device;
struct gendisk *disk;
+
+ u16 nr_plids;
#ifdef CONFIG_NVME_MULTIPATH
struct bio_list requeue_list;
spinlock_t requeue_lock;
--
2.43.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCHv12 12/12] nvme: use fdp streams if write stream is provided
2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
` (10 preceding siblings ...)
2024-12-06 22:18 ` [PATCHv12 11/12] nvme: register fdp parameters with the block layer Keith Busch
@ 2024-12-06 22:18 ` Keith Busch
2024-12-09 8:34 ` Hannes Reinecke
[not found] ` <CGME20241210073523epcas5p149482220b87ff3926fb8864ff1660e0c@epcas5p1.samsung.com>
2024-12-09 12:55 ` [PATCHv12 00/12] block write streams with nvme fdp Christoph Hellwig
12 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-06 22:18 UTC (permalink / raw)
To: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
From: Keith Busch <[email protected]>
Maps a user requested write stream to an FDP placement ID if possible.
Signed-off-by: Keith Busch <[email protected]>
---
drivers/nvme/host/core.c | 32 +++++++++++++++++++++++++++++++-
drivers/nvme/host/nvme.h | 1 +
2 files changed, 32 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 5f802e243736a..63c8a117b3b4a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -997,6 +997,18 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
if (req->cmd_flags & REQ_RAHEAD)
dsmgmt |= NVME_RW_DSM_FREQ_PREFETCH;
+ if (op == nvme_cmd_write && ns->head->nr_plids) {
+ u16 write_stream = req->bio->bi_write_stream;
+
+ if (WARN_ON_ONCE(write_stream > ns->head->nr_plids))
+ return BLK_STS_INVAL;
+
+ if (write_stream) {
+ dsmgmt |= ns->head->plids[write_stream - 1] << 16;
+ control |= NVME_RW_DTYPE_DPLCMT;
+ }
+ }
+
if (req->cmd_flags & REQ_ATOMIC && !nvme_valid_atomic_write(req))
return BLK_STS_INVAL;
@@ -2194,11 +2206,12 @@ static int nvme_check_fdp(struct nvme_ns *ns, struct nvme_ns_info *info,
static int nvme_query_fdp_info(struct nvme_ns *ns, struct nvme_ns_info *info)
{
+ struct nvme_fdp_ruh_status_desc *ruhsd;
struct nvme_ns_head *head = ns->head;
struct nvme_fdp_ruh_status *ruhs;
struct nvme_fdp_config fdp;
struct nvme_command c = {};
- int size, ret;
+ int size, ret, i;
ret = nvme_get_features(ns->ctrl, NVME_FEAT_FDP, info->endgid, NULL, 0,
&fdp);
@@ -2231,6 +2244,19 @@ static int nvme_query_fdp_info(struct nvme_ns *ns, struct nvme_ns_info *info)
if (!head->nr_plids)
goto free;
+ head->nr_plids = min(head->nr_plids, NVME_MAX_PLIDS);
+ head->plids = kcalloc(head->nr_plids, sizeof(head->plids),
+ GFP_KERNEL);
+ if (!head->plids) {
+ ret = -ENOMEM;
+ goto free;
+ }
+
+ for (i = 0; i < head->nr_plids; i++) {
+ ruhsd = &ruhs->ruhsd[i];
+ head->plids[i] = le16_to_cpu(ruhsd->pid);
+ }
+
kfree(ruhs);
return 0;
@@ -2285,6 +2311,10 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
"FDP failure status:0x%x\n", ret);
if (ret < 0)
goto out;
+ } else {
+ ns->head->nr_plids = 0;
+ kfree(ns->head->plids);
+ ns->head->plids = NULL;
}
blk_mq_freeze_queue(ns->disk->queue);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 914cc93e91f6d..49b234bfb42c4 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -495,6 +495,7 @@ struct nvme_ns_head {
struct gendisk *disk;
u16 nr_plids;
+ u16 *plids;
#ifdef CONFIG_NVME_MULTIPATH
struct bio_list requeue_list;
spinlock_t requeue_lock;
--
2.43.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* Re: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
2024-12-06 22:18 ` [PATCHv12 11/12] nvme: register fdp parameters with the block layer Keith Busch
@ 2024-12-09 4:05 ` kernel test robot
2024-12-09 12:44 ` Christoph Hellwig
2024-12-09 8:34 ` Hannes Reinecke
` (2 subsequent siblings)
3 siblings, 1 reply; 46+ messages in thread
From: kernel test robot @ 2024-12-09 4:05 UTC (permalink / raw)
To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
io-uring
Cc: llvm, oe-kbuild-all, sagi, asml.silence, anuj20.g, joshi.k,
Keith Busch
Hi Keith,
kernel test robot noticed the following build warnings:
[auto build test WARNING on axboe-block/for-next]
[also build test WARNING on next-20241206]
[cannot apply to brauner-vfs/vfs.all hch-configfs/for-next linus/master v6.13-rc1]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Keith-Busch/fs-add-write-stream-information-to-statx/20241207-063826
base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link: https://lore.kernel.org/r/20241206221801.790690-12-kbusch%40meta.com
patch subject: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
config: i386-buildonly-randconfig-001-20241207 (https://download.01.org/0day-ci/archive/20241207/[email protected]/config)
compiler: clang version 19.1.3 (https://github.com/llvm/llvm-project ab51eccf88f5321e7c60591c5546b254b6afab99)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241207/[email protected]/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
All warnings (new ones prefixed by >>):
In file included from drivers/nvme/host/core.c:8:
In file included from include/linux/blkdev.h:9:
In file included from include/linux/blk_types.h:10:
In file included from include/linux/bvec.h:10:
In file included from include/linux/highmem.h:8:
In file included from include/linux/cacheflush.h:5:
In file included from arch/x86/include/asm/cacheflush.h:5:
In file included from include/linux/mm.h:2223:
include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
518 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
| ~~~~~~~~~~~ ^ ~~~
>> drivers/nvme/host/core.c:2187:11: warning: variable 'i' is uninitialized when used here [-Wuninitialized]
2187 | } while (i++ < fdp_idx);
| ^
drivers/nvme/host/core.c:2160:7: note: initialize the variable 'i' to silence this warning
2160 | int i, n, ret;
| ^
| = 0
2 warnings generated.
vim +/i +2187 drivers/nvme/host/core.c
2153
2154 static int nvme_check_fdp(struct nvme_ns *ns, struct nvme_ns_info *info,
2155 u8 fdp_idx)
2156 {
2157 struct nvme_fdp_config_log hdr, *h;
2158 struct nvme_fdp_config_desc *desc;
2159 size_t size = sizeof(hdr);
2160 int i, n, ret;
2161 void *log;
2162
2163 info->runs = 0;
2164 ret = nvme_get_log_lsi(ns->ctrl, 0, NVME_LOG_FDP_CONFIGS, 0, NVME_CSI_NVM,
2165 (void *)&hdr, size, 0, info->endgid);
2166 if (ret)
2167 return ret;
2168
2169 size = le32_to_cpu(hdr.sze);
2170 h = kzalloc(size, GFP_KERNEL);
2171 if (!h)
2172 return 0;
2173
2174 ret = nvme_get_log_lsi(ns->ctrl, 0, NVME_LOG_FDP_CONFIGS, 0, NVME_CSI_NVM,
2175 h, size, 0, info->endgid);
2176 if (ret)
2177 goto out;
2178
2179 n = le16_to_cpu(h->numfdpc) + 1;
2180 if (fdp_idx > n)
2181 goto out;
2182
2183 log = h + 1;
2184 do {
2185 desc = log;
2186 log += le16_to_cpu(desc->dsze);
> 2187 } while (i++ < fdp_idx);
2188
2189 info->runs = le64_to_cpu(desc->runs);
2190 out:
2191 kfree(h);
2192 return ret;
2193 }
2194
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 01/12] fs: add write stream information to statx
2024-12-06 22:17 ` [PATCHv12 01/12] fs: add write stream information to statx Keith Busch
@ 2024-12-09 8:25 ` Hannes Reinecke
[not found] ` <CGME20241209115219epcas5p4cfc217e25d977cd87025a4284ba0436c@epcas5p4.samsung.com>
1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09 8:25 UTC (permalink / raw)
To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
On 12/6/24 23:17, Keith Busch wrote:
> From: Keith Busch <[email protected]>
>
> Add new statx field to report the maximum number of write streams
> supported and the granularity for them.
>
> Signed-off-by: Keith Busch <[email protected]>
> [hch: rename hint to stream, add granularity]
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> fs/stat.c | 2 ++
> include/linux/stat.h | 2 ++
> include/uapi/linux/stat.h | 7 +++++--
> 3 files changed, 9 insertions(+), 2 deletions(-)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 02/12] fs: add a write stream field to the kiocb
2024-12-06 22:17 ` [PATCHv12 02/12] fs: add a write stream field to the kiocb Keith Busch
@ 2024-12-09 8:25 ` Hannes Reinecke
2024-12-09 12:47 ` [PATCHv12 01/12] fs: add write stream information to statx Christian Brauner
[not found] ` <CGME20241210073225epcas5p4b2ed325714e6d17fae9e3e45b8e963f6@epcas5p4.samsung.com>
2 siblings, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09 8:25 UTC (permalink / raw)
To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
On 12/6/24 23:17, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
>
> Prepare for io_uring passthrough of write streams. The write stream
> field in the kiocb structure fits into an existing 2-byte hole, so its
> size is not changed.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
> include/linux/fs.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 2cc3d45da7b01..26940c451f319 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -373,6 +373,7 @@ struct kiocb {
> void *private;
> int ki_flags;
> u16 ki_ioprio; /* See linux/ioprio.h */
> + u8 ki_write_stream;
> union {
> /*
> * Only used for async buffered reads, where it denotes the
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 03/12] block: add a bi_write_stream field
2024-12-06 22:17 ` [PATCHv12 03/12] block: add a bi_write_stream field Keith Busch
@ 2024-12-09 8:26 ` Hannes Reinecke
[not found] ` <CGME20241210074213epcas5p22330d197c3e7058e9c2226f28fdb1475@epcas5p2.samsung.com>
1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09 8:26 UTC (permalink / raw)
To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
On 12/6/24 23:17, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
>
> Add the ability to pass a write stream for placement control in the bio.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
> block/bio.c | 2 ++
> block/blk-crypto-fallback.c | 1 +
> block/blk-merge.c | 4 ++++
> block/bounce.c | 1 +
> include/linux/blk_types.h | 1 +
> 5 files changed, 9 insertions(+)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 04/12] block: introduce max_write_streams queue limit
2024-12-06 22:17 ` [PATCHv12 04/12] block: introduce max_write_streams queue limit Keith Busch
@ 2024-12-09 8:27 ` Hannes Reinecke
[not found] ` <CGME20241210074628epcas5p3e36c7615cf2a5160d7fe169774fd30db@epcas5p3.samsung.com>
1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09 8:27 UTC (permalink / raw)
To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
On 12/6/24 23:17, Keith Busch wrote:
> From: Keith Busch <[email protected]>
>
> Drivers with hardware that support write streams need a way to export how
> many are available so applications can generically query this.
>
> Signed-off-by: Keith Busch <[email protected]>
> [hch: renamed hints to streams, removed stacking]
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> Documentation/ABI/stable/sysfs-block | 7 +++++++
> block/blk-sysfs.c | 3 +++
> include/linux/blkdev.h | 9 +++++++++
> 3 files changed, 19 insertions(+)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 05/12] block: introduce a write_stream_granularity queue limit
2024-12-06 22:17 ` [PATCHv12 05/12] block: introduce a write_stream_granularity " Keith Busch
@ 2024-12-09 8:29 ` Hannes Reinecke
[not found] ` <CGME20241210075259epcas5p23bbb79cdb18ddbfad337d764d4fe75da@epcas5p2.samsung.com>
1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09 8:29 UTC (permalink / raw)
To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
On 12/6/24 23:17, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
>
> Export the granularity that write streams should be discarded with,
> as it is essential for making good use of them.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
> Documentation/ABI/stable/sysfs-block | 8 ++++++++
> block/blk-sysfs.c | 3 +++
> include/linux/blkdev.h | 7 +++++++
> 3 files changed, 18 insertions(+)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 06/12] block: expose write streams for block device nodes
2024-12-06 22:17 ` [PATCHv12 06/12] block: expose write streams for block device nodes Keith Busch
@ 2024-12-09 8:30 ` Hannes Reinecke
[not found] ` <CGME20241209110649epcas5p41df7db0f7ea58f250da647106d25134b@epcas5p4.samsung.com>
1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09 8:30 UTC (permalink / raw)
To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
On 12/6/24 23:17, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
>
> Export statx information about the number and granularity of write
> streams, use the per-kiocb write hint and map temperature hints
> to write streams (which is a bit questionable, but this shows how it is
> done).
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
> block/bdev.c | 6 ++++++
> block/fops.c | 23 +++++++++++++++++++++++
> 2 files changed, 29 insertions(+)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 07/12] io_uring: enable per-io write streams
2024-12-06 22:17 ` [PATCHv12 07/12] io_uring: enable per-io write streams Keith Busch
@ 2024-12-09 8:31 ` Hannes Reinecke
0 siblings, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09 8:31 UTC (permalink / raw)
To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
On 12/6/24 23:17, Keith Busch wrote:
> From: Keith Busch <[email protected]>
>
> Allow userspace to pass a per-I/O write stream in the SQE:
>
> __u8 write_stream;
>
> The __u8 type matches the size the filesystems and block layer support.
>
> Application can query the supported values from the statx
> max_write_streams field. Unsupported values are ignored by file
> operations that do not support write streams or rejected with an error
> by those that support them.
>
> Signed-off-by: Keith Busch <[email protected]>
> ---
> include/uapi/linux/io_uring.h | 4 ++++
> io_uring/io_uring.c | 2 ++
> io_uring/rw.c | 1 +
> 3 files changed, 7 insertions(+)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 08/12] nvme: add a nvme_get_log_lsi helper
2024-12-06 22:17 ` [PATCHv12 08/12] nvme: add a nvme_get_log_lsi helper Keith Busch
@ 2024-12-09 8:31 ` Hannes Reinecke
[not found] ` <CGME20241210121958epcas5p27d14abfca66757a2c42ec71895b008b1@epcas5p2.samsung.com>
1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09 8:31 UTC (permalink / raw)
To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
On 12/6/24 23:17, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
>
> For log pages that need to pass in a LSI value, while at the same time
> not touching all the existing nvme_get_log callers.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
> drivers/nvme/host/core.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 09/12] nvme: pass a void pointer to nvme_get/set_features for the result
2024-12-06 22:17 ` [PATCHv12 09/12] nvme: pass a void pointer to nvme_get/set_features for the result Keith Busch
@ 2024-12-09 8:32 ` Hannes Reinecke
[not found] ` <CGME20241210122137epcas5p2e373baa1c99b78341928cc7bf0fe3bdf@epcas5p2.samsung.com>
1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09 8:32 UTC (permalink / raw)
To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
On 12/6/24 23:17, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
>
> That allows passing in structures instead of the u32 result, and thus
> reduce the amount of bit shifting and masking required to parse the
> result.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
> drivers/nvme/host/core.c | 4 ++--
> drivers/nvme/host/nvme.h | 4 ++--
> 2 files changed, 4 insertions(+), 4 deletions(-)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 10/12] nvme.h: add FDP definitions
2024-12-06 22:17 ` [PATCHv12 10/12] nvme.h: add FDP definitions Keith Busch
@ 2024-12-09 8:33 ` Hannes Reinecke
[not found] ` <CGME20241210122702epcas5p4fe3ed43ad714c6b467a35d16135d07c5@epcas5p4.samsung.com>
1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09 8:33 UTC (permalink / raw)
To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
On 12/6/24 23:17, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
>
> Add the config feature result, config log page, and management receive
> commands needed for FDP.
>
> Partially based on a patch from Kanchan Joshi <[email protected]>.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> [kbusch: renamed some fields to match spec]
> Signed-off-by: Keith Busch <[email protected]>
> ---
> include/linux/nvme.h | 77 ++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 77 insertions(+)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
2024-12-06 22:18 ` [PATCHv12 11/12] nvme: register fdp parameters with the block layer Keith Busch
2024-12-09 4:05 ` kernel test robot
@ 2024-12-09 8:34 ` Hannes Reinecke
2024-12-09 13:18 ` Christoph Hellwig
2024-12-10 8:45 ` Dan Carpenter
3 siblings, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09 8:34 UTC (permalink / raw)
To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
On 12/6/24 23:18, Keith Busch wrote:
> From: Keith Busch <[email protected]>
>
> Register the device data placement limits if supported. This is just
> registering the limits with the block layer. Nothing beyond reporting
> these attributes is happening in this patch.
>
> Signed-off-by: Keith Busch <[email protected]>
> ---
> drivers/nvme/host/core.c | 112 +++++++++++++++++++++++++++++++++++++++
> drivers/nvme/host/nvme.h | 4 ++
> 2 files changed, 116 insertions(+)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 12/12] nvme: use fdp streams if write stream is provided
2024-12-06 22:18 ` [PATCHv12 12/12] nvme: use fdp streams if write stream is provided Keith Busch
@ 2024-12-09 8:34 ` Hannes Reinecke
[not found] ` <CGME20241210073523epcas5p149482220b87ff3926fb8864ff1660e0c@epcas5p1.samsung.com>
1 sibling, 0 replies; 46+ messages in thread
From: Hannes Reinecke @ 2024-12-09 8:34 UTC (permalink / raw)
To: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
io-uring
Cc: sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
On 12/6/24 23:18, Keith Busch wrote:
> From: Keith Busch <[email protected]>
>
> Maps a user requested write stream to an FDP placement ID if possible.
>
> Signed-off-by: Keith Busch <[email protected]>
> ---
> drivers/nvme/host/core.c | 32 +++++++++++++++++++++++++++++++-
> drivers/nvme/host/nvme.h | 1 +
> 2 files changed, 32 insertions(+), 1 deletion(-)
>
Reviewed-by: Hannes Reinecke <[email protected]>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 06/12] block: expose write streams for block device nodes
[not found] ` <CGME20241209110649epcas5p41df7db0f7ea58f250da647106d25134b@epcas5p4.samsung.com>
@ 2024-12-09 10:58 ` Nitesh Shetty
0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-09 10:58 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
[-- Attachment #1: Type: text/plain, Size: 1975 bytes --]
On 06/12/24 02:17PM, Keith Busch wrote:
>From: Christoph Hellwig <[email protected]>
>
>Export statx information about the number and granularity of write
>streams, use the per-kiocb write hint and map temperature hints
>to write streams (which is a bit questionable, but this shows how it is
>done).
>
>Signed-off-by: Christoph Hellwig <[email protected]>
>Signed-off-by: Keith Busch <[email protected]>
>---
> block/bdev.c | 6 ++++++
> block/fops.c | 23 +++++++++++++++++++++++
> 2 files changed, 29 insertions(+)
>
>diff --git a/block/bdev.c b/block/bdev.c
>index 738e3c8457e7f..c23245f1fdfe3 100644
>--- a/block/bdev.c
>+++ b/block/bdev.c
>@@ -1296,6 +1296,12 @@ void bdev_statx(struct path *path, struct kstat *stat,
> stat->result_mask |= STATX_DIOALIGN;
> }
>
>+ if ((request_mask & STATX_WRITE_STREAM) &&
We may not reach this point, if user application doesn't set either of
STATX_DIOALIGN or STATX_WRITE_ATOMIC.
>+ bdev_max_write_streams(bdev)) {
>+ stat->write_stream_max = bdev_max_write_streams(bdev);
>+ stat->result_mask |= STATX_WRITE_STREAM;
statx will show value of 0 for write_stream_granularity.
Below is the fix which might help you,
diff --git a/block/bdev.c b/block/bdev.c
index c23245f1fdfe..290577e20457 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -1275,7 +1275,8 @@ void bdev_statx(struct path *path, struct kstat *stat,
struct inode *backing_inode;
struct block_device *bdev;
- if (!(request_mask & (STATX_DIOALIGN | STATX_WRITE_ATOMIC)))
+ if (!(request_mask & (STATX_DIOALIGN | STATX_WRITE_ATOMIC |
+ STATX_WRITE_STREAM)))
return;
backing_inode = d_backing_inode(path->dentry);
@@ -1299,6 +1300,7 @@ void bdev_statx(struct path *path, struct kstat *stat,
if ((request_mask & STATX_WRITE_STREAM) &&
bdev_max_write_streams(bdev)) {
stat->write_stream_max = bdev_max_write_streams(bdev);
+ stat->write_stream_granularity = bdev_write_stream_granularity(bdev);
stat->result_mask |= STATX_WRITE_STREAM;
}
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply related [flat|nested] 46+ messages in thread
* Re: [PATCHv12 01/12] fs: add write stream information to statx
[not found] ` <CGME20241209115219epcas5p4cfc217e25d977cd87025a4284ba0436c@epcas5p4.samsung.com>
@ 2024-12-09 11:44 ` Nitesh Shetty
0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-09 11:44 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
[-- Attachment #1: Type: text/plain, Size: 552 bytes --]
On 06/12/24 02:17PM, Keith Busch wrote:
>From: Keith Busch <[email protected]>
>
>Add new statx field to report the maximum number of write streams
>supported and the granularity for them.
>
>Signed-off-by: Keith Busch <[email protected]>
>[hch: rename hint to stream, add granularity]
>Signed-off-by: Christoph Hellwig <[email protected]>
>---
> fs/stat.c | 2 ++
> include/linux/stat.h | 2 ++
> include/uapi/linux/stat.h | 7 +++++--
> 3 files changed, 9 insertions(+), 2 deletions(-)
>
Reviewed-by: Nitesh Shetty <[email protected]>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
2024-12-09 4:05 ` kernel test robot
@ 2024-12-09 12:44 ` Christoph Hellwig
0 siblings, 0 replies; 46+ messages in thread
From: Christoph Hellwig @ 2024-12-09 12:44 UTC (permalink / raw)
To: kernel test robot
Cc: Keith Busch, axboe, hch, linux-block, linux-nvme, linux-fsdevel,
io-uring, llvm, oe-kbuild-all, sagi, asml.silence, anuj20.g,
joshi.k, Keith Busch
On Mon, Dec 09, 2024 at 12:05:27PM +0800, kernel test robot wrote:
> >> drivers/nvme/host/core.c:2187:11: warning: variable 'i' is uninitialized when used here [-Wuninitialized]
> 2187 | } while (i++ < fdp_idx);
> | ^
> drivers/nvme/host/core.c:2160:7: note: initialize the variable 'i' to silence this warning
> 2160 | int i, n, ret;
> | ^
> | = 0
> 2 warnings generated.
Yeah, looks like this is uninitialized. Did I mention I hate these
variable length log entries in nvme? They've already been a major
pain in ANA before..
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 01/12] fs: add write stream information to statx
2024-12-06 22:17 ` [PATCHv12 02/12] fs: add a write stream field to the kiocb Keith Busch
2024-12-09 8:25 ` Hannes Reinecke
@ 2024-12-09 12:47 ` Christian Brauner
[not found] ` <CGME20241210073225epcas5p4b2ed325714e6d17fae9e3e45b8e963f6@epcas5p4.samsung.com>
2 siblings, 0 replies; 46+ messages in thread
From: Christian Brauner @ 2024-12-09 12:47 UTC (permalink / raw)
To: Keith Busch, axboe
Cc: hch, linux-block, linux-nvme, linux-fsdevel, io-uring, sagi,
asml.silence, anuj20.g, joshi.k, Keith Busch
On Fri, Dec 06, 2024 at 02:17:50PM -0800, Keith Busch wrote:
> From: Keith Busch <[email protected]>
>
> Add new statx field to report the maximum number of write streams
> supported and the granularity for them.
>
> Signed-off-by: Keith Busch <[email protected]>
> [hch: rename hint to stream, add granularity]
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> fs/stat.c | 2 ++
> include/linux/stat.h | 2 ++
> include/uapi/linux/stat.h | 7 +++++--
> 3 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/fs/stat.c b/fs/stat.c
> index 0870e969a8a0b..00e4598b1ff25 100644
> --- a/fs/stat.c
> +++ b/fs/stat.c
> @@ -729,6 +729,8 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer)
> tmp.stx_atomic_write_unit_min = stat->atomic_write_unit_min;
> tmp.stx_atomic_write_unit_max = stat->atomic_write_unit_max;
> tmp.stx_atomic_write_segments_max = stat->atomic_write_segments_max;
> + tmp.stx_write_stream_granularity = stat->write_stream_granularity;
> + tmp.stx_write_stream_max = stat->write_stream_max;
>
> return copy_to_user(buffer, &tmp, sizeof(tmp)) ? -EFAULT : 0;
> }
> diff --git a/include/linux/stat.h b/include/linux/stat.h
> index 3d900c86981c5..36d4dfb291abd 100644
> --- a/include/linux/stat.h
> +++ b/include/linux/stat.h
> @@ -57,6 +57,8 @@ struct kstat {
> u32 atomic_write_unit_min;
> u32 atomic_write_unit_max;
> u32 atomic_write_segments_max;
> + u32 write_stream_granularity;
> + u16 write_stream_max;
> };
>
> /* These definitions are internal to the kernel for now. Mainly used by nfsd. */
> diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
> index 887a252864416..547c62a1a3a7c 100644
> --- a/include/uapi/linux/stat.h
> +++ b/include/uapi/linux/stat.h
> @@ -132,9 +132,11 @@ struct statx {
> __u32 stx_atomic_write_unit_max; /* Max atomic write unit in bytes */
> /* 0xb0 */
> __u32 stx_atomic_write_segments_max; /* Max atomic write segment count */
> - __u32 __spare1[1];
> + __u32 stx_write_stream_granularity;
> /* 0xb8 */
> - __u64 __spare3[9]; /* Spare space for future expansion */
> + __u16 stx_write_stream_max;
> + __u16 __sparse2[3];
> + __u64 __spare3[8]; /* Spare space for future expansion */
> /* 0x100 */
> };
Once you're ready to merge, let me know so I can give you a stable
branch with the fs changes.
>
> @@ -164,6 +166,7 @@ struct statx {
> #define STATX_MNT_ID_UNIQUE 0x00004000U /* Want/got extended stx_mount_id */
> #define STATX_SUBVOL 0x00008000U /* Want/got stx_subvol */
> #define STATX_WRITE_ATOMIC 0x00010000U /* Want/got atomic_write_* fields */
> +#define STATX_WRITE_STREAM 0x00020000U /* Want/got write_stream_* */
>
> #define STATX__RESERVED 0x80000000U /* Reserved for future struct statx expansion */
>
> --
> 2.43.5
>
On Fri, Dec 06, 2024 at 02:17:51PM -0800, Keith Busch wrote:
> From: Christoph Hellwig <[email protected]>
>
> Prepare for io_uring passthrough of write streams. The write stream
> field in the kiocb structure fits into an existing 2-byte hole, so its
> size is not changed.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Keith Busch <[email protected]>
> ---
> include/linux/fs.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 2cc3d45da7b01..26940c451f319 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -373,6 +373,7 @@ struct kiocb {
> void *private;
> int ki_flags;
> u16 ki_ioprio; /* See linux/ioprio.h */
> + u8 ki_write_stream;
> union {
> /*
> * Only used for async buffered reads, where it denotes the
> --
> 2.43.5
>
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 00/12] block write streams with nvme fdp
2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
` (11 preceding siblings ...)
2024-12-06 22:18 ` [PATCHv12 12/12] nvme: use fdp streams if write stream is provided Keith Busch
@ 2024-12-09 12:55 ` Christoph Hellwig
2024-12-09 16:07 ` Keith Busch
12 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2024-12-09 12:55 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
I just compared this to a crude rebase of what I last sent out, and
AFAICS the differences are:
1) basically all new io_uring handling due to the integrity stuff that
went in
2) fixes for the NVMe FDP log page parsing
3) drop the support for the remapping of per-partition streams
conceptually this all looks fine to me. I'll throw in a few nitpicks
on the nvme bits, and I'd need to get up to speed a bit more on the
io_uring bits before commenting useful.
One thing that came I was pondering for a new version is if statx
really is the right vehicle for this as it is a very common fast-path
information. If we had a separate streaminfo ioctl or fcntl it might
be easier to leave a bit spare space for extensibility. I can try to
prototype that or we can leave it as-is because everyone is tired of
the series.
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
2024-12-06 22:18 ` [PATCHv12 11/12] nvme: register fdp parameters with the block layer Keith Busch
2024-12-09 4:05 ` kernel test robot
2024-12-09 8:34 ` Hannes Reinecke
@ 2024-12-09 13:18 ` Christoph Hellwig
2024-12-09 16:29 ` Keith Busch
2024-12-10 8:45 ` Dan Carpenter
3 siblings, 1 reply; 46+ messages in thread
From: Christoph Hellwig @ 2024-12-09 13:18 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
> +static int nvme_check_fdp(struct nvme_ns *ns, struct nvme_ns_info *info,
> + u8 fdp_idx)
Maybe nvme_query_fdp_runs or something else that makes it clear this
is trying to find the runs field might make sense to name this a little
bit more descriptively.
> +{
> + struct nvme_fdp_config_log hdr, *h;
> + struct nvme_fdp_config_desc *desc;
> + size_t size = sizeof(hdr);
> + int i, n, ret;
> + void *log;
> +
> + info->runs = 0;
> + ret = nvme_get_log_lsi(ns->ctrl, 0, NVME_LOG_FDP_CONFIGS, 0, NVME_CSI_NVM,
Overly long line here, and same for the second call below.
> + (void *)&hdr, size, 0, info->endgid);
And this cast isn't actually needed.
> + n = le16_to_cpu(h->numfdpc) + 1;
> + if (fdp_idx > n)
> + goto out;
> +
> + log = h + 1;
> + do {
> + desc = log;
> + log += le16_to_cpu(desc->dsze);
> + } while (i++ < fdp_idx);
Maybe a for loop makes it easier to avoid the uninitialized variable,
e.g.
for (i = 0; i < fdp_index; i++) {
..
> + if (ns->ctrl->ctratt & NVME_CTRL_ATTR_FDPS) {
> + ret = nvme_query_fdp_info(ns, info);
> + if (ret)
> + dev_warn(ns->ctrl->device,
> + "FDP failure status:0x%x\n", ret);
> + if (ret < 0)
> + goto out;
> + }
Looking at the full series with the next patch applied I'm a bit
confused about the handling when rescanning. AFAIK the code now always
goes into nvme_query_fdp_info when NVME_CTRL_ATTR_FDPS even if
head->plids/head->nr_plids is already set, and that will then simply
override them, even if they were already set.
Also the old freeing of head->plids in nvme_free_ns_head seems gone in
this version.
Last not but least "FDP failure" is probably not a very helpful message
when it could come from about half a dozen different commands sent to
the device.
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 00/12] block write streams with nvme fdp
2024-12-09 12:55 ` [PATCHv12 00/12] block write streams with nvme fdp Christoph Hellwig
@ 2024-12-09 16:07 ` Keith Busch
2024-12-10 1:49 ` Martin K. Petersen
2024-12-10 7:19 ` Christoph Hellwig
0 siblings, 2 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-09 16:07 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Keith Busch, axboe, linux-block, linux-nvme, linux-fsdevel,
io-uring, sagi, asml.silence, anuj20.g, joshi.k
On Mon, Dec 09, 2024 at 01:55:11PM +0100, Christoph Hellwig wrote:
> I just compared this to a crude rebase of what I last sent out, and
> AFAICS the differences are:
>
> 1) basically all new io_uring handling due to the integrity stuff that
> went in
> 2) fixes for the NVMe FDP log page parsing
> 3) drop the support for the remapping of per-partition streams
Yep, pretty much. I will revisit the partition mapping. I just haven't
heard any use cases for divvying the streams up this way, so it's not
clear to me what the interface needs to provide.
> One thing that came I was pondering for a new version is if statx
> really is the right vehicle for this as it is a very common fast-path
> information. If we had a separate streaminfo ioctl or fcntl it might
> be easier to leave a bit spare space for extensibility. I can try to
> prototype that or we can leave it as-is because everyone is tired of
> the series.
Oh sure. I can live without the statx parts from this series if you
prefer we take additional time to consider other approaches. We have the
sysfs block attributes reporting the same information, and that is okay
for now.
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
2024-12-09 13:18 ` Christoph Hellwig
@ 2024-12-09 16:29 ` Keith Busch
0 siblings, 0 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-09 16:29 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Keith Busch, axboe, linux-block, linux-nvme, linux-fsdevel,
io-uring, sagi, asml.silence, anuj20.g, joshi.k
On Mon, Dec 09, 2024 at 02:18:19PM +0100, Christoph Hellwig wrote:
> > + n = le16_to_cpu(h->numfdpc) + 1;
> > + if (fdp_idx > n)
> > + goto out;
> > +
> > + log = h + 1;
> > + do {
> > + desc = log;
> > + log += le16_to_cpu(desc->dsze);
> > + } while (i++ < fdp_idx);
>
> Maybe a for loop makes it easier to avoid the uninitialized variable,
> e.g.
>
> for (i = 0; i < fdp_index; i++) {
> ..
Yeah, okay. I was just trying to cleverly have a single place where the
descriptor is set. A for-loop needs to set it both within and after the
loop.
> > + if (ns->ctrl->ctratt & NVME_CTRL_ATTR_FDPS) {
> > + ret = nvme_query_fdp_info(ns, info);
> > + if (ret)
> > + dev_warn(ns->ctrl->device,
> > + "FDP failure status:0x%x\n", ret);
> > + if (ret < 0)
> > + goto out;
> > + }
>
> Looking at the full series with the next patch applied I'm a bit
> confused about the handling when rescanning. AFAIK the code now always
> goes into nvme_query_fdp_info when NVME_CTRL_ATTR_FDPS even if
> head->plids/head->nr_plids is already set, and that will then simply
> override them, even if they were already set.
I thought you could change the FDP configuration on a live namespace
with the Set Feature command, so needed to account for that. But the
spec really does restrict that feature to endurance groups without
namespaces, so I was mistaken and we can skip re-validiting FDP state
after the first scan.
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 00/12] block write streams with nvme fdp
2024-12-09 16:07 ` Keith Busch
@ 2024-12-10 1:49 ` Martin K. Petersen
2024-12-10 7:19 ` Christoph Hellwig
1 sibling, 0 replies; 46+ messages in thread
From: Martin K. Petersen @ 2024-12-10 1:49 UTC (permalink / raw)
To: Keith Busch
Cc: Christoph Hellwig, Keith Busch, axboe, linux-block, linux-nvme,
linux-fsdevel, io-uring, sagi, asml.silence, anuj20.g, joshi.k
Hi Keith!
>> 3) drop the support for the remapping of per-partition streams
>
> Yep, pretty much. I will revisit the partition mapping. I just haven't
> heard any use cases for divvying the streams up this way, so it's not
> clear to me what the interface needs to provide.
Since the streams are a (very) scarce hardware resource, it does seem to
me like we should have an explicit interface for an entity (whether
app-on-bdev or a filesystem) to allocate them.
While there certainly are cases where there is a 1:1 app-to-device
mapping, as soon as you add virtualization or enterprise apps to the
mix, that assumption quickly falls apart...
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 00/12] block write streams with nvme fdp
2024-12-09 16:07 ` Keith Busch
2024-12-10 1:49 ` Martin K. Petersen
@ 2024-12-10 7:19 ` Christoph Hellwig
1 sibling, 0 replies; 46+ messages in thread
From: Christoph Hellwig @ 2024-12-10 7:19 UTC (permalink / raw)
To: Keith Busch
Cc: Christoph Hellwig, Keith Busch, axboe, linux-block, linux-nvme,
linux-fsdevel, io-uring, sagi, asml.silence, anuj20.g, joshi.k
On Mon, Dec 09, 2024 at 09:07:35AM -0700, Keith Busch wrote:
> Yep, pretty much. I will revisit the partition mapping. I just haven't
> heard any use cases for divvying the streams up this way, so it's not
> clear to me what the interface needs to provide.
Yes, it would be good to understand use cases first. I just threw the
patch in as a POC to show we can do it.
> > One thing that came I was pondering for a new version is if statx
> > really is the right vehicle for this as it is a very common fast-path
> > information. If we had a separate streaminfo ioctl or fcntl it might
> > be easier to leave a bit spare space for extensibility. I can try to
> > prototype that or we can leave it as-is because everyone is tired of
> > the series.
>
> Oh sure. I can live without the statx parts from this series if you
> prefer we take additional time to consider other approaches. We have the
> sysfs block attributes reporting the same information, and that is okay
> for now.
I'll try to find some time this afternoon for an interface, but if it
doesn't arrive in time we can probably drop if for the next submission.
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 02/12] fs: add a write stream field to the kiocb
[not found] ` <CGME20241210073225epcas5p4b2ed325714e6d17fae9e3e45b8e963f6@epcas5p4.samsung.com>
@ 2024-12-10 7:24 ` Nitesh Shetty
0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10 7:24 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
[-- Attachment #1: Type: text/plain, Size: 397 bytes --]
On 06/12/24 02:17PM, Keith Busch wrote:
>From: Christoph Hellwig <[email protected]>
>
>Prepare for io_uring passthrough of write streams. The write stream
>field in the kiocb structure fits into an existing 2-byte hole, so its
>size is not changed.
>
>Signed-off-by: Christoph Hellwig <[email protected]>
>Signed-off-by: Keith Busch <[email protected]>
>---
Reviewed-by: Nitesh Shetty <[email protected]>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 12/12] nvme: use fdp streams if write stream is provided
[not found] ` <CGME20241210073523epcas5p149482220b87ff3926fb8864ff1660e0c@epcas5p1.samsung.com>
@ 2024-12-10 7:27 ` Nitesh Shetty
0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10 7:27 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
[-- Attachment #1: Type: text/plain, Size: 255 bytes --]
On 06/12/24 02:18PM, Keith Busch wrote:
>From: Keith Busch <[email protected]>
>
>Maps a user requested write stream to an FDP placement ID if possible.
>
>Signed-off-by: Keith Busch <[email protected]>
Reviewed-by: Nitesh Shetty <[email protected]>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 03/12] block: add a bi_write_stream field
[not found] ` <CGME20241210074213epcas5p22330d197c3e7058e9c2226f28fdb1475@epcas5p2.samsung.com>
@ 2024-12-10 7:34 ` Nitesh Shetty
0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10 7:34 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
[-- Attachment #1: Type: text/plain, Size: 307 bytes --]
On 06/12/24 02:17PM, Keith Busch wrote:
>From: Christoph Hellwig <[email protected]>
>
>Add the ability to pass a write stream for placement control in the bio.
>
>Signed-off-by: Christoph Hellwig <[email protected]>
>Signed-off-by: Keith Busch <[email protected]>
>---
Reviewed-by: Nitesh Shetty <[email protected]>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 04/12] block: introduce max_write_streams queue limit
[not found] ` <CGME20241210074628epcas5p3e36c7615cf2a5160d7fe169774fd30db@epcas5p3.samsung.com>
@ 2024-12-10 7:38 ` Nitesh Shetty
0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10 7:38 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
[-- Attachment #1: Type: text/plain, Size: 424 bytes --]
On 06/12/24 02:17PM, Keith Busch wrote:
>From: Keith Busch <[email protected]>
>
>Drivers with hardware that support write streams need a way to export how
>many are available so applications can generically query this.
>
>Signed-off-by: Keith Busch <[email protected]>
>[hch: renamed hints to streams, removed stacking]
>Signed-off-by: Christoph Hellwig <[email protected]>
>---
Reviewed-by: Nitesh Shetty <[email protected]>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 05/12] block: introduce a write_stream_granularity queue limit
[not found] ` <CGME20241210075259epcas5p23bbb79cdb18ddbfad337d764d4fe75da@epcas5p2.samsung.com>
@ 2024-12-10 7:45 ` Nitesh Shetty
0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10 7:45 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
[-- Attachment #1: Type: text/plain, Size: 352 bytes --]
On 06/12/24 02:17PM, Keith Busch wrote:
>From: Christoph Hellwig <[email protected]>
>
>Export the granularity that write streams should be discarded with,
>as it is essential for making good use of them.
>
>Signed-off-by: Christoph Hellwig <[email protected]>
>Signed-off-by: Keith Busch <[email protected]>
>---
Reviewed-by: Nitesh Shetty <[email protected]>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
2024-12-06 22:18 ` [PATCHv12 11/12] nvme: register fdp parameters with the block layer Keith Busch
` (2 preceding siblings ...)
2024-12-09 13:18 ` Christoph Hellwig
@ 2024-12-10 8:45 ` Dan Carpenter
2024-12-10 15:23 ` Keith Busch
3 siblings, 1 reply; 46+ messages in thread
From: Dan Carpenter @ 2024-12-10 8:45 UTC (permalink / raw)
To: oe-kbuild, Keith Busch, axboe, hch, linux-block, linux-nvme,
linux-fsdevel, io-uring
Cc: lkp, oe-kbuild-all, sagi, asml.silence, anuj20.g, joshi.k,
Keith Busch
Hi Keith,
kernel test robot noticed the following build warnings:
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Keith-Busch/fs-add-write-stream-information-to-statx/20241207-063826
base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link: https://lore.kernel.org/r/20241206221801.790690-12-kbusch%40meta.com
patch subject: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
config: csky-randconfig-r072-20241209 (https://download.01.org/0day-ci/archive/20241210/[email protected]/config)
compiler: csky-linux-gcc (GCC) 14.2.0
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Reported-by: Dan Carpenter <[email protected]>
| Closes: https://lore.kernel.org/r/[email protected]/
New smatch warnings:
drivers/nvme/host/core.c:2187 nvme_check_fdp() error: uninitialized symbol 'i'.
drivers/nvme/host/core.c:2232 nvme_query_fdp_info() warn: missing error code 'ret'
vim +/i +2187 drivers/nvme/host/core.c
04ca0849938146 Keith Busch 2024-12-06 2154 static int nvme_check_fdp(struct nvme_ns *ns, struct nvme_ns_info *info,
04ca0849938146 Keith Busch 2024-12-06 2155 u8 fdp_idx)
04ca0849938146 Keith Busch 2024-12-06 2156 {
04ca0849938146 Keith Busch 2024-12-06 2157 struct nvme_fdp_config_log hdr, *h;
04ca0849938146 Keith Busch 2024-12-06 2158 struct nvme_fdp_config_desc *desc;
04ca0849938146 Keith Busch 2024-12-06 2159 size_t size = sizeof(hdr);
04ca0849938146 Keith Busch 2024-12-06 2160 int i, n, ret;
04ca0849938146 Keith Busch 2024-12-06 2161 void *log;
04ca0849938146 Keith Busch 2024-12-06 2162
04ca0849938146 Keith Busch 2024-12-06 2163 info->runs = 0;
04ca0849938146 Keith Busch 2024-12-06 2164 ret = nvme_get_log_lsi(ns->ctrl, 0, NVME_LOG_FDP_CONFIGS, 0, NVME_CSI_NVM,
04ca0849938146 Keith Busch 2024-12-06 2165 (void *)&hdr, size, 0, info->endgid);
04ca0849938146 Keith Busch 2024-12-06 2166 if (ret)
04ca0849938146 Keith Busch 2024-12-06 2167 return ret;
04ca0849938146 Keith Busch 2024-12-06 2168
04ca0849938146 Keith Busch 2024-12-06 2169 size = le32_to_cpu(hdr.sze);
04ca0849938146 Keith Busch 2024-12-06 2170 h = kzalloc(size, GFP_KERNEL);
04ca0849938146 Keith Busch 2024-12-06 2171 if (!h)
04ca0849938146 Keith Busch 2024-12-06 2172 return 0;
04ca0849938146 Keith Busch 2024-12-06 2173
04ca0849938146 Keith Busch 2024-12-06 2174 ret = nvme_get_log_lsi(ns->ctrl, 0, NVME_LOG_FDP_CONFIGS, 0, NVME_CSI_NVM,
04ca0849938146 Keith Busch 2024-12-06 2175 h, size, 0, info->endgid);
04ca0849938146 Keith Busch 2024-12-06 2176 if (ret)
04ca0849938146 Keith Busch 2024-12-06 2177 goto out;
04ca0849938146 Keith Busch 2024-12-06 2178
04ca0849938146 Keith Busch 2024-12-06 2179 n = le16_to_cpu(h->numfdpc) + 1;
04ca0849938146 Keith Busch 2024-12-06 2180 if (fdp_idx > n)
04ca0849938146 Keith Busch 2024-12-06 2181 goto out;
04ca0849938146 Keith Busch 2024-12-06 2182
04ca0849938146 Keith Busch 2024-12-06 2183 log = h + 1;
04ca0849938146 Keith Busch 2024-12-06 2184 do {
04ca0849938146 Keith Busch 2024-12-06 2185 desc = log;
04ca0849938146 Keith Busch 2024-12-06 2186 log += le16_to_cpu(desc->dsze);
04ca0849938146 Keith Busch 2024-12-06 @2187 } while (i++ < fdp_idx);
^
i needs to be initialized to zero at the start.
04ca0849938146 Keith Busch 2024-12-06 2188
04ca0849938146 Keith Busch 2024-12-06 2189 info->runs = le64_to_cpu(desc->runs);
04ca0849938146 Keith Busch 2024-12-06 2190 out:
04ca0849938146 Keith Busch 2024-12-06 2191 kfree(h);
04ca0849938146 Keith Busch 2024-12-06 2192 return ret;
04ca0849938146 Keith Busch 2024-12-06 2193 }
04ca0849938146 Keith Busch 2024-12-06 2194
04ca0849938146 Keith Busch 2024-12-06 2195 static int nvme_query_fdp_info(struct nvme_ns *ns, struct nvme_ns_info *info)
04ca0849938146 Keith Busch 2024-12-06 2196 {
04ca0849938146 Keith Busch 2024-12-06 2197 struct nvme_ns_head *head = ns->head;
04ca0849938146 Keith Busch 2024-12-06 2198 struct nvme_fdp_ruh_status *ruhs;
04ca0849938146 Keith Busch 2024-12-06 2199 struct nvme_fdp_config fdp;
04ca0849938146 Keith Busch 2024-12-06 2200 struct nvme_command c = {};
04ca0849938146 Keith Busch 2024-12-06 2201 int size, ret;
04ca0849938146 Keith Busch 2024-12-06 2202
04ca0849938146 Keith Busch 2024-12-06 2203 ret = nvme_get_features(ns->ctrl, NVME_FEAT_FDP, info->endgid, NULL, 0,
04ca0849938146 Keith Busch 2024-12-06 2204 &fdp);
04ca0849938146 Keith Busch 2024-12-06 2205 if (ret)
04ca0849938146 Keith Busch 2024-12-06 2206 goto err;
04ca0849938146 Keith Busch 2024-12-06 2207
04ca0849938146 Keith Busch 2024-12-06 2208 if (!(fdp.flags & FDPCFG_FDPE))
04ca0849938146 Keith Busch 2024-12-06 2209 goto err;
04ca0849938146 Keith Busch 2024-12-06 2210
04ca0849938146 Keith Busch 2024-12-06 2211 ret = nvme_check_fdp(ns, info, fdp.fdpcidx);
04ca0849938146 Keith Busch 2024-12-06 2212 if (ret || !info->runs)
04ca0849938146 Keith Busch 2024-12-06 2213 goto err;
04ca0849938146 Keith Busch 2024-12-06 2214
04ca0849938146 Keith Busch 2024-12-06 2215 size = struct_size(ruhs, ruhsd, NVME_MAX_PLIDS);
04ca0849938146 Keith Busch 2024-12-06 2216 ruhs = kzalloc(size, GFP_KERNEL);
04ca0849938146 Keith Busch 2024-12-06 2217 if (!ruhs) {
04ca0849938146 Keith Busch 2024-12-06 2218 ret = -ENOMEM;
04ca0849938146 Keith Busch 2024-12-06 2219 goto err;
04ca0849938146 Keith Busch 2024-12-06 2220 }
04ca0849938146 Keith Busch 2024-12-06 2221
04ca0849938146 Keith Busch 2024-12-06 2222 c.imr.opcode = nvme_cmd_io_mgmt_recv;
04ca0849938146 Keith Busch 2024-12-06 2223 c.imr.nsid = cpu_to_le32(head->ns_id);
04ca0849938146 Keith Busch 2024-12-06 2224 c.imr.mo = NVME_IO_MGMT_RECV_MO_RUHS;
04ca0849938146 Keith Busch 2024-12-06 2225 c.imr.numd = cpu_to_le32(nvme_bytes_to_numd(size));
04ca0849938146 Keith Busch 2024-12-06 2226 ret = nvme_submit_sync_cmd(ns->queue, &c, ruhs, size);
04ca0849938146 Keith Busch 2024-12-06 2227 if (ret)
04ca0849938146 Keith Busch 2024-12-06 2228 goto free;
04ca0849938146 Keith Busch 2024-12-06 2229
04ca0849938146 Keith Busch 2024-12-06 2230 head->nr_plids = le16_to_cpu(ruhs->nruhsd);
04ca0849938146 Keith Busch 2024-12-06 2231 if (!head->nr_plids)
04ca0849938146 Keith Busch 2024-12-06 @2232 goto free;
ret = -EINVAL?
04ca0849938146 Keith Busch 2024-12-06 2233
04ca0849938146 Keith Busch 2024-12-06 2234 kfree(ruhs);
04ca0849938146 Keith Busch 2024-12-06 2235 return 0;
04ca0849938146 Keith Busch 2024-12-06 2236
04ca0849938146 Keith Busch 2024-12-06 2237 free:
04ca0849938146 Keith Busch 2024-12-06 2238 kfree(ruhs);
04ca0849938146 Keith Busch 2024-12-06 2239 err:
04ca0849938146 Keith Busch 2024-12-06 2240 head->nr_plids = 0;
04ca0849938146 Keith Busch 2024-12-06 2241 info->runs = 0;
04ca0849938146 Keith Busch 2024-12-06 2242 return ret;
04ca0849938146 Keith Busch 2024-12-06 2243 }
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 08/12] nvme: add a nvme_get_log_lsi helper
[not found] ` <CGME20241210121958epcas5p27d14abfca66757a2c42ec71895b008b1@epcas5p2.samsung.com>
@ 2024-12-10 12:12 ` Nitesh Shetty
0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10 12:12 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
[-- Attachment #1: Type: text/plain, Size: 359 bytes --]
On 06/12/24 02:17PM, Keith Busch wrote:
>From: Christoph Hellwig <[email protected]>
>
>For log pages that need to pass in a LSI value, while at the same time
>not touching all the existing nvme_get_log callers.
>
>Signed-off-by: Christoph Hellwig <[email protected]>
>Signed-off-by: Keith Busch <[email protected]>
>---
Reviewed-by: Nitesh Shetty <[email protected]>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 09/12] nvme: pass a void pointer to nvme_get/set_features for the result
[not found] ` <CGME20241210122137epcas5p2e373baa1c99b78341928cc7bf0fe3bdf@epcas5p2.samsung.com>
@ 2024-12-10 12:13 ` Nitesh Shetty
0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10 12:13 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
[-- Attachment #1: Type: text/plain, Size: 383 bytes --]
On 06/12/24 02:17PM, Keith Busch wrote:
>From: Christoph Hellwig <[email protected]>
>
>That allows passing in structures instead of the u32 result, and thus
>reduce the amount of bit shifting and masking required to parse the
>result.
>
>Signed-off-by: Christoph Hellwig <[email protected]>
>Signed-off-by: Keith Busch <[email protected]>
>---
Reviewed-by: Nitesh Shetty <[email protected]>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 10/12] nvme.h: add FDP definitions
[not found] ` <CGME20241210122702epcas5p4fe3ed43ad714c6b467a35d16135d07c5@epcas5p4.samsung.com>
@ 2024-12-10 12:19 ` Nitesh Shetty
0 siblings, 0 replies; 46+ messages in thread
From: Nitesh Shetty @ 2024-12-10 12:19 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, linux-block, linux-nvme, linux-fsdevel, io-uring,
sagi, asml.silence, anuj20.g, joshi.k, Keith Busch
[-- Attachment #1: Type: text/plain, Size: 449 bytes --]
On 06/12/24 02:17PM, Keith Busch wrote:
>From: Christoph Hellwig <[email protected]>
>
>Add the config feature result, config log page, and management receive
>commands needed for FDP.
>
>Partially based on a patch from Kanchan Joshi <[email protected]>.
>
>Signed-off-by: Christoph Hellwig <[email protected]>
>[kbusch: renamed some fields to match spec]
>Signed-off-by: Keith Busch <[email protected]>
>---
Reviewed-by: Nitesh Shetty <[email protected]>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCHv12 11/12] nvme: register fdp parameters with the block layer
2024-12-10 8:45 ` Dan Carpenter
@ 2024-12-10 15:23 ` Keith Busch
0 siblings, 0 replies; 46+ messages in thread
From: Keith Busch @ 2024-12-10 15:23 UTC (permalink / raw)
To: Dan Carpenter
Cc: oe-kbuild, Keith Busch, axboe, hch, linux-block, linux-nvme,
linux-fsdevel, io-uring, lkp, oe-kbuild-all, sagi, asml.silence,
anuj20.g, joshi.k
On Tue, Dec 10, 2024 at 11:45:43AM +0300, Dan Carpenter wrote:
> 04ca0849938146 Keith Busch 2024-12-06 2226 ret = nvme_submit_sync_cmd(ns->queue, &c, ruhs, size);
> 04ca0849938146 Keith Busch 2024-12-06 2227 if (ret)
> 04ca0849938146 Keith Busch 2024-12-06 2228 goto free;
> 04ca0849938146 Keith Busch 2024-12-06 2229
> 04ca0849938146 Keith Busch 2024-12-06 2230 head->nr_plids = le16_to_cpu(ruhs->nruhsd);
> 04ca0849938146 Keith Busch 2024-12-06 2231 if (!head->nr_plids)
> 04ca0849938146 Keith Busch 2024-12-06 @2232 goto free;
>
> ret = -EINVAL?
It's very much on purpose to return "0" here. Returning a negative error
has the driver fail the namespace disk creation. Seeing a stream
configuration the driver doesn't support just means you don't get to use
the block layer's write stream features. You should still be able to use
your namespace the same as before the driver started checking these
configs, otherwise it's a regression since such namespaces are usable
today.
^ permalink raw reply [flat|nested] 46+ messages in thread
end of thread, other threads:[~2024-12-10 15:23 UTC | newest]
Thread overview: 46+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-06 22:17 [PATCHv12 00/12] block write streams with nvme fdp Keith Busch
2024-12-06 22:17 ` [PATCHv12 01/12] fs: add write stream information to statx Keith Busch
2024-12-09 8:25 ` Hannes Reinecke
[not found] ` <CGME20241209115219epcas5p4cfc217e25d977cd87025a4284ba0436c@epcas5p4.samsung.com>
2024-12-09 11:44 ` Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 02/12] fs: add a write stream field to the kiocb Keith Busch
2024-12-09 8:25 ` Hannes Reinecke
2024-12-09 12:47 ` [PATCHv12 01/12] fs: add write stream information to statx Christian Brauner
[not found] ` <CGME20241210073225epcas5p4b2ed325714e6d17fae9e3e45b8e963f6@epcas5p4.samsung.com>
2024-12-10 7:24 ` [PATCHv12 02/12] fs: add a write stream field to the kiocb Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 03/12] block: add a bi_write_stream field Keith Busch
2024-12-09 8:26 ` Hannes Reinecke
[not found] ` <CGME20241210074213epcas5p22330d197c3e7058e9c2226f28fdb1475@epcas5p2.samsung.com>
2024-12-10 7:34 ` Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 04/12] block: introduce max_write_streams queue limit Keith Busch
2024-12-09 8:27 ` Hannes Reinecke
[not found] ` <CGME20241210074628epcas5p3e36c7615cf2a5160d7fe169774fd30db@epcas5p3.samsung.com>
2024-12-10 7:38 ` Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 05/12] block: introduce a write_stream_granularity " Keith Busch
2024-12-09 8:29 ` Hannes Reinecke
[not found] ` <CGME20241210075259epcas5p23bbb79cdb18ddbfad337d764d4fe75da@epcas5p2.samsung.com>
2024-12-10 7:45 ` Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 06/12] block: expose write streams for block device nodes Keith Busch
2024-12-09 8:30 ` Hannes Reinecke
[not found] ` <CGME20241209110649epcas5p41df7db0f7ea58f250da647106d25134b@epcas5p4.samsung.com>
2024-12-09 10:58 ` Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 07/12] io_uring: enable per-io write streams Keith Busch
2024-12-09 8:31 ` Hannes Reinecke
2024-12-06 22:17 ` [PATCHv12 08/12] nvme: add a nvme_get_log_lsi helper Keith Busch
2024-12-09 8:31 ` Hannes Reinecke
[not found] ` <CGME20241210121958epcas5p27d14abfca66757a2c42ec71895b008b1@epcas5p2.samsung.com>
2024-12-10 12:12 ` Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 09/12] nvme: pass a void pointer to nvme_get/set_features for the result Keith Busch
2024-12-09 8:32 ` Hannes Reinecke
[not found] ` <CGME20241210122137epcas5p2e373baa1c99b78341928cc7bf0fe3bdf@epcas5p2.samsung.com>
2024-12-10 12:13 ` Nitesh Shetty
2024-12-06 22:17 ` [PATCHv12 10/12] nvme.h: add FDP definitions Keith Busch
2024-12-09 8:33 ` Hannes Reinecke
[not found] ` <CGME20241210122702epcas5p4fe3ed43ad714c6b467a35d16135d07c5@epcas5p4.samsung.com>
2024-12-10 12:19 ` Nitesh Shetty
2024-12-06 22:18 ` [PATCHv12 11/12] nvme: register fdp parameters with the block layer Keith Busch
2024-12-09 4:05 ` kernel test robot
2024-12-09 12:44 ` Christoph Hellwig
2024-12-09 8:34 ` Hannes Reinecke
2024-12-09 13:18 ` Christoph Hellwig
2024-12-09 16:29 ` Keith Busch
2024-12-10 8:45 ` Dan Carpenter
2024-12-10 15:23 ` Keith Busch
2024-12-06 22:18 ` [PATCHv12 12/12] nvme: use fdp streams if write stream is provided Keith Busch
2024-12-09 8:34 ` Hannes Reinecke
[not found] ` <CGME20241210073523epcas5p149482220b87ff3926fb8864ff1660e0c@epcas5p1.samsung.com>
2024-12-10 7:27 ` Nitesh Shetty
2024-12-09 12:55 ` [PATCHv12 00/12] block write streams with nvme fdp Christoph Hellwig
2024-12-09 16:07 ` Keith Busch
2024-12-10 1:49 ` Martin K. Petersen
2024-12-10 7:19 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox