public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH v7 00/10] Read/Write with meta/integrity
       [not found] <CGME20241104141427epcas5p2174ded627e2d785294ac4977b011a75b@epcas5p2.samsung.com>
@ 2024-11-04 14:05 ` Anuj Gupta
       [not found]   ` <CGME20241104141445epcas5p3fa11a5bebe88ac2bb3541850369591f7@epcas5p3.samsung.com>
                     ` (9 more replies)
  0 siblings, 10 replies; 26+ messages in thread
From: Anuj Gupta @ 2024-11-04 14:05 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
	brauner, jack, viro
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
	linux-fsdevel, Anuj Gupta

This adds a new io_uring interface to exchange additional integrity/pi
metadata with read/write.

The patchset is on top of block/for-next.

Interface:

A new meta_type field is introduced in SQE, which describes the type of metadata.
Currently one type "META_TYPE_PI" is supported. This meta type requires
application to setup a SQE128 ring. Application can use the second SQE to
pass following PI informaion:

* pi_flags: Three flags are exposed for integrity checks,
 namely IO_INTEGRITY_CHK_GUARD/APPTAG/REFTAG.
* len: length of the meta buffer
* addr: address of the meta buffer
* seed: seed value for reftag remapping
* app_tag: application-specific 16b value

Block path (direct IO) , NVMe and SCSI driver are modified to support
this.

Patch 1 is an enhancement patch.
Patch 2 is required to make the bounce buffer copy back work correctly.
Patch 3 to 5 are prep patches.
Patch 6 adds the io_uring support.
Patch 7 gives us unified interface for user and kernel generated
integrity.
Patch 8 adds support in SCSI and patch 9 in NVMe.
Patch 10 adds the support for block direct IO.

Testing has been done by modifying fio to use this interface.
Example program for the interface is appended below [1].

Changes since v6:
https://lore.kernel.org/linux-block/[email protected]/

- io_uring changes (bring back meta_type, move PI to the end of SQE128)
- Fix robot warnings

Changes since v5:
https://lore.kernel.org/linux-block/[email protected]/

- remove meta_type field from SQE (hch, keith)
- remove __bitwise annotation (hch)
- remove BIP_CTRL_NOCHECK from scsi (hch)

Changes since v4:
https://lore.kernel.org/linux-block/[email protected]/

- better variable names to describe bounce buffer copy back (hch)
- move defintion of flags in the same patch introducing uio_meta (hch)
- move uio_meta definition to include/linux/uio.h (hch)
- bump seed size in uio_meta to 8 bytes (martin)
- move flags definition to include/uapi/linux/fs.h (hch)
- s/meta/metadata in commit description of io-uring (hch)
- rearrange the meta fields in sqe for cleaner layout
- partial submission case is not applicable as, we are only plumbing for async case
- s/META_TYPE_INTEGRITY/META_TYPE_PI (hch, martin)
- remove unlikely branching (hch)
- Better formatting, misc cleanups, better commit descriptions, reordering commits(hch)

Changes since v3:
https://lore.kernel.org/linux-block/[email protected]/

- add reftag seed support (Martin)
- fix incorrect formatting in uio_meta (hch)
- s/IOCB_HAS_META/IOCB_HAS_METADATA (hch)
- move integrity check flags to block layer header (hch)
- add comments for BIP_CHECK_GUARD/REFTAG/APPTAG flags (hch)
- remove bio_integrity check during completion if IOCB_HAS_METADATA is set (hch)
- use goto label to get rid of duplicate error handling (hch)
- add warn_on if trying to do sync io with iocb_has_metadata flag (hch)
- remove check for disabling reftag remapping (hch)
- remove BIP_INTEGRITY_USER flag (hch)
- add comment for app_tag field introduced in bio_integrity_payload (hch)
- pass request to nvme_set_app_tag function (hch)
- right indentation at a place in scsi patch (hch)
- move IOCB_HAS_METADATA to a separate fs patch (hch)

Changes since v2:
https://lore.kernel.org/linux-block/[email protected]/
- io_uring error handling styling (Gabriel)
- add documented helper to get metadata bytes from data iter (hch)
- during clone specify "what flags to clone" rather than
"what not to clone" (hch)
- Move uio_meta defination to bio-integrity.h (hch)
- Rename apptag field to app_tag (hch)
- Change datatype of flags field in uio_meta to bitwise (hch)
- Don't introduce BIP_USER_CHK_FOO flags (hch, martin)
- Driver should rely on block layer flags instead of seeing if it is
user-passthrough (hch)
- update the scsi code for handling user-meta (hch, martin)

Changes since v1:
https://lore.kernel.org/linux-block/[email protected]/
- Do not use new opcode for meta, and also add the provision to introduce new
meta types beyond integrity (Pavel)
- Stuff IOCB_HAS_META check in need_complete_io (Jens)
- Split meta handling in NVMe into a separate handler (Keith)
- Add meta handling for __blkdev_direct_IO too (Keith)
- Don't inherit BIP_COPY_USER flag for cloned bio's (Christoph)
- Better commit descriptions (Christoph)

Changes since RFC:
- modify io_uring plumbing based on recent async handling state changes
- fixes/enhancements to correctly handle the split for meta buffer
- add flags to specify guard/reftag/apptag checks
- add support to send apptag

[1]

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <linux/fs.h>
#include <linux/io_uring.h>
#include <linux/types.h>
#include "liburing.h"

/*
 * write data/meta. read both. compare. send apptag too.
 * prerequisite:
 * protected xfer: format namespace with 4KB + 8b, pi_type = 1
 * For testing reftag remapping on device-mapper, create a
 * device-mapper and run this program. Device mapper creation:
 * # echo 0 80 linear /dev/nvme0n1 0 > /tmp/table
 * # echo 80 160 linear /dev/nvme0n1 200 >> /tmp/table
 * # dmsetup create two /tmp/table
 * # ./a.out /dev/dm-0
 */

#define DATA_LEN 4096
#define META_LEN 8

struct t10_pi_tuple {
        __be16  guard;
        __be16  apptag;
        __be32  reftag;
};

int main(int argc, char *argv[])
{
         struct io_uring ring;
         struct io_uring_sqe *sqe = NULL;
         struct io_uring_cqe *cqe = NULL;
         void *wdb,*rdb;
         char wmb[META_LEN], rmb[META_LEN];
         char *data_str = "data buffer";
         int fd, ret, blksize;
         struct stat fstat;
         unsigned long long offset = DATA_LEN * 10;
         struct t10_pi_tuple *pi;
         struct io_uring_sqe_ext *sqe_ext;

         if (argc != 2) {
                 fprintf(stderr, "Usage: %s <block-device>", argv[0]);
                 return 1;
         };

         if (stat(argv[1], &fstat) == 0) {
                 blksize = (int)fstat.st_blksize;
         } else {
                 perror("stat");
                 return 1;
         }

         if (posix_memalign(&wdb, blksize, DATA_LEN)) {
                 perror("posix_memalign failed");
                 return 1;
         }
         if (posix_memalign(&rdb, blksize, DATA_LEN)) {
                 perror("posix_memalign failed");
                 return 1;
         }

         memset(wdb, 0, DATA_LEN);

         fd = open(argv[1], O_RDWR | O_DIRECT);
         if (fd < 0) {
                 printf("Error in opening device\n");
                 return 0;
         }

         ret = io_uring_queue_init(8, &ring, IORING_SETUP_SQE128);
         if (ret) {
                 fprintf(stderr, "ring setup failed: %d\n", ret);
                 return 1;
         }

         /* write data + meta-buffer to device */
         sqe = io_uring_get_sqe(&ring);
         if (!sqe) {
                 fprintf(stderr, "get sqe failed\n");
                 return 1;
         }

         io_uring_prep_write(sqe, fd, wdb, DATA_LEN, offset);

	 sqe->meta_type = META_TYPE_PI;
         sqe_ext= (struct io_uring_sqe_ext *) (sqe + 1);
         sqe_ext->rw_pi.addr = (__u64)wmb;
         sqe_ext->rw_pi.len = META_LEN;
         /* flags to ask for guard/reftag/apptag*/
         sqe_ext->rw_pi.flags = IO_INTEGRITY_CHK_GUARD | IO_INTEGRITY_CHK_REFTAG | IO_INTEGRITY_CHK_APPTAG;
         sqe_ext->rw_pi.app_tag = 0x1234;
         sqe_ext->rw_pi.seed = 10;

         pi = (struct t10_pi_tuple *)wmb;
         pi->guard = 0;
         pi->reftag = 0x0A000000;
         pi->apptag = 0x3412;

         ret = io_uring_submit(&ring);
         if (ret <= 0) {
                 fprintf(stderr, "sqe submit failed: %d\n", ret);
                 return 1;
         }

         ret = io_uring_wait_cqe(&ring, &cqe);
         if (!cqe) {
                 fprintf(stderr, "cqe is NULL :%d\n", ret);
                 return 1;
         }
         if (cqe->res < 0) {
                 fprintf(stderr, "write cqe failure: %d", cqe->res);
                 return 1;
         }

         io_uring_cqe_seen(&ring, cqe);

         /* read data + meta-buffer back from device */
         sqe = io_uring_get_sqe(&ring);
         if (!sqe) {
                 fprintf(stderr, "get sqe failed\n");
                 return 1;
         }

         io_uring_prep_read(sqe, fd, rdb, DATA_LEN, offset);

         sqe->meta_type = META_TYPE_PI;
         sqe_ext= (struct io_uring_sqe_ext *) (sqe + 1);
         sqe_ext->rw_pi.addr = (__u64)rmb;
         sqe_ext->rw_pi.len = META_LEN;
         sqe_ext->rw_pi.flags = IO_INTEGRITY_CHK_GUARD | IO_INTEGRITY_CHK_REFTAG | IO_INTEGRITY_CHK_APPTAG;
         sqe_ext->rw_pi.app_tag = 0x1234;
         sqe_ext->rw_pi.seed = 10;

         ret = io_uring_submit(&ring);
         if (ret <= 0) {
                 fprintf(stderr, "sqe submit failed: %d\n", ret);
                 return 1;
         }

         ret = io_uring_wait_cqe(&ring, &cqe);
         if (!cqe) {
                 fprintf(stderr, "cqe is NULL :%d\n", ret);
                 return 1;
         }

         if (cqe->res < 0) {
                 fprintf(stderr, "read cqe failure: %d", cqe->res);
                 return 1;
         }

	 pi = (struct t10_pi_tuple *)rmb;
	 if (pi->apptag != 0x3412)
		 printf("Failure: apptag mismatch!\n");
	 if (pi->reftag != 0x0A000000)
		 printf("Failure: reftag mismatch!\n");

         io_uring_cqe_seen(&ring, cqe);

         pi = (struct t10_pi_tuple *)rmb;

         if (strncmp(wmb, rmb, META_LEN))
                 printf("Failure: meta mismatch!, wmb=%s, rmb=%s\n", wmb, rmb);

         if (strncmp(wdb, rdb, DATA_LEN))
                 printf("Failure: data mismatch!\n");

         io_uring_queue_exit(&ring);
         free(rdb);
         free(wdb);
         return 0;
}

Anuj Gupta (7):
  block: define set of integrity flags to be inherited by cloned bip
  block: modify bio_integrity_map_user to accept iov_iter as argument
  fs, iov_iter: define meta io descriptor
  fs: introduce IOCB_HAS_METADATA for metadata
  io_uring/rw: add support to send metadata along with read/write
  block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
  scsi: add support for user-meta interface

Christoph Hellwig (1):
  block: copy back bounce buffer to user-space correctly in case of
    split

Kanchan Joshi (2):
  nvme: add support for passing on the application tag
  block: add support to pass user meta buffer

 block/bio-integrity.c         | 84 +++++++++++++++++++++++++++------
 block/blk-integrity.c         | 10 +++-
 block/fops.c                  | 42 +++++++++++++----
 drivers/nvme/host/core.c      | 21 +++++----
 drivers/scsi/sd.c             |  4 +-
 include/linux/bio-integrity.h | 25 +++++++---
 include/linux/fs.h            |  1 +
 include/linux/uio.h           |  9 ++++
 include/uapi/linux/fs.h       |  9 ++++
 include/uapi/linux/io_uring.h | 30 ++++++++++++
 io_uring/io_uring.c           |  8 ++++
 io_uring/rw.c                 | 88 ++++++++++++++++++++++++++++++++++-
 io_uring/rw.h                 | 14 +++++-
 13 files changed, 300 insertions(+), 45 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v7 01/10] block: define set of integrity flags to be inherited by cloned bip
       [not found]   ` <CGME20241104141445epcas5p3fa11a5bebe88ac2bb3541850369591f7@epcas5p3.samsung.com>
@ 2024-11-04 14:05     ` Anuj Gupta
  0 siblings, 0 replies; 26+ messages in thread
From: Anuj Gupta @ 2024-11-04 14:05 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
	brauner, jack, viro
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
	linux-fsdevel, Anuj Gupta

Introduce BIP_CLONE_FLAGS describing integrity flags that should be
inherited in the cloned bip from the parent.

Suggested-by: Christoph Hellwig <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
---
 block/bio-integrity.c         | 2 +-
 include/linux/bio-integrity.h | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 2a4bd6611692..a448a25d13de 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -559,7 +559,7 @@ int bio_integrity_clone(struct bio *bio, struct bio *bio_src,
 
 	bip->bip_vec = bip_src->bip_vec;
 	bip->bip_iter = bip_src->bip_iter;
-	bip->bip_flags = bip_src->bip_flags & ~BIP_BLOCK_INTEGRITY;
+	bip->bip_flags = bip_src->bip_flags & BIP_CLONE_FLAGS;
 
 	return 0;
 }
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index dbf0f74c1529..0f0cf10222e8 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -30,6 +30,9 @@ struct bio_integrity_payload {
 	struct bio_vec		bip_inline_vecs[];/* embedded bvec array */
 };
 
+#define BIP_CLONE_FLAGS (BIP_MAPPED_INTEGRITY | BIP_CTRL_NOCHECK | \
+			 BIP_DISK_NOCHECK | BIP_IP_CHECKSUM)
+
 #ifdef CONFIG_BLK_DEV_INTEGRITY
 
 #define bip_for_each_vec(bvl, bip, iter)				\
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 02/10] block: copy back bounce buffer to user-space correctly in case of split
       [not found]   ` <CGME20241104141448epcas5p4179505e12f9cf45fd792dc6da6afce8e@epcas5p4.samsung.com>
@ 2024-11-04 14:05     ` Anuj Gupta
  2024-11-05 10:03       ` Christoph Hellwig
  0 siblings, 1 reply; 26+ messages in thread
From: Anuj Gupta @ 2024-11-04 14:05 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
	brauner, jack, viro
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
	linux-fsdevel, Anuj Gupta

From: Christoph Hellwig <[email protected]>

Copy back the bounce buffer to user-space in entirety when the parent
bio completes. The existing code uses bip_iter.bi_size for sizing the
copy, which can be modified. So move away from that and fetch it from
the vector passed to the block layer. While at it, switch to using
better variable names.

Fixes: 492c5d455969f ("block: bio-integrity: directly map user buffers")
Signed-off-by: Anuj Gupta <[email protected]>
[hch: better names for variables]
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
---
 block/bio-integrity.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index a448a25d13de..4341b0d4efa1 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -118,17 +118,18 @@ static void bio_integrity_unpin_bvec(struct bio_vec *bv, int nr_vecs,
 
 static void bio_integrity_uncopy_user(struct bio_integrity_payload *bip)
 {
-	unsigned short nr_vecs = bip->bip_max_vcnt - 1;
-	struct bio_vec *copy = &bip->bip_vec[1];
-	size_t bytes = bip->bip_iter.bi_size;
-	struct iov_iter iter;
+	unsigned short orig_nr_vecs = bip->bip_max_vcnt - 1;
+	struct bio_vec *orig_bvecs = &bip->bip_vec[1];
+	struct bio_vec *bounce_bvec = &bip->bip_vec[0];
+	size_t bytes = bounce_bvec->bv_len;
+	struct iov_iter orig_iter;
 	int ret;
 
-	iov_iter_bvec(&iter, ITER_DEST, copy, nr_vecs, bytes);
-	ret = copy_to_iter(bvec_virt(bip->bip_vec), bytes, &iter);
+	iov_iter_bvec(&orig_iter, ITER_DEST, orig_bvecs, orig_nr_vecs, bytes);
+	ret = copy_to_iter(bvec_virt(bounce_bvec), bytes, &orig_iter);
 	WARN_ON_ONCE(ret != bytes);
 
-	bio_integrity_unpin_bvec(copy, nr_vecs, true);
+	bio_integrity_unpin_bvec(orig_bvecs, orig_nr_vecs, true);
 }
 
 /**
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 03/10] block: modify bio_integrity_map_user to accept iov_iter as argument
       [not found]   ` <CGME20241104141451epcas5p2aef1f93e905c27e34b3e16d89ff39245@epcas5p2.samsung.com>
@ 2024-11-04 14:05     ` Anuj Gupta
  0 siblings, 0 replies; 26+ messages in thread
From: Anuj Gupta @ 2024-11-04 14:05 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
	brauner, jack, viro
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
	linux-fsdevel, Anuj Gupta, Kanchan Joshi

This patch refactors bio_integrity_map_user to accept iov_iter as
argument. This is a prep patch.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
---
 block/bio-integrity.c         | 12 +++++-------
 block/blk-integrity.c         | 10 +++++++++-
 include/linux/bio-integrity.h |  5 ++---
 3 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 4341b0d4efa1..f56d01cec689 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -302,16 +302,15 @@ static unsigned int bvec_from_pages(struct bio_vec *bvec, struct page **pages,
 	return nr_bvecs;
 }
 
-int bio_integrity_map_user(struct bio *bio, void __user *ubuf, ssize_t bytes)
+int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter)
 {
 	struct request_queue *q = bdev_get_queue(bio->bi_bdev);
 	unsigned int align = blk_lim_dma_alignment_and_pad(&q->limits);
 	struct page *stack_pages[UIO_FASTIOV], **pages = stack_pages;
 	struct bio_vec stack_vec[UIO_FASTIOV], *bvec = stack_vec;
+	size_t offset, bytes = iter->count;
 	unsigned int direction, nr_bvecs;
-	struct iov_iter iter;
 	int ret, nr_vecs;
-	size_t offset;
 	bool copy;
 
 	if (bio_integrity(bio))
@@ -324,8 +323,7 @@ int bio_integrity_map_user(struct bio *bio, void __user *ubuf, ssize_t bytes)
 	else
 		direction = ITER_SOURCE;
 
-	iov_iter_ubuf(&iter, direction, ubuf, bytes);
-	nr_vecs = iov_iter_npages(&iter, BIO_MAX_VECS + 1);
+	nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS + 1);
 	if (nr_vecs > BIO_MAX_VECS)
 		return -E2BIG;
 	if (nr_vecs > UIO_FASTIOV) {
@@ -335,8 +333,8 @@ int bio_integrity_map_user(struct bio *bio, void __user *ubuf, ssize_t bytes)
 		pages = NULL;
 	}
 
-	copy = !iov_iter_is_aligned(&iter, align, align);
-	ret = iov_iter_extract_pages(&iter, &pages, bytes, nr_vecs, 0, &offset);
+	copy = !iov_iter_is_aligned(iter, align, align);
+	ret = iov_iter_extract_pages(iter, &pages, bytes, nr_vecs, 0, &offset);
 	if (unlikely(ret < 0))
 		goto free_bvec;
 
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index b180cac61a9d..4a29754f1bc2 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -115,8 +115,16 @@ EXPORT_SYMBOL(blk_rq_map_integrity_sg);
 int blk_rq_integrity_map_user(struct request *rq, void __user *ubuf,
 			      ssize_t bytes)
 {
-	int ret = bio_integrity_map_user(rq->bio, ubuf, bytes);
+	int ret;
+	struct iov_iter iter;
+	unsigned int direction;
 
+	if (op_is_write(req_op(rq)))
+		direction = ITER_DEST;
+	else
+		direction = ITER_SOURCE;
+	iov_iter_ubuf(&iter, direction, ubuf, bytes);
+	ret = bio_integrity_map_user(rq->bio, &iter);
 	if (ret)
 		return ret;
 
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index 0f0cf10222e8..58ff9988433a 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -75,7 +75,7 @@ struct bio_integrity_payload *bio_integrity_alloc(struct bio *bio, gfp_t gfp,
 		unsigned int nr);
 int bio_integrity_add_page(struct bio *bio, struct page *page, unsigned int len,
 		unsigned int offset);
-int bio_integrity_map_user(struct bio *bio, void __user *ubuf, ssize_t len);
+int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter);
 void bio_integrity_unmap_user(struct bio *bio);
 bool bio_integrity_prep(struct bio *bio);
 void bio_integrity_advance(struct bio *bio, unsigned int bytes_done);
@@ -101,8 +101,7 @@ static inline void bioset_integrity_free(struct bio_set *bs)
 {
 }
 
-static inline int bio_integrity_map_user(struct bio *bio, void __user *ubuf,
-					 ssize_t len)
+static int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter)
 {
 	return -EINVAL;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 04/10] fs, iov_iter: define meta io descriptor
       [not found]   ` <CGME20241104141453epcas5p201e4aabfa7aa1f4af1cdf07228f8d4e7@epcas5p2.samsung.com>
@ 2024-11-04 14:05     ` Anuj Gupta
  2024-11-05  9:55       ` Christoph Hellwig
  0 siblings, 1 reply; 26+ messages in thread
From: Anuj Gupta @ 2024-11-04 14:05 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
	brauner, jack, viro
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
	linux-fsdevel, Anuj Gupta, Kanchan Joshi

Add flags to describe checks for integrity meta buffer. Also, introduce
a  new 'uio_meta' structure that upper layer can use to pass the
meta/integrity information.

Signed-off-by: Kanchan Joshi <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
---
 include/linux/uio.h     | 9 +++++++++
 include/uapi/linux/fs.h | 9 +++++++++
 2 files changed, 18 insertions(+)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 853f9de5aa05..8ada84e85447 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -82,6 +82,15 @@ struct iov_iter {
 	};
 };
 
+typedef __u16 uio_meta_flags_t;
+
+struct uio_meta {
+	uio_meta_flags_t	flags;
+	u16			app_tag;
+	u64			seed;
+	struct iov_iter		iter;
+};
+
 static inline const struct iovec *iter_iov(const struct iov_iter *iter)
 {
 	if (iter->iter_type == ITER_UBUF)
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 753971770733..9070ef19f0a3 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -40,6 +40,15 @@
 #define BLOCK_SIZE_BITS 10
 #define BLOCK_SIZE (1<<BLOCK_SIZE_BITS)
 
+/* flags for integrity meta */
+#define IO_INTEGRITY_CHK_GUARD		(1U << 0) /* enforce guard check */
+#define IO_INTEGRITY_CHK_REFTAG		(1U << 1) /* enforce ref check */
+#define IO_INTEGRITY_CHK_APPTAG		(1U << 2) /* enforce app check */
+
+#define IO_INTEGRITY_VALID_FLAGS (IO_INTEGRITY_CHK_GUARD | \
+				  IO_INTEGRITY_CHK_REFTAG | \
+				  IO_INTEGRITY_CHK_APPTAG)
+
 #define SEEK_SET	0	/* seek relative to beginning of file */
 #define SEEK_CUR	1	/* seek relative to current file position */
 #define SEEK_END	2	/* seek relative to end of file */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 05/10] fs: introduce IOCB_HAS_METADATA for metadata
       [not found]   ` <CGME20241104141456epcas5p38fef2ccde087de84ffc6f479f50e8071@epcas5p3.samsung.com>
@ 2024-11-04 14:05     ` Anuj Gupta
  0 siblings, 0 replies; 26+ messages in thread
From: Anuj Gupta @ 2024-11-04 14:05 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
	brauner, jack, viro
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
	linux-fsdevel, Anuj Gupta

Introduce an IOCB_HAS_METADATA flag for the kiocb struct, for handling
requests containing meta payload.

Signed-off-by: Anuj Gupta <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
 include/linux/fs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4b5cad44a126..7f14675b02df 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -346,6 +346,7 @@ struct readahead_control;
 #define IOCB_DIO_CALLER_COMP	(1 << 22)
 /* kiocb is a read or write operation submitted by fs/aio.c. */
 #define IOCB_AIO_RW		(1 << 23)
+#define IOCB_HAS_METADATA	(1 << 24)
 
 /* for use in trace events */
 #define TRACE_IOCB_STRINGS \
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write
       [not found]   ` <CGME20241104141459epcas5p27991e140158b1e7294b4d6c4e767373c@epcas5p2.samsung.com>
@ 2024-11-04 14:05     ` Anuj Gupta
  2024-11-05  9:56       ` Christoph Hellwig
  0 siblings, 1 reply; 26+ messages in thread
From: Anuj Gupta @ 2024-11-04 14:05 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
	brauner, jack, viro
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
	linux-fsdevel, Anuj Gupta, Kanchan Joshi

This patch adds the capability of passing integrity metadata along with
read/write. A new meta_type field is introduced in SQE which indicates
the type of metadata being passed. A new 'struct io_uring_sqe_ext'
represents the secondary SQE space for read/write. The last 32 bytes of
secondary SQE is used to pass following PI related information:

- flags: integrity check flags namely
IO_INTEGRITY_CHK_{GUARD/APPTAG/REFTAG}
- len: length of the pi/metadata buffer
- buf: address of the metadata buffer
- seed: seed value for reftag remapping
- app_tag: application defined 16b value

Application sets up a SQE128 ring, prepares PI information within the
second SQE. The patch processes this information to prepare uio_meta
descriptor and passes it down using kiocb->private.

Meta exchange is supported only for direct IO.
Also vectored read/write operations with meta are not supported
currently.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
 include/uapi/linux/io_uring.h | 30 ++++++++++++
 io_uring/io_uring.c           |  8 ++++
 io_uring/rw.c                 | 88 ++++++++++++++++++++++++++++++++++-
 io_uring/rw.h                 | 14 +++++-
 4 files changed, 137 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 024745283783..7f01124bedd5 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -92,6 +92,10 @@ struct io_uring_sqe {
 			__u16	addr_len;
 			__u16	__pad3[1];
 		};
+		struct {
+			__u16	meta_type;
+			__u16	__pad4[1];
+		};
 	};
 	union {
 		struct {
@@ -107,6 +111,32 @@ struct io_uring_sqe {
 	};
 };
 
+enum io_uring_sqe_meta_type_bits {
+	META_TYPE_PI_BIT,
+	/* not a real meta type; just to make sure that we don't overflow */
+	META_TYPE_LAST_BIT,
+};
+
+/* meta type flags */
+#define META_TYPE_PI	(1U << META_TYPE_PI_BIT)
+
+/* Second half of SQE128 for IORING_OP_READ/WRITE */
+struct io_uring_sqe_ext {
+	__u64	rsvd0[4];
+	/* if sqe->meta_type is META_TYPE_PI, last 32 bytes are for PI */
+	union {
+		__u64	rsvd1[4];
+		struct {
+			__u16	flags;
+			__u16	app_tag;
+			__u32	len;
+			__u64	addr;
+			__u64	seed;
+			__u64	rsvd;
+		} rw_pi;
+	};
+};
+
 /*
  * If sqe->file_index is set to this for opcodes that instantiate a new
  * direct descriptor (like openat/openat2/accept), then io_uring will allocate
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 44a772013c09..116c93022985 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3875,7 +3875,9 @@ static int __init io_uring_init(void)
 	BUILD_BUG_SQE_ELEM(44, __s32,  splice_fd_in);
 	BUILD_BUG_SQE_ELEM(44, __u32,  file_index);
 	BUILD_BUG_SQE_ELEM(44, __u16,  addr_len);
+	BUILD_BUG_SQE_ELEM(44, __u16,  meta_type);
 	BUILD_BUG_SQE_ELEM(46, __u16,  __pad3[0]);
+	BUILD_BUG_SQE_ELEM(46, __u16,  __pad4[0]);
 	BUILD_BUG_SQE_ELEM(48, __u64,  addr3);
 	BUILD_BUG_SQE_ELEM_SIZE(48, 0, cmd);
 	BUILD_BUG_SQE_ELEM(56, __u64,  __pad2);
@@ -3902,6 +3904,12 @@ static int __init io_uring_init(void)
 	/* top 8bits are for internal use */
 	BUILD_BUG_ON((IORING_URING_CMD_MASK & 0xff000000) != 0);
 
+	BUILD_BUG_ON(sizeof(struct io_uring_sqe_ext) !=
+		     sizeof(struct io_uring_sqe));
+
+	BUILD_BUG_ON(META_TYPE_LAST_BIT >
+		     8 * sizeof_field(struct io_uring_sqe, meta_type));
+
 	io_uring_optable_init();
 
 	/*
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 30448f343c7f..eb19b033df24 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -257,11 +257,64 @@ static int io_prep_rw_setup(struct io_kiocb *req, int ddir, bool do_import)
 	return 0;
 }
 
+static inline void io_meta_save_state(struct io_async_rw *io)
+{
+	io->meta_state.seed = io->meta.seed;
+	iov_iter_save_state(&io->meta.iter, &io->meta_state.iter_meta);
+}
+
+static inline void io_meta_restore(struct io_async_rw *io)
+{
+	io->meta.seed = io->meta_state.seed;
+	iov_iter_restore(&io->meta.iter, &io->meta_state.iter_meta);
+}
+
+static inline const void *io_uring_sqe_ext(const struct io_uring_sqe *sqe)
+{
+	return (sqe + 1);
+}
+
+static int io_prep_rw_pi(struct io_kiocb *req, const struct io_uring_sqe *sqe,
+			   struct io_rw *rw, int ddir)
+{
+	const struct io_uring_sqe_ext *sqe_ext;
+	const struct io_issue_def *def;
+	struct io_async_rw *io;
+	int ret;
+
+	if (!(req->ctx->flags & IORING_SETUP_SQE128))
+		return -EINVAL;
+
+	sqe_ext = io_uring_sqe_ext(sqe);
+	if (READ_ONCE(sqe_ext->rsvd0[0]) || READ_ONCE(sqe_ext->rsvd0[1])
+	    || READ_ONCE(sqe_ext->rsvd0[2]) || READ_ONCE(sqe_ext->rsvd0[3]))
+		return -EINVAL;
+	if (READ_ONCE(sqe_ext->rw_pi.rsvd))
+		return -EINVAL;
+
+	def = &io_issue_defs[req->opcode];
+	if (def->vectored)
+		return -EOPNOTSUPP;
+
+	io = req->async_data;
+	io->meta.flags = READ_ONCE(sqe_ext->rw_pi.flags);
+	io->meta.app_tag = READ_ONCE(sqe_ext->rw_pi.app_tag);
+	io->meta.seed = READ_ONCE(sqe_ext->rw_pi.seed);
+	ret = import_ubuf(ddir, u64_to_user_ptr(READ_ONCE(sqe_ext->rw_pi.addr)),
+			  READ_ONCE(sqe_ext->rw_pi.len), &io->meta.iter);
+	if (unlikely(ret < 0))
+		return ret;
+	rw->kiocb.ki_flags |= IOCB_HAS_METADATA;
+	io_meta_save_state(io);
+	return ret;
+}
+
 static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		      int ddir, bool do_import)
 {
 	struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw);
 	unsigned ioprio;
+	u16 meta_type;
 	int ret;
 
 	rw->kiocb.ki_pos = READ_ONCE(sqe->off);
@@ -279,11 +332,23 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		rw->kiocb.ki_ioprio = get_current_ioprio();
 	}
 	rw->kiocb.dio_complete = NULL;
+	rw->kiocb.ki_flags = 0;
 
 	rw->addr = READ_ONCE(sqe->addr);
 	rw->len = READ_ONCE(sqe->len);
 	rw->flags = READ_ONCE(sqe->rw_flags);
-	return io_prep_rw_setup(req, ddir, do_import);
+	ret = io_prep_rw_setup(req, ddir, do_import);
+
+	if (unlikely(ret))
+		return ret;
+
+	meta_type = READ_ONCE(sqe->meta_type);
+	if (meta_type) {
+		if (READ_ONCE(sqe->__pad4[0]) || !(meta_type & META_TYPE_PI))
+			return -EINVAL;
+		ret = io_prep_rw_pi(req, sqe, rw, ddir);
+	}
+	return ret;
 }
 
 int io_prep_read(struct io_kiocb *req, const struct io_uring_sqe *sqe)
@@ -409,7 +474,10 @@ static inline loff_t *io_kiocb_update_pos(struct io_kiocb *req)
 static void io_resubmit_prep(struct io_kiocb *req)
 {
 	struct io_async_rw *io = req->async_data;
+	struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw);
 
+	if (rw->kiocb.ki_flags & IOCB_HAS_METADATA)
+		io_meta_restore(io);
 	iov_iter_restore(&io->iter, &io->iter_state);
 }
 
@@ -794,7 +862,7 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode, int rw_type)
 	if (!(req->flags & REQ_F_FIXED_FILE))
 		req->flags |= io_file_get_flags(file);
 
-	kiocb->ki_flags = file->f_iocb_flags;
+	kiocb->ki_flags |= file->f_iocb_flags;
 	ret = kiocb_set_rw_flags(kiocb, rw->flags, rw_type);
 	if (unlikely(ret))
 		return ret;
@@ -823,6 +891,18 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode, int rw_type)
 		kiocb->ki_complete = io_complete_rw;
 	}
 
+	if (kiocb->ki_flags & IOCB_HAS_METADATA) {
+		struct io_async_rw *io = req->async_data;
+
+		/*
+		 * We have a union of meta fields with wpq used for buffered-io
+		 * in io_async_rw, so fail it here.
+		 */
+		if (!(req->file->f_flags & O_DIRECT))
+			return -EOPNOTSUPP;
+		kiocb->private = &io->meta;
+	}
+
 	return 0;
 }
 
@@ -897,6 +977,8 @@ static int __io_read(struct io_kiocb *req, unsigned int issue_flags)
 	 * manually if we need to.
 	 */
 	iov_iter_restore(&io->iter, &io->iter_state);
+	if (kiocb->ki_flags & IOCB_HAS_METADATA)
+		io_meta_restore(io);
 
 	do {
 		/*
@@ -1101,6 +1183,8 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags)
 	} else {
 ret_eagain:
 		iov_iter_restore(&io->iter, &io->iter_state);
+		if (kiocb->ki_flags & IOCB_HAS_METADATA)
+			io_meta_restore(io);
 		if (kiocb->ki_flags & IOCB_WRITE)
 			io_req_end_write(req);
 		return -EAGAIN;
diff --git a/io_uring/rw.h b/io_uring/rw.h
index 3f432dc75441..2d7656bd268d 100644
--- a/io_uring/rw.h
+++ b/io_uring/rw.h
@@ -2,6 +2,11 @@
 
 #include <linux/pagemap.h>
 
+struct io_meta_state {
+	u32			seed;
+	struct iov_iter_state	iter_meta;
+};
+
 struct io_async_rw {
 	size_t				bytes_done;
 	struct iov_iter			iter;
@@ -9,7 +14,14 @@ struct io_async_rw {
 	struct iovec			fast_iov;
 	struct iovec			*free_iovec;
 	int				free_iov_nr;
-	struct wait_page_queue		wpq;
+	/* wpq is for buffered io, while meta fields are used with direct io */
+	union {
+		struct wait_page_queue		wpq;
+		struct {
+			struct uio_meta			meta;
+			struct io_meta_state		meta_state;
+		};
+	};
 };
 
 int io_prep_read_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
       [not found]   ` <CGME20241104141501epcas5p38203d98ce0b2ac95cc45e02a142e84ef@epcas5p3.samsung.com>
@ 2024-11-04 14:05     ` Anuj Gupta
  0 siblings, 0 replies; 26+ messages in thread
From: Anuj Gupta @ 2024-11-04 14:05 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
	brauner, jack, viro
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
	linux-fsdevel, Anuj Gupta, Kanchan Joshi

This patch introduces BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags which
indicate how the hardware should check the integrity payload.
BIP_CHECK_GUARD/REFTAG are conversion of existing semantics, while
BIP_CHECK_APPTAG is a new flag. The driver can now just rely on block
layer flags, and doesn't need to know the integrity source. Submitter
of PI decides which tags to check. This would also give us a unified
interface for user and kernel generated integrity.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
---
 block/bio-integrity.c         |  5 +++++
 drivers/nvme/host/core.c      | 11 +++--------
 include/linux/bio-integrity.h |  6 +++++-
 3 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index f56d01cec689..3bee43b87001 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -434,6 +434,11 @@ bool bio_integrity_prep(struct bio *bio)
 	if (bi->csum_type == BLK_INTEGRITY_CSUM_IP)
 		bip->bip_flags |= BIP_IP_CHECKSUM;
 
+	/* describe what tags to check in payload */
+	if (bi->csum_type)
+		bip->bip_flags |= BIP_CHECK_GUARD;
+	if (bi->flags & BLK_INTEGRITY_REF_TAG)
+		bip->bip_flags |= BIP_CHECK_REFTAG;
 	if (bio_integrity_add_page(bio, virt_to_page(buf), len,
 			offset_in_page(buf)) < len) {
 		printk(KERN_ERR "could not attach integrity payload\n");
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 3de7555a7de7..79bd6b22e88d 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1004,18 +1004,13 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
 			control |= NVME_RW_PRINFO_PRACT;
 		}
 
-		switch (ns->head->pi_type) {
-		case NVME_NS_DPS_PI_TYPE3:
+		if (bio_integrity_flagged(req->bio, BIP_CHECK_GUARD))
 			control |= NVME_RW_PRINFO_PRCHK_GUARD;
-			break;
-		case NVME_NS_DPS_PI_TYPE1:
-		case NVME_NS_DPS_PI_TYPE2:
-			control |= NVME_RW_PRINFO_PRCHK_GUARD |
-					NVME_RW_PRINFO_PRCHK_REF;
+		if (bio_integrity_flagged(req->bio, BIP_CHECK_REFTAG)) {
+			control |= NVME_RW_PRINFO_PRCHK_REF;
 			if (op == nvme_cmd_zone_append)
 				control |= NVME_RW_APPEND_PIREMAP;
 			nvme_set_ref_tag(ns, cmnd, req);
-			break;
 		}
 	}
 
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index 58ff9988433a..fe2bfe122db2 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -11,6 +11,9 @@ enum bip_flags {
 	BIP_DISK_NOCHECK	= 1 << 3, /* disable disk integrity checking */
 	BIP_IP_CHECKSUM		= 1 << 4, /* IP checksum */
 	BIP_COPY_USER		= 1 << 5, /* Kernel bounce buffer in use */
+	BIP_CHECK_GUARD		= 1 << 6, /* guard check */
+	BIP_CHECK_REFTAG	= 1 << 7, /* reftag check */
+	BIP_CHECK_APPTAG	= 1 << 8, /* apptag check */
 };
 
 struct bio_integrity_payload {
@@ -31,7 +34,8 @@ struct bio_integrity_payload {
 };
 
 #define BIP_CLONE_FLAGS (BIP_MAPPED_INTEGRITY | BIP_CTRL_NOCHECK | \
-			 BIP_DISK_NOCHECK | BIP_IP_CHECKSUM)
+			 BIP_DISK_NOCHECK | BIP_IP_CHECKSUM | \
+			 BIP_CHECK_GUARD | BIP_CHECK_REFTAG | BIP_CHECK_APPTAG)
 
 #ifdef CONFIG_BLK_DEV_INTEGRITY
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 08/10] nvme: add support for passing on the application tag
       [not found]   ` <CGME20241104141504epcas5p47e46a75f9248a37c9a4180de8e72b54c@epcas5p4.samsung.com>
@ 2024-11-04 14:05     ` Anuj Gupta
  0 siblings, 0 replies; 26+ messages in thread
From: Anuj Gupta @ 2024-11-04 14:05 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
	brauner, jack, viro
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
	linux-fsdevel, Kanchan Joshi, Anuj Gupta

From: Kanchan Joshi <[email protected]>

With user integrity buffer, there is a way to specify the app_tag.
Set the corresponding protocol specific flags and send the app_tag down.

Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
---
 drivers/nvme/host/core.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 79bd6b22e88d..3b329e036d33 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -872,6 +872,12 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req,
 	return BLK_STS_OK;
 }
 
+static void nvme_set_app_tag(struct request *req, struct nvme_command *cmnd)
+{
+	cmnd->rw.lbat = cpu_to_le16(bio_integrity(req->bio)->app_tag);
+	cmnd->rw.lbatm = cpu_to_le16(0xffff);
+}
+
 static void nvme_set_ref_tag(struct nvme_ns *ns, struct nvme_command *cmnd,
 			      struct request *req)
 {
@@ -1012,6 +1018,10 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
 				control |= NVME_RW_APPEND_PIREMAP;
 			nvme_set_ref_tag(ns, cmnd, req);
 		}
+		if (bio_integrity_flagged(req->bio, BIP_CHECK_APPTAG)) {
+			control |= NVME_RW_PRINFO_PRCHK_APP;
+			nvme_set_app_tag(req, cmnd);
+		}
 	}
 
 	cmnd->rw.control = cpu_to_le16(control);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 09/10] scsi: add support for user-meta interface
       [not found]   ` <CGME20241104141507epcas5p161e39cef85f8fa5f5ad59e959e070d0b@epcas5p1.samsung.com>
@ 2024-11-04 14:06     ` Anuj Gupta
  0 siblings, 0 replies; 26+ messages in thread
From: Anuj Gupta @ 2024-11-04 14:06 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
	brauner, jack, viro
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
	linux-fsdevel, Anuj Gupta

Add support for sending user-meta buffer. Set tags to be checked
using flags specified by user/block-layer.
With this change, BIP_CTRL_NOCHECK becomes unused. Remove it.

Signed-off-by: Anuj Gupta <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
 drivers/scsi/sd.c             |  4 ++--
 include/linux/bio-integrity.h | 16 +++++++---------
 2 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index ca4bc0ac76ad..d1a2ae0d4c29 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -814,14 +814,14 @@ static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd,
 		if (bio_integrity_flagged(bio, BIP_IP_CHECKSUM))
 			scmd->prot_flags |= SCSI_PROT_IP_CHECKSUM;
 
-		if (bio_integrity_flagged(bio, BIP_CTRL_NOCHECK) == false)
+		if (bio_integrity_flagged(bio, BIP_CHECK_GUARD))
 			scmd->prot_flags |= SCSI_PROT_GUARD_CHECK;
 	}
 
 	if (dif != T10_PI_TYPE3_PROTECTION) {	/* DIX/DIF Type 0, 1, 2 */
 		scmd->prot_flags |= SCSI_PROT_REF_INCREMENT;
 
-		if (bio_integrity_flagged(bio, BIP_CTRL_NOCHECK) == false)
+		if (bio_integrity_flagged(bio, BIP_CHECK_REFTAG))
 			scmd->prot_flags |= SCSI_PROT_REF_CHECK;
 	}
 
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index fe2bfe122db2..2195bc06dcde 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -7,13 +7,12 @@
 enum bip_flags {
 	BIP_BLOCK_INTEGRITY	= 1 << 0, /* block layer owns integrity data */
 	BIP_MAPPED_INTEGRITY	= 1 << 1, /* ref tag has been remapped */
-	BIP_CTRL_NOCHECK	= 1 << 2, /* disable HBA integrity checking */
-	BIP_DISK_NOCHECK	= 1 << 3, /* disable disk integrity checking */
-	BIP_IP_CHECKSUM		= 1 << 4, /* IP checksum */
-	BIP_COPY_USER		= 1 << 5, /* Kernel bounce buffer in use */
-	BIP_CHECK_GUARD		= 1 << 6, /* guard check */
-	BIP_CHECK_REFTAG	= 1 << 7, /* reftag check */
-	BIP_CHECK_APPTAG	= 1 << 8, /* apptag check */
+	BIP_DISK_NOCHECK	= 1 << 2, /* disable disk integrity checking */
+	BIP_IP_CHECKSUM		= 1 << 3, /* IP checksum */
+	BIP_COPY_USER		= 1 << 4, /* Kernel bounce buffer in use */
+	BIP_CHECK_GUARD		= 1 << 5, /* guard check */
+	BIP_CHECK_REFTAG	= 1 << 6, /* reftag check */
+	BIP_CHECK_APPTAG	= 1 << 7, /* apptag check */
 };
 
 struct bio_integrity_payload {
@@ -33,8 +32,7 @@ struct bio_integrity_payload {
 	struct bio_vec		bip_inline_vecs[];/* embedded bvec array */
 };
 
-#define BIP_CLONE_FLAGS (BIP_MAPPED_INTEGRITY | BIP_CTRL_NOCHECK | \
-			 BIP_DISK_NOCHECK | BIP_IP_CHECKSUM | \
+#define BIP_CLONE_FLAGS (BIP_MAPPED_INTEGRITY | BIP_IP_CHECKSUM | \
 			 BIP_CHECK_GUARD | BIP_CHECK_REFTAG | BIP_CHECK_APPTAG)
 
 #ifdef CONFIG_BLK_DEV_INTEGRITY
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 10/10] block: add support to pass user meta buffer
       [not found]   ` <CGME20241104141509epcas5p4ed0c68c42ccad27f9a38dc0c0ef7628d@epcas5p4.samsung.com>
@ 2024-11-04 14:06     ` Anuj Gupta
  0 siblings, 0 replies; 26+ messages in thread
From: Anuj Gupta @ 2024-11-04 14:06 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
	brauner, jack, viro
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
	linux-fsdevel, Kanchan Joshi, Anuj Gupta

From: Kanchan Joshi <[email protected]>

If an iocb contains metadata, extract that and prepare the bip.
Based on flags specified by the user, set corresponding guard/app/ref
tags to be checked in bip.

Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
---
 block/bio-integrity.c         | 50 +++++++++++++++++++++++++++++++++++
 block/fops.c                  | 42 ++++++++++++++++++++++-------
 include/linux/bio-integrity.h |  7 +++++
 3 files changed, 90 insertions(+), 9 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 3bee43b87001..5d81ad9a3d20 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -364,6 +364,55 @@ int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter)
 	return ret;
 }
 
+static void bio_uio_meta_to_bip(struct bio *bio, struct uio_meta *meta)
+{
+	struct bio_integrity_payload *bip = bio_integrity(bio);
+
+	if (meta->flags & IO_INTEGRITY_CHK_GUARD)
+		bip->bip_flags |= BIP_CHECK_GUARD;
+	if (meta->flags & IO_INTEGRITY_CHK_APPTAG)
+		bip->bip_flags |= BIP_CHECK_APPTAG;
+	if (meta->flags & IO_INTEGRITY_CHK_REFTAG)
+		bip->bip_flags |= BIP_CHECK_REFTAG;
+
+	bip->app_tag = meta->app_tag;
+}
+
+int bio_integrity_map_iter(struct bio *bio, struct uio_meta *meta)
+{
+	struct blk_integrity *bi = blk_get_integrity(bio->bi_bdev->bd_disk);
+	unsigned int integrity_bytes;
+	int ret;
+	struct iov_iter it;
+
+	if (!bi)
+		return -EINVAL;
+	/*
+	 * original meta iterator can be bigger.
+	 * process integrity info corresponding to current data buffer only.
+	 */
+	it = meta->iter;
+	integrity_bytes = bio_integrity_bytes(bi, bio_sectors(bio));
+	if (it.count < integrity_bytes)
+		return -EINVAL;
+
+	/* should fit into two bytes */
+	BUILD_BUG_ON(IO_INTEGRITY_VALID_FLAGS >= (1 << 16));
+
+	if (meta->flags && (meta->flags & ~IO_INTEGRITY_VALID_FLAGS))
+		return -EINVAL;
+
+	it.count = integrity_bytes;
+	ret = bio_integrity_map_user(bio, &it);
+	if (!ret) {
+		bio_uio_meta_to_bip(bio, meta);
+		bip_set_seed(bio_integrity(bio), meta->seed);
+		iov_iter_advance(&meta->iter, integrity_bytes);
+		meta->seed += bio_integrity_intervals(bi, bio_sectors(bio));
+	}
+	return ret;
+}
+
 /**
  * bio_integrity_prep - Prepare bio for integrity I/O
  * @bio:	bio to prepare
@@ -564,6 +613,7 @@ int bio_integrity_clone(struct bio *bio, struct bio *bio_src,
 	bip->bip_vec = bip_src->bip_vec;
 	bip->bip_iter = bip_src->bip_iter;
 	bip->bip_flags = bip_src->bip_flags & BIP_CLONE_FLAGS;
+	bip->app_tag = bip_src->app_tag;
 
 	return 0;
 }
diff --git a/block/fops.c b/block/fops.c
index 2d01c9007681..3cf7e15eabbc 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -54,6 +54,7 @@ static ssize_t __blkdev_direct_IO_simple(struct kiocb *iocb,
 	struct bio bio;
 	ssize_t ret;
 
+	WARN_ON_ONCE(iocb->ki_flags & IOCB_HAS_METADATA);
 	if (nr_pages <= DIO_INLINE_BIO_VECS)
 		vecs = inline_vecs;
 	else {
@@ -128,6 +129,9 @@ static void blkdev_bio_end_io(struct bio *bio)
 	if (bio->bi_status && !dio->bio.bi_status)
 		dio->bio.bi_status = bio->bi_status;
 
+	if (dio->iocb->ki_flags & IOCB_HAS_METADATA)
+		bio_integrity_unmap_user(bio);
+
 	if (atomic_dec_and_test(&dio->ref)) {
 		if (!(dio->flags & DIO_IS_SYNC)) {
 			struct kiocb *iocb = dio->iocb;
@@ -221,14 +225,16 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
 			 * a retry of this from blocking context.
 			 */
 			if (unlikely(iov_iter_count(iter))) {
-				bio_release_pages(bio, false);
-				bio_clear_flag(bio, BIO_REFFED);
-				bio_put(bio);
-				blk_finish_plug(&plug);
-				return -EAGAIN;
+				ret = -EAGAIN;
+				goto fail;
 			}
 			bio->bi_opf |= REQ_NOWAIT;
 		}
+		if (!is_sync && (iocb->ki_flags & IOCB_HAS_METADATA)) {
+			ret = bio_integrity_map_iter(bio, iocb->private);
+			if (unlikely(ret))
+				goto fail;
+		}
 
 		if (is_read) {
 			if (dio->flags & DIO_SHOULD_DIRTY)
@@ -269,6 +275,12 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
 
 	bio_put(&dio->bio);
 	return ret;
+fail:
+	bio_release_pages(bio, false);
+	bio_clear_flag(bio, BIO_REFFED);
+	bio_put(bio);
+	blk_finish_plug(&plug);
+	return ret;
 }
 
 static void blkdev_bio_end_io_async(struct bio *bio)
@@ -286,6 +298,9 @@ static void blkdev_bio_end_io_async(struct bio *bio)
 		ret = blk_status_to_errno(bio->bi_status);
 	}
 
+	if (iocb->ki_flags & IOCB_HAS_METADATA)
+		bio_integrity_unmap_user(bio);
+
 	iocb->ki_complete(iocb, ret);
 
 	if (dio->flags & DIO_SHOULD_DIRTY) {
@@ -330,10 +345,8 @@ static ssize_t __blkdev_direct_IO_async(struct kiocb *iocb,
 		bio_iov_bvec_set(bio, iter);
 	} else {
 		ret = bio_iov_iter_get_pages(bio, iter);
-		if (unlikely(ret)) {
-			bio_put(bio);
-			return ret;
-		}
+		if (unlikely(ret))
+			goto out_bio_put;
 	}
 	dio->size = bio->bi_iter.bi_size;
 
@@ -346,6 +359,13 @@ static ssize_t __blkdev_direct_IO_async(struct kiocb *iocb,
 		task_io_account_write(bio->bi_iter.bi_size);
 	}
 
+	if (iocb->ki_flags & IOCB_HAS_METADATA) {
+		ret = bio_integrity_map_iter(bio, iocb->private);
+		WRITE_ONCE(iocb->private, NULL);
+		if (unlikely(ret))
+			goto out_bio_put;
+	}
+
 	if (iocb->ki_flags & IOCB_ATOMIC)
 		bio->bi_opf |= REQ_ATOMIC;
 
@@ -360,6 +380,10 @@ static ssize_t __blkdev_direct_IO_async(struct kiocb *iocb,
 		submit_bio(bio);
 	}
 	return -EIOCBQUEUED;
+
+out_bio_put:
+	bio_put(bio);
+	return ret;
 }
 
 static ssize_t blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index 2195bc06dcde..de0a6c9de4d1 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -23,6 +23,7 @@ struct bio_integrity_payload {
 	unsigned short		bip_vcnt;	/* # of integrity bio_vecs */
 	unsigned short		bip_max_vcnt;	/* integrity bio_vec slots */
 	unsigned short		bip_flags;	/* control flags */
+	u16			app_tag;	/* application tag value */
 
 	struct bvec_iter	bio_iter;	/* for rewinding parent bio */
 
@@ -78,6 +79,7 @@ struct bio_integrity_payload *bio_integrity_alloc(struct bio *bio, gfp_t gfp,
 int bio_integrity_add_page(struct bio *bio, struct page *page, unsigned int len,
 		unsigned int offset);
 int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter);
+int bio_integrity_map_iter(struct bio *bio, struct uio_meta *meta);
 void bio_integrity_unmap_user(struct bio *bio);
 bool bio_integrity_prep(struct bio *bio);
 void bio_integrity_advance(struct bio *bio, unsigned int bytes_done);
@@ -108,6 +110,11 @@ static int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter)
 	return -EINVAL;
 }
 
+static inline int bio_integrity_map_iter(struct bio *bio, struct uio_meta *meta)
+{
+	return -EINVAL;
+}
+
 static inline void bio_integrity_unmap_user(struct bio *bio)
 {
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 04/10] fs, iov_iter: define meta io descriptor
  2024-11-04 14:05     ` [PATCH v7 04/10] fs, iov_iter: define meta io descriptor Anuj Gupta
@ 2024-11-05  9:55       ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2024-11-05  9:55 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
	brauner, jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
	linux-scsi, vishak.g, linux-fsdevel, Kanchan Joshi

On Mon, Nov 04, 2024 at 07:35:55PM +0530, Anuj Gupta wrote:
> Add flags to describe checks for integrity meta buffer. Also, introduce
> a  new 'uio_meta' structure that upper layer can use to pass the
> meta/integrity information.
> 
> Signed-off-by: Kanchan Joshi <[email protected]>
> Signed-off-by: Anuj Gupta <[email protected]>

I'm pretty sure I reviewed this already last time, but here we go
again in case I'm misremembering:

Reviewed-by: Christoph Hellwig <[email protected]>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write
  2024-11-04 14:05     ` [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write Anuj Gupta
@ 2024-11-05  9:56       ` Christoph Hellwig
  2024-11-05 13:04         ` Anuj gupta
  0 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2024-11-05  9:56 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
	brauner, jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
	linux-scsi, vishak.g, linux-fsdevel, Kanchan Joshi

On Mon, Nov 04, 2024 at 07:35:57PM +0530, Anuj Gupta wrote:
> read/write. A new meta_type field is introduced in SQE which indicates
> the type of metadata being passed.

I still object to this completely pointless and ill-defined field.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 02/10] block: copy back bounce buffer to user-space correctly in case of split
  2024-11-04 14:05     ` [PATCH v7 02/10] block: copy back bounce buffer to user-space correctly in case of split Anuj Gupta
@ 2024-11-05 10:03       ` Christoph Hellwig
  2024-11-05 13:15         ` Anuj gupta
  0 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2024-11-05 10:03 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
	brauner, jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
	linux-scsi, vishak.g, linux-fsdevel

On Mon, Nov 04, 2024 at 07:35:53PM +0530, Anuj Gupta wrote:
> From: Christoph Hellwig <[email protected]>
> 
> Copy back the bounce buffer to user-space in entirety when the parent
> bio completes. The existing code uses bip_iter.bi_size for sizing the
> copy, which can be modified. So move away from that and fetch it from
> the vector passed to the block layer. While at it, switch to using
> better variable names.
> 
> Fixes: 492c5d455969f ("block: bio-integrity: directly map user buffers")
> Signed-off-by: Anuj Gupta <[email protected]>
> [hch: better names for variables]
> Signed-off-by: Christoph Hellwig <[email protected]>

This shouldn't really have a from for me as it wasn't my patch
originally.  But if you insist to re-attribute it, my signoff should
be the first as signoffs are supposed to be a chain starting from
the original author to the submitter.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write
  2024-11-05  9:56       ` Christoph Hellwig
@ 2024-11-05 13:04         ` Anuj gupta
  2024-11-05 13:56           ` Christoph Hellwig
  0 siblings, 1 reply; 26+ messages in thread
From: Anuj gupta @ 2024-11-05 13:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Anuj Gupta, axboe, kbusch, martin.petersen, asml.silence, brauner,
	jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
	linux-scsi, vishak.g, linux-fsdevel, Kanchan Joshi

On Tue, Nov 5, 2024 at 3:26 PM Christoph Hellwig <[email protected]> wrote:
>
> On Mon, Nov 04, 2024 at 07:35:57PM +0530, Anuj Gupta wrote:
> > read/write. A new meta_type field is introduced in SQE which indicates
> > the type of metadata being passed.
>
> I still object to this completely pointless and ill-defined field.

The field is used only at io_uring level, and it helps there in using the
SQE space flexibly.
Overall, while all other pieces are sorted, we are only missing the consensus
on io_uring bits. This is also an attempt to gain that. We will have to see
in what form Jens/Pavel would like to see this part.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 02/10] block: copy back bounce buffer to user-space correctly in case of split
  2024-11-05 10:03       ` Christoph Hellwig
@ 2024-11-05 13:15         ` Anuj gupta
  0 siblings, 0 replies; 26+ messages in thread
From: Anuj gupta @ 2024-11-05 13:15 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Anuj Gupta, axboe, kbusch, martin.petersen, asml.silence, brauner,
	jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
	linux-scsi, vishak.g, linux-fsdevel

> This shouldn't really have a from for me as it wasn't my patch
> originally.  But if you insist to re-attribute it, my signoff should
> be the first as signoffs are supposed to be a chain starting from
> the original author to the submitter.
>
Will change the sign-off order if I have to iterate.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write
  2024-11-05 13:04         ` Anuj gupta
@ 2024-11-05 13:56           ` Christoph Hellwig
  2024-11-05 15:51             ` Kanchan Joshi
  0 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2024-11-05 13:56 UTC (permalink / raw)
  To: Anuj gupta
  Cc: Christoph Hellwig, Anuj Gupta, axboe, kbusch, martin.petersen,
	asml.silence, brauner, jack, viro, io-uring, linux-nvme,
	linux-block, gost.dev, linux-scsi, vishak.g, linux-fsdevel,
	Kanchan Joshi

On Tue, Nov 05, 2024 at 06:34:29PM +0530, Anuj gupta wrote:
> The field is used only at io_uring level, and it helps there in using the
> SQE space flexibly.

How so?  There is absolutely no documentation for it in either the
code or commit log.  And if it is about sqe space management, meta_type
is about the most confusing possible name as well.  So someone please
needs to write down how it is supposed to work and come up with a name
that remotely makes sense for that.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write
  2024-11-05 13:56           ` Christoph Hellwig
@ 2024-11-05 15:51             ` Kanchan Joshi
  2024-11-05 16:00               ` Christoph Hellwig
  0 siblings, 1 reply; 26+ messages in thread
From: Kanchan Joshi @ 2024-11-05 15:51 UTC (permalink / raw)
  To: Christoph Hellwig, Anuj gupta
  Cc: Anuj Gupta, axboe, kbusch, martin.petersen, asml.silence, brauner,
	jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
	linux-scsi, vishak.g, linux-fsdevel

On 11/5/2024 7:26 PM, Christoph Hellwig wrote:
> On Tue, Nov 05, 2024 at 06:34:29PM +0530, Anuj gupta wrote:
>> The field is used only at io_uring level, and it helps there in using the
>> SQE space flexibly.
> 
> How so?  There is absolutely no documentation for it in either the
> code or commit log.  And if it is about sqe space management, meta_type
> is about the most confusing possible name as well.  So someone please
> needs to write down how it is supposed to work and come up with a name
> that remotely makes sense for that.

Can add the documentation (if this version is palatable for Jens/Pavel), 
but this was discussed in previous iteration:

1. Each meta type may have different space requirement in SQE.

Only for PI, we need so much space that we can't fit that in first SQE. 
The SQE128 requirement is only for PI type.
Another different meta type may just fit into the first SQE. For that we 
don't have to mandate SQE128.

2. If two meta types are known not to co-exist, they can be kept in the 
same place within SQE. Since each meta-type is a flag, we can check what 
combinations are valid within io_uring and throw the error in case of 
incompatibility.

3. Previous version was relying on SQE128 flag. If user set the ring 
that way, it is assumed that PI information was sent.
This is more explicitly conveyed now - if user passed META_TYPE_PI flag, 
it has sent the PI. This comment in the code:

+       /* if sqe->meta_type is META_TYPE_PI, last 32 bytes are for PI */
+       union {

If this flag is not passed, parsing of second SQE is skipped, which is 
the current behavior as now also one can send regular (non pi) 
read/write on SQE128 ring.







^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write
  2024-11-05 15:51             ` Kanchan Joshi
@ 2024-11-05 16:00               ` Christoph Hellwig
  2024-11-05 16:23                 ` Keith Busch
  2024-11-05 16:38                 ` Kanchan Joshi
  0 siblings, 2 replies; 26+ messages in thread
From: Christoph Hellwig @ 2024-11-05 16:00 UTC (permalink / raw)
  To: Kanchan Joshi
  Cc: Christoph Hellwig, Anuj gupta, Anuj Gupta, axboe, kbusch,
	martin.petersen, asml.silence, brauner, jack, viro, io-uring,
	linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
	linux-fsdevel

On Tue, Nov 05, 2024 at 09:21:27PM +0530, Kanchan Joshi wrote:
> Can add the documentation (if this version is palatable for Jens/Pavel), 
> but this was discussed in previous iteration:
> 
> 1. Each meta type may have different space requirement in SQE.
> 
> Only for PI, we need so much space that we can't fit that in first SQE. 
> The SQE128 requirement is only for PI type.
> Another different meta type may just fit into the first SQE. For that we 
> don't have to mandate SQE128.

Ok, I'm really confused now.  The way I understood Anuj was that this
is NOT about block level metadata, but about other uses of the big SQE.

Which version is right?  Or did I just completely misunderstand Anuj?

> 2. If two meta types are known not to co-exist, they can be kept in the 
> same place within SQE. Since each meta-type is a flag, we can check what 
> combinations are valid within io_uring and throw the error in case of 
> incompatibility.

And this sounds like what you refer to is not actually block metadata
as in this patchset or nvme, (or weirdly enough integrity in the block
layer code).

> 3. Previous version was relying on SQE128 flag. If user set the ring 
> that way, it is assumed that PI information was sent.
> This is more explicitly conveyed now - if user passed META_TYPE_PI flag, 
> it has sent the PI. This comment in the code:
> 
> +       /* if sqe->meta_type is META_TYPE_PI, last 32 bytes are for PI */
> +       union {
> 
> If this flag is not passed, parsing of second SQE is skipped, which is 
> the current behavior as now also one can send regular (non pi) 
> read/write on SQE128 ring.

And while I don't understand how this threads in with the previous
statements, this makes sense.  If you only want to send a pointer (+len)
to metadata you can use the normal 64-byte SQE.  If you want to send
a PI tuple you need SEQ128.  Is that what the various above statements
try to express?  If so the right API to me would be to have two flags:

 - a flag that a pointer to metadata is passed.  This can work with
   a 64-bit SQE.
 - another flag that a PI tuple is passed.  This requires a 128-byte
   and also the previous flag.


> 
> 
> 
> 
> 
---end quoted text---

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write
  2024-11-05 16:00               ` Christoph Hellwig
@ 2024-11-05 16:23                 ` Keith Busch
  2024-11-05 16:50                   ` Kanchan Joshi
  2024-11-06  5:29                   ` Christoph Hellwig
  2024-11-05 16:38                 ` Kanchan Joshi
  1 sibling, 2 replies; 26+ messages in thread
From: Keith Busch @ 2024-11-05 16:23 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Kanchan Joshi, Anuj gupta, Anuj Gupta, axboe, martin.petersen,
	asml.silence, brauner, jack, viro, io-uring, linux-nvme,
	linux-block, gost.dev, linux-scsi, vishak.g, linux-fsdevel

On Tue, Nov 05, 2024 at 05:00:51PM +0100, Christoph Hellwig wrote:
> On Tue, Nov 05, 2024 at 09:21:27PM +0530, Kanchan Joshi wrote:
> > Can add the documentation (if this version is palatable for Jens/Pavel), 
> > but this was discussed in previous iteration:
> > 
> > 1. Each meta type may have different space requirement in SQE.
> > 
> > Only for PI, we need so much space that we can't fit that in first SQE. 
> > The SQE128 requirement is only for PI type.
> > Another different meta type may just fit into the first SQE. For that we 
> > don't have to mandate SQE128.
> 
> Ok, I'm really confused now.  The way I understood Anuj was that this
> is NOT about block level metadata, but about other uses of the big SQE.
> 
> Which version is right?  Or did I just completely misunderstand Anuj?

Let's not call this "meta_type". Can we use something that has a less
overloaded meaning, like "sqe_extended_capabilities", or "ecap", or
something like that.
 
> > 2. If two meta types are known not to co-exist, they can be kept in the 
> > same place within SQE. Since each meta-type is a flag, we can check what 
> > combinations are valid within io_uring and throw the error in case of 
> > incompatibility.
> 
> And this sounds like what you refer to is not actually block metadata
> as in this patchset or nvme, (or weirdly enough integrity in the block
> layer code).
> 
> > 3. Previous version was relying on SQE128 flag. If user set the ring 
> > that way, it is assumed that PI information was sent.
> > This is more explicitly conveyed now - if user passed META_TYPE_PI flag, 
> > it has sent the PI. This comment in the code:
> > 
> > +       /* if sqe->meta_type is META_TYPE_PI, last 32 bytes are for PI */
> > +       union {
> > 
> > If this flag is not passed, parsing of second SQE is skipped, which is 
> > the current behavior as now also one can send regular (non pi) 
> > read/write on SQE128 ring.
> 
> And while I don't understand how this threads in with the previous
> statements, this makes sense.  If you only want to send a pointer (+len)
> to metadata you can use the normal 64-byte SQE.  If you want to send
> a PI tuple you need SEQ128.  Is that what the various above statements
> try to express?  If so the right API to me would be to have two flags:
> 
>  - a flag that a pointer to metadata is passed.  This can work with
>    a 64-bit SQE.
>  - another flag that a PI tuple is passed.  This requires a 128-byte
>    and also the previous flag.

I don't think anything done so far aligns with what Pavel had in mind.
Let me try to lay out what I think he's going for. Just bare with me,
this is just a hypothetical example.

  This patch adds a PI extension.
  Later, let's say write streams needs another extenion.
  Then key per-IO wants another extention.
  Then someone else adds wizbang-awesome-feature extention.

Let's say you have device that can do all 4, or any combination of them.
Pavel wants a solution that is future proof to such a scenario. So not
just a single new "meta_type" with its structure, but a list of types in
no particular order, and their structures.

That list can exist either in the extended SQE, or in some other user
address that the kernel will need copy.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write
  2024-11-05 16:00               ` Christoph Hellwig
  2024-11-05 16:23                 ` Keith Busch
@ 2024-11-05 16:38                 ` Kanchan Joshi
  2024-11-06  5:33                   ` Christoph Hellwig
  1 sibling, 1 reply; 26+ messages in thread
From: Kanchan Joshi @ 2024-11-05 16:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Anuj gupta, Anuj Gupta, axboe, kbusch, martin.petersen,
	asml.silence, brauner, jack, viro, io-uring, linux-nvme,
	linux-block, gost.dev, linux-scsi, vishak.g, linux-fsdevel

On 11/5/2024 9:30 PM, Christoph Hellwig wrote:
> On Tue, Nov 05, 2024 at 09:21:27PM +0530, Kanchan Joshi wrote:
>> Can add the documentation (if this version is palatable for Jens/Pavel),
>> but this was discussed in previous iteration:
>>
>> 1. Each meta type may have different space requirement in SQE.
>>
>> Only for PI, we need so much space that we can't fit that in first SQE.
>> The SQE128 requirement is only for PI type.
>> Another different meta type may just fit into the first SQE. For that we
>> don't have to mandate SQE128.
> 
> Ok, I'm really confused now.  The way I understood Anuj was that this
> is NOT about block level metadata, but about other uses of the big SQE.
> 
> Which version is right?  Or did I just completely misunderstand Anuj?

We both mean the same. Currently read/write don't [need to] use big SQE 
as all the information is there in the first SQE.
Down the line there may be users fighting for space in SQE. The flag 
(meta_type) may help a bit when that happens.

>> 2. If two meta types are known not to co-exist, they can be kept in the
>> same place within SQE. Since each meta-type is a flag, we can check what
>> combinations are valid within io_uring and throw the error in case of
>> incompatibility.
> 
> And this sounds like what you refer to is not actually block metadata
> as in this patchset or nvme, (or weirdly enough integrity in the block
> layer code).

Right, not about block metadata/pi. But some extra information 
(different in size/semantics etc.) that user wants to pass into SQE 
along with read/write.

>> 3. Previous version was relying on SQE128 flag. If user set the ring
>> that way, it is assumed that PI information was sent.
>> This is more explicitly conveyed now - if user passed META_TYPE_PI flag,
>> it has sent the PI. This comment in the code:
>>
>> +       /* if sqe->meta_type is META_TYPE_PI, last 32 bytes are for PI */
>> +       union {
>>
>> If this flag is not passed, parsing of second SQE is skipped, which is
>> the current behavior as now also one can send regular (non pi)
>> read/write on SQE128 ring.
> 
> And while I don't understand how this threads in with the previous
> statements, this makes sense.  If you only want to send a pointer (+len)
> to metadata you can use the normal 64-byte SQE.  If you want to send
> a PI tuple you need SEQ128.  Is that what the various above statements
> try to express? 

Not exactly. You are talking about pi-type 0 (which only requires meta 
buffer/len) versus !0 pi-type. We thought about it, but decided to keep 
asking for SQE128 regardless of that (pi 0 or non-zero). In both cases 
user will set meta-buffer/len, and other type-specific flags are taken 
care by the low-level code. This keeps thing simple and at io_uring 
level we don't have to distinguish that case.

What I rather meant in this statement was - one can setup a ring with 
SQE128 today and send IORING_OP_READ/IORING_OP_WRITE. That goes fine 
without any processing/error as SQE128 is skipped completely. So relying 
only on SQE128 flag to detect the presence of PI is a bit fragile.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write
  2024-11-05 16:23                 ` Keith Busch
@ 2024-11-05 16:50                   ` Kanchan Joshi
  2024-11-06  5:29                   ` Christoph Hellwig
  1 sibling, 0 replies; 26+ messages in thread
From: Kanchan Joshi @ 2024-11-05 16:50 UTC (permalink / raw)
  To: Keith Busch, Christoph Hellwig
  Cc: Anuj gupta, Anuj Gupta, axboe, martin.petersen, asml.silence,
	brauner, jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
	linux-scsi, vishak.g, linux-fsdevel

On 11/5/2024 9:53 PM, Keith Busch wrote:
> On Tue, Nov 05, 2024 at 05:00:51PM +0100, Christoph Hellwig wrote:
>> On Tue, Nov 05, 2024 at 09:21:27PM +0530, Kanchan Joshi wrote:
>>> Can add the documentation (if this version is palatable for Jens/Pavel),
>>> but this was discussed in previous iteration:
>>>
>>> 1. Each meta type may have different space requirement in SQE.
>>>
>>> Only for PI, we need so much space that we can't fit that in first SQE.
>>> The SQE128 requirement is only for PI type.
>>> Another different meta type may just fit into the first SQE. For that we
>>> don't have to mandate SQE128.
>>
>> Ok, I'm really confused now.  The way I understood Anuj was that this
>> is NOT about block level metadata, but about other uses of the big SQE.
>>
>> Which version is right?  Or did I just completely misunderstand Anuj?
> 
> Let's not call this "meta_type". Can we use something that has a less
> overloaded meaning, like "sqe_extended_capabilities", or "ecap", or
> something like that.
>   

Right, something like that. We need to change it.
Seems a useful thing is not being seen that way because of its name.

>>> 2. If two meta types are known not to co-exist, they can be kept in the
>>> same place within SQE. Since each meta-type is a flag, we can check what
>>> combinations are valid within io_uring and throw the error in case of
>>> incompatibility.
>>
>> And this sounds like what you refer to is not actually block metadata
>> as in this patchset or nvme, (or weirdly enough integrity in the block
>> layer code).
>>
>>> 3. Previous version was relying on SQE128 flag. If user set the ring
>>> that way, it is assumed that PI information was sent.
>>> This is more explicitly conveyed now - if user passed META_TYPE_PI flag,
>>> it has sent the PI. This comment in the code:
>>>
>>> +       /* if sqe->meta_type is META_TYPE_PI, last 32 bytes are for PI */
>>> +       union {
>>>
>>> If this flag is not passed, parsing of second SQE is skipped, which is
>>> the current behavior as now also one can send regular (non pi)
>>> read/write on SQE128 ring.
>>
>> And while I don't understand how this threads in with the previous
>> statements, this makes sense.  If you only want to send a pointer (+len)
>> to metadata you can use the normal 64-byte SQE.  If you want to send
>> a PI tuple you need SEQ128.  Is that what the various above statements
>> try to express?  If so the right API to me would be to have two flags:
>>
>>   - a flag that a pointer to metadata is passed.  This can work with
>>     a 64-bit SQE.
>>   - another flag that a PI tuple is passed.  This requires a 128-byte
>>     and also the previous flag.
> 
> I don't think anything done so far aligns with what Pavel had in mind.
> Let me try to lay out what I think he's going for. Just bare with me,
> this is just a hypothetical example.

I have the same example in mind.


>    This patch adds a PI extension.
>    Later, let's say write streams needs another extenion.
>    Then key per-IO wants another extention.
>    Then someone else adds wizbang-awesome-feature extention.
> 
> Let's say you have device that can do all 4, or any combination of them.
> Pavel wants a solution that is future proof to such a scenario. So not
> just a single new "meta_type" with its structure, but a list of types in
> no particular order, and their structures.
> 
> That list can exist either in the extended SQE, or in some other user
> address that the kernel will need copy.

That list is the meta_type bit-flags this series creates.

For some future meta_type there can be "META_TYPE_XYZ_INDIRECT" flag and 
that will mean extra-information needs to fetched via copy_from_user.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write
  2024-11-05 16:23                 ` Keith Busch
  2024-11-05 16:50                   ` Kanchan Joshi
@ 2024-11-06  5:29                   ` Christoph Hellwig
  2024-11-06  6:00                     ` Kanchan Joshi
  1 sibling, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2024-11-06  5:29 UTC (permalink / raw)
  To: Keith Busch
  Cc: Christoph Hellwig, Kanchan Joshi, Anuj gupta, Anuj Gupta, axboe,
	martin.petersen, asml.silence, brauner, jack, viro, io-uring,
	linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
	linux-fsdevel

On Tue, Nov 05, 2024 at 09:23:19AM -0700, Keith Busch wrote:
> > > The SQE128 requirement is only for PI type.
> > > Another different meta type may just fit into the first SQE. For that we 
> > > don't have to mandate SQE128.
> > 
> > Ok, I'm really confused now.  The way I understood Anuj was that this
> > is NOT about block level metadata, but about other uses of the big SQE.
> > 
> > Which version is right?  Or did I just completely misunderstand Anuj?
> 
> Let's not call this "meta_type". Can we use something that has a less
> overloaded meaning, like "sqe_extended_capabilities", or "ecap", or
> something like that.

So it's just a flag that a 128-byte SQE is used?  Don't we know that
implicitly from the sq?

> >  - a flag that a pointer to metadata is passed.  This can work with
> >    a 64-bit SQE.
> >  - another flag that a PI tuple is passed.  This requires a 128-byte
> >    and also the previous flag.
> 
> I don't think anything done so far aligns with what Pavel had in mind.
> Let me try to lay out what I think he's going for. Just bare with me,
> this is just a hypothetical example.
> 
>   This patch adds a PI extension.
>   Later, let's say write streams needs another extenion.
>   Then key per-IO wants another extention.
>   Then someone else adds wizbang-awesome-feature extention.
> 
> Let's say you have device that can do all 4, or any combination of them.
> Pavel wants a solution that is future proof to such a scenario. So not
> just a single new "meta_type" with its structure, but a list of types in
> no particular order, and their structures.

But why do we need the type at all?  Each of them obvious needs two
things:

 1) some space to actually store the extra fields
 2) a flag that the additional values are passed

any single value is not going to help with supporting arbitrary
combinations, because well, you can can mix and match, and you need
space for all them even if you are not using all of them.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write
  2024-11-05 16:38                 ` Kanchan Joshi
@ 2024-11-06  5:33                   ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2024-11-06  5:33 UTC (permalink / raw)
  To: Kanchan Joshi
  Cc: Christoph Hellwig, Anuj gupta, Anuj Gupta, axboe, kbusch,
	martin.petersen, asml.silence, brauner, jack, viro, io-uring,
	linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
	linux-fsdevel

On Tue, Nov 05, 2024 at 10:08:46PM +0530, Kanchan Joshi wrote:
> We both mean the same. Currently read/write don't [need to] use big SQE 
> as all the information is there in the first SQE.
> Down the line there may be users fighting for space in SQE. The flag 
> (meta_type) may help a bit when that happens.

IFF we ever have a fight we need to split command or add an even bigger
SQE.`

> What I rather meant in this statement was - one can setup a ring with 
> SQE128 today and send IORING_OP_READ/IORING_OP_WRITE. That goes fine 
> without any processing/error as SQE128 is skipped completely. So relying 
> only on SQE128 flag to detect the presence of PI is a bit fragile.

Maybe the right answer is to add

READ_LARGE/WRITE_LARGE (better names welcome) commands that are defined
to the entire 128-byte SQE, and then we have a bitmap of what extra
features are supported in it, with descriptive names for each feature.
Not trying to have one command for 64 vs 128 byte SQE might also be
useful to have a more straight forward layout in general (although I
haven't checked for that, just speaking from experience in other
protocols).

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write
  2024-11-06  5:29                   ` Christoph Hellwig
@ 2024-11-06  6:00                     ` Kanchan Joshi
  2024-11-06  6:12                       ` Christoph Hellwig
  0 siblings, 1 reply; 26+ messages in thread
From: Kanchan Joshi @ 2024-11-06  6:00 UTC (permalink / raw)
  To: Christoph Hellwig, Keith Busch
  Cc: Anuj gupta, Anuj Gupta, axboe, martin.petersen, asml.silence,
	brauner, jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
	linux-scsi, vishak.g, linux-fsdevel

On 11/6/2024 10:59 AM, Christoph Hellwig wrote:
> On Tue, Nov 05, 2024 at 09:23:19AM -0700, Keith Busch wrote:
>>>> The SQE128 requirement is only for PI type.
>>>> Another different meta type may just fit into the first SQE. For that we
>>>> don't have to mandate SQE128.
>>>
>>> Ok, I'm really confused now.  The way I understood Anuj was that this
>>> is NOT about block level metadata, but about other uses of the big SQE.
>>>
>>> Which version is right?  Or did I just completely misunderstand Anuj?
>>
>> Let's not call this "meta_type". Can we use something that has a less
>> overloaded meaning, like "sqe_extended_capabilities", or "ecap", or
>> something like that.
> 
> So it's just a flag that a 128-byte SQE is used?

No, this flag tells that user decided to send PI in SQE. And this flag 
is kept into first half of SQE (which always exists). This is just 
additional detail/requirement that PI fields are kept into SQE128 (which 
is opt in).

>  Don't we know that
> implicitly from the sq?

Yes, we have a separate ring-level flag for that.

#define IORING_SETUP_SQE128             (1U << 10) /* SQEs are 128 byte */

>>>   - a flag that a pointer to metadata is passed.  This can work with
>>>     a 64-bit SQE.
>>>   - another flag that a PI tuple is passed.  This requires a 128-byte
>>>     and also the previous flag.
>>
>> I don't think anything done so far aligns with what Pavel had in mind.
>> Let me try to lay out what I think he's going for. Just bare with me,
>> this is just a hypothetical example.
>>
>>    This patch adds a PI extension.
>>    Later, let's say write streams needs another extenion.
>>    Then key per-IO wants another extention.
>>    Then someone else adds wizbang-awesome-feature extention.
>>
>> Let's say you have device that can do all 4, or any combination of them.
>> Pavel wants a solution that is future proof to such a scenario. So not
>> just a single new "meta_type" with its structure, but a list of types in
>> no particular order, and their structures.
> 
> But why do we need the type at all?  Each of them obvious needs two
> things:
> 
>   1) some space to actually store the extra fields
>   2) a flag that the additional values are passed

Yes, this is exactly how the patch is implemented. 'meta-type' is the 
flag that tells additional values (representing PI info) are passed.

> any single value is not going to help with supporting arbitrary
> combinations,

Not a single value. It is a u16 field, so it can represent 16 possible 
flags.
This part in the patch:

+enum io_uring_sqe_meta_type_bits {
+       META_TYPE_PI_BIT,
+       /* not a real meta type; just to make sure that we don't overflow */
+       META_TYPE_LAST_BIT,
+};
+
+/* meta type flags */
+#define META_TYPE_PI   (1U << META_TYPE_PI_BIT)

For future users, one can add things like META_TYPE_KPIO_BIT or 
META_TYPE_WRITE_HINT_BIT if they needed to send extra information in SQE.

Note that these users may not require SQE128. It all depends on how much 
of extra information is required. We still have some free space in first 
SQE.

  because well, you can can mix and match, and you need
> space for all them even if you are not using all of them.

mix-and-match can be detected with the above flags.
And in case two types don't go well together, that also. And for such 
types we can reuse the space.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write
  2024-11-06  6:00                     ` Kanchan Joshi
@ 2024-11-06  6:12                       ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2024-11-06  6:12 UTC (permalink / raw)
  To: Kanchan Joshi
  Cc: Christoph Hellwig, Keith Busch, Anuj gupta, Anuj Gupta, axboe,
	martin.petersen, asml.silence, brauner, jack, viro, io-uring,
	linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
	linux-fsdevel

On Wed, Nov 06, 2024 at 11:30:45AM +0530, Kanchan Joshi wrote:
> >   1) some space to actually store the extra fields
> >   2) a flag that the additional values are passed
> 
> Yes, this is exactly how the patch is implemented. 'meta-type' is the 
> flag that tells additional values (representing PI info) are passed.
> 
> > any single value is not going to help with supporting arbitrary
> > combinations,
> 
> Not a single value. It is a u16 field, so it can represent 16 possible 
> flags.
> This part in the patch:
> 
> +enum io_uring_sqe_meta_type_bits {
> +       META_TYPE_PI_BIT,
> +       /* not a real meta type; just to make sure that we don't overflow */
> +       META_TYPE_LAST_BIT,
> +};

Well, then it's grossly misnamed and underdocumented.  For one the
meta name simply is wrong because it's about all extra features.
Second a type implies an enumeration of types, not a set of flags.

So if you actually name this extended_features or similar and clearly
document it might actually make sense.


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2024-11-06  6:12 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CGME20241104141427epcas5p2174ded627e2d785294ac4977b011a75b@epcas5p2.samsung.com>
2024-11-04 14:05 ` [PATCH v7 00/10] Read/Write with meta/integrity Anuj Gupta
     [not found]   ` <CGME20241104141445epcas5p3fa11a5bebe88ac2bb3541850369591f7@epcas5p3.samsung.com>
2024-11-04 14:05     ` [PATCH v7 01/10] block: define set of integrity flags to be inherited by cloned bip Anuj Gupta
     [not found]   ` <CGME20241104141448epcas5p4179505e12f9cf45fd792dc6da6afce8e@epcas5p4.samsung.com>
2024-11-04 14:05     ` [PATCH v7 02/10] block: copy back bounce buffer to user-space correctly in case of split Anuj Gupta
2024-11-05 10:03       ` Christoph Hellwig
2024-11-05 13:15         ` Anuj gupta
     [not found]   ` <CGME20241104141451epcas5p2aef1f93e905c27e34b3e16d89ff39245@epcas5p2.samsung.com>
2024-11-04 14:05     ` [PATCH v7 03/10] block: modify bio_integrity_map_user to accept iov_iter as argument Anuj Gupta
     [not found]   ` <CGME20241104141453epcas5p201e4aabfa7aa1f4af1cdf07228f8d4e7@epcas5p2.samsung.com>
2024-11-04 14:05     ` [PATCH v7 04/10] fs, iov_iter: define meta io descriptor Anuj Gupta
2024-11-05  9:55       ` Christoph Hellwig
     [not found]   ` <CGME20241104141456epcas5p38fef2ccde087de84ffc6f479f50e8071@epcas5p3.samsung.com>
2024-11-04 14:05     ` [PATCH v7 05/10] fs: introduce IOCB_HAS_METADATA for metadata Anuj Gupta
     [not found]   ` <CGME20241104141459epcas5p27991e140158b1e7294b4d6c4e767373c@epcas5p2.samsung.com>
2024-11-04 14:05     ` [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write Anuj Gupta
2024-11-05  9:56       ` Christoph Hellwig
2024-11-05 13:04         ` Anuj gupta
2024-11-05 13:56           ` Christoph Hellwig
2024-11-05 15:51             ` Kanchan Joshi
2024-11-05 16:00               ` Christoph Hellwig
2024-11-05 16:23                 ` Keith Busch
2024-11-05 16:50                   ` Kanchan Joshi
2024-11-06  5:29                   ` Christoph Hellwig
2024-11-06  6:00                     ` Kanchan Joshi
2024-11-06  6:12                       ` Christoph Hellwig
2024-11-05 16:38                 ` Kanchan Joshi
2024-11-06  5:33                   ` Christoph Hellwig
     [not found]   ` <CGME20241104141501epcas5p38203d98ce0b2ac95cc45e02a142e84ef@epcas5p3.samsung.com>
2024-11-04 14:05     ` [PATCH v7 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags Anuj Gupta
     [not found]   ` <CGME20241104141504epcas5p47e46a75f9248a37c9a4180de8e72b54c@epcas5p4.samsung.com>
2024-11-04 14:05     ` [PATCH v7 08/10] nvme: add support for passing on the application tag Anuj Gupta
     [not found]   ` <CGME20241104141507epcas5p161e39cef85f8fa5f5ad59e959e070d0b@epcas5p1.samsung.com>
2024-11-04 14:06     ` [PATCH v7 09/10] scsi: add support for user-meta interface Anuj Gupta
     [not found]   ` <CGME20241104141509epcas5p4ed0c68c42ccad27f9a38dc0c0ef7628d@epcas5p4.samsung.com>
2024-11-04 14:06     ` [PATCH v7 10/10] block: add support to pass user meta buffer Anuj Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox