public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH v3 00/10] Read/Write with meta/integrity
       [not found] <CGME20240823104552epcas5p226dbbbd448cd0ee0955ffdd3ad1b112d@epcas5p2.samsung.com>
@ 2024-08-23 10:38 ` Anuj Gupta
       [not found]   ` <CGME20240823104616epcas5p4bd315bd116ea7e32b1abf7e174af64a1@epcas5p4.samsung.com>
                     ` (10 more replies)
  0 siblings, 11 replies; 36+ messages in thread
From: Anuj Gupta @ 2024-08-23 10:38 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, krisman
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Anuj Gupta

This adds a new io_uring interface to exchange meta along with read/write.

Interface:
Meta information is represented using a newly introduced 'struct io_uring_meta'.
Application sets up a SQE128 ring, and prepares io_uring_meta within second
SQE. Application populates 'struct io_uring_meta' fields as below:

* meta_type: describes type of meta that is passed. Currently one type
"Integrity" is supported.
* meta_flags: these are meta-type specific flags. Three flags are exposed for
integrity type, namely INTEGRITY_CHK_GUARD/APPTAG/REFTAG.
* meta_len: length of the meta buffer
* meta_addr: address of the meta buffer
* app_tag: optional application-specific 16b value; this goes along with
INTEGRITY_CHK_APPTAG flag.

Block path (direct IO) , NVMe and SCSI driver are modified to support
this.

The first three patches are required to make the user metadata split
work correctly.
Patch 4,5 are prep patches.
Patch 6 adds the io_uring support.
Patch 7 gives us unified interface for user and kernel generated
integrity.
Patch 8 adds the support for block direct IO, patch 9 for NVMe, and
patch 10 for SCSI.

Some of the design choices came from this discussion [2].

Example program on how to use the interface is appended below [3]

Tree:
https://github.com/SamsungDS/linux/tree/feat/pi_us_v3
Testing:
has been done by modifying fio to use this interface.
https://github.sec.samsung.net/DS8-MemoryOpenSource/fio/tree/feat/test-meta-v4

Changes since v2:
https://lore.kernel.org/linux-block/[email protected]/
- io_uring error handling styling (Gabriel)
- add documented helper to get metadata bytes from data iter (hch)
- during clone specify "what flags to clone" rather than
"what not to clone" (hch)
- Move uio_meta defination to bio-integrity.h (hch)
- Rename apptag field to app_tag (hch)
- Change datatype of flags field in uio_meta to bitwise (hch)
- Don't introduce BIP_USER_CHK_FOO flags (hch, martin)
- Driver should rely on block layer flags instead of seeing if it is
user-passthrough (hch)
- update the scsi code for handling user-meta (hch, martin)

Changes since v1:
https://lore.kernel.org/linux-block/[email protected]/
- Do not use new opcode for meta, and also add the provision to introduce new
meta types beyond integrity (Pavel)
- Stuff IOCB_HAS_META check in need_complete_io (Jens)
- Split meta handling in NVMe into a separate handler (Keith)
- Add meta handling for __blkdev_direct_IO too (Keith)
- Don't inherit BIP_COPY_USER flag for cloned bio's (Christoph)
- Better commit descriptions (Christoph)

Changes since RFC:
- modify io_uring plumbing based on recent async handling state changes
- fixes/enhancements to correctly handle the split for meta buffer
- add flags to specify guard/reftag/apptag checks
- add support to send apptag

[1] https://lore.kernel.org/linux-block/[email protected]/

[2] https://lore.kernel.org/linux-block/[email protected]/

[3]

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <linux/io_uring.h>
#include <linux/types.h>
#include "liburing.h"

/* write data/meta. read both. compare. send apptag too.
* prerequisite:
* unprotected xfer: format namespace with 4KB + 8b, pi_type = 0
* protected xfer: format namespace with 4KB + 8b, pi_type = 1
*/

#define DATA_LEN 4096
#define META_LEN 8

struct t10_pi_tuple {
        __be16  guard;
        __be16  apptag;
        __be32  reftag;
};

int main(int argc, char *argv[])
{
         struct io_uring ring;
         struct io_uring_sqe *sqe = NULL;
         struct io_uring_cqe *cqe = NULL;
         void *wdb,*rdb;
         char wmb[META_LEN], rmb[META_LEN];
         char *data_str = "data buffer";
         char *meta_str = "meta";
         int fd, ret, blksize;
         struct stat fstat;
         unsigned long long offset = DATA_LEN;
         struct t10_pi_tuple *pi;
         struct io_uring_meta *md;

         if (argc != 2) {
                 fprintf(stderr, "Usage: %s <block-device>", argv[0]);
                 return 1;
         };

         if (stat(argv[1], &fstat) == 0) {
                 blksize = (int)fstat.st_blksize;
         } else {
                 perror("stat");
                 return 1;
         }

         if (posix_memalign(&wdb, blksize, DATA_LEN)) {
                 perror("posix_memalign failed");
                 return 1;
         }
         if (posix_memalign(&rdb, blksize, DATA_LEN)) {
                 perror("posix_memalign failed");
                 return 1;
         }

         strcpy(wdb, data_str);
         strcpy(wmb, meta_str);

         fd = open(argv[1], O_RDWR | O_DIRECT);
         if (fd < 0) {
                 printf("Error in opening device\n");
                 return 0;
         }

         ret = io_uring_queue_init(8, &ring, IORING_SETUP_SQE128);
         if (ret) {
                 fprintf(stderr, "ring setup failed: %d\n", ret);
                 return 1;
         }

         /* write data + meta-buffer to device */
         sqe = io_uring_get_sqe(&ring);
         if (!sqe) {
                 fprintf(stderr, "get sqe failed\n");
                 return 1;
         }

         io_uring_prep_write(sqe, fd, wdb, DATA_LEN, offset);

         md = (struct io_uring_meta *) sqe->big_sqe_cmd;
         md->meta_type = META_TYPE_INTEGRITY;
         md->meta_addr = (__u64)wmb;
         md->meta_len = META_LEN;
         /* flags to ask for guard/reftag/apptag*/
         md->meta_flags = INTEGRITY_CHK_APPTAG;
         md->app_tag = 0x1234;

         pi = (struct t10_pi_tuple *)wmb;
         pi->apptag = 0x3412;

         ret = io_uring_submit(&ring);
         if (ret <= 0) {
                 fprintf(stderr, "sqe submit failed: %d\n", ret);
                 return 1;
         }

         ret = io_uring_wait_cqe(&ring, &cqe);
         if (!cqe) {
                 fprintf(stderr, "cqe is NULL :%d\n", ret);
                 return 1;
         }
         if (cqe->res < 0) {
                 fprintf(stderr, "write cqe failure: %d", cqe->res);
                 return 1;
         }

         io_uring_cqe_seen(&ring, cqe);

         /* read data + meta-buffer back from device */
         sqe = io_uring_get_sqe(&ring);
         if (!sqe) {
                 fprintf(stderr, "get sqe failed\n");
                 return 1;
         }

         io_uring_prep_read(sqe, fd, rdb, DATA_LEN, offset);

         md = (struct io_uring_meta *) sqe->big_sqe_cmd;
         md->meta_type = META_TYPE_INTEGRITY;
         md->meta_addr = (__u64)rmb;
         md->meta_len = META_LEN;
         md->meta_flags = INTEGRITY_CHK_APPTAG;
         md->app_tag = 0x1234;

         ret = io_uring_submit(&ring);
         if (ret <= 0) {
                 fprintf(stderr, "sqe submit failed: %d\n", ret);
                 return 1;
         }

         ret = io_uring_wait_cqe(&ring, &cqe);
         if (!cqe) {
                 fprintf(stderr, "cqe is NULL :%d\n", ret);
                 return 1;
         }

         if (cqe->res < 0) {
                 fprintf(stderr, "read cqe failure: %d", cqe->res);
                 return 1;
         }
         io_uring_cqe_seen(&ring, cqe);

         if (strncmp(wmb, rmb, META_LEN))
                 printf("Failure: meta mismatch!, wmb=%s, rmb=%s\n", wmb, rmb);

         if (strncmp(wdb, rdb, DATA_LEN))
                 printf("Failure: data mismatch!\n");

         io_uring_queue_exit(&ring);
         free(rdb);
         free(wdb);
         return 0;
}

Anuj Gupta (7):
  block: define set of integrity flags to be inherited by cloned bip
  block: introduce a helper to determine metadata bytes from data iter
  block: handle split correctly for user meta bounce buffer
  block: modify bio_integrity_map_user to accept iov_iter as argument
  io_uring/rw: add support to send meta along with read/write
  block,nvme: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
  scsi: add support for user-meta interface

Kanchan Joshi (3):
  block: define meta io descriptor
  block: add support to pass user meta buffer
  nvme: add handling for app_tag

 block/bio-integrity.c         | 71 ++++++++++++++++++++++++++++++-----
 block/fops.c                  | 25 ++++++++++++
 block/t10-pi.c                |  6 +++
 drivers/nvme/host/core.c      | 24 +++++++-----
 drivers/nvme/host/ioctl.c     | 11 +++++-
 drivers/scsi/sd.c             | 25 +++++++++++-
 drivers/scsi/sd_dif.c         |  2 +-
 include/linux/bio-integrity.h | 33 ++++++++++++++--
 include/linux/blk-integrity.h | 17 +++++++++
 include/linux/fs.h            |  1 +
 include/uapi/linux/io_uring.h | 32 ++++++++++++++++
 io_uring/io_uring.c           |  6 +++
 io_uring/rw.c                 | 70 ++++++++++++++++++++++++++++++++--
 io_uring/rw.h                 | 10 ++++-
 14 files changed, 302 insertions(+), 31 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 01/10] block: define set of integrity flags to be inherited by cloned bip
       [not found]   ` <CGME20240823104616epcas5p4bd315bd116ea7e32b1abf7e174af64a1@epcas5p4.samsung.com>
@ 2024-08-23 10:38     ` Anuj Gupta
  2024-08-24  8:24       ` Christoph Hellwig
  2024-08-29  3:05       ` Martin K. Petersen
  0 siblings, 2 replies; 36+ messages in thread
From: Anuj Gupta @ 2024-08-23 10:38 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, krisman
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Anuj Gupta

Introduce BIP_CLONE_FLAGS describing integrity flags that should be
inherited in the cloned bip from the parent.

Suggested-by: Christoph Hellwig <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
---
 block/bio-integrity.c         | 2 +-
 include/linux/bio-integrity.h | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 8d1fb38f745f..0b69f2b003c3 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -567,7 +567,7 @@ int bio_integrity_clone(struct bio *bio, struct bio *bio_src,
 
 	bip->bip_vec = bip_src->bip_vec;
 	bip->bip_iter = bip_src->bip_iter;
-	bip->bip_flags = bip_src->bip_flags & ~BIP_BLOCK_INTEGRITY;
+	bip->bip_flags = bip_src->bip_flags & BIP_CLONE_FLAGS;
 
 	return 0;
 }
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index dd831c269e99..485d8a43017a 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -30,6 +30,9 @@ struct bio_integrity_payload {
 	struct bio_vec		bip_inline_vecs[];/* embedded bvec array */
 };
 
+#define BIP_CLONE_FLAGS (BIP_MAPPED_INTEGRITY | BIP_CTRL_NOCHECK | \
+			 BIP_DISK_NOCHECK | BIP_IP_CHECKSUM)
+
 #ifdef CONFIG_BLK_DEV_INTEGRITY
 
 #define bip_for_each_vec(bvl, bip, iter)				\
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 02/10] block: introduce a helper to determine metadata bytes from data iter
       [not found]   ` <CGME20240823104618epcas5p4b9983678886dceed75edd9cbec9341b2@epcas5p4.samsung.com>
@ 2024-08-23 10:38     ` Anuj Gupta
  2024-08-24  8:24       ` Christoph Hellwig
  2024-08-29  3:06       ` Martin K. Petersen
  0 siblings, 2 replies; 36+ messages in thread
From: Anuj Gupta @ 2024-08-23 10:38 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, krisman
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Anuj Gupta

Introduce a new helper bio_iter_integrity_bytes to determine the number
of metadata bytes corresponding to data iter.

Suggested-by: Christoph Hellwig <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
---
 include/linux/blk-integrity.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/include/linux/blk-integrity.h b/include/linux/blk-integrity.h
index de98049b7ded..2ff65c933c50 100644
--- a/include/linux/blk-integrity.h
+++ b/include/linux/blk-integrity.h
@@ -76,6 +76,15 @@ static inline unsigned int bio_integrity_bytes(struct blk_integrity *bi,
 	return bio_integrity_intervals(bi, sectors) * bi->tuple_size;
 }
 
+/*
+ * Return the integrity bytes corresponding to data iter
+ */
+static inline unsigned int bio_iter_integrity_bytes(struct blk_integrity *bi,
+						    struct bvec_iter iter)
+{
+	return bio_integrity_bytes(bi, bvec_iter_sectors(iter));
+}
+
 static inline bool blk_integrity_rq(struct request *rq)
 {
 	return rq->cmd_flags & REQ_INTEGRITY;
@@ -132,6 +141,13 @@ static inline unsigned int bio_integrity_bytes(struct blk_integrity *bi,
 {
 	return 0;
 }
+
+static inline unsigned int bio_iter_integrity_bytes(struct blk_integrity *bi,
+						    struct bvec_iter iter)
+{
+	return 0;
+}
+
 static inline int blk_integrity_rq(struct request *rq)
 {
 	return 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 03/10] block: handle split correctly for user meta bounce buffer
       [not found]   ` <CGME20240823104620epcas5p2118c152963d6cadfbc9968790ac0e536@epcas5p2.samsung.com>
@ 2024-08-23 10:38     ` Anuj Gupta
  2024-08-24  8:31       ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: Anuj Gupta @ 2024-08-23 10:38 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, krisman
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Anuj Gupta

Copy back the bounce buffer to user-space in entirety when the parent
bio completes.

Signed-off-by: Anuj Gupta <[email protected]>
---
 block/bio-integrity.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 0b69f2b003c3..d8b810a2b4bf 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -118,9 +118,11 @@ static void bio_integrity_unpin_bvec(struct bio_vec *bv, int nr_vecs,
 
 static void bio_integrity_uncopy_user(struct bio_integrity_payload *bip)
 {
+	struct bio *bio = bip->bip_bio;
+	struct blk_integrity *bi = blk_get_integrity(bio->bi_bdev->bd_disk);
 	unsigned short nr_vecs = bip->bip_max_vcnt - 1;
 	struct bio_vec *copy = &bip->bip_vec[1];
-	size_t bytes = bip->bip_iter.bi_size;
+	size_t bytes = bio_iter_integrity_bytes(bi, bip->bio_iter);
 	struct iov_iter iter;
 	int ret;
 
@@ -253,6 +255,7 @@ static int bio_integrity_copy_user(struct bio *bio, struct bio_vec *bvec,
 	bip->bip_flags |= BIP_COPY_USER;
 	bip->bip_iter.bi_sector = seed;
 	bip->bip_vcnt = nr_vecs;
+	bip->bio_iter = bio->bi_iter;
 	return 0;
 free_bip:
 	bio_integrity_free(bio);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 04/10] block: modify bio_integrity_map_user to accept iov_iter as argument
       [not found]   ` <CGME20240823104622epcas5p2e3b29f793eff9857c5712b3d6d327ed5@epcas5p2.samsung.com>
@ 2024-08-23 10:38     ` Anuj Gupta
  0 siblings, 0 replies; 36+ messages in thread
From: Anuj Gupta @ 2024-08-23 10:38 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, krisman
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Anuj Gupta, Kanchan Joshi

This patch refactors bio_integrity_map_user to accept iov_iter as
argument. This is a prep patch.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
 block/bio-integrity.c         | 12 +++++-------
 drivers/nvme/host/ioctl.c     | 11 +++++++++--
 include/linux/bio-integrity.h |  6 +++---
 3 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index d8b810a2b4bf..aaf67eb427ab 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -310,17 +310,16 @@ static unsigned int bvec_from_pages(struct bio_vec *bvec, struct page **pages,
 	return nr_bvecs;
 }
 
-int bio_integrity_map_user(struct bio *bio, void __user *ubuf, ssize_t bytes,
+int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter,
 			   u32 seed)
 {
 	struct request_queue *q = bdev_get_queue(bio->bi_bdev);
 	unsigned int align = blk_lim_dma_alignment_and_pad(&q->limits);
 	struct page *stack_pages[UIO_FASTIOV], **pages = stack_pages;
 	struct bio_vec stack_vec[UIO_FASTIOV], *bvec = stack_vec;
+	size_t offset, bytes = iter->count;
 	unsigned int direction, nr_bvecs;
-	struct iov_iter iter;
 	int ret, nr_vecs;
-	size_t offset;
 	bool copy;
 
 	if (bio_integrity(bio))
@@ -333,8 +332,7 @@ int bio_integrity_map_user(struct bio *bio, void __user *ubuf, ssize_t bytes,
 	else
 		direction = ITER_SOURCE;
 
-	iov_iter_ubuf(&iter, direction, ubuf, bytes);
-	nr_vecs = iov_iter_npages(&iter, BIO_MAX_VECS + 1);
+	nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS + 1);
 	if (nr_vecs > BIO_MAX_VECS)
 		return -E2BIG;
 	if (nr_vecs > UIO_FASTIOV) {
@@ -344,8 +342,8 @@ int bio_integrity_map_user(struct bio *bio, void __user *ubuf, ssize_t bytes,
 		pages = NULL;
 	}
 
-	copy = !iov_iter_is_aligned(&iter, align, align);
-	ret = iov_iter_extract_pages(&iter, &pages, bytes, nr_vecs, 0, &offset);
+	copy = !iov_iter_is_aligned(iter, align, align);
+	ret = iov_iter_extract_pages(iter, &pages, bytes, nr_vecs, 0, &offset);
 	if (unlikely(ret < 0))
 		goto free_bvec;
 
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index f1d58e70933f..d8628e7071cb 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -146,8 +146,15 @@ static int nvme_map_user_request(struct request *req, u64 ubuffer,
 	if (bdev) {
 		bio_set_dev(bio, bdev);
 		if (meta_buffer && meta_len) {
-			ret = bio_integrity_map_user(bio, meta_buffer, meta_len,
-						     meta_seed);
+			struct iov_iter iter;
+			unsigned int direction;
+
+			if (bio_data_dir(bio) == READ)
+				direction = ITER_DEST;
+			else
+				direction = ITER_SOURCE;
+			iov_iter_ubuf(&iter, direction, meta_buffer, meta_len);
+			ret = bio_integrity_map_user(bio, &iter, meta_seed);
 			if (ret)
 				goto out_unmap;
 			req->cmd_flags |= REQ_INTEGRITY;
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index 485d8a43017a..5313811dc1ce 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -75,7 +75,7 @@ struct bio_integrity_payload *bio_integrity_alloc(struct bio *bio, gfp_t gfp,
 		unsigned int nr);
 int bio_integrity_add_page(struct bio *bio, struct page *page, unsigned int len,
 		unsigned int offset);
-int bio_integrity_map_user(struct bio *bio, void __user *ubuf, ssize_t len, u32 seed);
+int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter, u32 seed);
 void bio_integrity_unmap_user(struct bio *bio);
 bool bio_integrity_prep(struct bio *bio);
 void bio_integrity_advance(struct bio *bio, unsigned int bytes_done);
@@ -101,8 +101,8 @@ static inline void bioset_integrity_free(struct bio_set *bs)
 {
 }
 
-static inline int bio_integrity_map_user(struct bio *bio, void __user *ubuf,
-					 ssize_t len, u32 seed)
+static inline int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter,
+					u32 seed)
 {
 	return -EINVAL;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 05/10] block: define meta io descriptor
       [not found]   ` <CGME20240823104624epcas5p40c1b0f3516100f69cbd31d45867cd289@epcas5p4.samsung.com>
@ 2024-08-23 10:38     ` Anuj Gupta
  2024-08-24  8:31       ` Christoph Hellwig
  2024-08-29  3:05       ` Martin K. Petersen
  0 siblings, 2 replies; 36+ messages in thread
From: Anuj Gupta @ 2024-08-23 10:38 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, krisman
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Kanchan Joshi

From: Kanchan Joshi <[email protected]>

Introduces a new 'uio_meta' structure that upper layer can
use to pass the meta/integrity information.

Signed-off-by: Kanchan Joshi <[email protected]>
---
 include/linux/bio-integrity.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index 5313811dc1ce..a1a9031a5985 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -30,6 +30,15 @@ struct bio_integrity_payload {
 	struct bio_vec		bip_inline_vecs[];/* embedded bvec array */
 };
 
+/* flags for integrity meta */
+typedef __u16 __bitwise meta_flags_t;
+
+struct uio_meta {
+	meta_flags_t	flags;
+	u16		app_tag;
+	struct		iov_iter iter;
+};
+
 #define BIP_CLONE_FLAGS (BIP_MAPPED_INTEGRITY | BIP_CTRL_NOCHECK | \
 			 BIP_DISK_NOCHECK | BIP_IP_CHECKSUM)
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 06/10] io_uring/rw: add support to send meta along with read/write
       [not found]   ` <CGME20240823104627epcas5p2abcd2283f6fb3301e1a8e828e3c270ae@epcas5p2.samsung.com>
@ 2024-08-23 10:38     ` Anuj Gupta
  2024-08-24  8:33       ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: Anuj Gupta @ 2024-08-23 10:38 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, krisman
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Anuj Gupta, Kanchan Joshi

This patch adds the capability of sending meta along with read/write.
This meta is represented by a newly introduced 'struct io_uring_meta'
which specifies information such as meta type/flags/buffer/length and
apptag.
Application sets up a SQE128 ring, prepares io_uring_meta within the
second SQE.
The patch processes the user-passed information to prepare uio_meta
descriptor and passes it down using kiocb->private.

Meta exchange is supported only for direct IO.
Also vectored read/write operations with meta are not supported
currently.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
 include/linux/fs.h            |  1 +
 include/uapi/linux/io_uring.h | 32 ++++++++++++++++
 io_uring/io_uring.c           |  6 +++
 io_uring/rw.c                 | 70 +++++++++++++++++++++++++++++++++--
 io_uring/rw.h                 | 10 ++++-
 5 files changed, 115 insertions(+), 4 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index fb0426f349fc..aec78bf3040c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -330,6 +330,7 @@ struct readahead_control;
 #define IOCB_NOIO		(1 << 20)
 /* can use bio alloc cache */
 #define IOCB_ALLOC_CACHE	(1 << 21)
+#define IOCB_HAS_META		(1 << 22)
 /*
  * IOCB_DIO_CALLER_COMP can be set by the iocb owner, to indicate that the
  * iocb completion can be passed back to the owner for execution from a safe
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 042eab793e26..09e6cc022669 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -105,8 +105,40 @@ struct io_uring_sqe {
 		 */
 		__u8	cmd[0];
 	};
+	/*
+	 * If the ring is initialized with IORING_SETUP_SQE128, then
+	 * this field is starting offset for 64 bytes of data. For meta io
+	 * this contains 'struct io_uring_meta'
+	 */
+	__u8	big_sqe_cmd[0];
 };
 
+enum io_uring_sqe_meta_type_bits {
+	META_TYPE_INTEGRITY_BIT,
+	/* not a real meta type; just to make sure that we don't overflow */
+	META_TYPE_LAST_BIT,
+};
+
+/* meta type flags */
+#define META_TYPE_INTEGRITY	(1U << META_TYPE_INTEGRITY_BIT)
+
+/* this goes to SQE128 */
+struct io_uring_meta {
+	__u16		meta_type;
+	__u16		meta_flags;
+	__u32		meta_len;
+	__u64		meta_addr;
+	__u16		app_tag;
+	__u8		pad[46];
+};
+
+/*
+ * flags for integrity meta
+ */
+#define INTEGRITY_CHK_GUARD	(1U << 0) /* enforce guard check */
+#define INTEGRITY_CHK_APPTAG	(1U << 1) /* enforce app tag check */
+#define INTEGRITY_CHK_REFTAG	(1U << 2) /* enforce ref tag check */
+
 /*
  * If sqe->file_index is set to this for opcodes that instantiate a new
  * direct descriptor (like openat/openat2/accept), then io_uring will allocate
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index a53f2f25a80b..743201d37611 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3814,6 +3814,12 @@ static int __init io_uring_init(void)
 	/* top 8bits are for internal use */
 	BUILD_BUG_ON((IORING_URING_CMD_MASK & 0xff000000) != 0);
 
+	BUILD_BUG_ON(sizeof(struct io_uring_meta) >
+		     sizeof(struct io_uring_sqe));
+
+	BUILD_BUG_ON(META_TYPE_LAST_BIT >
+		     8 * sizeof_field(struct io_uring_meta, meta_type));
+
 	io_uring_optable_init();
 
 	/*
diff --git a/io_uring/rw.c b/io_uring/rw.c
index c004d21e2f12..fadc17813f76 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -23,6 +23,8 @@
 #include "poll.h"
 #include "rw.h"
 
+#define	INTEGRITY_VALID_FLAGS (INTEGRITY_CHK_GUARD | INTEGRITY_CHK_APPTAG | \
+			       INTEGRITY_CHK_REFTAG)
 struct io_rw {
 	/* NOTE: kiocb has the file as the first member, so don't do it here */
 	struct kiocb			kiocb;
@@ -247,6 +249,42 @@ static int io_prep_rw_setup(struct io_kiocb *req, int ddir, bool do_import)
 	return 0;
 }
 
+static int io_prep_rw_meta(struct io_kiocb *req, const struct io_uring_sqe *sqe,
+			   struct io_rw *rw, int ddir)
+{
+	const struct io_uring_meta *md = (struct io_uring_meta *)sqe->big_sqe_cmd;
+	u16 meta_type = READ_ONCE(md->meta_type);
+	const struct io_issue_def *def;
+	struct io_async_rw *io;
+	int ret;
+
+	if (!meta_type)
+		return 0;
+	if (!(meta_type & META_TYPE_INTEGRITY))
+		return -EINVAL;
+
+	/* should fit into two bytes */
+	BUILD_BUG_ON(INTEGRITY_VALID_FLAGS >= (1 << 16));
+
+	def = &io_issue_defs[req->opcode];
+	if (def->vectored)
+		return -EOPNOTSUPP;
+
+	io = req->async_data;
+	io->meta.flags = READ_ONCE(md->meta_flags);
+	if (io->meta.flags && (io->meta.flags & ~INTEGRITY_VALID_FLAGS))
+		return -EINVAL;
+
+	io->meta.app_tag = READ_ONCE(md->app_tag);
+	ret = import_ubuf(ddir, u64_to_user_ptr(READ_ONCE(md->meta_addr)),
+			  READ_ONCE(md->meta_len), &io->meta.iter);
+	if (unlikely(ret < 0))
+		return ret;
+	rw->kiocb.ki_flags |= IOCB_HAS_META;
+	iov_iter_save_state(&io->meta.iter, &io->iter_meta_state);
+	return ret;
+}
+
 static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		      int ddir, bool do_import)
 {
@@ -269,11 +307,18 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		rw->kiocb.ki_ioprio = get_current_ioprio();
 	}
 	rw->kiocb.dio_complete = NULL;
+	rw->kiocb.ki_flags = 0;
 
 	rw->addr = READ_ONCE(sqe->addr);
 	rw->len = READ_ONCE(sqe->len);
 	rw->flags = READ_ONCE(sqe->rw_flags);
-	return io_prep_rw_setup(req, ddir, do_import);
+	ret = io_prep_rw_setup(req, ddir, do_import);
+
+	if (unlikely(ret))
+		return ret;
+	if (unlikely(req->ctx->flags & IORING_SETUP_SQE128))
+		ret = io_prep_rw_meta(req, sqe, rw, ddir);
+	return ret;
 }
 
 int io_prep_read(struct io_kiocb *req, const struct io_uring_sqe *sqe)
@@ -400,7 +445,10 @@ static inline loff_t *io_kiocb_update_pos(struct io_kiocb *req)
 static void io_resubmit_prep(struct io_kiocb *req)
 {
 	struct io_async_rw *io = req->async_data;
+	struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw);
 
+	if (unlikely(rw->kiocb.ki_flags & IOCB_HAS_META))
+		iov_iter_restore(&io->meta.iter, &io->iter_meta_state);
 	iov_iter_restore(&io->iter, &io->iter_state);
 }
 
@@ -768,8 +816,12 @@ static inline int io_iter_do_read(struct io_rw *rw, struct iov_iter *iter)
 
 static bool need_complete_io(struct io_kiocb *req)
 {
+	struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw);
+
+	/* Exclude meta IO as we don't support partial completion for that */
 	return req->flags & REQ_F_ISREG ||
-		S_ISBLK(file_inode(req->file)->i_mode);
+		S_ISBLK(file_inode(req->file)->i_mode) ||
+		!(rw->kiocb.ki_flags & IOCB_HAS_META);
 }
 
 static int io_rw_init_file(struct io_kiocb *req, fmode_t mode, int rw_type)
@@ -786,7 +838,7 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode, int rw_type)
 	if (!(req->flags & REQ_F_FIXED_FILE))
 		req->flags |= io_file_get_flags(file);
 
-	kiocb->ki_flags = file->f_iocb_flags;
+	kiocb->ki_flags |= file->f_iocb_flags;
 	ret = kiocb_set_rw_flags(kiocb, rw->flags, rw_type);
 	if (unlikely(ret))
 		return ret;
@@ -815,6 +867,14 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode, int rw_type)
 		kiocb->ki_complete = io_complete_rw;
 	}
 
+	if (unlikely(kiocb->ki_flags & IOCB_HAS_META)) {
+		struct io_async_rw *io = req->async_data;
+
+		if (!(req->file->f_flags & O_DIRECT))
+			return -EOPNOTSUPP;
+		kiocb->private = &io->meta;
+	}
+
 	return 0;
 }
 
@@ -881,6 +941,8 @@ static int __io_read(struct io_kiocb *req, unsigned int issue_flags)
 	 * manually if we need to.
 	 */
 	iov_iter_restore(&io->iter, &io->iter_state);
+	if (unlikely(kiocb->ki_flags & IOCB_HAS_META))
+		iov_iter_restore(&io->meta.iter, &io->iter_meta_state);
 
 	do {
 		/*
@@ -1091,6 +1153,8 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags)
 	} else {
 ret_eagain:
 		iov_iter_restore(&io->iter, &io->iter_state);
+		if (unlikely(kiocb->ki_flags & IOCB_HAS_META))
+			iov_iter_restore(&io->meta.iter, &io->iter_meta_state);
 		if (kiocb->ki_flags & IOCB_WRITE)
 			io_req_end_write(req);
 		return -EAGAIN;
diff --git a/io_uring/rw.h b/io_uring/rw.h
index 3f432dc75441..ce7a865fac95 100644
--- a/io_uring/rw.h
+++ b/io_uring/rw.h
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 
 #include <linux/pagemap.h>
+#include <linux/bio-integrity.h>
 
 struct io_async_rw {
 	size_t				bytes_done;
@@ -9,7 +10,14 @@ struct io_async_rw {
 	struct iovec			fast_iov;
 	struct iovec			*free_iovec;
 	int				free_iov_nr;
-	struct wait_page_queue		wpq;
+	/* wpq is for buffered io, while meta fields are used with direct io*/
+	union {
+		struct wait_page_queue		wpq;
+		struct {
+			struct uio_meta			meta;
+			struct iov_iter_state		iter_meta_state;
+		};
+	};
 };
 
 int io_prep_read_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
       [not found]   ` <CGME20240823104629epcas5p3fea0cb7e66b0446ddacf7648c08c3ba8@epcas5p3.samsung.com>
@ 2024-08-23 10:38     ` Anuj Gupta
  2024-08-24  8:35       ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: Anuj Gupta @ 2024-08-23 10:38 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, krisman
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Anuj Gupta, Kanchan Joshi

This patch introduces BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags which
indicate how the hardware should check the integrity payload. The
driver can now just rely on block layer flags, and doesn't need to
know the integrity source. Submitter of PI decides which tags to check.
This would also give us a unified interface for user and kernel
generated integrity.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
 block/bio-integrity.c         |  5 +++++
 drivers/nvme/host/core.c      | 12 +++---------
 include/linux/bio-integrity.h |  6 +++++-
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index aaf67eb427ab..7fbf8c307a36 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -444,6 +444,11 @@ bool bio_integrity_prep(struct bio *bio)
 	if (bi->csum_type == BLK_INTEGRITY_CSUM_IP)
 		bip->bip_flags |= BIP_IP_CHECKSUM;
 
+	/* describe what tags to check in payload */
+	if (bi->csum_type)
+		bip->bip_flags |= BIP_CHECK_GUARD;
+	if (bi->flags & BLK_INTEGRITY_REF_TAG)
+		bip->bip_flags |= BIP_CHECK_REFTAG;
 	if (bio_integrity_add_page(bio, virt_to_page(buf), len,
 			offset_in_page(buf)) < len) {
 		printk(KERN_ERR "could not attach integrity payload\n");
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 33fa01c599ad..d4c366df8f12 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1002,19 +1002,13 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
 				return BLK_STS_NOTSUPP;
 			control |= NVME_RW_PRINFO_PRACT;
 		}
-
-		switch (ns->head->pi_type) {
-		case NVME_NS_DPS_PI_TYPE3:
+		if (bio_integrity_flagged(req->bio, BIP_CHECK_GUARD))
 			control |= NVME_RW_PRINFO_PRCHK_GUARD;
-			break;
-		case NVME_NS_DPS_PI_TYPE1:
-		case NVME_NS_DPS_PI_TYPE2:
-			control |= NVME_RW_PRINFO_PRCHK_GUARD |
-					NVME_RW_PRINFO_PRCHK_REF;
+		if (bio_integrity_flagged(req->bio, BIP_CHECK_REFTAG)) {
+			control |= NVME_RW_PRINFO_PRCHK_REF;
 			if (op == nvme_cmd_zone_append)
 				control |= NVME_RW_APPEND_PIREMAP;
 			nvme_set_ref_tag(ns, cmnd, req);
-			break;
 		}
 	}
 
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index a1a9031a5985..c7c0121689e1 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -11,6 +11,9 @@ enum bip_flags {
 	BIP_DISK_NOCHECK	= 1 << 3, /* disable disk integrity checking */
 	BIP_IP_CHECKSUM		= 1 << 4, /* IP checksum */
 	BIP_COPY_USER		= 1 << 5, /* Kernel bounce buffer in use */
+	BIP_CHECK_GUARD		= 1 << 6,
+	BIP_CHECK_REFTAG	= 1 << 7,
+	BIP_CHECK_APPTAG	= 1 << 8,
 };
 
 struct bio_integrity_payload {
@@ -40,7 +43,8 @@ struct uio_meta {
 };
 
 #define BIP_CLONE_FLAGS (BIP_MAPPED_INTEGRITY | BIP_CTRL_NOCHECK | \
-			 BIP_DISK_NOCHECK | BIP_IP_CHECKSUM)
+			 BIP_DISK_NOCHECK | BIP_IP_CHECKSUM | \
+			 BIP_CHECK_GUARD | BIP_CHECK_REFTAG | BIP_CHECK_APPTAG)
 
 #ifdef CONFIG_BLK_DEV_INTEGRITY
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 07/10] block,nvme: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
       [not found]   ` <CGME20240823104631epcas5p4f83b92081107fbefca78008ee319ff7e@epcas5p4.samsung.com>
@ 2024-08-23 10:38     ` Anuj Gupta
  0 siblings, 0 replies; 36+ messages in thread
From: Anuj Gupta @ 2024-08-23 10:38 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, krisman
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Anuj Gupta, Kanchan Joshi

This patch introduces BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags which
indicate how the hardware should check the integrity payload. The
driver can now just rely on block layer flags, and doesn't need to
know the integrity source. Submitter of PI decides which tags to check.
This would also give us a unified interface for user and kernel
generated integrity.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
 block/bio-integrity.c         |  5 +++++
 drivers/nvme/host/core.c      | 12 +++---------
 include/linux/bio-integrity.h |  6 +++++-
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index aaf67eb427ab..7fbf8c307a36 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -444,6 +444,11 @@ bool bio_integrity_prep(struct bio *bio)
 	if (bi->csum_type == BLK_INTEGRITY_CSUM_IP)
 		bip->bip_flags |= BIP_IP_CHECKSUM;
 
+	/* describe what tags to check in payload */
+	if (bi->csum_type)
+		bip->bip_flags |= BIP_CHECK_GUARD;
+	if (bi->flags & BLK_INTEGRITY_REF_TAG)
+		bip->bip_flags |= BIP_CHECK_REFTAG;
 	if (bio_integrity_add_page(bio, virt_to_page(buf), len,
 			offset_in_page(buf)) < len) {
 		printk(KERN_ERR "could not attach integrity payload\n");
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 33fa01c599ad..d4c366df8f12 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1002,19 +1002,13 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
 				return BLK_STS_NOTSUPP;
 			control |= NVME_RW_PRINFO_PRACT;
 		}
-
-		switch (ns->head->pi_type) {
-		case NVME_NS_DPS_PI_TYPE3:
+		if (bio_integrity_flagged(req->bio, BIP_CHECK_GUARD))
 			control |= NVME_RW_PRINFO_PRCHK_GUARD;
-			break;
-		case NVME_NS_DPS_PI_TYPE1:
-		case NVME_NS_DPS_PI_TYPE2:
-			control |= NVME_RW_PRINFO_PRCHK_GUARD |
-					NVME_RW_PRINFO_PRCHK_REF;
+		if (bio_integrity_flagged(req->bio, BIP_CHECK_REFTAG)) {
+			control |= NVME_RW_PRINFO_PRCHK_REF;
 			if (op == nvme_cmd_zone_append)
 				control |= NVME_RW_APPEND_PIREMAP;
 			nvme_set_ref_tag(ns, cmnd, req);
-			break;
 		}
 	}
 
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index a1a9031a5985..c7c0121689e1 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -11,6 +11,9 @@ enum bip_flags {
 	BIP_DISK_NOCHECK	= 1 << 3, /* disable disk integrity checking */
 	BIP_IP_CHECKSUM		= 1 << 4, /* IP checksum */
 	BIP_COPY_USER		= 1 << 5, /* Kernel bounce buffer in use */
+	BIP_CHECK_GUARD		= 1 << 6,
+	BIP_CHECK_REFTAG	= 1 << 7,
+	BIP_CHECK_APPTAG	= 1 << 8,
 };
 
 struct bio_integrity_payload {
@@ -40,7 +43,8 @@ struct uio_meta {
 };
 
 #define BIP_CLONE_FLAGS (BIP_MAPPED_INTEGRITY | BIP_CTRL_NOCHECK | \
-			 BIP_DISK_NOCHECK | BIP_IP_CHECKSUM)
+			 BIP_DISK_NOCHECK | BIP_IP_CHECKSUM | \
+			 BIP_CHECK_GUARD | BIP_CHECK_REFTAG | BIP_CHECK_APPTAG)
 
 #ifdef CONFIG_BLK_DEV_INTEGRITY
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 08/10] block: add support to pass user meta buffer
       [not found]   ` <CGME20240823104634epcas5p4ef1af26cc7146b4e8b7a4a1844ffe476@epcas5p4.samsung.com>
@ 2024-08-23 10:38     ` Anuj Gupta
  2024-08-24  8:44       ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: Anuj Gupta @ 2024-08-23 10:38 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, krisman
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Kanchan Joshi, Anuj Gupta

From: Kanchan Joshi <[email protected]>

If iocb contains the meta, extract that and prepare the bip.
Based on flags specified by the user, set corresponding guard/app/ref
tags to be checked in bip. Introduce BIP_INTEGRITY_USER flag to
indicate integrity payload is user address. Make sure that
->prepare_fn and ->complete_fn are skipped for user-owned meta buffer.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
 block/bio-integrity.c         | 45 ++++++++++++++++++++++++++++++++++-
 block/fops.c                  | 25 +++++++++++++++++++
 block/t10-pi.c                |  6 +++++
 include/linux/bio-integrity.h | 13 +++++++++-
 4 files changed, 87 insertions(+), 2 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 7fbf8c307a36..02b766c2e57d 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -12,6 +12,7 @@
 #include <linux/bio.h>
 #include <linux/workqueue.h>
 #include <linux/slab.h>
+#include <uapi/linux/io_uring.h>
 #include "blk.h"
 
 static struct kmem_cache *bip_slab;
@@ -252,7 +253,7 @@ static int bio_integrity_copy_user(struct bio *bio, struct bio_vec *bvec,
 		goto free_bip;
 	}
 
-	bip->bip_flags |= BIP_COPY_USER;
+	bip->bip_flags |= BIP_INTEGRITY_USER | BIP_COPY_USER;
 	bip->bip_iter.bi_sector = seed;
 	bip->bip_vcnt = nr_vecs;
 	bip->bio_iter = bio->bi_iter;
@@ -274,6 +275,7 @@ static int bio_integrity_init_user(struct bio *bio, struct bio_vec *bvec,
 		return PTR_ERR(bip);
 
 	memcpy(bip->bip_vec, bvec, nr_vecs * sizeof(*bvec));
+	bip->bip_flags |= BIP_INTEGRITY_USER;
 	bip->bip_iter.bi_sector = seed;
 	bip->bip_iter.bi_size = len;
 	bip->bip_vcnt = nr_vecs;
@@ -310,6 +312,47 @@ static unsigned int bvec_from_pages(struct bio_vec *bvec, struct page **pages,
 	return nr_bvecs;
 }
 
+static void bio_uio_meta_to_bip(struct bio *bio, struct uio_meta *meta)
+{
+	struct bio_integrity_payload *bip = bio_integrity(bio);
+
+	if (meta->flags & INTEGRITY_CHK_GUARD)
+		bip->bip_flags |= BIP_CHECK_GUARD;
+	if (meta->flags & INTEGRITY_CHK_APPTAG)
+		bip->bip_flags |= BIP_CHECK_APPTAG;
+	if (meta->flags & INTEGRITY_CHK_REFTAG)
+		bip->bip_flags |= BIP_CHECK_REFTAG;
+
+	bip->app_tag = meta->app_tag;
+}
+
+int bio_integrity_map_iter(struct bio *bio, struct uio_meta *meta)
+{
+	struct blk_integrity *bi = blk_get_integrity(bio->bi_bdev->bd_disk);
+	unsigned int integrity_bytes;
+	int ret;
+	struct iov_iter it;
+
+	if (!bi)
+		return -EINVAL;
+	/*
+	 * original meta iterator can be bigger.
+	 * process integrity info corresponding to current data buffer only.
+	 */
+	it = meta->iter;
+	integrity_bytes = bio_integrity_bytes(bi, bio_sectors(bio));
+	if (it.count < integrity_bytes)
+		return -EINVAL;
+
+	it.count = integrity_bytes;
+	ret = bio_integrity_map_user(bio, &it, 0);
+	if (!ret) {
+		bio_uio_meta_to_bip(bio, meta);
+		iov_iter_advance(&meta->iter, integrity_bytes);
+	}
+	return ret;
+}
+
 int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter,
 			   u32 seed)
 {
diff --git a/block/fops.c b/block/fops.c
index 9825c1713a49..5c18676c17ab 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -154,6 +154,9 @@ static void blkdev_bio_end_io(struct bio *bio)
 		}
 	}
 
+	if (bio_integrity(bio) && (dio->iocb->ki_flags & IOCB_HAS_META))
+		bio_integrity_unmap_user(bio);
+
 	if (should_dirty) {
 		bio_check_pages_dirty(bio);
 	} else {
@@ -231,6 +234,16 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
 			}
 			bio->bi_opf |= REQ_NOWAIT;
 		}
+		if (!is_sync && unlikely(iocb->ki_flags & IOCB_HAS_META)) {
+			ret = bio_integrity_map_iter(bio, iocb->private);
+			if (unlikely(ret)) {
+				bio_release_pages(bio, false);
+				bio_clear_flag(bio, BIO_REFFED);
+				bio_put(bio);
+				blk_finish_plug(&plug);
+				return ret;
+			}
+		}
 
 		if (is_read) {
 			if (dio->flags & DIO_SHOULD_DIRTY)
@@ -288,6 +301,9 @@ static void blkdev_bio_end_io_async(struct bio *bio)
 		ret = blk_status_to_errno(bio->bi_status);
 	}
 
+	if (bio_integrity(bio) && (iocb->ki_flags & IOCB_HAS_META))
+		bio_integrity_unmap_user(bio);
+
 	iocb->ki_complete(iocb, ret);
 
 	if (dio->flags & DIO_SHOULD_DIRTY) {
@@ -348,6 +364,15 @@ static ssize_t __blkdev_direct_IO_async(struct kiocb *iocb,
 		task_io_account_write(bio->bi_iter.bi_size);
 	}
 
+	if (unlikely(iocb->ki_flags & IOCB_HAS_META)) {
+		ret = bio_integrity_map_iter(bio, iocb->private);
+		WRITE_ONCE(iocb->private, NULL);
+		if (unlikely(ret)) {
+			bio_put(bio);
+			return ret;
+		}
+	}
+
 	if (iocb->ki_flags & IOCB_ATOMIC)
 		bio->bi_opf |= REQ_ATOMIC;
 
diff --git a/block/t10-pi.c b/block/t10-pi.c
index e7052a728966..cb7bc4a88380 100644
--- a/block/t10-pi.c
+++ b/block/t10-pi.c
@@ -139,6 +139,8 @@ static void t10_pi_type1_prepare(struct request *rq)
 		/* Already remapped? */
 		if (bip->bip_flags & BIP_MAPPED_INTEGRITY)
 			break;
+		if (bip->bip_flags & BIP_INTEGRITY_USER)
+			break;
 
 		bip_for_each_vec(iv, bip, iter) {
 			unsigned int j;
@@ -188,6 +190,8 @@ static void t10_pi_type1_complete(struct request *rq, unsigned int nr_bytes)
 		struct bio_vec iv;
 		struct bvec_iter iter;
 
+		if (bip->bip_flags & BIP_INTEGRITY_USER)
+			break;
 		bip_for_each_vec(iv, bip, iter) {
 			unsigned int j;
 			void *p;
@@ -313,6 +317,8 @@ static void ext_pi_type1_prepare(struct request *rq)
 		/* Already remapped? */
 		if (bip->bip_flags & BIP_MAPPED_INTEGRITY)
 			break;
+		if (bip->bip_flags & BIP_INTEGRITY_USER)
+			break;
 
 		bip_for_each_vec(iv, bip, iter) {
 			unsigned int j;
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index c7c0121689e1..22ff2ae16444 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -14,6 +14,9 @@ enum bip_flags {
 	BIP_CHECK_GUARD		= 1 << 6,
 	BIP_CHECK_REFTAG	= 1 << 7,
 	BIP_CHECK_APPTAG	= 1 << 8,
+	BIP_INTEGRITY_USER      = 1 << 9, /* Integrity payload is user
+					   * address
+					   */
 };
 
 struct bio_integrity_payload {
@@ -24,6 +27,7 @@ struct bio_integrity_payload {
 	unsigned short		bip_vcnt;	/* # of integrity bio_vecs */
 	unsigned short		bip_max_vcnt;	/* integrity bio_vec slots */
 	unsigned short		bip_flags;	/* control flags */
+	u16			app_tag;
 
 	struct bvec_iter	bio_iter;	/* for rewinding parent bio */
 
@@ -44,7 +48,8 @@ struct uio_meta {
 
 #define BIP_CLONE_FLAGS (BIP_MAPPED_INTEGRITY | BIP_CTRL_NOCHECK | \
 			 BIP_DISK_NOCHECK | BIP_IP_CHECKSUM | \
-			 BIP_CHECK_GUARD | BIP_CHECK_REFTAG | BIP_CHECK_APPTAG)
+			 BIP_CHECK_GUARD | BIP_CHECK_REFTAG | \
+			 BIP_CHECK_APPTAG | BIP_INTEGRITY_USER)
 
 #ifdef CONFIG_BLK_DEV_INTEGRITY
 
@@ -89,6 +94,7 @@ struct bio_integrity_payload *bio_integrity_alloc(struct bio *bio, gfp_t gfp,
 int bio_integrity_add_page(struct bio *bio, struct page *page, unsigned int len,
 		unsigned int offset);
 int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter, u32 seed);
+int bio_integrity_map_iter(struct bio *bio, struct uio_meta *meta);
 void bio_integrity_unmap_user(struct bio *bio);
 bool bio_integrity_prep(struct bio *bio);
 void bio_integrity_advance(struct bio *bio, unsigned int bytes_done);
@@ -120,6 +126,11 @@ static inline int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter,
 	return -EINVAL;
 }
 
+static inline int bio_integrity_map_iter(struct bio *bio, struct uio_meta *meta)
+{
+	return -EINVAL;
+}
+
 static inline void bio_integrity_unmap_user(struct bio *bio)
 {
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 09/10] nvme: add handling for app_tag
       [not found]   ` <CGME20240823104636epcas5p4825a6d2dd9e45cfbcc97895264662d30@epcas5p4.samsung.com>
@ 2024-08-23 10:38     ` Anuj Gupta
  2024-08-24  8:49       ` Christoph Hellwig
  2024-08-29  3:00       ` Martin K. Petersen
  0 siblings, 2 replies; 36+ messages in thread
From: Anuj Gupta @ 2024-08-23 10:38 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, krisman
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Kanchan Joshi, Anuj Gupta

From: Kanchan Joshi <[email protected]>

With user integrity buffer, there is a way to specify the app_tag.
Set the corresponding protocol specific flags and send the app_tag down.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
 drivers/nvme/host/core.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index d4c366df8f12..af6940ec6e3c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -871,6 +871,13 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req,
 	return BLK_STS_OK;
 }
 
+static void nvme_set_app_tag(struct nvme_command *cmnd, u16 apptag)
+{
+	cmnd->rw.apptag = cpu_to_le16(apptag);
+	/* use 0xfff as mask so that apptag is used in entirety */
+	cmnd->rw.appmask = cpu_to_le16(0xffff);
+}
+
 static void nvme_set_ref_tag(struct nvme_ns *ns, struct nvme_command *cmnd,
 			      struct request *req)
 {
@@ -1010,6 +1017,11 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
 				control |= NVME_RW_APPEND_PIREMAP;
 			nvme_set_ref_tag(ns, cmnd, req);
 		}
+		if (bio_integrity_flagged(req->bio, BIP_CHECK_APPTAG)) {
+			control |= NVME_RW_PRINFO_PRCHK_APP;
+			nvme_set_app_tag(cmnd,
+					 bio_integrity(req->bio)->app_tag);
+		}
 	}
 
 	cmnd->rw.control = cpu_to_le16(control);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 10/10] scsi: add support for user-meta interface
       [not found]   ` <CGME20240823104639epcas5p11dbab393122841419368a86b4bd5c04b@epcas5p1.samsung.com>
@ 2024-08-23 10:38     ` Anuj Gupta
  2024-08-24  8:52       ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: Anuj Gupta @ 2024-08-23 10:38 UTC (permalink / raw)
  To: axboe, hch, kbusch, martin.petersen, asml.silence, krisman
  Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Anuj Gupta

Add support for sending user-meta buffer. Set tags to be checked
using flags specified by user/block-layer user and underlying DIF/DIX
configuration. Introduce BLK_INTEGRITY_APP_TAG to specify apptag.
This provides a way for upper layers to specify apptag checking.

Signed-off-by: Anuj Gupta <[email protected]>
---
 block/bio-integrity.c         |  2 ++
 drivers/scsi/sd.c             | 25 +++++++++++++++++++++++--
 drivers/scsi/sd_dif.c         |  2 +-
 include/linux/blk-integrity.h |  1 +
 4 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 02b766c2e57d..ff7de4fe74c4 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -492,6 +492,8 @@ bool bio_integrity_prep(struct bio *bio)
 		bip->bip_flags |= BIP_CHECK_GUARD;
 	if (bi->flags & BLK_INTEGRITY_REF_TAG)
 		bip->bip_flags |= BIP_CHECK_REFTAG;
+	if (bi->flags & BLK_INTEGRITY_APP_TAG)
+		bip->bip_flags |= BIP_CHECK_APPTAG;
 	if (bio_integrity_add_page(bio, virt_to_page(buf), len,
 			offset_in_page(buf)) < len) {
 		printk(KERN_ERR "could not attach integrity payload\n");
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 699f4f9674d9..6ebef140cec2 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -803,6 +803,23 @@ static unsigned int sd_prot_flag_mask(unsigned int prot_op)
 	return flag_mask[prot_op];
 }
 
+/*
+ * Can't check reftag alone or apptag alone
+ */
+static bool sd_prot_flags_valid(struct scsi_cmnd *scmd)
+{
+	struct request *rq = scsi_cmd_to_rq(scmd);
+	struct bio *bio = rq->bio;
+
+	if (bio_integrity_flagged(bio, BIP_CHECK_REFTAG) &&
+	    !bio_integrity_flagged(bio, BIP_CHECK_APPTAG))
+		return false;
+	if (!bio_integrity_flagged(bio, BIP_CHECK_REFTAG) &&
+	    bio_integrity_flagged(bio, BIP_CHECK_APPTAG))
+		return false;
+	return true;
+}
+
 static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd,
 					   unsigned int dix, unsigned int dif)
 {
@@ -815,14 +832,16 @@ static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd,
 		if (bio_integrity_flagged(bio, BIP_IP_CHECKSUM))
 			scmd->prot_flags |= SCSI_PROT_IP_CHECKSUM;
 
-		if (bio_integrity_flagged(bio, BIP_CTRL_NOCHECK) == false)
+		if (bio_integrity_flagged(bio, BIP_CTRL_NOCHECK) == false &&
+		    (bio_integrity_flagged(bio, BIP_CHECK_GUARD)))
 			scmd->prot_flags |= SCSI_PROT_GUARD_CHECK;
 	}
 
 	if (dif != T10_PI_TYPE3_PROTECTION) {	/* DIX/DIF Type 0, 1, 2 */
 		scmd->prot_flags |= SCSI_PROT_REF_INCREMENT;
 
-		if (bio_integrity_flagged(bio, BIP_CTRL_NOCHECK) == false)
+		if ((bio_integrity_flagged(bio, BIP_CTRL_NOCHECK) == false) &&
+			(!dix || bio_integrity_flagged(bio, BIP_CHECK_REFTAG)))
 			scmd->prot_flags |= SCSI_PROT_REF_CHECK;
 	}
 
@@ -1374,6 +1393,8 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd)
 	dif = scsi_host_dif_capable(cmd->device->host, sdkp->protection_type);
 	dld = sd_cdl_dld(sdkp, cmd);
 
+	if (!sd_prot_flags_valid(cmd))
+		goto fail;
 	if (dif || dix)
 		protect = sd_setup_protect_cmnd(cmd, dix, dif);
 	else
diff --git a/drivers/scsi/sd_dif.c b/drivers/scsi/sd_dif.c
index ae6ce6f5d622..6c53e3b9d7d7 100644
--- a/drivers/scsi/sd_dif.c
+++ b/drivers/scsi/sd_dif.c
@@ -50,7 +50,7 @@ void sd_dif_config_host(struct scsi_disk *sdkp, struct queue_limits *lim)
 		bi->csum_type = BLK_INTEGRITY_CSUM_CRC;
 
 	if (type != T10_PI_TYPE3_PROTECTION)
-		bi->flags |= BLK_INTEGRITY_REF_TAG;
+		bi->flags |= BLK_INTEGRITY_REF_TAG | BLK_INTEGRITY_APP_TAG;
 
 	bi->tuple_size = sizeof(struct t10_pi_tuple);
 
diff --git a/include/linux/blk-integrity.h b/include/linux/blk-integrity.h
index 2ff65c933c50..865e0c4a7255 100644
--- a/include/linux/blk-integrity.h
+++ b/include/linux/blk-integrity.h
@@ -13,6 +13,7 @@ enum blk_integrity_flags {
 	BLK_INTEGRITY_DEVICE_CAPABLE	= 1 << 2,
 	BLK_INTEGRITY_REF_TAG		= 1 << 3,
 	BLK_INTEGRITY_STACKED		= 1 << 4,
+	BLK_INTEGRITY_APP_TAG		= 1 << 5,
 };
 
 const char *blk_integrity_profile_name(struct blk_integrity *bi);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 01/10] block: define set of integrity flags to be inherited by cloned bip
  2024-08-23 10:38     ` [PATCH v3 01/10] block: define set of integrity flags to be inherited by cloned bip Anuj Gupta
@ 2024-08-24  8:24       ` Christoph Hellwig
  2024-08-29  3:05       ` Martin K. Petersen
  1 sibling, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2024-08-24  8:24 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, krisman,
	io-uring, linux-nvme, linux-block, gost.dev, linux-scsi

Looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 02/10] block: introduce a helper to determine metadata bytes from data iter
  2024-08-23 10:38     ` [PATCH v3 02/10] block: introduce a helper to determine metadata bytes from data iter Anuj Gupta
@ 2024-08-24  8:24       ` Christoph Hellwig
  2024-08-29  3:06       ` Martin K. Petersen
  1 sibling, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2024-08-24  8:24 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, krisman,
	io-uring, linux-nvme, linux-block, gost.dev, linux-scsi

Looks good:

Reviewed-by: Christoph Hellwig <[email protected]>


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 03/10] block: handle split correctly for user meta bounce buffer
  2024-08-23 10:38     ` [PATCH v3 03/10] block: handle split correctly for user meta bounce buffer Anuj Gupta
@ 2024-08-24  8:31       ` Christoph Hellwig
  2024-08-28 11:18         ` Anuj Gupta
  0 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2024-08-24  8:31 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, krisman,
	io-uring, linux-nvme, linux-block, gost.dev, linux-scsi

On Fri, Aug 23, 2024 at 04:08:03PM +0530, Anuj Gupta wrote:
> Copy back the bounce buffer to user-space in entirety when the parent
> bio completes.

This looks odd to me.  The usual way to handle iterating the entire
submitter controlled data is to just iterate over the bvec array, as
done by bio_for_each_segment_all/bio_for_each_bvec_all for the bio
data.  I think you want to do the same here, probably with a
similar bip_for_each_bvec_all or similar helper.  That way you don't
need to stash away the iter.  Currently we have the field for that,
but I really want to split up struct bio_integrity_payload into
what is actually needed for the payload and stuff only needed for
the block layer autogenerated PI (bip_bio/bio_iter/bip_work).

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 05/10] block: define meta io descriptor
  2024-08-23 10:38     ` [PATCH v3 05/10] block: define meta io descriptor Anuj Gupta
@ 2024-08-24  8:31       ` Christoph Hellwig
  2024-08-29  3:05       ` Martin K. Petersen
  1 sibling, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2024-08-24  8:31 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, krisman,
	io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Kanchan Joshi

On Fri, Aug 23, 2024 at 04:08:05PM +0530, Anuj Gupta wrote:
> +struct uio_meta {
> +	meta_flags_t	flags;
> +	u16		app_tag;
> +	struct		iov_iter iter;
> +};

Odd formatting here - the aligning tab goes before the field name,
not the name of the structure type.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 06/10] io_uring/rw: add support to send meta along with read/write
  2024-08-23 10:38     ` [PATCH v3 06/10] io_uring/rw: add support to send meta along with read/write Anuj Gupta
@ 2024-08-24  8:33       ` Christoph Hellwig
  0 siblings, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2024-08-24  8:33 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, krisman,
	io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Kanchan Joshi

> +#define IOCB_HAS_META		(1 << 22)

.. METADATA?

Also the fs infrastructure should probably be split from io_uring.

> +/*
> + * flags for integrity meta
> + */
> +#define INTEGRITY_CHK_GUARD	(1U << 0) /* enforce guard check */
> +#define INTEGRITY_CHK_APPTAG	(1U << 1) /* enforce app tag check */
> +#define INTEGRITY_CHK_REFTAG	(1U << 2) /* enforce ref tag check */

This gets used all over the block layer.  I don't think it should be
in an io_uring specific header even if that is the initial user.

We might also gain a bit more flexibility by splitting the userspace
API from the in-kernel flags even if there is no strong needs for that
yet.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
  2024-08-23 10:38     ` [PATCH v3 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags Anuj Gupta
@ 2024-08-24  8:35       ` Christoph Hellwig
  2024-08-28 13:42         ` Kanchan Joshi
  0 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2024-08-24  8:35 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, krisman,
	io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Kanchan Joshi

This patch came in twice, once with a "block" and once with a
"block,nvme" prefix.  One should be fine, and I think just block is
the right one.

How do we communicate what flags can be combined to the upper layer
and userspace given the SCSI limitations here?

> --- a/include/linux/bio-integrity.h
> +++ b/include/linux/bio-integrity.h
> @@ -11,6 +11,9 @@ enum bip_flags {
>  	BIP_DISK_NOCHECK	= 1 << 3, /* disable disk integrity checking */
>  	BIP_IP_CHECKSUM		= 1 << 4, /* IP checksum */
>  	BIP_COPY_USER		= 1 << 5, /* Kernel bounce buffer in use */
> +	BIP_CHECK_GUARD		= 1 << 6,
> +	BIP_CHECK_REFTAG	= 1 << 7,
> +	BIP_CHECK_APPTAG	= 1 << 8,

Maybe describe the flags here like the other ones?


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 08/10] block: add support to pass user meta buffer
  2024-08-23 10:38     ` [PATCH v3 08/10] block: add support to pass user meta buffer Anuj Gupta
@ 2024-08-24  8:44       ` Christoph Hellwig
  0 siblings, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2024-08-24  8:44 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, krisman,
	io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Kanchan Joshi

On Fri, Aug 23, 2024 at 04:08:09PM +0530, Anuj Gupta wrote:
> From: Kanchan Joshi <[email protected]>
> 
> If iocb contains the meta, extract that and prepare the bip.

If an iocb contains metadata, ...

> --- a/block/fops.c
> +++ b/block/fops.c
> @@ -154,6 +154,9 @@ static void blkdev_bio_end_io(struct bio *bio)
>  		}
>  	}
>  
> +	if (bio_integrity(bio) && (dio->iocb->ki_flags & IOCB_HAS_META))
> +		bio_integrity_unmap_user(bio);

How could bio_integrity() be true here without the iocb flag?

> +		if (!is_sync && unlikely(iocb->ki_flags & IOCB_HAS_META)) {

unlikely is actively harmful here, as the code is likely if you use
the feature..

> +			ret = bio_integrity_map_iter(bio, iocb->private);
> +			if (unlikely(ret)) {
> +				bio_release_pages(bio, false);
> +				bio_clear_flag(bio, BIO_REFFED);
> +				bio_put(bio);
> +				blk_finish_plug(&plug);
> +				return ret;
> +			}

This duplicates the error handling done just above.  Please add a
goto label to de-duplicate it.

> +	if (unlikely(iocb->ki_flags & IOCB_HAS_META)) {
> +		ret = bio_integrity_map_iter(bio, iocb->private);
> +		WRITE_ONCE(iocb->private, NULL);
> +		if (unlikely(ret)) {
> +			bio_put(bio);
> +			return ret;

This probably also wants an out_bio_put label even if the duplication
is minimal so far.

You probably also want a WARN_ON for the iocb meta flag in
__blkdev_direct_IO_simple so that we don't get caught off guard
if someone adds a synchronous path using PI.

> diff --git a/block/t10-pi.c b/block/t10-pi.c
> index e7052a728966..cb7bc4a88380 100644
> --- a/block/t10-pi.c
> +++ b/block/t10-pi.c
> @@ -139,6 +139,8 @@ static void t10_pi_type1_prepare(struct request *rq)
>  		/* Already remapped? */
>  		if (bip->bip_flags & BIP_MAPPED_INTEGRITY)
>  			break;
> +		if (bip->bip_flags & BIP_INTEGRITY_USER)
> +			break;

This is wrong.  When submitting metadata on a partition the ref tag
does need to be remapped.  Please also add a tests that tests submitting
metadata on a partition so that we have a regression test for this.

> +	BIP_INTEGRITY_USER      = 1 << 9, /* Integrity payload is user
> +					   * address
> +					   */

.. and with the above fix this flag should not be needed.

>  };
>  
>  struct bio_integrity_payload {
> @@ -24,6 +27,7 @@ struct bio_integrity_payload {
>  	unsigned short		bip_vcnt;	/* # of integrity bio_vecs */
>  	unsigned short		bip_max_vcnt;	/* integrity bio_vec slots */
>  	unsigned short		bip_flags;	/* control flags */
> +	u16			app_tag;

Please document the field even if it seems obvious.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 09/10] nvme: add handling for app_tag
  2024-08-23 10:38     ` [PATCH v3 09/10] nvme: add handling for app_tag Anuj Gupta
@ 2024-08-24  8:49       ` Christoph Hellwig
  2024-08-29  3:00       ` Martin K. Petersen
  1 sibling, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2024-08-24  8:49 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, krisman,
	io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Kanchan Joshi

Maybe name this "add support for passin on the application tag" ?

> +static void nvme_set_app_tag(struct nvme_command *cmnd, u16 apptag)
> +{
> +	cmnd->rw.apptag = cpu_to_le16(apptag);
> +	/* use 0xfff as mask so that apptag is used in entirety */
> +	cmnd->rw.appmask = cpu_to_le16(0xffff);

Can you throw in a patch to rename these field to match the field names
used in the specification?  I also don't think the comment is all that
useful.

> +}
> +
>  static void nvme_set_ref_tag(struct nvme_ns *ns, struct nvme_command *cmnd,
>  			      struct request *req)
>  {
> @@ -1010,6 +1017,11 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
>  				control |= NVME_RW_APPEND_PIREMAP;
>  			nvme_set_ref_tag(ns, cmnd, req);
>  		}
> +		if (bio_integrity_flagged(req->bio, BIP_CHECK_APPTAG)) {
> +			control |= NVME_RW_PRINFO_PRCHK_APP;
> +			nvme_set_app_tag(cmnd,
> +					 bio_integrity(req->bio)->app_tag);

Passing the request to nvme_set_app_tag would probably be a bit cleaner.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 10/10] scsi: add support for user-meta interface
  2024-08-23 10:38     ` [PATCH v3 10/10] scsi: add support for user-meta interface Anuj Gupta
@ 2024-08-24  8:52       ` Christoph Hellwig
  0 siblings, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2024-08-24  8:52 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, krisman,
	io-uring, linux-nvme, linux-block, gost.dev, linux-scsi

> Add support for sending user-meta buffer. Set tags to be checked
> using flags specified by user/block-layer user and underlying DIF/DIX
> configuration. Introduce BLK_INTEGRITY_APP_TAG to specify apptag.
> This provides a way for upper layers to specify apptag checking.

We'll also need that flag for nvme, don't we?  It should also be
added in a block layer patch and not as part of a driver patch.

> +/*
> + * Can't check reftag alone or apptag alone
> + */
> +static bool sd_prot_flags_valid(struct scsi_cmnd *scmd)
> +{
> +	struct request *rq = scsi_cmd_to_rq(scmd);
> +	struct bio *bio = rq->bio;
> +
> +	if (bio_integrity_flagged(bio, BIP_CHECK_REFTAG) &&
> +	    !bio_integrity_flagged(bio, BIP_CHECK_APPTAG))
> +		return false;
> +	if (!bio_integrity_flagged(bio, BIP_CHECK_REFTAG) &&
> +	    bio_integrity_flagged(bio, BIP_CHECK_APPTAG))
> +		return false;
> +	return true;
> +}

We'll need to advertise this limitations to the application or in-kernel
user somehow..

> +		if ((bio_integrity_flagged(bio, BIP_CTRL_NOCHECK) == false) &&
> +			(!dix || bio_integrity_flagged(bio, BIP_CHECK_REFTAG)))

Incorrect formatting.  This is better:

		if (!bio_integrity_flagged(bio, BIP_CTRL_NOCHECK) &&
		    (!dix || bio_integrity_flagged(bio, BIP_CHECK_REFTAG)))


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 03/10] block: handle split correctly for user meta bounce buffer
  2024-08-24  8:31       ` Christoph Hellwig
@ 2024-08-28 11:18         ` Anuj Gupta
  2024-08-29  4:04           ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: Anuj Gupta @ 2024-08-28 11:18 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, kbusch, martin.petersen, asml.silence, krisman, io-uring,
	linux-nvme, linux-block, gost.dev, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 2864 bytes --]

On Sat, Aug 24, 2024 at 10:31:16AM +0200, Christoph Hellwig wrote:
> On Fri, Aug 23, 2024 at 04:08:03PM +0530, Anuj Gupta wrote:
> > Copy back the bounce buffer to user-space in entirety when the parent
> > bio completes.
> 
> This looks odd to me.  The usual way to handle iterating the entire
> submitter controlled data is to just iterate over the bvec array, as
> done by bio_for_each_segment_all/bio_for_each_bvec_all for the bio
> data.  I think you want to do the same here, probably with a
> similar bip_for_each_bvec_all or similar helper.  That way you don't
> need to stash away the iter.  Currently we have the field for that,
> but I really want to split up struct bio_integrity_payload into
> what is actually needed for the payload and stuff only needed for
> the block layer autogenerated PI (bip_bio/bio_iter/bip_work).
 
I can add it [*], to iterate over the entire bvec array. But the original
bio_iter still needs to be stored during submission, to calculate the
number of bytes in the original integrity/metadata iter (as it could have
gotten split, and I don't have original integrity iter stored anywhere).
Do you have a different opinion?

[*]
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index ff7de4fe74c4..f1690c644e70 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -125,11 +125,23 @@ static void bio_integrity_uncopy_user(struct bio_integrity_payload *bip)
 	struct bio_vec *copy = &bip->bip_vec[1];
 	size_t bytes = bio_iter_integrity_bytes(bi, bip->bio_iter);
 	struct iov_iter iter;
-	int ret;
+	struct bio_vec *bvec;
+	struct bvec_iter_all iter_all;
 
 	iov_iter_bvec(&iter, ITER_DEST, copy, nr_vecs, bytes);
-	ret = copy_to_iter(bvec_virt(bip->bip_vec), bytes, &iter);
-	WARN_ON_ONCE(ret != bytes);
+	bip_for_each_segment_all(bvec, bip, iter_all) {
+		ssize_t ret;
+
+		ret = copy_page_to_iter(bvec->bv_page,
+					bvec->bv_offset,
+					bvec->bv_len,
+					&iter);
+
+		if (!iov_iter_count(&iter))
+			break;
+
+		WARN_ON_ONCE(ret < bvec->bv_len);
+	}
 
 	bio_integrity_unpin_bvec(copy, nr_vecs, true);
 }
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index 22ff2ae16444..3132ef6f27e0 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -46,6 +46,19 @@ struct uio_meta {
 	struct		iov_iter iter;
 };
 
+static inline bool bip_next_segment(const struct bio_integrity_payload *bip,
+				    struct bvec_iter_all *iter)
+{
+	if (iter->idx >= bip->bip_vcnt)
+		return false;
+
+	bvec_advance(&bip->bip_vec[iter->idx], iter);
+	return true;
+}
+
+#define bip_for_each_segment_all(bvl, bip, iter) \
+	for (bvl = bvec_init_iter_all(&iter); bip_next_segment((bip), &iter); )
+
 #define BIP_CLONE_FLAGS (BIP_MAPPED_INTEGRITY | BIP_CTRL_NOCHECK | \
 			 BIP_DISK_NOCHECK | BIP_IP_CHECKSUM | \
 			 BIP_CHECK_GUARD | BIP_CHECK_REFTAG | \
-- 
2.25.1

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
  2024-08-24  8:35       ` Christoph Hellwig
@ 2024-08-28 13:42         ` Kanchan Joshi
  2024-08-29  3:16           ` Martin K. Petersen
  2024-08-29  4:06           ` Christoph Hellwig
  0 siblings, 2 replies; 36+ messages in thread
From: Kanchan Joshi @ 2024-08-28 13:42 UTC (permalink / raw)
  To: Christoph Hellwig, Anuj Gupta
  Cc: axboe, kbusch, martin.petersen, asml.silence, krisman, io-uring,
	linux-nvme, linux-block, gost.dev, linux-scsi

On 8/24/2024 2:05 PM, Christoph Hellwig wrote:
> How do we communicate what flags can be combined to the upper layer
> and userspace given the SCSI limitations here?

Will adding a new read-only sysfs attribute (under the group [*]) that 
publishes all valid combinations be fine?

With Guard/Reftag/Apptag, we get 6 combinations.
For NVMe, all can be valid. For SCSI, maximum 4 can be valid. And we 
factor the pi-type in while listing what all is valid.
For example: 010 or 001 is not valid for SCSI and should not be shown by 
this.

[*]
const struct attribute_group blk_integrity_attr_group = {
          .name = "integrity",
          .attrs = integrity_attrs,
};

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 09/10] nvme: add handling for app_tag
  2024-08-23 10:38     ` [PATCH v3 09/10] nvme: add handling for app_tag Anuj Gupta
  2024-08-24  8:49       ` Christoph Hellwig
@ 2024-08-29  3:00       ` Martin K. Petersen
  2024-08-29 10:18         ` Kanchan Joshi
  1 sibling, 1 reply; 36+ messages in thread
From: Martin K. Petersen @ 2024-08-29  3:00 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, krisman,
	io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Kanchan Joshi


Anuj,

> With user integrity buffer, there is a way to specify the app_tag. Set
> the corresponding protocol specific flags and send the app_tag down.

This assumes the app tag is the same for every block in the I/O? That's
not how it's typically used (to the extent that it is used at all due to
the value 0xffff acting as escape).

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 05/10] block: define meta io descriptor
  2024-08-23 10:38     ` [PATCH v3 05/10] block: define meta io descriptor Anuj Gupta
  2024-08-24  8:31       ` Christoph Hellwig
@ 2024-08-29  3:05       ` Martin K. Petersen
  1 sibling, 0 replies; 36+ messages in thread
From: Martin K. Petersen @ 2024-08-29  3:05 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, krisman,
	io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
	Kanchan Joshi


Anuj,

> +struct uio_meta {
> +	meta_flags_t	flags;
> +	u16		app_tag;
> +	struct		iov_iter iter;
> +};

What about the ref tag seed? Or does the proposed API assume that the
first block of the I/O always has a ref tag of 0?

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 01/10] block: define set of integrity flags to be inherited by cloned bip
  2024-08-23 10:38     ` [PATCH v3 01/10] block: define set of integrity flags to be inherited by cloned bip Anuj Gupta
  2024-08-24  8:24       ` Christoph Hellwig
@ 2024-08-29  3:05       ` Martin K. Petersen
  1 sibling, 0 replies; 36+ messages in thread
From: Martin K. Petersen @ 2024-08-29  3:05 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, krisman,
	io-uring, linux-nvme, linux-block, gost.dev, linux-scsi


Anuj,

> Introduce BIP_CLONE_FLAGS describing integrity flags that should be
> inherited in the cloned bip from the parent.

Reviewed-by: Martin K. Petersen <[email protected]>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 02/10] block: introduce a helper to determine metadata bytes from data iter
  2024-08-23 10:38     ` [PATCH v3 02/10] block: introduce a helper to determine metadata bytes from data iter Anuj Gupta
  2024-08-24  8:24       ` Christoph Hellwig
@ 2024-08-29  3:06       ` Martin K. Petersen
  1 sibling, 0 replies; 36+ messages in thread
From: Martin K. Petersen @ 2024-08-29  3:06 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: axboe, hch, kbusch, martin.petersen, asml.silence, krisman,
	io-uring, linux-nvme, linux-block, gost.dev, linux-scsi


Anuj,

> Introduce a new helper bio_iter_integrity_bytes to determine the number
> of metadata bytes corresponding to data iter.

Reviewed-by: Martin K. Petersen <[email protected]>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
  2024-08-28 13:42         ` Kanchan Joshi
@ 2024-08-29  3:16           ` Martin K. Petersen
  2024-08-29  4:06             ` Christoph Hellwig
  2024-08-29 13:29             ` Anuj gupta
  2024-08-29  4:06           ` Christoph Hellwig
  1 sibling, 2 replies; 36+ messages in thread
From: Martin K. Petersen @ 2024-08-29  3:16 UTC (permalink / raw)
  To: Kanchan Joshi
  Cc: Christoph Hellwig, Anuj Gupta, axboe, kbusch, martin.petersen,
	asml.silence, krisman, io-uring, linux-nvme, linux-block,
	gost.dev, linux-scsi


Kanchan,

> With Guard/Reftag/Apptag, we get 6 combinations. For NVMe, all can be
> valid. For SCSI, maximum 4 can be valid. And we factor the pi-type in
> while listing what all is valid. For example: 010 or 001 is not valid
> for SCSI and should not be shown by this.

I thought we had tentatively agreed to let the block layer integrity
flags only describe what the controller should do? And then let sd.c
decide what to do about RDPROTECT/WRPROTECT (since host-to-target is a
different protection envelope anyway). That is kind of how it works
already.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 03/10] block: handle split correctly for user meta bounce buffer
  2024-08-28 11:18         ` Anuj Gupta
@ 2024-08-29  4:04           ` Christoph Hellwig
  0 siblings, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2024-08-29  4:04 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: Christoph Hellwig, axboe, kbusch, martin.petersen, asml.silence,
	krisman, io-uring, linux-nvme, linux-block, gost.dev, linux-scsi

On Wed, Aug 28, 2024 at 04:48:06PM +0530, Anuj Gupta wrote:
> I can add it [*], to iterate over the entire bvec array. But the original
> bio_iter still needs to be stored during submission, to calculate the
> number of bytes in the original integrity/metadata iter (as it could have
> gotten split, and I don't have original integrity iter stored anywhere).
> Do you have a different opinion?

Just like for the bio data, the original submitter should never use the
iter, but the _all iter. 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
  2024-08-28 13:42         ` Kanchan Joshi
  2024-08-29  3:16           ` Martin K. Petersen
@ 2024-08-29  4:06           ` Christoph Hellwig
  1 sibling, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2024-08-29  4:06 UTC (permalink / raw)
  To: Kanchan Joshi
  Cc: Christoph Hellwig, Anuj Gupta, axboe, kbusch, martin.petersen,
	asml.silence, krisman, io-uring, linux-nvme, linux-block,
	gost.dev, linux-scsi

On Wed, Aug 28, 2024 at 07:12:22PM +0530, Kanchan Joshi wrote:
> On 8/24/2024 2:05 PM, Christoph Hellwig wrote:
> > How do we communicate what flags can be combined to the upper layer
> > and userspace given the SCSI limitations here?
> 
> Will adding a new read-only sysfs attribute (under the group [*]) that 
> publishes all valid combinations be fine?
> 

That's not a good application interface.  Finding the sysfs file is
already a pain for direct users of the block device, and almost
impossible when going through a file system.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
  2024-08-29  3:16           ` Martin K. Petersen
@ 2024-08-29  4:06             ` Christoph Hellwig
  2024-08-29 13:29             ` Anuj gupta
  1 sibling, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2024-08-29  4:06 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Kanchan Joshi, Christoph Hellwig, Anuj Gupta, axboe, kbusch,
	asml.silence, krisman, io-uring, linux-nvme, linux-block,
	gost.dev, linux-scsi

On Wed, Aug 28, 2024 at 11:16:53PM -0400, Martin K. Petersen wrote:
> I thought we had tentatively agreed to let the block layer integrity
> flags only describe what the controller should do? And then let sd.c
> decide what to do about RDPROTECT/WRPROTECT (since host-to-target is a
> different protection envelope anyway). That is kind of how it works
> already.

Yes, that should easier to manage.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 09/10] nvme: add handling for app_tag
  2024-08-29  3:00       ` Martin K. Petersen
@ 2024-08-29 10:18         ` Kanchan Joshi
  2024-09-13  2:05           ` Martin K. Petersen
  0 siblings, 1 reply; 36+ messages in thread
From: Kanchan Joshi @ 2024-08-29 10:18 UTC (permalink / raw)
  To: Martin K. Petersen, Anuj Gupta
  Cc: axboe, hch, kbusch, asml.silence, krisman, io-uring, linux-nvme,
	linux-block, gost.dev, linux-scsi

On 8/29/2024 8:30 AM, Martin K. Petersen wrote:
> 
> Anuj,
> 
>> With user integrity buffer, there is a way to specify the app_tag. Set
>> the corresponding protocol specific flags and send the app_tag down.
> 
> This assumes the app tag is the same for every block in the I/O? That's
> not how it's typically used (to the extent that it is used at all due to
> the value 0xffff acting as escape).
> 

NVMe spec allows only one value to be passed in each read/write command 
(LBAT for write, and ELBAT for read). And that's what controller checks 
for the entire block range covered by one command.
So per-io tag rather than per-block tag. The integrity buffer creator is 
supposed to put the same application-tag for each block if it is sending 
multi-block IO.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
  2024-08-29  3:16           ` Martin K. Petersen
  2024-08-29  4:06             ` Christoph Hellwig
@ 2024-08-29 13:29             ` Anuj gupta
  2024-09-12 12:40               ` Anuj Gupta
  2024-09-13  2:06               ` Martin K. Petersen
  1 sibling, 2 replies; 36+ messages in thread
From: Anuj gupta @ 2024-08-29 13:29 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Kanchan Joshi, Christoph Hellwig, Anuj Gupta, axboe, kbusch,
	asml.silence, krisman, io-uring, linux-nvme, linux-block,
	gost.dev, linux-scsi

On Thu, Aug 29, 2024 at 8:47 AM Martin K. Petersen
<[email protected]> wrote:
>
>
> Kanchan,
>
> > With Guard/Reftag/Apptag, we get 6 combinations. For NVMe, all can be
> > valid. For SCSI, maximum 4 can be valid. And we factor the pi-type in
> > while listing what all is valid. For example: 010 or 001 is not valid
> > for SCSI and should not be shown by this.
>
> I thought we had tentatively agreed to let the block layer integrity
> flags only describe what the controller should do? And then let sd.c
> decide what to do about RDPROTECT/WRPROTECT (since host-to-target is a
> different protection envelope anyway). That is kind of how it works
> already.
>
Do you see that this patch (and this set of flags) are fine?
If not, which specific flags do you suggest should be introduced?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
  2024-08-29 13:29             ` Anuj gupta
@ 2024-09-12 12:40               ` Anuj Gupta
  2024-09-13  2:06               ` Martin K. Petersen
  1 sibling, 0 replies; 36+ messages in thread
From: Anuj Gupta @ 2024-09-12 12:40 UTC (permalink / raw)
  To: Anuj gupta
  Cc: Martin K. Petersen, Kanchan Joshi, Christoph Hellwig, axboe,
	kbusch, asml.silence, krisman, io-uring, linux-nvme, linux-block,
	gost.dev, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 1015 bytes --]

Martin, Christoph

On 29/08/24 06:59PM, Anuj gupta wrote:
>On Thu, Aug 29, 2024 at 8:47 AM Martin K. Petersen
><[email protected]> wrote:
>>
>>
>> Kanchan,
>>
>> > With Guard/Reftag/Apptag, we get 6 combinations. For NVMe, all can be
>> > valid. For SCSI, maximum 4 can be valid. And we factor the pi-type in
>> > while listing what all is valid. For example: 010 or 001 is not valid
>> > for SCSI and should not be shown by this.
>>
>> I thought we had tentatively agreed to let the block layer integrity
>> flags only describe what the controller should do? And then let sd.c
>> decide what to do about RDPROTECT/WRPROTECT (since host-to-target is a
>> different protection envelope anyway). That is kind of how it works
>> already.
>>
>Do you see that this patch (and this set of flags) are fine?
>If not, which specific flags do you suggest should be introduced?

While other things are sorted for next iteration, it's not fully clear
what are we missing for this part. Can you comment on the above?

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 09/10] nvme: add handling for app_tag
  2024-08-29 10:18         ` Kanchan Joshi
@ 2024-09-13  2:05           ` Martin K. Petersen
  0 siblings, 0 replies; 36+ messages in thread
From: Martin K. Petersen @ 2024-09-13  2:05 UTC (permalink / raw)
  To: Kanchan Joshi
  Cc: Martin K. Petersen, Anuj Gupta, axboe, hch, kbusch, asml.silence,
	krisman, io-uring, linux-nvme, linux-block, gost.dev, linux-scsi


Kanchan,

>> This assumes the app tag is the same for every block in the I/O?
>> That's not how it's typically used (to the extent that it is used at
>> all due to the value 0xffff acting as escape).
>> 
>
> NVMe spec allows only one value to be passed in each read/write
> command (LBAT for write, and ELBAT for read). And that's what
> controller checks for the entire block range covered by one command.
> So per-io tag rather than per-block tag. The integrity buffer creator
> is supposed to put the same application-tag for each block if it is
> sending multi-block IO.

I am OK with that approach as long as the mask is only applied when
checking is enabled.

I.e. I don't have a use case for checking less than the full app tag.
But almost all users of PI I am aware of depend on being able to put
different values in the app tag for different blocks in the I/O when
checking is disabled.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
  2024-08-29 13:29             ` Anuj gupta
  2024-09-12 12:40               ` Anuj Gupta
@ 2024-09-13  2:06               ` Martin K. Petersen
  1 sibling, 0 replies; 36+ messages in thread
From: Martin K. Petersen @ 2024-09-13  2:06 UTC (permalink / raw)
  To: Anuj gupta
  Cc: Martin K. Petersen, Kanchan Joshi, Christoph Hellwig, Anuj Gupta,
	axboe, kbusch, asml.silence, krisman, io-uring, linux-nvme,
	linux-block, gost.dev, linux-scsi


Anuj,

> Do you see that this patch (and this set of flags) are fine?
> If not, which specific flags do you suggest should be introduced?

The three exposed BIP flags are fine. We'll deal with the target flag
peculiarities in the SCSI disk driver.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2024-09-13  2:06 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CGME20240823104552epcas5p226dbbbd448cd0ee0955ffdd3ad1b112d@epcas5p2.samsung.com>
2024-08-23 10:38 ` [PATCH v3 00/10] Read/Write with meta/integrity Anuj Gupta
     [not found]   ` <CGME20240823104616epcas5p4bd315bd116ea7e32b1abf7e174af64a1@epcas5p4.samsung.com>
2024-08-23 10:38     ` [PATCH v3 01/10] block: define set of integrity flags to be inherited by cloned bip Anuj Gupta
2024-08-24  8:24       ` Christoph Hellwig
2024-08-29  3:05       ` Martin K. Petersen
     [not found]   ` <CGME20240823104618epcas5p4b9983678886dceed75edd9cbec9341b2@epcas5p4.samsung.com>
2024-08-23 10:38     ` [PATCH v3 02/10] block: introduce a helper to determine metadata bytes from data iter Anuj Gupta
2024-08-24  8:24       ` Christoph Hellwig
2024-08-29  3:06       ` Martin K. Petersen
     [not found]   ` <CGME20240823104620epcas5p2118c152963d6cadfbc9968790ac0e536@epcas5p2.samsung.com>
2024-08-23 10:38     ` [PATCH v3 03/10] block: handle split correctly for user meta bounce buffer Anuj Gupta
2024-08-24  8:31       ` Christoph Hellwig
2024-08-28 11:18         ` Anuj Gupta
2024-08-29  4:04           ` Christoph Hellwig
     [not found]   ` <CGME20240823104622epcas5p2e3b29f793eff9857c5712b3d6d327ed5@epcas5p2.samsung.com>
2024-08-23 10:38     ` [PATCH v3 04/10] block: modify bio_integrity_map_user to accept iov_iter as argument Anuj Gupta
     [not found]   ` <CGME20240823104624epcas5p40c1b0f3516100f69cbd31d45867cd289@epcas5p4.samsung.com>
2024-08-23 10:38     ` [PATCH v3 05/10] block: define meta io descriptor Anuj Gupta
2024-08-24  8:31       ` Christoph Hellwig
2024-08-29  3:05       ` Martin K. Petersen
     [not found]   ` <CGME20240823104627epcas5p2abcd2283f6fb3301e1a8e828e3c270ae@epcas5p2.samsung.com>
2024-08-23 10:38     ` [PATCH v3 06/10] io_uring/rw: add support to send meta along with read/write Anuj Gupta
2024-08-24  8:33       ` Christoph Hellwig
     [not found]   ` <CGME20240823104629epcas5p3fea0cb7e66b0446ddacf7648c08c3ba8@epcas5p3.samsung.com>
2024-08-23 10:38     ` [PATCH v3 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags Anuj Gupta
2024-08-24  8:35       ` Christoph Hellwig
2024-08-28 13:42         ` Kanchan Joshi
2024-08-29  3:16           ` Martin K. Petersen
2024-08-29  4:06             ` Christoph Hellwig
2024-08-29 13:29             ` Anuj gupta
2024-09-12 12:40               ` Anuj Gupta
2024-09-13  2:06               ` Martin K. Petersen
2024-08-29  4:06           ` Christoph Hellwig
     [not found]   ` <CGME20240823104631epcas5p4f83b92081107fbefca78008ee319ff7e@epcas5p4.samsung.com>
2024-08-23 10:38     ` [PATCH v3 07/10] block,nvme: " Anuj Gupta
     [not found]   ` <CGME20240823104634epcas5p4ef1af26cc7146b4e8b7a4a1844ffe476@epcas5p4.samsung.com>
2024-08-23 10:38     ` [PATCH v3 08/10] block: add support to pass user meta buffer Anuj Gupta
2024-08-24  8:44       ` Christoph Hellwig
     [not found]   ` <CGME20240823104636epcas5p4825a6d2dd9e45cfbcc97895264662d30@epcas5p4.samsung.com>
2024-08-23 10:38     ` [PATCH v3 09/10] nvme: add handling for app_tag Anuj Gupta
2024-08-24  8:49       ` Christoph Hellwig
2024-08-29  3:00       ` Martin K. Petersen
2024-08-29 10:18         ` Kanchan Joshi
2024-09-13  2:05           ` Martin K. Petersen
     [not found]   ` <CGME20240823104639epcas5p11dbab393122841419368a86b4bd5c04b@epcas5p1.samsung.com>
2024-08-23 10:38     ` [PATCH v3 10/10] scsi: add support for user-meta interface Anuj Gupta
2024-08-24  8:52       ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox