* [PATCH v5 00/10] Read/Write with meta/integrity
[not found] <CGME20241029163153epcas5p4ab83a94429a227bfc262423aa8a8dd26@epcas5p4.samsung.com>
@ 2024-10-29 16:23 ` Anuj Gupta
[not found] ` <CGME20241029163212epcas5p343cd56d66b58a9e7e8e1faa98067891d@epcas5p3.samsung.com>
` (9 more replies)
0 siblings, 10 replies; 24+ messages in thread
From: Anuj Gupta @ 2024-10-29 16:23 UTC (permalink / raw)
To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
brauner, jack, viro
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
linux-fsdevel, Anuj Gupta
This adds a new io_uring interface to exchange meta along with read/write.
The patchset is on top block for-next [1] and keith's cleanup patch [2].
Interface:
A new meta_type field is introduced in SQE, which describes type of meta
that is passed. Currently only one type "PI" is supported. Meta information
is represented using a newly introduced 'struct io_uring_meta_pi'.
Application sets up a SQE128 ring, and prepares io_uring_meta_pi within
second SQE. Application populates 'struct io_uring_meta_pi' fields as below:
* pi_flags: these are meta-type specific flags. Three flags are exposed for
integrity type, namely IO_INTEGRITY_CHK_GUARD/APPTAG/REFTAG.
* len: length of the meta buffer
* addr: address of the meta buffer
* seed: seed value for ref tag remapping
* app_tag: optional application-specific 16b value; this goes along with
INTEGRITY_CHK_APPTAG flag.
* rsvd: reserved space for storage tag.
Block path (direct IO) , NVMe and SCSI driver are modified to support
this.
Patch 1 is an enhancement patch.
Patch 2 is required to make the bounce buffer copy back work correctly.
Patch 3 to 5 are prep patches.
Patch 6 adds the io_uring support.
Patch 7 gives us unified interface for user and kernel generated
integrity.
Patch 8 adds support in SCSI and patch 9 in NVMe.
Patch 10 adds the support for block direct IO.
Some of the design choices came from this discussion [3].
Example program on how to use the interface is appended below [4]
(It also tests whether reftag remapping happens correctly or not)
Tree:
https://github.com/SamsungDS/linux/tree/feat/pi_us_v5
Testing:
has been done by modifying fio to use this interface.
https://github.com/SamsungDS/fio/tree/priv/feat/pi-test-v6
Changes since v4;
https://lore.kernel.org/linux-block/[email protected]/
- better variable names to describe bounce buffer copy back (hch)
- move defintion of flags in the same patch introducing uio_meta (hch)
- move uio_meta definition to include/linux/uio.h (hch)
- bump seed size in uio_meta to 8 bytes (martin)
- move flags definition to include/uapi/linux/fs.h (hch)
- s/meta/metadata in commit description of io-uring (hch)
- rearrange the meta fields in sqe for cleaner layout
- partial submission case is not applicable as, we are only plumbing for async case
- s/META_TYPE_INTEGRITY/META_TYPE_PI (hch, martin)
- remove unlikely branching (hch)
- Better formatting, misc cleanups, better commit descriptions, reordering commits(hch)
Changes since v3:
https://lore.kernel.org/linux-block/[email protected]/
- add reftag seed support (Martin)
- fix incorrect formatting in uio_meta (hch)
- s/IOCB_HAS_META/IOCB_HAS_METADATA (hch)
- move integrity check flags to block layer header (hch)
- add comments for BIP_CHECK_GUARD/REFTAG/APPTAG flags (hch)
- remove bio_integrity check during completion if IOCB_HAS_METADATA is set (hch)
- use goto label to get rid of duplicate error handling (hch)
- add warn_on if trying to do sync io with iocb_has_metadata flag (hch)
- remove check for disabling reftag remapping (hch)
- remove BIP_INTEGRITY_USER flag (hch)
- add comment for app_tag field introduced in bio_integrity_payload (hch)
- pass request to nvme_set_app_tag function (hch)
- right indentation at a place in scsi patch (hch)
- move IOCB_HAS_METADATA to a separate fs patch (hch)
Changes since v2:
https://lore.kernel.org/linux-block/[email protected]/
- io_uring error handling styling (Gabriel)
- add documented helper to get metadata bytes from data iter (hch)
- during clone specify "what flags to clone" rather than
"what not to clone" (hch)
- Move uio_meta defination to bio-integrity.h (hch)
- Rename apptag field to app_tag (hch)
- Change datatype of flags field in uio_meta to bitwise (hch)
- Don't introduce BIP_USER_CHK_FOO flags (hch, martin)
- Driver should rely on block layer flags instead of seeing if it is
user-passthrough (hch)
- update the scsi code for handling user-meta (hch, martin)
Changes since v1:
https://lore.kernel.org/linux-block/[email protected]/
- Do not use new opcode for meta, and also add the provision to introduce new
meta types beyond integrity (Pavel)
- Stuff IOCB_HAS_META check in need_complete_io (Jens)
- Split meta handling in NVMe into a separate handler (Keith)
- Add meta handling for __blkdev_direct_IO too (Keith)
- Don't inherit BIP_COPY_USER flag for cloned bio's (Christoph)
- Better commit descriptions (Christoph)
Changes since RFC:
- modify io_uring plumbing based on recent async handling state changes
- fixes/enhancements to correctly handle the split for meta buffer
- add flags to specify guard/reftag/apptag checks
- add support to send apptag
[1] https://git.kernel.dk/cgit/linux-block/log/?h=for-next
[2] https://lore.kernel.org/linux-block/[email protected]/
[3] https://lore.kernel.org/linux-block/[email protected]/
[4]
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <linux/fs.h>
#include <linux/io_uring.h>
#include <linux/types.h>
#include "liburing.h"
/*
* write data/meta. read both. compare. send apptag too.
* prerequisite:
* protected xfer: format namespace with 4KB + 8b, pi_type = 1
* For testing reftag remapping on device-mapper, create a
* device-mapper and run this program. Device mapper creation:
* # echo 0 80 linear /dev/nvme0n1 0 > /tmp/table
* # echo 80 160 linear /dev/nvme0n1 200 >> /tmp/table
* # dmsetup create two /tmp/table
* # ./a.out /dev/dm-0
*/
#define DATA_LEN 4096
#define META_LEN 8
struct t10_pi_tuple {
__be16 guard;
__be16 apptag;
__be32 reftag;
};
int main(int argc, char *argv[])
{
struct io_uring ring;
struct io_uring_sqe *sqe = NULL;
struct io_uring_cqe *cqe = NULL;
void *wdb,*rdb;
char wmb[META_LEN], rmb[META_LEN];
char *data_str = "data buffer";
int fd, ret, blksize;
struct stat fstat;
unsigned long long offset = DATA_LEN * 10;
struct t10_pi_tuple *pi;
struct io_uring_meta_pi *md;
if (argc != 2) {
fprintf(stderr, "Usage: %s <block-device>", argv[0]);
return 1;
};
if (stat(argv[1], &fstat) == 0) {
blksize = (int)fstat.st_blksize;
} else {
perror("stat");
return 1;
}
if (posix_memalign(&wdb, blksize, DATA_LEN)) {
perror("posix_memalign failed");
return 1;
}
if (posix_memalign(&rdb, blksize, DATA_LEN)) {
perror("posix_memalign failed");
return 1;
}
memset(wdb, 0, DATA_LEN);
fd = open(argv[1], O_RDWR | O_DIRECT);
if (fd < 0) {
printf("Error in opening device\n");
return 0;
}
ret = io_uring_queue_init(8, &ring, IORING_SETUP_SQE128);
if (ret) {
fprintf(stderr, "ring setup failed: %d\n", ret);
return 1;
}
/* write data + meta-buffer to device */
sqe = io_uring_get_sqe(&ring);
if (!sqe) {
fprintf(stderr, "get sqe failed\n");
return 1;
}
io_uring_prep_write(sqe, fd, wdb, DATA_LEN, offset);
sqe->meta_type = META_TYPE_PI;
md = (struct io_uring_meta_pi *) sqe->big_sqe;
md->addr = (__u64)wmb;
md->len = META_LEN;
/* flags to ask for guard/reftag/apptag*/
md->pi_flags = IO_INTEGRITY_CHK_GUARD | IO_INTEGRITY_CHK_REFTAG | IO_INTEGRITY_CHK_APPTAG;
md->app_tag = 0x1234;
md->seed = 10;
pi = (struct t10_pi_tuple *)wmb;
pi->guard = 0;
pi->reftag = 0x0A000000;
pi->apptag = 0x3412;
ret = io_uring_submit(&ring);
if (ret <= 0) {
fprintf(stderr, "sqe submit failed: %d\n", ret);
return 1;
}
ret = io_uring_wait_cqe(&ring, &cqe);
if (!cqe) {
fprintf(stderr, "cqe is NULL :%d\n", ret);
return 1;
}
if (cqe->res < 0) {
fprintf(stderr, "write cqe failure: %d", cqe->res);
return 1;
}
io_uring_cqe_seen(&ring, cqe);
/* read data + meta-buffer back from device */
sqe = io_uring_get_sqe(&ring);
if (!sqe) {
fprintf(stderr, "get sqe failed\n");
return 1;
}
io_uring_prep_read(sqe, fd, rdb, DATA_LEN, offset);
sqe->meta_type = META_TYPE_PI;
md = (struct io_uring_meta_pi *) sqe->big_sqe;
md->addr = (__u64)rmb;
md->len = META_LEN;
md->pi_flags = IO_INTEGRITY_CHK_GUARD | IO_INTEGRITY_CHK_REFTAG | IO_INTEGRITY_CHK_APPTAG;
md->app_tag = 0x1234;
md->seed = 10;
ret = io_uring_submit(&ring);
if (ret <= 0) {
fprintf(stderr, "sqe submit failed: %d\n", ret);
return 1;
}
ret = io_uring_wait_cqe(&ring, &cqe);
if (!cqe) {
fprintf(stderr, "cqe is NULL :%d\n", ret);
return 1;
}
if (cqe->res < 0) {
fprintf(stderr, "read cqe failure: %d", cqe->res);
return 1;
}
pi = (struct t10_pi_tuple *)rmb;
if (pi->apptag != 0x3412)
printf("Failure: apptag mismatch!\n");
if (pi->reftag != 0x0A000000)
printf("Failure: reftag mismatch!\n");
io_uring_cqe_seen(&ring, cqe);
pi = (struct t10_pi_tuple *)rmb;
if (strncmp(wmb, rmb, META_LEN))
printf("Failure: meta mismatch!, wmb=%s, rmb=%s\n", wmb, rmb);
if (strncmp(wdb, rdb, DATA_LEN))
printf("Failure: data mismatch!\n");
io_uring_queue_exit(&ring);
free(rdb);
free(wdb);
return 0;
}
Anuj Gupta (7):
block: define set of integrity flags to be inherited by cloned bip
block: modify bio_integrity_map_user to accept iov_iter as argument
fs, iov_iter: define meta io descriptor
fs: introduce IOCB_HAS_METADATA for metadata
io_uring/rw: add support to send metadata along with read/write
block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
scsi: add support for user-meta interface
Christoph Hellwig (1):
block: copy back bounce buffer to user-space correctly in case of
split
Kanchan Joshi (2):
nvme: add support for passing on the application tag
block: add support to pass user meta buffer
block/bio-integrity.c | 84 ++++++++++++++++++++++++++++-------
block/blk-integrity.c | 10 ++++-
block/fops.c | 42 ++++++++++++++----
drivers/nvme/host/core.c | 21 +++++----
drivers/scsi/sd.c | 4 +-
include/linux/bio-integrity.h | 19 ++++++--
include/linux/fs.h | 1 +
include/linux/uio.h | 10 +++++
include/uapi/linux/fs.h | 9 ++++
include/uapi/linux/io_uring.h | 29 ++++++++++++
io_uring/io_uring.c | 9 ++++
io_uring/rw.c | 79 +++++++++++++++++++++++++++++++-
io_uring/rw.h | 14 +++++-
13 files changed, 290 insertions(+), 41 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v5 01/10] block: define set of integrity flags to be inherited by cloned bip
[not found] ` <CGME20241029163212epcas5p343cd56d66b58a9e7e8e1faa98067891d@epcas5p3.samsung.com>
@ 2024-10-29 16:23 ` Anuj Gupta
0 siblings, 0 replies; 24+ messages in thread
From: Anuj Gupta @ 2024-10-29 16:23 UTC (permalink / raw)
To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
brauner, jack, viro
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
linux-fsdevel, Anuj Gupta
Introduce BIP_CLONE_FLAGS describing integrity flags that should be
inherited in the cloned bip from the parent.
Suggested-by: Christoph Hellwig <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
---
block/bio-integrity.c | 2 +-
include/linux/bio-integrity.h | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 2a4bd6611692..a448a25d13de 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -559,7 +559,7 @@ int bio_integrity_clone(struct bio *bio, struct bio *bio_src,
bip->bip_vec = bip_src->bip_vec;
bip->bip_iter = bip_src->bip_iter;
- bip->bip_flags = bip_src->bip_flags & ~BIP_BLOCK_INTEGRITY;
+ bip->bip_flags = bip_src->bip_flags & BIP_CLONE_FLAGS;
return 0;
}
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index dbf0f74c1529..0f0cf10222e8 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -30,6 +30,9 @@ struct bio_integrity_payload {
struct bio_vec bip_inline_vecs[];/* embedded bvec array */
};
+#define BIP_CLONE_FLAGS (BIP_MAPPED_INTEGRITY | BIP_CTRL_NOCHECK | \
+ BIP_DISK_NOCHECK | BIP_IP_CHECKSUM)
+
#ifdef CONFIG_BLK_DEV_INTEGRITY
#define bip_for_each_vec(bvl, bip, iter) \
--
2.25.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 02/10] block: copy back bounce buffer to user-space correctly in case of split
[not found] ` <CGME20241029163214epcas5p1069ca93a2a9d8840e4f142cc4b713775@epcas5p1.samsung.com>
@ 2024-10-29 16:23 ` Anuj Gupta
0 siblings, 0 replies; 24+ messages in thread
From: Anuj Gupta @ 2024-10-29 16:23 UTC (permalink / raw)
To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
brauner, jack, viro
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
linux-fsdevel, Anuj Gupta
From: Christoph Hellwig <[email protected]>
Copy back the bounce buffer to user-space in entirety when the parent
bio completes. The existing code uses bip_iter.bi_size for sizing the
copy, which can be modified. So move away from that and fetch it from
the vector passed to the block layer. While at it, switch to using
better variable names.
Fixes: 492c5d455969f ("block: bio-integrity: directly map user buffers")
Signed-off-by: Anuj Gupta <[email protected]>
[hch: better names for variables]
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
---
block/bio-integrity.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index a448a25d13de..4341b0d4efa1 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -118,17 +118,18 @@ static void bio_integrity_unpin_bvec(struct bio_vec *bv, int nr_vecs,
static void bio_integrity_uncopy_user(struct bio_integrity_payload *bip)
{
- unsigned short nr_vecs = bip->bip_max_vcnt - 1;
- struct bio_vec *copy = &bip->bip_vec[1];
- size_t bytes = bip->bip_iter.bi_size;
- struct iov_iter iter;
+ unsigned short orig_nr_vecs = bip->bip_max_vcnt - 1;
+ struct bio_vec *orig_bvecs = &bip->bip_vec[1];
+ struct bio_vec *bounce_bvec = &bip->bip_vec[0];
+ size_t bytes = bounce_bvec->bv_len;
+ struct iov_iter orig_iter;
int ret;
- iov_iter_bvec(&iter, ITER_DEST, copy, nr_vecs, bytes);
- ret = copy_to_iter(bvec_virt(bip->bip_vec), bytes, &iter);
+ iov_iter_bvec(&orig_iter, ITER_DEST, orig_bvecs, orig_nr_vecs, bytes);
+ ret = copy_to_iter(bvec_virt(bounce_bvec), bytes, &orig_iter);
WARN_ON_ONCE(ret != bytes);
- bio_integrity_unpin_bvec(copy, nr_vecs, true);
+ bio_integrity_unpin_bvec(orig_bvecs, orig_nr_vecs, true);
}
/**
--
2.25.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 03/10] block: modify bio_integrity_map_user to accept iov_iter as argument
[not found] ` <CGME20241029163217epcas5p414d493b7a89c6bd092afd28c4eeea24c@epcas5p4.samsung.com>
@ 2024-10-29 16:23 ` Anuj Gupta
2024-10-29 21:31 ` Keith Busch
0 siblings, 1 reply; 24+ messages in thread
From: Anuj Gupta @ 2024-10-29 16:23 UTC (permalink / raw)
To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
brauner, jack, viro
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
linux-fsdevel, Anuj Gupta, Kanchan Joshi
This patch refactors bio_integrity_map_user to accept iov_iter as
argument. This is a prep patch.
Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
block/bio-integrity.c | 12 +++++-------
block/blk-integrity.c | 10 +++++++++-
include/linux/bio-integrity.h | 5 ++---
3 files changed, 16 insertions(+), 11 deletions(-)
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 4341b0d4efa1..f56d01cec689 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -302,16 +302,15 @@ static unsigned int bvec_from_pages(struct bio_vec *bvec, struct page **pages,
return nr_bvecs;
}
-int bio_integrity_map_user(struct bio *bio, void __user *ubuf, ssize_t bytes)
+int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter)
{
struct request_queue *q = bdev_get_queue(bio->bi_bdev);
unsigned int align = blk_lim_dma_alignment_and_pad(&q->limits);
struct page *stack_pages[UIO_FASTIOV], **pages = stack_pages;
struct bio_vec stack_vec[UIO_FASTIOV], *bvec = stack_vec;
+ size_t offset, bytes = iter->count;
unsigned int direction, nr_bvecs;
- struct iov_iter iter;
int ret, nr_vecs;
- size_t offset;
bool copy;
if (bio_integrity(bio))
@@ -324,8 +323,7 @@ int bio_integrity_map_user(struct bio *bio, void __user *ubuf, ssize_t bytes)
else
direction = ITER_SOURCE;
- iov_iter_ubuf(&iter, direction, ubuf, bytes);
- nr_vecs = iov_iter_npages(&iter, BIO_MAX_VECS + 1);
+ nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS + 1);
if (nr_vecs > BIO_MAX_VECS)
return -E2BIG;
if (nr_vecs > UIO_FASTIOV) {
@@ -335,8 +333,8 @@ int bio_integrity_map_user(struct bio *bio, void __user *ubuf, ssize_t bytes)
pages = NULL;
}
- copy = !iov_iter_is_aligned(&iter, align, align);
- ret = iov_iter_extract_pages(&iter, &pages, bytes, nr_vecs, 0, &offset);
+ copy = !iov_iter_is_aligned(iter, align, align);
+ ret = iov_iter_extract_pages(iter, &pages, bytes, nr_vecs, 0, &offset);
if (unlikely(ret < 0))
goto free_bvec;
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index b180cac61a9d..4a29754f1bc2 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -115,8 +115,16 @@ EXPORT_SYMBOL(blk_rq_map_integrity_sg);
int blk_rq_integrity_map_user(struct request *rq, void __user *ubuf,
ssize_t bytes)
{
- int ret = bio_integrity_map_user(rq->bio, ubuf, bytes);
+ int ret;
+ struct iov_iter iter;
+ unsigned int direction;
+ if (op_is_write(req_op(rq)))
+ direction = ITER_DEST;
+ else
+ direction = ITER_SOURCE;
+ iov_iter_ubuf(&iter, direction, ubuf, bytes);
+ ret = bio_integrity_map_user(rq->bio, &iter);
if (ret)
return ret;
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index 0f0cf10222e8..58ff9988433a 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -75,7 +75,7 @@ struct bio_integrity_payload *bio_integrity_alloc(struct bio *bio, gfp_t gfp,
unsigned int nr);
int bio_integrity_add_page(struct bio *bio, struct page *page, unsigned int len,
unsigned int offset);
-int bio_integrity_map_user(struct bio *bio, void __user *ubuf, ssize_t len);
+int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter);
void bio_integrity_unmap_user(struct bio *bio);
bool bio_integrity_prep(struct bio *bio);
void bio_integrity_advance(struct bio *bio, unsigned int bytes_done);
@@ -101,8 +101,7 @@ static inline void bioset_integrity_free(struct bio_set *bs)
{
}
-static inline int bio_integrity_map_user(struct bio *bio, void __user *ubuf,
- ssize_t len)
+static int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter)
{
return -EINVAL;
}
--
2.25.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 04/10] fs, iov_iter: define meta io descriptor
[not found] ` <CGME20241029163220epcas5p2207d4c54b8c4811e973fca601fd7e3f5@epcas5p2.samsung.com>
@ 2024-10-29 16:23 ` Anuj Gupta
2024-10-30 5:03 ` Christoph Hellwig
0 siblings, 1 reply; 24+ messages in thread
From: Anuj Gupta @ 2024-10-29 16:23 UTC (permalink / raw)
To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
brauner, jack, viro
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
linux-fsdevel, Anuj Gupta, Kanchan Joshi
Add flags to describe checks for integrity meta buffer. Also, introduce
a new 'uio_meta' structure that upper layer can use to pass the
meta/integrity information.
Signed-off-by: Kanchan Joshi <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
---
include/linux/uio.h | 10 ++++++++++
include/uapi/linux/fs.h | 9 +++++++++
2 files changed, 19 insertions(+)
diff --git a/include/linux/uio.h b/include/linux/uio.h
index 853f9de5aa05..eb3eee957a7d 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -82,6 +82,16 @@ struct iov_iter {
};
};
+/* flags for integrity meta */
+typedef __u16 __bitwise uio_meta_flags_t;
+
+struct uio_meta {
+ uio_meta_flags_t flags;
+ u16 app_tag;
+ u64 seed;
+ struct iov_iter iter;
+};
+
static inline const struct iovec *iter_iov(const struct iov_iter *iter)
{
if (iter->iter_type == ITER_UBUF)
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 753971770733..9070ef19f0a3 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -40,6 +40,15 @@
#define BLOCK_SIZE_BITS 10
#define BLOCK_SIZE (1<<BLOCK_SIZE_BITS)
+/* flags for integrity meta */
+#define IO_INTEGRITY_CHK_GUARD (1U << 0) /* enforce guard check */
+#define IO_INTEGRITY_CHK_REFTAG (1U << 1) /* enforce ref check */
+#define IO_INTEGRITY_CHK_APPTAG (1U << 2) /* enforce app check */
+
+#define IO_INTEGRITY_VALID_FLAGS (IO_INTEGRITY_CHK_GUARD | \
+ IO_INTEGRITY_CHK_REFTAG | \
+ IO_INTEGRITY_CHK_APPTAG)
+
#define SEEK_SET 0 /* seek relative to beginning of file */
#define SEEK_CUR 1 /* seek relative to current file position */
#define SEEK_END 2 /* seek relative to end of file */
--
2.25.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 05/10] fs: introduce IOCB_HAS_METADATA for metadata
[not found] ` <CGME20241029163222epcas5p4f46c83e92322214e00212cec15d29489@epcas5p4.samsung.com>
@ 2024-10-29 16:23 ` Anuj Gupta
0 siblings, 0 replies; 24+ messages in thread
From: Anuj Gupta @ 2024-10-29 16:23 UTC (permalink / raw)
To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
brauner, jack, viro
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
linux-fsdevel, Anuj Gupta
Introduce an IOCB_HAS_METADATA flag for the kiocb struct, for handling
requests containing meta payload.
Signed-off-by: Anuj Gupta <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
include/linux/fs.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4b5cad44a126..7f14675b02df 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -346,6 +346,7 @@ struct readahead_control;
#define IOCB_DIO_CALLER_COMP (1 << 22)
/* kiocb is a read or write operation submitted by fs/aio.c. */
#define IOCB_AIO_RW (1 << 23)
+#define IOCB_HAS_METADATA (1 << 24)
/* for use in trace events */
#define TRACE_IOCB_STRINGS \
--
2.25.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 06/10] io_uring/rw: add support to send metadata along with read/write
[not found] ` <CGME20241029163225epcas5p24ec51c7a9b6b115757ed99cadcc3690c@epcas5p2.samsung.com>
@ 2024-10-29 16:23 ` Anuj Gupta
2024-10-29 23:24 ` Keith Busch
0 siblings, 1 reply; 24+ messages in thread
From: Anuj Gupta @ 2024-10-29 16:23 UTC (permalink / raw)
To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
brauner, jack, viro
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
linux-fsdevel, Anuj Gupta, Kanchan Joshi
This patch adds the capability of sending metadata along with read/write.
A new meta_type field is introduced in SQE which indicates the type of
metadata being passed. This meta is represented by a newly introduced
'struct io_uring_meta_pi' which specifies information such as flags,buffer
length,seed and apptag. Application sets up a SQE128 ring, prepares
io_uring_meta_pi within the second SQE.
The patch processes the user-passed information to prepare uio_meta
descriptor and passes it down using kiocb->private.
Meta exchange is supported only for direct IO.
Also vectored read/write operations with meta are not supported
currently.
Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
include/uapi/linux/io_uring.h | 29 +++++++++++++
io_uring/io_uring.c | 9 ++++
io_uring/rw.c | 79 ++++++++++++++++++++++++++++++++++-
io_uring/rw.h | 14 ++++++-
4 files changed, 128 insertions(+), 3 deletions(-)
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 024745283783..4dab2b904394 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -92,6 +92,10 @@ struct io_uring_sqe {
__u16 addr_len;
__u16 __pad3[1];
};
+ struct {
+ __u16 meta_type;
+ __u16 __pad4[1];
+ };
};
union {
struct {
@@ -105,6 +109,31 @@ struct io_uring_sqe {
*/
__u8 cmd[0];
};
+ /*
+ * If the ring is initialized with IORING_SETUP_SQE128, then
+ * this field is starting offset for 64 bytes of data. For meta io
+ * this contains 'struct io_uring_meta_pi'
+ */
+ __u8 big_sqe[0];
+};
+
+enum io_uring_sqe_meta_type_bits {
+ META_TYPE_PI_BIT,
+ /* not a real meta type; just to make sure that we don't overflow */
+ META_TYPE_LAST_BIT,
+};
+
+/* meta type flags */
+#define META_TYPE_PI (1U << META_TYPE_PI_BIT)
+
+/* this goes to SQE128 */
+struct io_uring_meta_pi {
+ __u16 pi_flags;
+ __u16 app_tag;
+ __u32 len;
+ __u64 addr;
+ __u64 seed;
+ __u64 rsvd;
};
/*
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 4514644fdf52..b3aeddeaba2f 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3875,10 +3875,13 @@ static int __init io_uring_init(void)
BUILD_BUG_SQE_ELEM(44, __s32, splice_fd_in);
BUILD_BUG_SQE_ELEM(44, __u32, file_index);
BUILD_BUG_SQE_ELEM(44, __u16, addr_len);
+ BUILD_BUG_SQE_ELEM(44, __u16, meta_type);
BUILD_BUG_SQE_ELEM(46, __u16, __pad3[0]);
+ BUILD_BUG_SQE_ELEM(46, __u16, __pad4[0]);
BUILD_BUG_SQE_ELEM(48, __u64, addr3);
BUILD_BUG_SQE_ELEM_SIZE(48, 0, cmd);
BUILD_BUG_SQE_ELEM(56, __u64, __pad2);
+ BUILD_BUG_SQE_ELEM_SIZE(64, 0, big_sqe);
BUILD_BUG_ON(sizeof(struct io_uring_files_update) !=
sizeof(struct io_uring_rsrc_update));
@@ -3902,6 +3905,12 @@ static int __init io_uring_init(void)
/* top 8bits are for internal use */
BUILD_BUG_ON((IORING_URING_CMD_MASK & 0xff000000) != 0);
+ BUILD_BUG_ON(sizeof(struct io_uring_meta_pi) >
+ sizeof(struct io_uring_sqe));
+
+ BUILD_BUG_ON(META_TYPE_LAST_BIT >
+ 8 * sizeof_field(struct io_uring_sqe, meta_type));
+
io_uring_optable_init();
/*
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 7ce1cbc048fa..bcff3ae76268 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -257,11 +257,58 @@ static int io_prep_rw_setup(struct io_kiocb *req, int ddir, bool do_import)
return 0;
}
+static inline void io_meta_save_state(struct io_async_rw *io)
+{
+ io->meta_state.seed = io->meta.seed;
+ iov_iter_save_state(&io->meta.iter, &io->meta_state.iter_meta);
+}
+
+static inline void io_meta_restore(struct io_async_rw *io)
+{
+ io->meta.seed = io->meta_state.seed;
+ iov_iter_restore(&io->meta.iter, &io->meta_state.iter_meta);
+}
+
+static int io_prep_rw_meta(struct io_kiocb *req, const struct io_uring_sqe *sqe,
+ struct io_rw *rw, int ddir, u16 meta_type)
+{
+ const struct io_uring_meta_pi *md = (struct io_uring_meta_pi *)sqe->big_sqe;
+ const struct io_issue_def *def;
+ struct io_async_rw *io;
+ int ret;
+
+ if (READ_ONCE(sqe->__pad4[0]))
+ return -EINVAL;
+ if (!(meta_type & META_TYPE_PI))
+ return -EINVAL;
+ if (!(req->ctx->flags & IORING_SETUP_SQE128))
+ return -EINVAL;
+ if (READ_ONCE(md->rsvd))
+ return -EINVAL;
+
+ def = &io_issue_defs[req->opcode];
+ if (def->vectored)
+ return -EOPNOTSUPP;
+
+ io = req->async_data;
+ io->meta.flags = READ_ONCE(md->pi_flags);
+ io->meta.app_tag = READ_ONCE(md->app_tag);
+ io->meta.seed = READ_ONCE(md->seed);
+ ret = import_ubuf(ddir, u64_to_user_ptr(READ_ONCE(md->addr)),
+ READ_ONCE(md->len), &io->meta.iter);
+ if (unlikely(ret < 0))
+ return ret;
+ rw->kiocb.ki_flags |= IOCB_HAS_METADATA;
+ io_meta_save_state(io);
+ return ret;
+}
+
static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe,
int ddir, bool do_import)
{
struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw);
unsigned ioprio;
+ u16 meta_type;
int ret;
rw->kiocb.ki_pos = READ_ONCE(sqe->off);
@@ -279,11 +326,20 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe,
rw->kiocb.ki_ioprio = get_current_ioprio();
}
rw->kiocb.dio_complete = NULL;
+ rw->kiocb.ki_flags = 0;
rw->addr = READ_ONCE(sqe->addr);
rw->len = READ_ONCE(sqe->len);
rw->flags = READ_ONCE(sqe->rw_flags);
- return io_prep_rw_setup(req, ddir, do_import);
+ ret = io_prep_rw_setup(req, ddir, do_import);
+
+ if (unlikely(ret))
+ return ret;
+
+ meta_type = READ_ONCE(sqe->meta_type);
+ if (meta_type)
+ ret = io_prep_rw_meta(req, sqe, rw, ddir, meta_type);
+ return ret;
}
int io_prep_read(struct io_kiocb *req, const struct io_uring_sqe *sqe)
@@ -410,7 +466,10 @@ static inline loff_t *io_kiocb_update_pos(struct io_kiocb *req)
static void io_resubmit_prep(struct io_kiocb *req)
{
struct io_async_rw *io = req->async_data;
+ struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw);
+ if (rw->kiocb.ki_flags & IOCB_HAS_METADATA)
+ io_meta_restore(io);
iov_iter_restore(&io->iter, &io->iter_state);
}
@@ -795,7 +854,7 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode, int rw_type)
if (!(req->flags & REQ_F_FIXED_FILE))
req->flags |= io_file_get_flags(file);
- kiocb->ki_flags = file->f_iocb_flags;
+ kiocb->ki_flags |= file->f_iocb_flags;
ret = kiocb_set_rw_flags(kiocb, rw->flags, rw_type);
if (unlikely(ret))
return ret;
@@ -824,6 +883,18 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode, int rw_type)
kiocb->ki_complete = io_complete_rw;
}
+ if (kiocb->ki_flags & IOCB_HAS_METADATA) {
+ struct io_async_rw *io = req->async_data;
+
+ /*
+ * We have a union of meta fields with wpq used for buffered-io
+ * in io_async_rw, so fail it here.
+ */
+ if (!(req->file->f_flags & O_DIRECT))
+ return -EOPNOTSUPP;
+ kiocb->private = &io->meta;
+ }
+
return 0;
}
@@ -898,6 +969,8 @@ static int __io_read(struct io_kiocb *req, unsigned int issue_flags)
* manually if we need to.
*/
iov_iter_restore(&io->iter, &io->iter_state);
+ if (kiocb->ki_flags & IOCB_HAS_METADATA)
+ io_meta_restore(io);
do {
/*
@@ -1102,6 +1175,8 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags)
} else {
ret_eagain:
iov_iter_restore(&io->iter, &io->iter_state);
+ if (kiocb->ki_flags & IOCB_HAS_METADATA)
+ io_meta_restore(io);
if (kiocb->ki_flags & IOCB_WRITE)
io_req_end_write(req);
return -EAGAIN;
diff --git a/io_uring/rw.h b/io_uring/rw.h
index 3f432dc75441..2d7656bd268d 100644
--- a/io_uring/rw.h
+++ b/io_uring/rw.h
@@ -2,6 +2,11 @@
#include <linux/pagemap.h>
+struct io_meta_state {
+ u32 seed;
+ struct iov_iter_state iter_meta;
+};
+
struct io_async_rw {
size_t bytes_done;
struct iov_iter iter;
@@ -9,7 +14,14 @@ struct io_async_rw {
struct iovec fast_iov;
struct iovec *free_iovec;
int free_iov_nr;
- struct wait_page_queue wpq;
+ /* wpq is for buffered io, while meta fields are used with direct io */
+ union {
+ struct wait_page_queue wpq;
+ struct {
+ struct uio_meta meta;
+ struct io_meta_state meta_state;
+ };
+ };
};
int io_prep_read_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe);
--
2.25.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
[not found] ` <CGME20241029163228epcas5p1cd9d1df3d8000250d58092ba82faa870@epcas5p1.samsung.com>
@ 2024-10-29 16:23 ` Anuj Gupta
2024-10-29 21:40 ` Keith Busch
2024-10-30 5:09 ` Christoph Hellwig
0 siblings, 2 replies; 24+ messages in thread
From: Anuj Gupta @ 2024-10-29 16:23 UTC (permalink / raw)
To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
brauner, jack, viro
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
linux-fsdevel, Anuj Gupta, Kanchan Joshi
This patch introduces BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags which
indicate how the hardware should check the integrity payload.
BIP_CHECK_GUARD/REFTAG are conversion of existing semantics, while
BIP_CHECK_APPTAG is a new flag. The driver can now just rely on block
layer flags, and doesn't need to know the integrity source. Submitter
of PI decides which tags to check. This would also give us a unified
interface for user and kernel generated integrity.
Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
block/bio-integrity.c | 5 +++++
drivers/nvme/host/core.c | 11 +++--------
include/linux/bio-integrity.h | 6 +++++-
3 files changed, 13 insertions(+), 9 deletions(-)
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index f56d01cec689..3bee43b87001 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -434,6 +434,11 @@ bool bio_integrity_prep(struct bio *bio)
if (bi->csum_type == BLK_INTEGRITY_CSUM_IP)
bip->bip_flags |= BIP_IP_CHECKSUM;
+ /* describe what tags to check in payload */
+ if (bi->csum_type)
+ bip->bip_flags |= BIP_CHECK_GUARD;
+ if (bi->flags & BLK_INTEGRITY_REF_TAG)
+ bip->bip_flags |= BIP_CHECK_REFTAG;
if (bio_integrity_add_page(bio, virt_to_page(buf), len,
offset_in_page(buf)) < len) {
printk(KERN_ERR "could not attach integrity payload\n");
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 3de7555a7de7..79bd6b22e88d 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1004,18 +1004,13 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
control |= NVME_RW_PRINFO_PRACT;
}
- switch (ns->head->pi_type) {
- case NVME_NS_DPS_PI_TYPE3:
+ if (bio_integrity_flagged(req->bio, BIP_CHECK_GUARD))
control |= NVME_RW_PRINFO_PRCHK_GUARD;
- break;
- case NVME_NS_DPS_PI_TYPE1:
- case NVME_NS_DPS_PI_TYPE2:
- control |= NVME_RW_PRINFO_PRCHK_GUARD |
- NVME_RW_PRINFO_PRCHK_REF;
+ if (bio_integrity_flagged(req->bio, BIP_CHECK_REFTAG)) {
+ control |= NVME_RW_PRINFO_PRCHK_REF;
if (op == nvme_cmd_zone_append)
control |= NVME_RW_APPEND_PIREMAP;
nvme_set_ref_tag(ns, cmnd, req);
- break;
}
}
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index 58ff9988433a..fe2bfe122db2 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -11,6 +11,9 @@ enum bip_flags {
BIP_DISK_NOCHECK = 1 << 3, /* disable disk integrity checking */
BIP_IP_CHECKSUM = 1 << 4, /* IP checksum */
BIP_COPY_USER = 1 << 5, /* Kernel bounce buffer in use */
+ BIP_CHECK_GUARD = 1 << 6, /* guard check */
+ BIP_CHECK_REFTAG = 1 << 7, /* reftag check */
+ BIP_CHECK_APPTAG = 1 << 8, /* apptag check */
};
struct bio_integrity_payload {
@@ -31,7 +34,8 @@ struct bio_integrity_payload {
};
#define BIP_CLONE_FLAGS (BIP_MAPPED_INTEGRITY | BIP_CTRL_NOCHECK | \
- BIP_DISK_NOCHECK | BIP_IP_CHECKSUM)
+ BIP_DISK_NOCHECK | BIP_IP_CHECKSUM | \
+ BIP_CHECK_GUARD | BIP_CHECK_REFTAG | BIP_CHECK_APPTAG)
#ifdef CONFIG_BLK_DEV_INTEGRITY
--
2.25.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 08/10] nvme: add support for passing on the application tag
[not found] ` <CGME20241029163230epcas5p18172a7e54687e454e4ecb65840810c4e@epcas5p1.samsung.com>
@ 2024-10-29 16:24 ` Anuj Gupta
2024-10-29 21:40 ` Keith Busch
0 siblings, 1 reply; 24+ messages in thread
From: Anuj Gupta @ 2024-10-29 16:24 UTC (permalink / raw)
To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
brauner, jack, viro
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
linux-fsdevel, Kanchan Joshi, Anuj Gupta
From: Kanchan Joshi <[email protected]>
With user integrity buffer, there is a way to specify the app_tag.
Set the corresponding protocol specific flags and send the app_tag down.
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
drivers/nvme/host/core.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 79bd6b22e88d..3b329e036d33 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -872,6 +872,12 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req,
return BLK_STS_OK;
}
+static void nvme_set_app_tag(struct request *req, struct nvme_command *cmnd)
+{
+ cmnd->rw.lbat = cpu_to_le16(bio_integrity(req->bio)->app_tag);
+ cmnd->rw.lbatm = cpu_to_le16(0xffff);
+}
+
static void nvme_set_ref_tag(struct nvme_ns *ns, struct nvme_command *cmnd,
struct request *req)
{
@@ -1012,6 +1018,10 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
control |= NVME_RW_APPEND_PIREMAP;
nvme_set_ref_tag(ns, cmnd, req);
}
+ if (bio_integrity_flagged(req->bio, BIP_CHECK_APPTAG)) {
+ control |= NVME_RW_PRINFO_PRCHK_APP;
+ nvme_set_app_tag(req, cmnd);
+ }
}
cmnd->rw.control = cpu_to_le16(control);
--
2.25.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 09/10] scsi: add support for user-meta interface
[not found] ` <CGME20241029163233epcas5p497b3c81dcdf3c691a6f9c461bf0da7ac@epcas5p4.samsung.com>
@ 2024-10-29 16:24 ` Anuj Gupta
2024-10-30 5:10 ` Christoph Hellwig
0 siblings, 1 reply; 24+ messages in thread
From: Anuj Gupta @ 2024-10-29 16:24 UTC (permalink / raw)
To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
brauner, jack, viro
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
linux-fsdevel, Anuj Gupta
Add support for sending user-meta buffer. Set tags to be checked
using flags specified by user/block-layer.
Signed-off-by: Anuj Gupta <[email protected]>
---
drivers/scsi/sd.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index ca4bc0ac76ad..d1a2ae0d4c29 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -814,14 +814,14 @@ static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd,
if (bio_integrity_flagged(bio, BIP_IP_CHECKSUM))
scmd->prot_flags |= SCSI_PROT_IP_CHECKSUM;
- if (bio_integrity_flagged(bio, BIP_CTRL_NOCHECK) == false)
+ if (bio_integrity_flagged(bio, BIP_CHECK_GUARD))
scmd->prot_flags |= SCSI_PROT_GUARD_CHECK;
}
if (dif != T10_PI_TYPE3_PROTECTION) { /* DIX/DIF Type 0, 1, 2 */
scmd->prot_flags |= SCSI_PROT_REF_INCREMENT;
- if (bio_integrity_flagged(bio, BIP_CTRL_NOCHECK) == false)
+ if (bio_integrity_flagged(bio, BIP_CHECK_REFTAG))
scmd->prot_flags |= SCSI_PROT_REF_CHECK;
}
--
2.25.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 10/10] block: add support to pass user meta buffer
[not found] ` <CGME20241029163235epcas5p340ce6d131cc7bf220db978a2d4dc24c2@epcas5p3.samsung.com>
@ 2024-10-29 16:24 ` Anuj Gupta
2024-10-29 21:52 ` Keith Busch
0 siblings, 1 reply; 24+ messages in thread
From: Anuj Gupta @ 2024-10-29 16:24 UTC (permalink / raw)
To: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
brauner, jack, viro
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
linux-fsdevel, Kanchan Joshi, Anuj Gupta
From: Kanchan Joshi <[email protected]>
If an iocb contains metadata, extract that and prepare the bip.
Based on flags specified by the user, set corresponding guard/app/ref
tags to be checked in bip.
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
block/bio-integrity.c | 50 +++++++++++++++++++++++++++++++++++
block/fops.c | 42 ++++++++++++++++++++++-------
include/linux/bio-integrity.h | 7 +++++
3 files changed, 90 insertions(+), 9 deletions(-)
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 3bee43b87001..5d81ad9a3d20 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -364,6 +364,55 @@ int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter)
return ret;
}
+static void bio_uio_meta_to_bip(struct bio *bio, struct uio_meta *meta)
+{
+ struct bio_integrity_payload *bip = bio_integrity(bio);
+
+ if (meta->flags & IO_INTEGRITY_CHK_GUARD)
+ bip->bip_flags |= BIP_CHECK_GUARD;
+ if (meta->flags & IO_INTEGRITY_CHK_APPTAG)
+ bip->bip_flags |= BIP_CHECK_APPTAG;
+ if (meta->flags & IO_INTEGRITY_CHK_REFTAG)
+ bip->bip_flags |= BIP_CHECK_REFTAG;
+
+ bip->app_tag = meta->app_tag;
+}
+
+int bio_integrity_map_iter(struct bio *bio, struct uio_meta *meta)
+{
+ struct blk_integrity *bi = blk_get_integrity(bio->bi_bdev->bd_disk);
+ unsigned int integrity_bytes;
+ int ret;
+ struct iov_iter it;
+
+ if (!bi)
+ return -EINVAL;
+ /*
+ * original meta iterator can be bigger.
+ * process integrity info corresponding to current data buffer only.
+ */
+ it = meta->iter;
+ integrity_bytes = bio_integrity_bytes(bi, bio_sectors(bio));
+ if (it.count < integrity_bytes)
+ return -EINVAL;
+
+ /* should fit into two bytes */
+ BUILD_BUG_ON(IO_INTEGRITY_VALID_FLAGS >= (1 << 16));
+
+ if (meta->flags && (meta->flags & ~IO_INTEGRITY_VALID_FLAGS))
+ return -EINVAL;
+
+ it.count = integrity_bytes;
+ ret = bio_integrity_map_user(bio, &it);
+ if (!ret) {
+ bio_uio_meta_to_bip(bio, meta);
+ bip_set_seed(bio_integrity(bio), meta->seed);
+ iov_iter_advance(&meta->iter, integrity_bytes);
+ meta->seed += bio_integrity_intervals(bi, bio_sectors(bio));
+ }
+ return ret;
+}
+
/**
* bio_integrity_prep - Prepare bio for integrity I/O
* @bio: bio to prepare
@@ -564,6 +613,7 @@ int bio_integrity_clone(struct bio *bio, struct bio *bio_src,
bip->bip_vec = bip_src->bip_vec;
bip->bip_iter = bip_src->bip_iter;
bip->bip_flags = bip_src->bip_flags & BIP_CLONE_FLAGS;
+ bip->app_tag = bip_src->app_tag;
return 0;
}
diff --git a/block/fops.c b/block/fops.c
index 2d01c9007681..3cf7e15eabbc 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -54,6 +54,7 @@ static ssize_t __blkdev_direct_IO_simple(struct kiocb *iocb,
struct bio bio;
ssize_t ret;
+ WARN_ON_ONCE(iocb->ki_flags & IOCB_HAS_METADATA);
if (nr_pages <= DIO_INLINE_BIO_VECS)
vecs = inline_vecs;
else {
@@ -128,6 +129,9 @@ static void blkdev_bio_end_io(struct bio *bio)
if (bio->bi_status && !dio->bio.bi_status)
dio->bio.bi_status = bio->bi_status;
+ if (dio->iocb->ki_flags & IOCB_HAS_METADATA)
+ bio_integrity_unmap_user(bio);
+
if (atomic_dec_and_test(&dio->ref)) {
if (!(dio->flags & DIO_IS_SYNC)) {
struct kiocb *iocb = dio->iocb;
@@ -221,14 +225,16 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
* a retry of this from blocking context.
*/
if (unlikely(iov_iter_count(iter))) {
- bio_release_pages(bio, false);
- bio_clear_flag(bio, BIO_REFFED);
- bio_put(bio);
- blk_finish_plug(&plug);
- return -EAGAIN;
+ ret = -EAGAIN;
+ goto fail;
}
bio->bi_opf |= REQ_NOWAIT;
}
+ if (!is_sync && (iocb->ki_flags & IOCB_HAS_METADATA)) {
+ ret = bio_integrity_map_iter(bio, iocb->private);
+ if (unlikely(ret))
+ goto fail;
+ }
if (is_read) {
if (dio->flags & DIO_SHOULD_DIRTY)
@@ -269,6 +275,12 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
bio_put(&dio->bio);
return ret;
+fail:
+ bio_release_pages(bio, false);
+ bio_clear_flag(bio, BIO_REFFED);
+ bio_put(bio);
+ blk_finish_plug(&plug);
+ return ret;
}
static void blkdev_bio_end_io_async(struct bio *bio)
@@ -286,6 +298,9 @@ static void blkdev_bio_end_io_async(struct bio *bio)
ret = blk_status_to_errno(bio->bi_status);
}
+ if (iocb->ki_flags & IOCB_HAS_METADATA)
+ bio_integrity_unmap_user(bio);
+
iocb->ki_complete(iocb, ret);
if (dio->flags & DIO_SHOULD_DIRTY) {
@@ -330,10 +345,8 @@ static ssize_t __blkdev_direct_IO_async(struct kiocb *iocb,
bio_iov_bvec_set(bio, iter);
} else {
ret = bio_iov_iter_get_pages(bio, iter);
- if (unlikely(ret)) {
- bio_put(bio);
- return ret;
- }
+ if (unlikely(ret))
+ goto out_bio_put;
}
dio->size = bio->bi_iter.bi_size;
@@ -346,6 +359,13 @@ static ssize_t __blkdev_direct_IO_async(struct kiocb *iocb,
task_io_account_write(bio->bi_iter.bi_size);
}
+ if (iocb->ki_flags & IOCB_HAS_METADATA) {
+ ret = bio_integrity_map_iter(bio, iocb->private);
+ WRITE_ONCE(iocb->private, NULL);
+ if (unlikely(ret))
+ goto out_bio_put;
+ }
+
if (iocb->ki_flags & IOCB_ATOMIC)
bio->bi_opf |= REQ_ATOMIC;
@@ -360,6 +380,10 @@ static ssize_t __blkdev_direct_IO_async(struct kiocb *iocb,
submit_bio(bio);
}
return -EIOCBQUEUED;
+
+out_bio_put:
+ bio_put(bio);
+ return ret;
}
static ssize_t blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index fe2bfe122db2..96ec559c24ef 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -24,6 +24,7 @@ struct bio_integrity_payload {
unsigned short bip_vcnt; /* # of integrity bio_vecs */
unsigned short bip_max_vcnt; /* integrity bio_vec slots */
unsigned short bip_flags; /* control flags */
+ u16 app_tag; /* application tag value */
struct bvec_iter bio_iter; /* for rewinding parent bio */
@@ -80,6 +81,7 @@ struct bio_integrity_payload *bio_integrity_alloc(struct bio *bio, gfp_t gfp,
int bio_integrity_add_page(struct bio *bio, struct page *page, unsigned int len,
unsigned int offset);
int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter);
+int bio_integrity_map_iter(struct bio *bio, struct uio_meta *meta);
void bio_integrity_unmap_user(struct bio *bio);
bool bio_integrity_prep(struct bio *bio);
void bio_integrity_advance(struct bio *bio, unsigned int bytes_done);
@@ -110,6 +112,11 @@ static int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter)
return -EINVAL;
}
+static inline int bio_integrity_map_iter(struct bio *bio, struct uio_meta *meta)
+{
+ return -EINVAL;
+}
+
static inline void bio_integrity_unmap_user(struct bio *bio)
{
}
--
2.25.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v5 03/10] block: modify bio_integrity_map_user to accept iov_iter as argument
2024-10-29 16:23 ` [PATCH v5 03/10] block: modify bio_integrity_map_user to accept iov_iter as argument Anuj Gupta
@ 2024-10-29 21:31 ` Keith Busch
0 siblings, 0 replies; 24+ messages in thread
From: Keith Busch @ 2024-10-29 21:31 UTC (permalink / raw)
To: Anuj Gupta
Cc: axboe, hch, martin.petersen, asml.silence, anuj1072538, brauner,
jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
linux-scsi, vishak.g, linux-fsdevel, Kanchan Joshi
On Tue, Oct 29, 2024 at 09:53:55PM +0530, Anuj Gupta wrote:
> This patch refactors bio_integrity_map_user to accept iov_iter as
> argument. This is a prep patch.
Looks good.
Reviewed-by: Keith Busch <[email protected]>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
2024-10-29 16:23 ` [PATCH v5 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags Anuj Gupta
@ 2024-10-29 21:40 ` Keith Busch
2024-10-30 5:09 ` Christoph Hellwig
1 sibling, 0 replies; 24+ messages in thread
From: Keith Busch @ 2024-10-29 21:40 UTC (permalink / raw)
To: Anuj Gupta
Cc: axboe, hch, martin.petersen, asml.silence, anuj1072538, brauner,
jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
linux-scsi, vishak.g, linux-fsdevel, Kanchan Joshi
On Tue, Oct 29, 2024 at 09:53:59PM +0530, Anuj Gupta wrote:
> This patch introduces BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags which
> indicate how the hardware should check the integrity payload.
> BIP_CHECK_GUARD/REFTAG are conversion of existing semantics, while
> BIP_CHECK_APPTAG is a new flag. The driver can now just rely on block
> layer flags, and doesn't need to know the integrity source. Submitter
> of PI decides which tags to check. This would also give us a unified
> interface for user and kernel generated integrity.
Looks good.
Reviewed-by: Keith Busch <[email protected]>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 08/10] nvme: add support for passing on the application tag
2024-10-29 16:24 ` [PATCH v5 08/10] nvme: add support for passing on the application tag Anuj Gupta
@ 2024-10-29 21:40 ` Keith Busch
0 siblings, 0 replies; 24+ messages in thread
From: Keith Busch @ 2024-10-29 21:40 UTC (permalink / raw)
To: Anuj Gupta
Cc: axboe, hch, martin.petersen, asml.silence, anuj1072538, brauner,
jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
linux-scsi, vishak.g, linux-fsdevel, Kanchan Joshi
On Tue, Oct 29, 2024 at 09:54:00PM +0530, Anuj Gupta wrote:
> From: Kanchan Joshi <[email protected]>
>
> With user integrity buffer, there is a way to specify the app_tag.
> Set the corresponding protocol specific flags and send the app_tag down.
Looks good.
Reviewed-by: Keith Busch <[email protected]>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 10/10] block: add support to pass user meta buffer
2024-10-29 16:24 ` [PATCH v5 10/10] block: add support to pass user meta buffer Anuj Gupta
@ 2024-10-29 21:52 ` Keith Busch
0 siblings, 0 replies; 24+ messages in thread
From: Keith Busch @ 2024-10-29 21:52 UTC (permalink / raw)
To: Anuj Gupta
Cc: axboe, hch, martin.petersen, asml.silence, anuj1072538, brauner,
jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
linux-scsi, vishak.g, linux-fsdevel, Kanchan Joshi
On Tue, Oct 29, 2024 at 09:54:02PM +0530, Anuj Gupta wrote:
> From: Kanchan Joshi <[email protected]>
>
> If an iocb contains metadata, extract that and prepare the bip.
> Based on flags specified by the user, set corresponding guard/app/ref
> tags to be checked in bip.
Looks good.
Reviewed-by: Keith Busch <[email protected]>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 06/10] io_uring/rw: add support to send metadata along with read/write
2024-10-29 16:23 ` [PATCH v5 06/10] io_uring/rw: add support to send metadata along with read/write Anuj Gupta
@ 2024-10-29 23:24 ` Keith Busch
2024-10-30 5:05 ` Kanchan Joshi
2024-11-07 17:30 ` Pavel Begunkov
0 siblings, 2 replies; 24+ messages in thread
From: Keith Busch @ 2024-10-29 23:24 UTC (permalink / raw)
To: Anuj Gupta
Cc: axboe, hch, martin.petersen, asml.silence, anuj1072538, brauner,
jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
linux-scsi, vishak.g, linux-fsdevel, Kanchan Joshi
On Tue, Oct 29, 2024 at 09:53:58PM +0530, Anuj Gupta wrote:
> This patch adds the capability of sending metadata along with read/write.
> A new meta_type field is introduced in SQE which indicates the type of
> metadata being passed. This meta is represented by a newly introduced
> 'struct io_uring_meta_pi' which specifies information such as flags,buffer
> length,seed and apptag. Application sets up a SQE128 ring, prepares
> io_uring_meta_pi within the second SQE.
> The patch processes the user-passed information to prepare uio_meta
> descriptor and passes it down using kiocb->private.
>
> Meta exchange is supported only for direct IO.
> Also vectored read/write operations with meta are not supported
> currently.
It looks like it is reasonable to add support for fixed buffers too.
There would be implications for subsequent patches, mostly patch 10, but
it looks like we can do that.
Anyway, this patch mostly looks okay to me. I don't know about the whole
"meta_type" thing. My understanding from Pavel was wanting a way to
chain command specific extra options. For example, userspace metadata
and write hints, and this doesn't look like it can be extended to do
that.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 04/10] fs, iov_iter: define meta io descriptor
2024-10-29 16:23 ` [PATCH v5 04/10] fs, iov_iter: define meta io descriptor Anuj Gupta
@ 2024-10-30 5:03 ` Christoph Hellwig
2024-10-30 11:17 ` Kanchan Joshi
0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2024-10-30 5:03 UTC (permalink / raw)
To: Anuj Gupta
Cc: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
brauner, jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
linux-scsi, vishak.g, linux-fsdevel, Kanchan Joshi
On Tue, Oct 29, 2024 at 09:53:56PM +0530, Anuj Gupta wrote:
> +/* flags for integrity meta */
> +typedef __u16 __bitwise uio_meta_flags_t;
> +
> +struct uio_meta {
> + uio_meta_flags_t flags;
.. this is a bitwise type
> +/* flags for integrity meta */
> +#define IO_INTEGRITY_CHK_GUARD (1U << 0) /* enforce guard check */
> +#define IO_INTEGRITY_CHK_REFTAG (1U << 1) /* enforce ref check */
> +#define IO_INTEGRITY_CHK_APPTAG (1U << 2) /* enforce app check */
.. but these aren't. Leading to warnings like:
CHECK block/bio-integrity.c
block/bio-integrity.c:371:17: warning: restricted uio_meta_flags_t degrades to integer
block/bio-integrity.c:373:17: warning: restricted uio_meta_flags_t degrades to integer
block/bio-integrity.c:375:17: warning: restricted uio_meta_flags_t degrades to integer
block/bio-integrity.c:402:33: warning: restricted uio_meta_flags_t degrades to integer
from sparse. Given that the flags are uapi, the it's probably best
to just drop the __bitwise annotation.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 06/10] io_uring/rw: add support to send metadata along with read/write
2024-10-29 23:24 ` Keith Busch
@ 2024-10-30 5:05 ` Kanchan Joshi
2024-10-30 5:08 ` Christoph Hellwig
2024-11-07 17:30 ` Pavel Begunkov
1 sibling, 1 reply; 24+ messages in thread
From: Kanchan Joshi @ 2024-10-30 5:05 UTC (permalink / raw)
To: Keith Busch, Anuj Gupta
Cc: axboe, hch, martin.petersen, asml.silence, anuj1072538, brauner,
jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
linux-scsi, vishak.g, linux-fsdevel
On 10/30/2024 4:54 AM, Keith Busch wrote:
> On Tue, Oct 29, 2024 at 09:53:58PM +0530, Anuj Gupta wrote:
>> This patch adds the capability of sending metadata along with read/write.
>> A new meta_type field is introduced in SQE which indicates the type of
>> metadata being passed. This meta is represented by a newly introduced
>> 'struct io_uring_meta_pi' which specifies information such as flags,buffer
>> length,seed and apptag. Application sets up a SQE128 ring, prepares
>> io_uring_meta_pi within the second SQE.
>> The patch processes the user-passed information to prepare uio_meta
>> descriptor and passes it down using kiocb->private.
>>
>> Meta exchange is supported only for direct IO.
>> Also vectored read/write operations with meta are not supported
>> currently.
>
> It looks like it is reasonable to add support for fixed buffers too.
> There would be implications for subsequent patches, mostly patch 10, but
> it looks like we can do that.
Fixed buffers for data continues to be supported with this.
Do you mean fixed buffers for metadata?
We can take that as an incremental addition outside of this series which
is already touching various subsystems (io_uring, block, nvme, scsi, fs).
> Anyway, this patch mostly looks okay to me. I don't know about the whole
> "meta_type" thing. My understanding from Pavel was wanting a way to
> chain command specific extra options.
Right. During LSFMM, he mentioned Btrfs needed to send extra stuff with
read/write.
But in general, this is about seeing metadata as a generic term to
encode extra information into io_uring SQE.
It may not be very uncommon that people will have the need to send extra
stuff with read/write and add specific processing for that. And
SQE->meta_type helps to isolate all such processing from the common case
when no extra stuff is sent.
if (sqe->meta_type)
{
if (type1(sqe->meta_type))
process(type1);
if (type2(sqe>meta_type))
process(type1);
}
For example, userspace metadata
> and write hints, and this doesn't look like it can be extended to do
> that.
It can be. And in past I used that to represent different types of write
hints.
Just that in the current version, write hints are being sent without any
type.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 06/10] io_uring/rw: add support to send metadata along with read/write
2024-10-30 5:05 ` Kanchan Joshi
@ 2024-10-30 5:08 ` Christoph Hellwig
0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2024-10-30 5:08 UTC (permalink / raw)
To: Kanchan Joshi
Cc: Keith Busch, Anuj Gupta, axboe, hch, martin.petersen,
asml.silence, anuj1072538, brauner, jack, viro, io-uring,
linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
linux-fsdevel
On Wed, Oct 30, 2024 at 10:35:19AM +0530, Kanchan Joshi wrote:
> if (sqe->meta_type)
> {
> if (type1(sqe->meta_type))
> process(type1);
> if (type2(sqe>meta_type))
> process(type1);
> }
Ensuring that all these are incompatible, which doesn't exactly scale.
So as is this weird meta_type thing (especially overloading the
meta name which is unfortuntely) feels actively harmful.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
2024-10-29 16:23 ` [PATCH v5 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags Anuj Gupta
2024-10-29 21:40 ` Keith Busch
@ 2024-10-30 5:09 ` Christoph Hellwig
1 sibling, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2024-10-30 5:09 UTC (permalink / raw)
To: Anuj Gupta
Cc: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
brauner, jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
linux-scsi, vishak.g, linux-fsdevel, Kanchan Joshi
Looks good:
Reviewed-by: Christoph Hellwig <[email protected]>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 09/10] scsi: add support for user-meta interface
2024-10-29 16:24 ` [PATCH v5 09/10] scsi: add support for user-meta interface Anuj Gupta
@ 2024-10-30 5:10 ` Christoph Hellwig
0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2024-10-30 5:10 UTC (permalink / raw)
To: Anuj Gupta
Cc: axboe, hch, kbusch, martin.petersen, asml.silence, anuj1072538,
brauner, jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
linux-scsi, vishak.g, linux-fsdevel
On Tue, Oct 29, 2024 at 09:54:01PM +0530, Anuj Gupta wrote:
> Add support for sending user-meta buffer. Set tags to be checked
> using flags specified by user/block-layer.
Looks good:
Reviewed-by: Christoph Hellwig <[email protected]>
> - if (bio_integrity_flagged(bio, BIP_CTRL_NOCHECK) == false)
> + if (bio_integrity_flagged(bio, BIP_CHECK_GUARD))
> scmd->prot_flags |= SCSI_PROT_GUARD_CHECK;
> }
>
> if (dif != T10_PI_TYPE3_PROTECTION) { /* DIX/DIF Type 0, 1, 2 */
> scmd->prot_flags |= SCSI_PROT_REF_INCREMENT;
>
> - if (bio_integrity_flagged(bio, BIP_CTRL_NOCHECK) == false)
> + if (bio_integrity_flagged(bio, BIP_CHECK_REFTAG))
BIP_CTRL_NOCHECK is unused now, and should probably go away.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 04/10] fs, iov_iter: define meta io descriptor
2024-10-30 5:03 ` Christoph Hellwig
@ 2024-10-30 11:17 ` Kanchan Joshi
0 siblings, 0 replies; 24+ messages in thread
From: Kanchan Joshi @ 2024-10-30 11:17 UTC (permalink / raw)
To: Christoph Hellwig, Anuj Gupta
Cc: axboe, kbusch, martin.petersen, asml.silence, anuj1072538,
brauner, jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
linux-scsi, vishak.g, linux-fsdevel
On 10/30/2024 10:33 AM, Christoph Hellwig wrote:
> .. but these aren't. Leading to warnings like:
>
> CHECK block/bio-integrity.c
> block/bio-integrity.c:371:17: warning: restricted uio_meta_flags_t degrades to integer
For some reasons this does not show up in my setup.
But that only means setup needs to be fixed. Apart from dropping the
__bitwise.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 06/10] io_uring/rw: add support to send metadata along with read/write
2024-10-29 23:24 ` Keith Busch
2024-10-30 5:05 ` Kanchan Joshi
@ 2024-11-07 17:30 ` Pavel Begunkov
2024-11-08 7:12 ` Christoph Hellwig
1 sibling, 1 reply; 24+ messages in thread
From: Pavel Begunkov @ 2024-11-07 17:30 UTC (permalink / raw)
To: Keith Busch, Anuj Gupta
Cc: axboe, hch, martin.petersen, anuj1072538, brauner, jack, viro,
io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, vishak.g,
linux-fsdevel, Kanchan Joshi
On 10/29/24 23:24, Keith Busch wrote:
> On Tue, Oct 29, 2024 at 09:53:58PM +0530, Anuj Gupta wrote:
>> This patch adds the capability of sending metadata along with read/write.
>> A new meta_type field is introduced in SQE which indicates the type of
>> metadata being passed. This meta is represented by a newly introduced
>> 'struct io_uring_meta_pi' which specifies information such as flags,buffer
>> length,seed and apptag. Application sets up a SQE128 ring, prepares
>> io_uring_meta_pi within the second SQE.
>> The patch processes the user-passed information to prepare uio_meta
>> descriptor and passes it down using kiocb->private.
>>
>> Meta exchange is supported only for direct IO.
>> Also vectored read/write operations with meta are not supported
>> currently.
>
> It looks like it is reasonable to add support for fixed buffers too.
> There would be implications for subsequent patches, mostly patch 10, but
> it looks like we can do that.
>
> Anyway, this patch mostly looks okay to me. I don't know about the whole
> "meta_type" thing. My understanding from Pavel was wanting a way to
> chain command specific extra options. For example, userspace metadata
> and write hints, and this doesn't look like it can be extended to do
> that.
It makes sense to implement write hints as a meta/attribute type,
but depends on whether it's supposed to be widely supported by
different file types vs it being a block specific feature, and if
SQEs have space for it.
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 06/10] io_uring/rw: add support to send metadata along with read/write
2024-11-07 17:30 ` Pavel Begunkov
@ 2024-11-08 7:12 ` Christoph Hellwig
0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2024-11-08 7:12 UTC (permalink / raw)
To: Pavel Begunkov
Cc: Keith Busch, Anuj Gupta, axboe, hch, martin.petersen, anuj1072538,
brauner, jack, viro, io-uring, linux-nvme, linux-block, gost.dev,
linux-scsi, vishak.g, linux-fsdevel, Kanchan Joshi
On Thu, Nov 07, 2024 at 05:30:36PM +0000, Pavel Begunkov wrote:
> It makes sense to implement write hints as a meta/attribute type,
> but depends on whether it's supposed to be widely supported by
> different file types vs it being a block specific feature, and if
> SQEs have space for it.
It make sense everywhere. Implementing it for direct I/O on regular
files is mostly trivial and I'll do it once this series lands.
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2024-11-08 7:12 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20241029163153epcas5p4ab83a94429a227bfc262423aa8a8dd26@epcas5p4.samsung.com>
2024-10-29 16:23 ` [PATCH v5 00/10] Read/Write with meta/integrity Anuj Gupta
[not found] ` <CGME20241029163212epcas5p343cd56d66b58a9e7e8e1faa98067891d@epcas5p3.samsung.com>
2024-10-29 16:23 ` [PATCH v5 01/10] block: define set of integrity flags to be inherited by cloned bip Anuj Gupta
[not found] ` <CGME20241029163214epcas5p1069ca93a2a9d8840e4f142cc4b713775@epcas5p1.samsung.com>
2024-10-29 16:23 ` [PATCH v5 02/10] block: copy back bounce buffer to user-space correctly in case of split Anuj Gupta
[not found] ` <CGME20241029163217epcas5p414d493b7a89c6bd092afd28c4eeea24c@epcas5p4.samsung.com>
2024-10-29 16:23 ` [PATCH v5 03/10] block: modify bio_integrity_map_user to accept iov_iter as argument Anuj Gupta
2024-10-29 21:31 ` Keith Busch
[not found] ` <CGME20241029163220epcas5p2207d4c54b8c4811e973fca601fd7e3f5@epcas5p2.samsung.com>
2024-10-29 16:23 ` [PATCH v5 04/10] fs, iov_iter: define meta io descriptor Anuj Gupta
2024-10-30 5:03 ` Christoph Hellwig
2024-10-30 11:17 ` Kanchan Joshi
[not found] ` <CGME20241029163222epcas5p4f46c83e92322214e00212cec15d29489@epcas5p4.samsung.com>
2024-10-29 16:23 ` [PATCH v5 05/10] fs: introduce IOCB_HAS_METADATA for metadata Anuj Gupta
[not found] ` <CGME20241029163225epcas5p24ec51c7a9b6b115757ed99cadcc3690c@epcas5p2.samsung.com>
2024-10-29 16:23 ` [PATCH v5 06/10] io_uring/rw: add support to send metadata along with read/write Anuj Gupta
2024-10-29 23:24 ` Keith Busch
2024-10-30 5:05 ` Kanchan Joshi
2024-10-30 5:08 ` Christoph Hellwig
2024-11-07 17:30 ` Pavel Begunkov
2024-11-08 7:12 ` Christoph Hellwig
[not found] ` <CGME20241029163228epcas5p1cd9d1df3d8000250d58092ba82faa870@epcas5p1.samsung.com>
2024-10-29 16:23 ` [PATCH v5 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags Anuj Gupta
2024-10-29 21:40 ` Keith Busch
2024-10-30 5:09 ` Christoph Hellwig
[not found] ` <CGME20241029163230epcas5p18172a7e54687e454e4ecb65840810c4e@epcas5p1.samsung.com>
2024-10-29 16:24 ` [PATCH v5 08/10] nvme: add support for passing on the application tag Anuj Gupta
2024-10-29 21:40 ` Keith Busch
[not found] ` <CGME20241029163233epcas5p497b3c81dcdf3c691a6f9c461bf0da7ac@epcas5p4.samsung.com>
2024-10-29 16:24 ` [PATCH v5 09/10] scsi: add support for user-meta interface Anuj Gupta
2024-10-30 5:10 ` Christoph Hellwig
[not found] ` <CGME20241029163235epcas5p340ce6d131cc7bf220db978a2d4dc24c2@epcas5p3.samsung.com>
2024-10-29 16:24 ` [PATCH v5 10/10] block: add support to pass user meta buffer Anuj Gupta
2024-10-29 21:52 ` Keith Busch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox