From: Anuj Gupta <[email protected]>
To: [email protected], [email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected], [email protected],
[email protected]
Cc: [email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], Anuj Gupta <[email protected]>
Subject: [PATCH v9 00/11] Read/Write with meta/integrity
Date: Thu, 14 Nov 2024 16:15:06 +0530 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: CGME20241114105326epcas5p103b2c996293fa680092b97c747fdbd59@epcas5p1.samsung.com
This adds a new io_uring interface to pass additional attributes with
read/write. It can be done using two ways.
1. Passing the attribute information via user pointer.
2. Passing the attribute information inline with SQE/SQE128.
We have had discussions in the past about using the SQE128 space for passing
PI fields. Still if there are concerns about it, can we please consider
this patchset for inclusion only with the user pointer approach for now
(and drop patch 7)?
Example program for using the interface is appended below [1].
In the program, write is done via inline scheme and read is done via user
pointer scheme.
The patchset is on top of block/for-next.
Block path (direct IO) , NVMe and SCSI driver are modified to support
this.
Patch 1 is an enhancement patch.
Patch 2 is required to make the bounce buffer copy back work correctly.
Patch 3 to 5 are prep patches.
Patch 6 and 7 adds the io_uring support.
Patch 8 gives us unified interface for user and kernel generated
integrity.
Patch 9 adds support in SCSI and patch 10 in NVMe.
Patch 11 adds the support for block direct IO.
Changes since v8:
https://lore.kernel.org/io-uring/[email protected]/
- add option of the pass the PI information from user space via a
pointer (Pavel)
Changes since v7:
https://lore.kernel.org/io-uring/[email protected]/
- change the sign-off order (hch)
- add a check for doing metadata completion handling only for async-io
- change meta_type name to something more meaningful (hch, keith)
- add detail description in io-uring patch (hch)
Changes since v6:
https://lore.kernel.org/linux-block/[email protected]/
- io_uring changes (bring back meta_type, move PI to the end of SQE128)
- Fix robot warnings
Changes since v5:
https://lore.kernel.org/linux-block/[email protected]/
- remove meta_type field from SQE (hch, keith)
- remove __bitwise annotation (hch)
- remove BIP_CTRL_NOCHECK from scsi (hch)
Changes since v4:
https://lore.kernel.org/linux-block/[email protected]/
- better variable names to describe bounce buffer copy back (hch)
- move defintion of flags in the same patch introducing uio_meta (hch)
- move uio_meta definition to include/linux/uio.h (hch)
- bump seed size in uio_meta to 8 bytes (martin)
- move flags definition to include/uapi/linux/fs.h (hch)
- s/meta/metadata in commit description of io-uring (hch)
- rearrange the meta fields in sqe for cleaner layout
- partial submission case is not applicable as, we are only plumbing for async case
- s/META_TYPE_INTEGRITY/META_TYPE_PI (hch, martin)
- remove unlikely branching (hch)
- Better formatting, misc cleanups, better commit descriptions, reordering commits(hch)
Changes since v3:
https://lore.kernel.org/linux-block/[email protected]/
- add reftag seed support (Martin)
- fix incorrect formatting in uio_meta (hch)
- s/IOCB_HAS_META/IOCB_HAS_METADATA (hch)
- move integrity check flags to block layer header (hch)
- add comments for BIP_CHECK_GUARD/REFTAG/APPTAG flags (hch)
- remove bio_integrity check during completion if IOCB_HAS_METADATA is set (hch)
- use goto label to get rid of duplicate error handling (hch)
- add warn_on if trying to do sync io with iocb_has_metadata flag (hch)
- remove check for disabling reftag remapping (hch)
- remove BIP_INTEGRITY_USER flag (hch)
- add comment for app_tag field introduced in bio_integrity_payload (hch)
- pass request to nvme_set_app_tag function (hch)
- right indentation at a place in scsi patch (hch)
- move IOCB_HAS_METADATA to a separate fs patch (hch)
Changes since v2:
https://lore.kernel.org/linux-block/[email protected]/
- io_uring error handling styling (Gabriel)
- add documented helper to get metadata bytes from data iter (hch)
- during clone specify "what flags to clone" rather than
"what not to clone" (hch)
- Move uio_meta defination to bio-integrity.h (hch)
- Rename apptag field to app_tag (hch)
- Change datatype of flags field in uio_meta to bitwise (hch)
- Don't introduce BIP_USER_CHK_FOO flags (hch, martin)
- Driver should rely on block layer flags instead of seeing if it is
user-passthrough (hch)
- update the scsi code for handling user-meta (hch, martin)
Changes since v1:
https://lore.kernel.org/linux-block/[email protected]/
- Do not use new opcode for meta, and also add the provision to introduce new
meta types beyond integrity (Pavel)
- Stuff IOCB_HAS_META check in need_complete_io (Jens)
- Split meta handling in NVMe into a separate handler (Keith)
- Add meta handling for __blkdev_direct_IO too (Keith)
- Don't inherit BIP_COPY_USER flag for cloned bio's (Christoph)
- Better commit descriptions (Christoph)
Changes since RFC:
- modify io_uring plumbing based on recent async handling state changes
- fixes/enhancements to correctly handle the split for meta buffer
- add flags to specify guard/reftag/apptag checks
- add support to send apptag
[1]
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <linux/fs.h>
#include <linux/io_uring.h>
#include <linux/types.h>
#include "liburing.h"
/*
* write data/meta. read both. compare. send apptag too.
* prerequisite:
* protected xfer: format namespace with 4KB + 8b, pi_type = 1
* For testing reftag remapping on device-mapper, create a
* device-mapper and run this program. Device mapper creation:
* # echo 0 80 linear /dev/nvme0n1 0 > /tmp/table
* # echo 80 160 linear /dev/nvme0n1 200 >> /tmp/table
* # dmsetup create two /tmp/table
* # ./a.out /dev/dm-0
*/
#define DATA_LEN 4096
#define META_LEN 8
struct t10_pi_tuple {
__be16 guard;
__be16 apptag;
__be32 reftag;
};
int main(int argc, char *argv[])
{
struct io_uring ring;
struct io_uring_sqe *sqe = NULL;
struct io_uring_cqe *cqe = NULL;
void *wdb,*rdb;
char wmb[META_LEN], rmb[META_LEN];
int fd, ret, blksize;
struct stat fstat;
unsigned long long offset = DATA_LEN * 10;
struct t10_pi_tuple *pi;
struct io_uring_sqe_ext *sqe_ext;
struct io_uring_attr_pi pi_attr;
struct io_uring_attr_vec attr_vec;
if (argc != 2) {
fprintf(stderr, "Usage: %s <block-device>", argv[0]);
return 1;
};
if (stat(argv[1], &fstat) == 0) {
blksize = (int)fstat.st_blksize;
} else {
perror("stat");
return 1;
}
if (posix_memalign(&wdb, blksize, DATA_LEN)) {
perror("posix_memalign failed");
return 1;
}
if (posix_memalign(&rdb, blksize, DATA_LEN)) {
perror("posix_memalign failed");
return 1;
}
memset(wdb, 0, DATA_LEN);
fd = open(argv[1], O_RDWR | O_DIRECT);
if (fd < 0) {
printf("Error in opening device\n");
return 0;
}
ret = io_uring_queue_init(8, &ring, IORING_SETUP_SQE128);
if (ret) {
fprintf(stderr, "ring setup failed: %d\n", ret);
return 1;
}
/* write data + meta-buffer to device */
sqe = io_uring_get_sqe(&ring);
if (!sqe) {
fprintf(stderr, "get sqe failed\n");
return 1;
}
io_uring_prep_write(sqe, fd, wdb, DATA_LEN, offset);
sqe->attr_inline_flags = ATTR_FLAG_PI;
sqe_ext= (struct io_uring_sqe_ext *) (sqe + 1);
sqe_ext->rw_pi.addr = (__u64)wmb;
sqe_ext->rw_pi.len = META_LEN;
/* flags to ask for guard/reftag/apptag*/
sqe_ext->rw_pi.flags = IO_INTEGRITY_CHK_GUARD | IO_INTEGRITY_CHK_REFTAG | IO_INTEGRITY_CHK_APPTAG;
sqe_ext->rw_pi.app_tag = 0x1234;
sqe_ext->rw_pi.seed = 10;
pi = (struct t10_pi_tuple *)wmb;
pi->guard = 0;
pi->reftag = 0x0A000000;
pi->apptag = 0x3412;
ret = io_uring_submit(&ring);
if (ret <= 0) {
fprintf(stderr, "sqe submit failed: %d\n", ret);
return 1;
}
ret = io_uring_wait_cqe(&ring, &cqe);
if (!cqe) {
fprintf(stderr, "cqe is NULL :%d\n", ret);
return 1;
}
if (cqe->res < 0) {
fprintf(stderr, "write cqe failure: %d", cqe->res);
return 1;
}
io_uring_cqe_seen(&ring, cqe);
/* read data + meta-buffer back from device */
sqe = io_uring_get_sqe(&ring);
if (!sqe) {
fprintf(stderr, "get sqe failed\n");
return 1;
}
io_uring_prep_read(sqe, fd, rdb, DATA_LEN, offset);
sqe->nr_attr_indirect = 1;
sqe->attr_vec_addr = (__u64)&attr_vec;
attr_vec.type = ATTR_TYPE_PI;
attr_vec.addr = (__u64)&pi_attr;
pi_attr.addr = (__u64)rmb;
pi_attr.len = META_LEN;
pi_attr.flags = IO_INTEGRITY_CHK_GUARD | IO_INTEGRITY_CHK_REFTAG | IO_INTEGRITY_CHK_APPTAG;
pi_attr.app_tag = 0x1234;
pi_attr.seed = 10;
ret = io_uring_submit(&ring);
if (ret <= 0) {
fprintf(stderr, "sqe submit failed: %d\n", ret);
return 1;
}
ret = io_uring_wait_cqe(&ring, &cqe);
if (!cqe) {
fprintf(stderr, "cqe is NULL :%d\n", ret);
return 1;
}
if (cqe->res < 0) {
fprintf(stderr, "read cqe failure: %d", cqe->res);
return 1;
}
pi = (struct t10_pi_tuple *)rmb;
if (pi->apptag != 0x3412)
printf("Failure: apptag mismatch!\n");
if (pi->reftag != 0x0A000000)
printf("Failure: reftag mismatch!\n");
io_uring_cqe_seen(&ring, cqe);
pi = (struct t10_pi_tuple *)rmb;
if (strncmp(wmb, rmb, META_LEN))
printf("Failure: meta mismatch!, wmb=%s, rmb=%s\n", wmb, rmb);
if (strncmp(wdb, rdb, DATA_LEN))
printf("Failure: data mismatch!\n");
io_uring_queue_exit(&ring);
free(rdb);
free(wdb);
return 0;
}
Anuj Gupta (8):
block: define set of integrity flags to be inherited by cloned bip
block: modify bio_integrity_map_user to accept iov_iter as argument
fs, iov_iter: define meta io descriptor
fs: introduce IOCB_HAS_METADATA for metadata
io_uring: introduce attributes for read/write and PI support
io_uring: inline read/write attributes and PI
block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
scsi: add support for user-meta interface
Christoph Hellwig (1):
block: copy back bounce buffer to user-space correctly in case of
split
Kanchan Joshi (2):
nvme: add support for passing on the application tag
block: add support to pass user meta buffer
block/bio-integrity.c | 84 ++++++++++++++----
block/blk-integrity.c | 10 ++-
block/fops.c | 45 +++++++---
drivers/nvme/host/core.c | 21 +++--
drivers/scsi/sd.c | 4 +-
include/linux/bio-integrity.h | 25 ++++--
include/linux/fs.h | 1 +
include/linux/uio.h | 9 ++
include/uapi/linux/fs.h | 9 ++
include/uapi/linux/io_uring.h | 40 +++++++++
io_uring/io_uring.c | 5 ++
io_uring/rw.c | 160 +++++++++++++++++++++++++++++++++-
io_uring/rw.h | 14 ++-
13 files changed, 381 insertions(+), 46 deletions(-)
--
2.25.1
next parent reply other threads:[~2024-11-14 11:17 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20241114105326epcas5p103b2c996293fa680092b97c747fdbd59@epcas5p1.samsung.com>
2024-11-14 10:45 ` Anuj Gupta [this message]
[not found] ` <CGME20241114105352epcas5p109c1742fa8a6552296b9c104f2271308@epcas5p1.samsung.com>
2024-11-14 10:45 ` [PATCH v9 01/11] block: define set of integrity flags to be inherited by cloned bip Anuj Gupta
[not found] ` <CGME20241114105354epcas5p49a73947c3d37be4189023f66fb7ba413@epcas5p4.samsung.com>
2024-11-14 10:45 ` [PATCH v9 02/11] block: copy back bounce buffer to user-space correctly in case of split Anuj Gupta
[not found] ` <CGME20241114105357epcas5p41fd14282d4abfe564e858b37babe708a@epcas5p4.samsung.com>
2024-11-14 10:45 ` [PATCH v9 03/11] block: modify bio_integrity_map_user to accept iov_iter as argument Anuj Gupta
[not found] ` <CGME20241114105400epcas5p270b8062a0c4f26833a5b497f057d65a7@epcas5p2.samsung.com>
2024-11-14 10:45 ` [PATCH v9 04/11] fs, iov_iter: define meta io descriptor Anuj Gupta
[not found] ` <CGME20241114105402epcas5p41b1f6054a557f1bda2cfddfdfb9a9477@epcas5p4.samsung.com>
2024-11-14 10:45 ` [PATCH v9 05/11] fs: introduce IOCB_HAS_METADATA for metadata Anuj Gupta
[not found] ` <CGME20241114105405epcas5p24ca2fb9017276ff8a50ef447638fd739@epcas5p2.samsung.com>
2024-11-14 10:45 ` [PATCH v9 06/11] io_uring: introduce attributes for read/write and PI support Anuj Gupta
2024-11-14 12:16 ` Christoph Hellwig
2024-11-14 13:09 ` Pavel Begunkov
2024-11-14 15:19 ` Christoph Hellwig
2024-11-15 16:40 ` Pavel Begunkov
2024-11-15 17:12 ` Christoph Hellwig
2024-11-15 17:44 ` Jens Axboe
2024-11-15 18:00 ` Christoph Hellwig
2024-11-15 19:03 ` Pavel Begunkov
2024-11-18 12:49 ` Christoph Hellwig
2024-11-15 18:04 ` Matthew Wilcox
2024-11-20 17:35 ` Darrick J. Wong
2024-11-21 6:54 ` Christoph Hellwig
2024-11-15 13:29 ` Anuj gupta
2024-11-16 0:00 ` Pavel Begunkov
2024-11-16 0:32 ` Pavel Begunkov
2024-11-18 12:50 ` Christoph Hellwig
2024-11-18 16:59 ` Pavel Begunkov
2024-11-18 17:03 ` Christoph Hellwig
2024-11-18 17:45 ` Pavel Begunkov
2024-11-19 12:49 ` Christoph Hellwig
2024-11-16 23:09 ` kernel test robot
[not found] ` <CGME20241114105408epcas5p3c77cda2faf7ccb37abbfd8e95b4ad1f5@epcas5p3.samsung.com>
2024-11-14 10:45 ` [PATCH v9 07/11] io_uring: inline read/write attributes and PI Anuj Gupta
[not found] ` <CGME20241114105410epcas5p1c6a4e6141b073ccfd6277288f7d5e28b@epcas5p1.samsung.com>
2024-11-14 10:45 ` [PATCH v9 08/11] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags Anuj Gupta
[not found] ` <CGME20241114105413epcas5p2d7da8675df2de0d1efba3057144e691d@epcas5p2.samsung.com>
2024-11-14 10:45 ` [PATCH v9 09/11] nvme: add support for passing on the application tag Anuj Gupta
[not found] ` <CGME20241114105416epcas5p3a7aa552775cfe50f60ef89f7d982ea12@epcas5p3.samsung.com>
2024-11-14 10:45 ` [PATCH v9 10/11] scsi: add support for user-meta interface Anuj Gupta
[not found] ` <CGME20241114105418epcas5p1537d72b9016d10670cf97751704e2cc8@epcas5p1.samsung.com>
2024-11-14 10:45 ` [PATCH v9 11/11] block: add support to pass user meta buffer Anuj Gupta
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox