* [PATCH for-next v12 00/12] Fixed-buffer for uring-cmd/passthru
[not found] <CGME20220930063754epcas5p2aff33c952032713a39604388eacda910@epcas5p2.samsung.com>
@ 2022-09-30 6:27 ` Anuj Gupta
[not found] ` <CGME20220930063805epcas5p2c8eb80f32507f011baedc6d6b4d3f38d@epcas5p2.samsung.com>
` (12 more replies)
0 siblings, 13 replies; 16+ messages in thread
From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw)
To: axboe, hch, kbusch
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
Anuj Gupta
Hi,
uring-cmd lacks the ability to leverage the pre-registered buffers.
This series adds that support in uring-cmd, and plumbs nvme passthrough
to work with it.
Patches 3 - 5 carve out a block helper and scsi, nvme then use it to
avoid duplication of code.
Patch 6 and 7 contains a bunch of general nvme cleanups, which got added
along the iterations.
Using registered-buffers showed ~20% IOPS hike from 2.62M to 3.17M in my setup
Without fixedbufs
*****************
# taskset -c 0 t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -B0 -O0 -n1 -u1 /dev/ng0n1
submitter=0, tid=3623, file=/dev/ng0n1, node=-1
polled=1, fixedbufs=0/0, register_files=1, buffered=1, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=2.62M, BW=1281MiB/s, IOS/call=32/31
IOPS=2.62M, BW=1277MiB/s, IOS/call=32/32
IOPS=2.62M, BW=1277MiB/s, IOS/call=32/32
IOPS=2.61M, BW=1276MiB/s, IOS/call=32/32
^CExiting on signal
Maximum IOPS=2.62M
With fixedbufs
**************
# taskset -c 0 t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -B1 -O0 -n1 -u1 /dev/ng0n1
submitter=0, tid=3627, file=/dev/ng0n1, node=-1
polled=1, fixedbufs=1/0, register_files=1, buffered=1, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=3.17M, BW=1546MiB/s, IOS/call=32/31
IOPS=3.17M, BW=1546MiB/s, IOS/call=32/31
IOPS=3.17M, BW=1546MiB/s, IOS/call=32/32
IOPS=3.16M, BW=1544MiB/s, IOS/call=32/32
^CExiting on signal
Maximum IOPS=3.17M
Changes since v11:
Patch 2 - Add a check for flags (Jens)
Patch 3 - Moved the refactoring patches to start, before the nvme-refactoring
patches (Christoph)
Patch 3 - Initialize ret to 0, to prevent uninitialized variable warning
(kernel test robot)
Patch 4 - Added the onstack advantage part in the commit description (Christoph)
Patch 7 - Move blk_rq_free_request into nvme_map_user_request to handle error
scenarios, instead of doing it using goto in it's callers, helps in getting
rid of a uninitialized variable warning (kernel test robot)
Patch 10 - Folded it in with the next patch to avoid compiler warning for
unused static functions(Christoph)
Changes since v10:
- Patch 3: Fix overly long line (Christoph)
- Patch 4: create a helper in block-map for vectored and non-vectored-io, to be used by scsi and nvme (Christoph)
- Patch 5: Rename bio_map_get to blk_rq_map_bio_alloc and bio_map_put to blk_mq_map_bio_put (Christoph)
- Patch 6: Split it into a prep patch and avoid duplicate checks (Christoph)
- Patch 7: Put changes to pass ubuffer as a integer in a separate prep patch and simplify condition checks in nvme (Christoph)
Changes since v9:
- Patch 6: Make blk_rq_map_user_iov() to operate on bvec iterator
(Christoph)
- Patch 7: Change nvme to use the above
Changes since v8:
- Split some patches further; now 7 patches rather than 5 (Christoph)
- Applied a bunch of other suggested cleanups (Christoph)
Changes since v7:
- Patch 3: added many cleanups/refactoring suggested by Christoph
- Patch 4: added copying-pages fallback for bounce-buffer/dma-alignment case
(Christoph)
Changes since v6:
- Patch 1: fix warning for io_uring_cmd_import_fixed (robot)
-
Changes since v5:
- Patch 4: newly addd, to split a nvme function into two
- Patch 3: folded cleanups in bio_map_user_iov (Chaitanya, Pankaj)
- Rebase to latest for-next
Changes since v4:
- Patch 1, 2: folded all review comments of Jens
Changes since v3:
- uring_cmd_flags, change from u16 to u32 (Jens)
- patch 3, add another helper to reduce code-duplication (Jens)
Changes since v2:
- Kill the new opcode, add a flag instead (Pavel)
- Fix standalone build issue with patch 1 (Pavel)
Changes since v1:
- Fix a naming issue for an exported helper
Anuj Gupta (6):
io_uring: add io_uring_cmd_import_fixed
io_uring: introduce fixed buffer support for io_uring_cmd
block: add blk_rq_map_user_io
scsi: Use blk_rq_map_user_io helper
nvme: Use blk_rq_map_user_io helper
block: rename bio_map_put to blk_mq_map_bio_put
Kanchan Joshi (6):
nvme: refactor nvme_add_user_metadata
nvme: refactor nvme_alloc_request
block: factor out blk_rq_map_bio_alloc helper
block: extend functionality to map bvec iterator
nvme: pass ubuffer as an integer
nvme: wire up fixed buffer support for nvme passthrough
block/blk-map.c | 150 ++++++++++++++++++++++++++++++----
drivers/nvme/host/ioctl.c | 144 ++++++++++++++++++--------------
drivers/scsi/scsi_ioctl.c | 22 +----
drivers/scsi/sg.c | 22 +----
include/linux/blk-mq.h | 2 +
include/linux/io_uring.h | 10 ++-
include/uapi/linux/io_uring.h | 9 ++
io_uring/uring_cmd.c | 28 ++++++-
8 files changed, 266 insertions(+), 121 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH for-next v12 01/12] io_uring: add io_uring_cmd_import_fixed
[not found] ` <CGME20220930063805epcas5p2c8eb80f32507f011baedc6d6b4d3f38d@epcas5p2.samsung.com>
@ 2022-09-30 6:27 ` Anuj Gupta
0 siblings, 0 replies; 16+ messages in thread
From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw)
To: axboe, hch, kbusch
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
Anuj Gupta, Kanchan Joshi
This is a new helper that callers can use to obtain a bvec iterator for
the previously mapped buffer. This is preparatory work to enable
fixed-buffer support for io_uring_cmd.
Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
include/linux/io_uring.h | 8 ++++++++
io_uring/uring_cmd.c | 10 ++++++++++
2 files changed, 18 insertions(+)
diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
index 58676c0a398f..1dbf51115c30 100644
--- a/include/linux/io_uring.h
+++ b/include/linux/io_uring.h
@@ -4,6 +4,7 @@
#include <linux/sched.h>
#include <linux/xarray.h>
+#include <uapi/linux/io_uring.h>
enum io_uring_cmd_flags {
IO_URING_F_COMPLETE_DEFER = 1,
@@ -32,6 +33,8 @@ struct io_uring_cmd {
};
#if defined(CONFIG_IO_URING)
+int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
+ struct iov_iter *iter, void *ioucmd);
void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret, ssize_t res2);
void io_uring_cmd_complete_in_task(struct io_uring_cmd *ioucmd,
void (*task_work_cb)(struct io_uring_cmd *));
@@ -59,6 +62,11 @@ static inline void io_uring_free(struct task_struct *tsk)
__io_uring_free(tsk);
}
#else
+static int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
+ struct iov_iter *iter, void *ioucmd)
+{
+ return -EOPNOTSUPP;
+}
static inline void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret,
ssize_t ret2)
{
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index f3ed61e9bd0f..6a6d69523d75 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -8,6 +8,7 @@
#include <uapi/linux/io_uring.h>
#include "io_uring.h"
+#include "rsrc.h"
#include "uring_cmd.h"
static void io_uring_cmd_work(struct io_kiocb *req, bool *locked)
@@ -129,3 +130,12 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
return IOU_ISSUE_SKIP_COMPLETE;
}
+
+int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
+ struct iov_iter *iter, void *ioucmd)
+{
+ struct io_kiocb *req = cmd_to_io_kiocb(ioucmd);
+
+ return io_import_fixed(rw, iter, req->imu, ubuf, len);
+}
+EXPORT_SYMBOL_GPL(io_uring_cmd_import_fixed);
--
2.25.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH for-next v12 02/12] io_uring: introduce fixed buffer support for io_uring_cmd
[not found] ` <CGME20220930063809epcas5p328b9e14ead49e9612b905e6f5b6682f7@epcas5p3.samsung.com>
@ 2022-09-30 6:27 ` Anuj Gupta
2022-09-30 13:42 ` Jens Axboe
0 siblings, 1 reply; 16+ messages in thread
From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw)
To: axboe, hch, kbusch
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
Anuj Gupta, Kanchan Joshi
Add IORING_URING_CMD_FIXED flag that is to be used for sending io_uring
command with previously registered buffers. User-space passes the buffer
index in sqe->buf_index, same as done in read/write variants that uses
fixed buffers.
Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
include/linux/io_uring.h | 2 +-
include/uapi/linux/io_uring.h | 9 +++++++++
io_uring/uring_cmd.c | 18 +++++++++++++++++-
3 files changed, 27 insertions(+), 2 deletions(-)
diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
index 1dbf51115c30..e10c5cc81082 100644
--- a/include/linux/io_uring.h
+++ b/include/linux/io_uring.h
@@ -28,7 +28,7 @@ struct io_uring_cmd {
void *cookie;
};
u32 cmd_op;
- u32 pad;
+ u32 flags;
u8 pdu[32]; /* available inline for free use */
};
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 92f29d9505a6..ab7458033ee3 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -56,6 +56,7 @@ struct io_uring_sqe {
__u32 hardlink_flags;
__u32 xattr_flags;
__u32 msg_ring_flags;
+ __u32 uring_cmd_flags;
};
__u64 user_data; /* data to be passed back at completion time */
/* pack this to avoid bogus arm OABI complaints */
@@ -219,6 +220,14 @@ enum io_uring_op {
IORING_OP_LAST,
};
+/*
+ * sqe->uring_cmd_flags
+ * IORING_URING_CMD_FIXED use registered buffer; pass thig flag
+ * along with setting sqe->buf_index.
+ */
+#define IORING_URING_CMD_FIXED (1U << 0)
+
+
/*
* sqe->fsync_flags
*/
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 6a6d69523d75..05e8ad8cef87 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -4,6 +4,7 @@
#include <linux/file.h>
#include <linux/io_uring.h>
#include <linux/security.h>
+#include <linux/nospec.h>
#include <uapi/linux/io_uring.h>
@@ -77,7 +78,22 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
{
struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
- if (sqe->rw_flags || sqe->__pad1)
+ if (sqe->__pad1)
+ return -EINVAL;
+
+ ioucmd->flags = READ_ONCE(sqe->uring_cmd_flags);
+ if (ioucmd->flags & IORING_URING_CMD_FIXED) {
+ struct io_ring_ctx *ctx = req->ctx;
+ u16 index;
+
+ req->buf_index = READ_ONCE(sqe->buf_index);
+ if (unlikely(req->buf_index >= ctx->nr_user_bufs))
+ return -EFAULT;
+ index = array_index_nospec(req->buf_index, ctx->nr_user_bufs);
+ req->imu = ctx->user_bufs[index];
+ io_req_set_rsrc_node(req, ctx, 0);
+ }
+ if (ioucmd->flags & ~IORING_URING_CMD_FIXED)
return -EINVAL;
ioucmd->cmd = sqe->cmd;
ioucmd->cmd_op = READ_ONCE(sqe->cmd_op);
--
2.25.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH for-next v12 03/12] block: add blk_rq_map_user_io
[not found] ` <CGME20220930063811epcas5p43cce58f5e1589c3e3780ce0cfd563986@epcas5p4.samsung.com>
@ 2022-09-30 6:27 ` Anuj Gupta
0 siblings, 0 replies; 16+ messages in thread
From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw)
To: axboe, hch, kbusch
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
Anuj Gupta
Create a helper blk_rq_map_user_io for mapping of vectored as well as
non-vectored requests. This will help in saving dupilcation of code at few
places in scsi and nvme.
Signed-off-by: Anuj Gupta <[email protected]>
Suggested-by: Christoph Hellwig <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
block/blk-map.c | 36 ++++++++++++++++++++++++++++++++++++
include/linux/blk-mq.h | 2 ++
2 files changed, 38 insertions(+)
diff --git a/block/blk-map.c b/block/blk-map.c
index 7693f8e3c454..0e37bbedd46c 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -611,6 +611,42 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
}
EXPORT_SYMBOL(blk_rq_map_user);
+int blk_rq_map_user_io(struct request *req, struct rq_map_data *map_data,
+ void __user *ubuf, unsigned long buf_len, gfp_t gfp_mask,
+ bool vec, int iov_count, bool check_iter_count, int rw)
+{
+ int ret = 0;
+
+ if (vec) {
+ struct iovec fast_iov[UIO_FASTIOV];
+ struct iovec *iov = fast_iov;
+ struct iov_iter iter;
+
+ ret = import_iovec(rw, ubuf, iov_count ? iov_count : buf_len,
+ UIO_FASTIOV, &iov, &iter);
+ if (ret < 0)
+ return ret;
+
+ if (iov_count) {
+ /* SG_IO howto says that the shorter of the two wins */
+ iov_iter_truncate(&iter, buf_len);
+ if (check_iter_count && !iov_iter_count(&iter)) {
+ kfree(iov);
+ return -EINVAL;
+ }
+ }
+
+ ret = blk_rq_map_user_iov(req->q, req, map_data, &iter,
+ gfp_mask);
+ kfree(iov);
+ } else if (buf_len) {
+ ret = blk_rq_map_user(req->q, req, map_data, ubuf, buf_len,
+ gfp_mask);
+ }
+ return ret;
+}
+EXPORT_SYMBOL(blk_rq_map_user_io);
+
/**
* blk_rq_unmap_user - unmap a request with user data
* @bio: start of bio list
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 50811d0fb143..ba18e9bdb799 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -985,6 +985,8 @@ struct rq_map_data {
int blk_rq_map_user(struct request_queue *, struct request *,
struct rq_map_data *, void __user *, unsigned long, gfp_t);
+int blk_rq_map_user_io(struct request *, struct rq_map_data *,
+ void __user *, unsigned long, gfp_t, bool, int, bool, int);
int blk_rq_map_user_iov(struct request_queue *, struct request *,
struct rq_map_data *, const struct iov_iter *, gfp_t);
int blk_rq_unmap_user(struct bio *);
--
2.25.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH for-next v12 04/12] scsi: Use blk_rq_map_user_io helper
[not found] ` <CGME20220930063815epcas5p1e056d6a2a53949296a7657de804fd2ec@epcas5p1.samsung.com>
@ 2022-09-30 6:27 ` Anuj Gupta
0 siblings, 0 replies; 16+ messages in thread
From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw)
To: axboe, hch, kbusch
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
Anuj Gupta
Use the new blk_rq_map_user_io helper instead of duplicating code at
various places. Additionally this also takes advantage of the on-stack
iov fast path.
Signed-off-by: Anuj Gupta <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
drivers/scsi/scsi_ioctl.c | 22 +++-------------------
drivers/scsi/sg.c | 22 ++--------------------
2 files changed, 5 insertions(+), 39 deletions(-)
diff --git a/drivers/scsi/scsi_ioctl.c b/drivers/scsi/scsi_ioctl.c
index 729e309e6034..2d20da55fb64 100644
--- a/drivers/scsi/scsi_ioctl.c
+++ b/drivers/scsi/scsi_ioctl.c
@@ -449,25 +449,9 @@ static int sg_io(struct scsi_device *sdev, struct sg_io_hdr *hdr, fmode_t mode)
if (ret < 0)
goto out_put_request;
- ret = 0;
- if (hdr->iovec_count && hdr->dxfer_len) {
- struct iov_iter i;
- struct iovec *iov = NULL;
-
- ret = import_iovec(rq_data_dir(rq), hdr->dxferp,
- hdr->iovec_count, 0, &iov, &i);
- if (ret < 0)
- goto out_put_request;
-
- /* SG_IO howto says that the shorter of the two wins */
- iov_iter_truncate(&i, hdr->dxfer_len);
-
- ret = blk_rq_map_user_iov(rq->q, rq, NULL, &i, GFP_KERNEL);
- kfree(iov);
- } else if (hdr->dxfer_len)
- ret = blk_rq_map_user(rq->q, rq, NULL, hdr->dxferp,
- hdr->dxfer_len, GFP_KERNEL);
-
+ ret = blk_rq_map_user_io(rq, NULL, hdr->dxferp, hdr->dxfer_len,
+ GFP_KERNEL, hdr->iovec_count && hdr->dxfer_len,
+ hdr->iovec_count, 0, rq_data_dir(rq));
if (ret)
goto out_put_request;
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 94c5e9a9309c..ce34a8ad53b4 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -1804,26 +1804,8 @@ sg_start_req(Sg_request *srp, unsigned char *cmd)
md->from_user = 0;
}
- if (iov_count) {
- struct iovec *iov = NULL;
- struct iov_iter i;
-
- res = import_iovec(rw, hp->dxferp, iov_count, 0, &iov, &i);
- if (res < 0)
- return res;
-
- iov_iter_truncate(&i, hp->dxfer_len);
- if (!iov_iter_count(&i)) {
- kfree(iov);
- return -EINVAL;
- }
-
- res = blk_rq_map_user_iov(q, rq, md, &i, GFP_ATOMIC);
- kfree(iov);
- } else
- res = blk_rq_map_user(q, rq, md, hp->dxferp,
- hp->dxfer_len, GFP_ATOMIC);
-
+ res = blk_rq_map_user_io(rq, md, hp->dxferp, hp->dxfer_len,
+ GFP_ATOMIC, iov_count, iov_count, 1, rw);
if (!res) {
srp->bio = rq->bio;
--
2.25.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH for-next v12 05/12] nvme: Use blk_rq_map_user_io helper
[not found] ` <CGME20220930063818epcas5p4e321f0efa5a53759ea19eb8f1c63deef@epcas5p4.samsung.com>
@ 2022-09-30 6:27 ` Anuj Gupta
0 siblings, 0 replies; 16+ messages in thread
From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw)
To: axboe, hch, kbusch
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
Anuj Gupta
User blk_rq_map_user_io instead of duplicating the same code at
different places
Signed-off-by: Anuj Gupta <[email protected]>
---
drivers/nvme/host/ioctl.c | 18 ++----------------
1 file changed, 2 insertions(+), 16 deletions(-)
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 914b142b6f2b..3746a02a88ef 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -88,22 +88,8 @@ static struct request *nvme_alloc_user_request(struct request_queue *q,
nvme_req(req)->flags |= NVME_REQ_USERCMD;
if (ubuffer && bufflen) {
- if (!vec)
- ret = blk_rq_map_user(q, req, NULL, ubuffer, bufflen,
- GFP_KERNEL);
- else {
- struct iovec fast_iov[UIO_FASTIOV];
- struct iovec *iov = fast_iov;
- struct iov_iter iter;
-
- ret = import_iovec(rq_data_dir(req), ubuffer, bufflen,
- UIO_FASTIOV, &iov, &iter);
- if (ret < 0)
- goto out;
- ret = blk_rq_map_user_iov(q, req, NULL, &iter,
- GFP_KERNEL);
- kfree(iov);
- }
+ ret = blk_rq_map_user_io(req, NULL, ubuffer, bufflen,
+ GFP_KERNEL, vec, 0, 0, rq_data_dir(req));
if (ret)
goto out;
bio = req->bio;
--
2.25.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH for-next v12 06/12] nvme: refactor nvme_add_user_metadata
[not found] ` <CGME20220930063821epcas5p48d4ec5136d487ea779ac74e2c0b740ac@epcas5p4.samsung.com>
@ 2022-09-30 6:27 ` Anuj Gupta
0 siblings, 0 replies; 16+ messages in thread
From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw)
To: axboe, hch, kbusch
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
Kanchan Joshi
From: Kanchan Joshi <[email protected]>
Pass struct request rather than bio. It helps to kill a parameter, and
some processing clean-up too.
Signed-off-by: Kanchan Joshi <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
drivers/nvme/host/ioctl.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 3746a02a88ef..bcaa6b3f97ca 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -20,19 +20,20 @@ static void __user *nvme_to_user_ptr(uintptr_t ptrval)
return (void __user *)ptrval;
}
-static void *nvme_add_user_metadata(struct bio *bio, void __user *ubuf,
- unsigned len, u32 seed, bool write)
+static void *nvme_add_user_metadata(struct request *req, void __user *ubuf,
+ unsigned len, u32 seed)
{
struct bio_integrity_payload *bip;
int ret = -ENOMEM;
void *buf;
+ struct bio *bio = req->bio;
buf = kmalloc(len, GFP_KERNEL);
if (!buf)
goto out;
ret = -EFAULT;
- if (write && copy_from_user(buf, ubuf, len))
+ if ((req_op(req) == REQ_OP_DRV_OUT) && copy_from_user(buf, ubuf, len))
goto out_free_meta;
bip = bio_integrity_alloc(bio, GFP_KERNEL, 1);
@@ -45,9 +46,13 @@ static void *nvme_add_user_metadata(struct bio *bio, void __user *ubuf,
bip->bip_iter.bi_sector = seed;
ret = bio_integrity_add_page(bio, virt_to_page(buf), len,
offset_in_page(buf));
- if (ret == len)
- return buf;
- ret = -ENOMEM;
+ if (ret != len) {
+ ret = -ENOMEM;
+ goto out_free_meta;
+ }
+
+ req->cmd_flags |= REQ_INTEGRITY;
+ return buf;
out_free_meta:
kfree(buf);
out:
@@ -70,7 +75,6 @@ static struct request *nvme_alloc_user_request(struct request_queue *q,
u32 meta_seed, void **metap, unsigned timeout, bool vec,
blk_opf_t rq_flags, blk_mq_req_flags_t blk_flags)
{
- bool write = nvme_is_write(cmd);
struct nvme_ns *ns = q->queuedata;
struct block_device *bdev = ns ? ns->disk->part0 : NULL;
struct request *req;
@@ -96,13 +100,12 @@ static struct request *nvme_alloc_user_request(struct request_queue *q,
if (bdev)
bio_set_dev(bio, bdev);
if (bdev && meta_buffer && meta_len) {
- meta = nvme_add_user_metadata(bio, meta_buffer, meta_len,
- meta_seed, write);
+ meta = nvme_add_user_metadata(req, meta_buffer,
+ meta_len, meta_seed);
if (IS_ERR(meta)) {
ret = PTR_ERR(meta);
goto out_unmap;
}
- req->cmd_flags |= REQ_INTEGRITY;
*metap = meta;
}
}
--
2.25.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH for-next v12 07/12] nvme: refactor nvme_alloc_request
[not found] ` <CGME20220930063824epcas5p4f829f3b8673e2603cdc9a799ca44ea6e@epcas5p4.samsung.com>
@ 2022-09-30 6:27 ` Anuj Gupta
0 siblings, 0 replies; 16+ messages in thread
From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw)
To: axboe, hch, kbusch
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
Kanchan Joshi, Anuj Gupta
From: Kanchan Joshi <[email protected]>
nvme_alloc_request expects a large number of parameters.
Split this out into two functions to reduce number of parameters.
First one retains the name nvme_alloc_request, while second one is
named nvme_map_user_request.
Signed-off-by: Kanchan Joshi <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
drivers/nvme/host/ioctl.c | 90 +++++++++++++++++++++++----------------
1 file changed, 53 insertions(+), 37 deletions(-)
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index bcaa6b3f97ca..3f1e7af19716 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -70,54 +70,57 @@ static int nvme_finish_user_metadata(struct request *req, void __user *ubuf,
}
static struct request *nvme_alloc_user_request(struct request_queue *q,
- struct nvme_command *cmd, void __user *ubuffer,
- unsigned bufflen, void __user *meta_buffer, unsigned meta_len,
- u32 meta_seed, void **metap, unsigned timeout, bool vec,
- blk_opf_t rq_flags, blk_mq_req_flags_t blk_flags)
+ struct nvme_command *cmd, blk_opf_t rq_flags,
+ blk_mq_req_flags_t blk_flags)
{
- struct nvme_ns *ns = q->queuedata;
- struct block_device *bdev = ns ? ns->disk->part0 : NULL;
struct request *req;
- struct bio *bio = NULL;
- void *meta = NULL;
- int ret;
req = blk_mq_alloc_request(q, nvme_req_op(cmd) | rq_flags, blk_flags);
if (IS_ERR(req))
return req;
nvme_init_request(req, cmd);
-
- if (timeout)
- req->timeout = timeout;
nvme_req(req)->flags |= NVME_REQ_USERCMD;
+ return req;
+}
- if (ubuffer && bufflen) {
- ret = blk_rq_map_user_io(req, NULL, ubuffer, bufflen,
- GFP_KERNEL, vec, 0, 0, rq_data_dir(req));
- if (ret)
- goto out;
- bio = req->bio;
- if (bdev)
- bio_set_dev(bio, bdev);
- if (bdev && meta_buffer && meta_len) {
- meta = nvme_add_user_metadata(req, meta_buffer,
- meta_len, meta_seed);
- if (IS_ERR(meta)) {
- ret = PTR_ERR(meta);
- goto out_unmap;
- }
- *metap = meta;
+static int nvme_map_user_request(struct request *req, void __user *ubuffer,
+ unsigned bufflen, void __user *meta_buffer, unsigned meta_len,
+ u32 meta_seed, void **metap, bool vec)
+{
+ struct request_queue *q = req->q;
+ struct nvme_ns *ns = q->queuedata;
+ struct block_device *bdev = ns ? ns->disk->part0 : NULL;
+ struct bio *bio = NULL;
+ void *meta = NULL;
+ int ret;
+
+ ret = blk_rq_map_user_io(req, NULL, ubuffer, bufflen, GFP_KERNEL, vec,
+ 0, 0, rq_data_dir(req));
+
+ if (ret)
+ goto out;
+ bio = req->bio;
+ if (bdev)
+ bio_set_dev(bio, bdev);
+
+ if (bdev && meta_buffer && meta_len) {
+ meta = nvme_add_user_metadata(req, meta_buffer, meta_len,
+ meta_seed);
+ if (IS_ERR(meta)) {
+ ret = PTR_ERR(meta);
+ goto out_unmap;
}
+ *metap = meta;
}
- return req;
+ return ret;
out_unmap:
if (bio)
blk_rq_unmap_user(bio);
out:
blk_mq_free_request(req);
- return ERR_PTR(ret);
+ return ret;
}
static int nvme_submit_user_cmd(struct request_queue *q,
@@ -132,11 +135,18 @@ static int nvme_submit_user_cmd(struct request_queue *q,
u32 effects;
int ret;
- req = nvme_alloc_user_request(q, cmd, ubuffer, bufflen, meta_buffer,
- meta_len, meta_seed, &meta, timeout, vec, 0, 0);
+ req = nvme_alloc_user_request(q, cmd, 0, 0);
if (IS_ERR(req))
return PTR_ERR(req);
+ req->timeout = timeout;
+ if (ubuffer && bufflen) {
+ ret = nvme_map_user_request(req, ubuffer, bufflen, meta_buffer,
+ meta_len, meta_seed, &meta, vec);
+ if (ret)
+ return ret;
+ }
+
bio = req->bio;
ctrl = nvme_req(req)->ctrl;
@@ -456,6 +466,7 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
blk_opf_t rq_flags = 0;
blk_mq_req_flags_t blk_flags = 0;
void *meta = NULL;
+ int ret;
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
@@ -495,13 +506,18 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
rq_flags |= REQ_POLLED;
retry:
- req = nvme_alloc_user_request(q, &c, nvme_to_user_ptr(d.addr),
- d.data_len, nvme_to_user_ptr(d.metadata),
- d.metadata_len, 0, &meta, d.timeout_ms ?
- msecs_to_jiffies(d.timeout_ms) : 0, vec, rq_flags,
- blk_flags);
+ req = nvme_alloc_user_request(q, &c, rq_flags, blk_flags);
if (IS_ERR(req))
return PTR_ERR(req);
+ req->timeout = d.timeout_ms ? msecs_to_jiffies(d.timeout_ms) : 0;
+
+ if (d.addr && d.data_len) {
+ ret = nvme_map_user_request(req, nvme_to_user_ptr(d.addr),
+ d.data_len, nvme_to_user_ptr(d.metadata),
+ d.metadata_len, 0, &meta, vec);
+ if (ret)
+ return ret;
+ }
if (issue_flags & IO_URING_F_IOPOLL && rq_flags & REQ_POLLED) {
if (unlikely(!req->bio)) {
--
2.25.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH for-next v12 08/12] block: rename bio_map_put to blk_mq_map_bio_put
[not found] ` <CGME20220930063826epcas5p491d9bc62214c1d7c8c24c883299edfb7@epcas5p4.samsung.com>
@ 2022-09-30 6:27 ` Anuj Gupta
0 siblings, 0 replies; 16+ messages in thread
From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw)
To: axboe, hch, kbusch
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
Anuj Gupta
This patch renames existing bio_map_put function to blk_mq_map_bio_put.
Signed-off-by: Anuj Gupta <[email protected]>
Suggested-by: Christoph Hellwig <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
block/blk-map.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/block/blk-map.c b/block/blk-map.c
index 0e37bbedd46c..84b13a4158b7 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -231,7 +231,7 @@ static int bio_copy_user_iov(struct request *rq, struct rq_map_data *map_data,
return ret;
}
-static void bio_map_put(struct bio *bio)
+static void blk_mq_map_bio_put(struct bio *bio)
{
if (bio->bi_opf & REQ_ALLOC_CACHE) {
bio_put(bio);
@@ -331,7 +331,7 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
out_unmap:
bio_release_pages(bio, false);
- bio_map_put(bio);
+ blk_mq_map_bio_put(bio);
return ret;
}
@@ -672,7 +672,7 @@ int blk_rq_unmap_user(struct bio *bio)
next_bio = bio;
bio = bio->bi_next;
- bio_map_put(next_bio);
+ blk_mq_map_bio_put(next_bio);
}
return ret;
--
2.25.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH for-next v12 09/12] block: factor out blk_rq_map_bio_alloc helper
[not found] ` <CGME20220930063828epcas5p2bfddb254b0dffde77e99c2acc4440bde@epcas5p2.samsung.com>
@ 2022-09-30 6:27 ` Anuj Gupta
0 siblings, 0 replies; 16+ messages in thread
From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw)
To: axboe, hch, kbusch
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
Kanchan Joshi
From: Kanchan Joshi <[email protected]>
Move bio allocation logic from bio_map_user_iov to a new helper
blk_rq_map_bio_alloc. It is named so because functionality is opposite
of what is done inside blk_mq_map_bio_put. This is a prep patch.
Signed-off-by: Kanchan Joshi <[email protected]>
---
block/blk-map.c | 33 ++++++++++++++++++++++-----------
1 file changed, 22 insertions(+), 11 deletions(-)
diff --git a/block/blk-map.c b/block/blk-map.c
index 84b13a4158b7..d6ea377394a9 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -241,17 +241,10 @@ static void blk_mq_map_bio_put(struct bio *bio)
}
}
-static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
- gfp_t gfp_mask)
+static struct bio *blk_rq_map_bio_alloc(struct request *rq,
+ unsigned int nr_vecs, gfp_t gfp_mask)
{
- unsigned int max_sectors = queue_max_hw_sectors(rq->q);
- unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS);
struct bio *bio;
- int ret;
- int j;
-
- if (!iov_iter_count(iter))
- return -EINVAL;
if (rq->cmd_flags & REQ_POLLED) {
blk_opf_t opf = rq->cmd_flags | REQ_ALLOC_CACHE;
@@ -259,13 +252,31 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask,
&fs_bio_set);
if (!bio)
- return -ENOMEM;
+ return NULL;
} else {
bio = bio_kmalloc(nr_vecs, gfp_mask);
if (!bio)
- return -ENOMEM;
+ return NULL;
bio_init(bio, NULL, bio->bi_inline_vecs, nr_vecs, req_op(rq));
}
+ return bio;
+}
+
+static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
+ gfp_t gfp_mask)
+{
+ unsigned int max_sectors = queue_max_hw_sectors(rq->q);
+ unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS);
+ struct bio *bio;
+ int ret;
+ int j;
+
+ if (!iov_iter_count(iter))
+ return -EINVAL;
+
+ bio = blk_rq_map_bio_alloc(rq, nr_vecs, gfp_mask);
+ if (bio == NULL)
+ return -ENOMEM;
while (iov_iter_count(iter)) {
struct page **pages, *stack_pages[UIO_FASTIOV];
--
2.25.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH for-next v12 10/12] block: extend functionality to map bvec iterator
[not found] ` <CGME20220930063831epcas5p4b6a8559dedd39ef423a0b9a317163969@epcas5p4.samsung.com>
@ 2022-09-30 6:27 ` Anuj Gupta
0 siblings, 0 replies; 16+ messages in thread
From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw)
To: axboe, hch, kbusch
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
Kanchan Joshi, Anuj Gupta
From: Kanchan Joshi <[email protected]>
Extend blk_rq_map_user_iov so that it can handle bvec iterator, using
the new blk_rq_map_user_bvec function. It maps the pages from bvec
iterator into a bio and place the bio into request.
This helper will be used by nvme for uring-passthrough path when IO is
done using pre-mapped buffers.
Signed-off-by: Kanchan Joshi <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
Suggested-by: Christoph Hellwig <[email protected]>
---
block/blk-map.c | 75 ++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 71 insertions(+), 4 deletions(-)
diff --git a/block/blk-map.c b/block/blk-map.c
index d6ea377394a9..34735626b00f 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -548,6 +548,62 @@ int blk_rq_append_bio(struct request *rq, struct bio *bio)
}
EXPORT_SYMBOL(blk_rq_append_bio);
+/* Prepare bio for passthrough IO given ITER_BVEC iter */
+static int blk_rq_map_user_bvec(struct request *rq, const struct iov_iter *iter)
+{
+ struct request_queue *q = rq->q;
+ size_t nr_iter = iov_iter_count(iter);
+ size_t nr_segs = iter->nr_segs;
+ struct bio_vec *bvecs, *bvprvp = NULL;
+ struct queue_limits *lim = &q->limits;
+ unsigned int nsegs = 0, bytes = 0;
+ struct bio *bio;
+ size_t i;
+
+ if (!nr_iter || (nr_iter >> SECTOR_SHIFT) > queue_max_hw_sectors(q))
+ return -EINVAL;
+ if (nr_segs > queue_max_segments(q))
+ return -EINVAL;
+
+ /* no iovecs to alloc, as we already have a BVEC iterator */
+ bio = blk_rq_map_bio_alloc(rq, 0, GFP_KERNEL);
+ if (bio == NULL)
+ return -ENOMEM;
+
+ bio_iov_bvec_set(bio, (struct iov_iter *)iter);
+ blk_rq_bio_prep(rq, bio, nr_segs);
+
+ /* loop to perform a bunch of sanity checks */
+ bvecs = (struct bio_vec *)iter->bvec;
+ for (i = 0; i < nr_segs; i++) {
+ struct bio_vec *bv = &bvecs[i];
+
+ /*
+ * If the queue doesn't support SG gaps and adding this
+ * offset would create a gap, fallback to copy.
+ */
+ if (bvprvp && bvec_gap_to_prev(lim, bvprvp, bv->bv_offset)) {
+ blk_mq_map_bio_put(bio);
+ return -EREMOTEIO;
+ }
+ /* check full condition */
+ if (nsegs >= nr_segs || bytes > UINT_MAX - bv->bv_len)
+ goto put_bio;
+ if (bytes + bv->bv_len > nr_iter)
+ goto put_bio;
+ if (bv->bv_offset + bv->bv_len > PAGE_SIZE)
+ goto put_bio;
+
+ nsegs++;
+ bytes += bv->bv_len;
+ bvprvp = bv;
+ }
+ return 0;
+put_bio:
+ blk_mq_map_bio_put(bio);
+ return -EINVAL;
+}
+
/**
* blk_rq_map_user_iov - map user data to a request, for passthrough requests
* @q: request queue where request should be inserted
@@ -567,24 +623,35 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
struct rq_map_data *map_data,
const struct iov_iter *iter, gfp_t gfp_mask)
{
- bool copy = false;
+ bool copy = false, map_bvec = false;
unsigned long align = q->dma_pad_mask | queue_dma_alignment(q);
struct bio *bio = NULL;
struct iov_iter i;
int ret = -EINVAL;
- if (!iter_is_iovec(iter))
- goto fail;
-
if (map_data)
copy = true;
else if (blk_queue_may_bounce(q))
copy = true;
else if (iov_iter_alignment(iter) & align)
copy = true;
+ else if (iov_iter_is_bvec(iter))
+ map_bvec = true;
+ else if (!iter_is_iovec(iter))
+ copy = true;
else if (queue_virt_boundary(q))
copy = queue_virt_boundary(q) & iov_iter_gap_alignment(iter);
+ if (map_bvec) {
+ ret = blk_rq_map_user_bvec(rq, iter);
+ if (!ret)
+ return 0;
+ if (ret != -EREMOTEIO)
+ goto fail;
+ /* fall back to copying the data on limits mismatches */
+ copy = true;
+ }
+
i = *iter;
do {
if (copy)
--
2.25.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH for-next v12 11/12] nvme: pass ubuffer as an integer
[not found] ` <CGME20220930063833epcas5p40fbff95f9d132f5a42dda80d307426e9@epcas5p4.samsung.com>
@ 2022-09-30 6:27 ` Anuj Gupta
0 siblings, 0 replies; 16+ messages in thread
From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw)
To: axboe, hch, kbusch
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
Kanchan Joshi, Anuj Gupta
From: Kanchan Joshi <[email protected]>
This is a prep patch. Modify nvme_submit_user_cmd and
nvme_map_user_request to take ubuffer as plain integer
argument, and do away with nvme_to_user_ptr conversion in callers.
Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
drivers/nvme/host/ioctl.c | 23 ++++++++++++-----------
1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 3f1e7af19716..7a41caa9bfd2 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -83,9 +83,10 @@ static struct request *nvme_alloc_user_request(struct request_queue *q,
return req;
}
-static int nvme_map_user_request(struct request *req, void __user *ubuffer,
+static int nvme_map_user_request(struct request *req, u64 ubuffer,
unsigned bufflen, void __user *meta_buffer, unsigned meta_len,
- u32 meta_seed, void **metap, bool vec)
+ u32 meta_seed, void **metap, struct io_uring_cmd *ioucmd,
+ bool vec)
{
struct request_queue *q = req->q;
struct nvme_ns *ns = q->queuedata;
@@ -94,8 +95,8 @@ static int nvme_map_user_request(struct request *req, void __user *ubuffer,
void *meta = NULL;
int ret;
- ret = blk_rq_map_user_io(req, NULL, ubuffer, bufflen, GFP_KERNEL, vec,
- 0, 0, rq_data_dir(req));
+ ret = blk_rq_map_user_io(req, NULL, nvme_to_user_ptr(ubuffer), bufflen,
+ GFP_KERNEL, vec, 0, 0, rq_data_dir(req));
if (ret)
goto out;
@@ -124,7 +125,7 @@ static int nvme_map_user_request(struct request *req, void __user *ubuffer,
}
static int nvme_submit_user_cmd(struct request_queue *q,
- struct nvme_command *cmd, void __user *ubuffer,
+ struct nvme_command *cmd, u64 ubuffer,
unsigned bufflen, void __user *meta_buffer, unsigned meta_len,
u32 meta_seed, u64 *result, unsigned timeout, bool vec)
{
@@ -142,7 +143,7 @@ static int nvme_submit_user_cmd(struct request_queue *q,
req->timeout = timeout;
if (ubuffer && bufflen) {
ret = nvme_map_user_request(req, ubuffer, bufflen, meta_buffer,
- meta_len, meta_seed, &meta, vec);
+ meta_len, meta_seed, &meta, NULL, vec);
if (ret)
return ret;
}
@@ -226,7 +227,7 @@ static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio)
c.rw.appmask = cpu_to_le16(io.appmask);
return nvme_submit_user_cmd(ns->queue, &c,
- nvme_to_user_ptr(io.addr), length,
+ io.addr, length,
metadata, meta_len, lower_32_bits(io.slba), NULL, 0,
false);
}
@@ -280,7 +281,7 @@ static int nvme_user_cmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
timeout = msecs_to_jiffies(cmd.timeout_ms);
status = nvme_submit_user_cmd(ns ? ns->queue : ctrl->admin_q, &c,
- nvme_to_user_ptr(cmd.addr), cmd.data_len,
+ cmd.addr, cmd.data_len,
nvme_to_user_ptr(cmd.metadata), cmd.metadata_len,
0, &result, timeout, false);
@@ -326,7 +327,7 @@ static int nvme_user_cmd64(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
timeout = msecs_to_jiffies(cmd.timeout_ms);
status = nvme_submit_user_cmd(ns ? ns->queue : ctrl->admin_q, &c,
- nvme_to_user_ptr(cmd.addr), cmd.data_len,
+ cmd.addr, cmd.data_len,
nvme_to_user_ptr(cmd.metadata), cmd.metadata_len,
0, &cmd.result, timeout, vec);
@@ -512,9 +513,9 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
req->timeout = d.timeout_ms ? msecs_to_jiffies(d.timeout_ms) : 0;
if (d.addr && d.data_len) {
- ret = nvme_map_user_request(req, nvme_to_user_ptr(d.addr),
+ ret = nvme_map_user_request(req, d.addr,
d.data_len, nvme_to_user_ptr(d.metadata),
- d.metadata_len, 0, &meta, vec);
+ d.metadata_len, 0, &meta, ioucmd, vec);
if (ret)
return ret;
}
--
2.25.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH for-next v12 12/12] nvme: wire up fixed buffer support for nvme passthrough
[not found] ` <CGME20220930063835epcas5p2812f8e3d0758b19c01198034fcddc019@epcas5p2.samsung.com>
@ 2022-09-30 6:27 ` Anuj Gupta
0 siblings, 0 replies; 16+ messages in thread
From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw)
To: axboe, hch, kbusch
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
Kanchan Joshi
From: Kanchan Joshi <[email protected]>
if io_uring sends passthrough command with IORING_URING_CMD_FIXED flag,
use the pre-registered buffer for IO (non-vectored variant). Pass the
buffer/length to io_uring and get the bvec iterator for the range. Next,
pass this bvec to block-layer and obtain a bio/request for subsequent
processing.
Signed-off-by: Kanchan Joshi <[email protected]>
---
drivers/nvme/host/ioctl.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 7a41caa9bfd2..81f5550b670d 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -95,8 +95,22 @@ static int nvme_map_user_request(struct request *req, u64 ubuffer,
void *meta = NULL;
int ret;
- ret = blk_rq_map_user_io(req, NULL, nvme_to_user_ptr(ubuffer), bufflen,
- GFP_KERNEL, vec, 0, 0, rq_data_dir(req));
+ if (ioucmd && (ioucmd->flags & IORING_URING_CMD_FIXED)) {
+ struct iov_iter iter;
+
+ /* fixedbufs is only for non-vectored io */
+ if (WARN_ON_ONCE(vec))
+ return -EINVAL;
+ ret = io_uring_cmd_import_fixed(ubuffer, bufflen,
+ rq_data_dir(req), &iter, ioucmd);
+ if (ret < 0)
+ goto out;
+ ret = blk_rq_map_user_iov(q, req, NULL, &iter, GFP_KERNEL);
+ } else {
+ ret = blk_rq_map_user_io(req, NULL, nvme_to_user_ptr(ubuffer),
+ bufflen, GFP_KERNEL, vec, 0, 0,
+ rq_data_dir(req));
+ }
if (ret)
goto out;
--
2.25.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH for-next v12 02/12] io_uring: introduce fixed buffer support for io_uring_cmd
2022-09-30 6:27 ` [PATCH for-next v12 02/12] io_uring: introduce fixed buffer support for io_uring_cmd Anuj Gupta
@ 2022-09-30 13:42 ` Jens Axboe
2022-09-30 14:04 ` Anuj gupta
0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2022-09-30 13:42 UTC (permalink / raw)
To: Anuj Gupta, hch, kbusch
Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi,
Kanchan Joshi
On 9/30/22 12:27 AM, Anuj Gupta wrote:
> Add IORING_URING_CMD_FIXED flag that is to be used for sending io_uring
> command with previously registered buffers. User-space passes the buffer
> index in sqe->buf_index, same as done in read/write variants that uses
> fixed buffers.
>
> Signed-off-by: Anuj Gupta <[email protected]>
> Signed-off-by: Kanchan Joshi <[email protected]>
> ---
> include/linux/io_uring.h | 2 +-
> include/uapi/linux/io_uring.h | 9 +++++++++
> io_uring/uring_cmd.c | 18 +++++++++++++++++-
> 3 files changed, 27 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
> index 1dbf51115c30..e10c5cc81082 100644
> --- a/include/linux/io_uring.h
> +++ b/include/linux/io_uring.h
> @@ -28,7 +28,7 @@ struct io_uring_cmd {
> void *cookie;
> };
> u32 cmd_op;
> - u32 pad;
> + u32 flags;
> u8 pdu[32]; /* available inline for free use */
> };
>
> diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
> index 92f29d9505a6..ab7458033ee3 100644
> --- a/include/uapi/linux/io_uring.h
> +++ b/include/uapi/linux/io_uring.h
> @@ -56,6 +56,7 @@ struct io_uring_sqe {
> __u32 hardlink_flags;
> __u32 xattr_flags;
> __u32 msg_ring_flags;
> + __u32 uring_cmd_flags;
> };
> __u64 user_data; /* data to be passed back at completion time */
> /* pack this to avoid bogus arm OABI complaints */
> @@ -219,6 +220,14 @@ enum io_uring_op {
> IORING_OP_LAST,
> };
>
> +/*
> + * sqe->uring_cmd_flags
> + * IORING_URING_CMD_FIXED use registered buffer; pass thig flag
> + * along with setting sqe->buf_index.
> + */
> +#define IORING_URING_CMD_FIXED (1U << 0)
> +
> +
> /*
> * sqe->fsync_flags
> */
> diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
> index 6a6d69523d75..05e8ad8cef87 100644
> --- a/io_uring/uring_cmd.c
> +++ b/io_uring/uring_cmd.c
> @@ -4,6 +4,7 @@
> #include <linux/file.h>
> #include <linux/io_uring.h>
> #include <linux/security.h>
> +#include <linux/nospec.h>
>
> #include <uapi/linux/io_uring.h>
>
> @@ -77,7 +78,22 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
> {
> struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
>
> - if (sqe->rw_flags || sqe->__pad1)
> + if (sqe->__pad1)
> + return -EINVAL;
> +
> + ioucmd->flags = READ_ONCE(sqe->uring_cmd_flags);
> + if (ioucmd->flags & IORING_URING_CMD_FIXED) {
> + struct io_ring_ctx *ctx = req->ctx;
> + u16 index;
> +
> + req->buf_index = READ_ONCE(sqe->buf_index);
> + if (unlikely(req->buf_index >= ctx->nr_user_bufs))
> + return -EFAULT;
> + index = array_index_nospec(req->buf_index, ctx->nr_user_bufs);
> + req->imu = ctx->user_bufs[index];
> + io_req_set_rsrc_node(req, ctx, 0);
> + }
> + if (ioucmd->flags & ~IORING_URING_CMD_FIXED)
> return -EINVAL;
Not that it _really_ matters, but why isn't this check the first thing
that is done after reading the flags? No need to respin, I can just move
it myself.
--
Jens Axboe
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH for-next v12 02/12] io_uring: introduce fixed buffer support for io_uring_cmd
2022-09-30 13:42 ` Jens Axboe
@ 2022-09-30 14:04 ` Anuj gupta
0 siblings, 0 replies; 16+ messages in thread
From: Anuj gupta @ 2022-09-30 14:04 UTC (permalink / raw)
To: Jens Axboe
Cc: Anuj Gupta, hch, kbusch, io-uring, linux-nvme, linux-block,
gost.dev, linux-scsi, Kanchan Joshi
On Fri, Sep 30, 2022 at 7:28 PM Jens Axboe <[email protected]> wrote:
>
> On 9/30/22 12:27 AM, Anuj Gupta wrote:
> > Add IORING_URING_CMD_FIXED flag that is to be used for sending io_uring
> > command with previously registered buffers. User-space passes the buffer
> > index in sqe->buf_index, same as done in read/write variants that uses
> > fixed buffers.
> >
> > Signed-off-by: Anuj Gupta <[email protected]>
> > Signed-off-by: Kanchan Joshi <[email protected]>
> > ---
> > include/linux/io_uring.h | 2 +-
> > include/uapi/linux/io_uring.h | 9 +++++++++
> > io_uring/uring_cmd.c | 18 +++++++++++++++++-
> > 3 files changed, 27 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
> > index 1dbf51115c30..e10c5cc81082 100644
> > --- a/include/linux/io_uring.h
> > +++ b/include/linux/io_uring.h
> > @@ -28,7 +28,7 @@ struct io_uring_cmd {
> > void *cookie;
> > };
> > u32 cmd_op;
> > - u32 pad;
> > + u32 flags;
> > u8 pdu[32]; /* available inline for free use */
> > };
> >
> > diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
> > index 92f29d9505a6..ab7458033ee3 100644
> > --- a/include/uapi/linux/io_uring.h
> > +++ b/include/uapi/linux/io_uring.h
> > @@ -56,6 +56,7 @@ struct io_uring_sqe {
> > __u32 hardlink_flags;
> > __u32 xattr_flags;
> > __u32 msg_ring_flags;
> > + __u32 uring_cmd_flags;
> > };
> > __u64 user_data; /* data to be passed back at completion time */
> > /* pack this to avoid bogus arm OABI complaints */
> > @@ -219,6 +220,14 @@ enum io_uring_op {
> > IORING_OP_LAST,
> > };
> >
> > +/*
> > + * sqe->uring_cmd_flags
> > + * IORING_URING_CMD_FIXED use registered buffer; pass thig flag
> > + * along with setting sqe->buf_index.
> > + */
> > +#define IORING_URING_CMD_FIXED (1U << 0)
> > +
> > +
> > /*
> > * sqe->fsync_flags
> > */
> > diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
> > index 6a6d69523d75..05e8ad8cef87 100644
> > --- a/io_uring/uring_cmd.c
> > +++ b/io_uring/uring_cmd.c
> > @@ -4,6 +4,7 @@
> > #include <linux/file.h>
> > #include <linux/io_uring.h>
> > #include <linux/security.h>
> > +#include <linux/nospec.h>
> >
> > #include <uapi/linux/io_uring.h>
> >
> > @@ -77,7 +78,22 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
> > {
> > struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
> >
> > - if (sqe->rw_flags || sqe->__pad1)
> > + if (sqe->__pad1)
> > + return -EINVAL;
> > +
> > + ioucmd->flags = READ_ONCE(sqe->uring_cmd_flags);
> > + if (ioucmd->flags & IORING_URING_CMD_FIXED) {
> > + struct io_ring_ctx *ctx = req->ctx;
> > + u16 index;
> > +
> > + req->buf_index = READ_ONCE(sqe->buf_index);
> > + if (unlikely(req->buf_index >= ctx->nr_user_bufs))
> > + return -EFAULT;
> > + index = array_index_nospec(req->buf_index, ctx->nr_user_bufs);
> > + req->imu = ctx->user_bufs[index];
> > + io_req_set_rsrc_node(req, ctx, 0);
> > + }
> > + if (ioucmd->flags & ~IORING_URING_CMD_FIXED)
> > return -EINVAL;
>
> Not that it _really_ matters, but why isn't this check the first thing
> that is done after reading the flags? No need to respin, I can just move
> it myself.
>
Right, checking this condition should have been the first thing to do after
reading the flags. Thanks for taking care of it.
> --
> Jens Axboe
--
Anuj Gupta
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH for-next v12 00/12] Fixed-buffer for uring-cmd/passthru
2022-09-30 6:27 ` [PATCH for-next v12 00/12] Fixed-buffer for uring-cmd/passthru Anuj Gupta
` (11 preceding siblings ...)
[not found] ` <CGME20220930063835epcas5p2812f8e3d0758b19c01198034fcddc019@epcas5p2.samsung.com>
@ 2022-09-30 14:04 ` Jens Axboe
12 siblings, 0 replies; 16+ messages in thread
From: Jens Axboe @ 2022-09-30 14:04 UTC (permalink / raw)
To: kbusch, Anuj Gupta, hch
Cc: io-uring, linux-nvme, gost.dev, linux-block, linux-scsi
On Fri, 30 Sep 2022 11:57:37 +0530, Anuj Gupta wrote:
> uring-cmd lacks the ability to leverage the pre-registered buffers.
> This series adds that support in uring-cmd, and plumbs nvme passthrough
> to work with it.
> Patches 3 - 5 carve out a block helper and scsi, nvme then use it to
> avoid duplication of code.
> Patch 6 and 7 contains a bunch of general nvme cleanups, which got added
> along the iterations.
>
> [...]
Applied, thanks!
[01/12] io_uring: add io_uring_cmd_import_fixed
commit: a9216fac3ed8819cbbda5d39dd5fcaa43dfd35d8
[02/12] io_uring: introduce fixed buffer support for io_uring_cmd
commit: 9cda70f622cdcf049521a9c2886e5fd8a90a0591
[03/12] block: add blk_rq_map_user_io
commit: 557654025df5706785d395558244890dc4b2c875
[04/12] scsi: Use blk_rq_map_user_io helper
commit: 6732932c836a4313f471b92b4d90761f31d3fa81
[05/12] nvme: Use blk_rq_map_user_io helper
commit: 7f05635764390d5f811971af9f4c89b794032c80
[06/12] nvme: refactor nvme_add_user_metadata
commit: 38c0ddab7b93daa90c046d0b9064a34fb0e586e5
[07/12] nvme: refactor nvme_alloc_request
commit: 470e900c8036ff1aafeb5f06f3cb7a375a081399
[08/12] block: rename bio_map_put to blk_mq_map_bio_put
commit: 32f1c71b15fc9cb8e964c3d0c15ca99a70cfe8a7
[09/12] block: factor out blk_rq_map_bio_alloc helper
commit: ab89e8e7ca526ca04baaad2aa28172d336425d67
[10/12] block: extend functionality to map bvec iterator
commit: 37987547932c89f15f9b76950040131ddb591a8b
[11/12] nvme: pass ubuffer as an integer
commit: 4d174486820e625fa85bac5d4235d4b4cb659866
[12/12] nvme: wire up fixed buffer support for nvme passthrough
commit: 23fd22e55b767be9c31fda57205afb2023cd6aad
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2022-09-30 14:05 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20220930063754epcas5p2aff33c952032713a39604388eacda910@epcas5p2.samsung.com>
2022-09-30 6:27 ` [PATCH for-next v12 00/12] Fixed-buffer for uring-cmd/passthru Anuj Gupta
[not found] ` <CGME20220930063805epcas5p2c8eb80f32507f011baedc6d6b4d3f38d@epcas5p2.samsung.com>
2022-09-30 6:27 ` [PATCH for-next v12 01/12] io_uring: add io_uring_cmd_import_fixed Anuj Gupta
[not found] ` <CGME20220930063809epcas5p328b9e14ead49e9612b905e6f5b6682f7@epcas5p3.samsung.com>
2022-09-30 6:27 ` [PATCH for-next v12 02/12] io_uring: introduce fixed buffer support for io_uring_cmd Anuj Gupta
2022-09-30 13:42 ` Jens Axboe
2022-09-30 14:04 ` Anuj gupta
[not found] ` <CGME20220930063811epcas5p43cce58f5e1589c3e3780ce0cfd563986@epcas5p4.samsung.com>
2022-09-30 6:27 ` [PATCH for-next v12 03/12] block: add blk_rq_map_user_io Anuj Gupta
[not found] ` <CGME20220930063815epcas5p1e056d6a2a53949296a7657de804fd2ec@epcas5p1.samsung.com>
2022-09-30 6:27 ` [PATCH for-next v12 04/12] scsi: Use blk_rq_map_user_io helper Anuj Gupta
[not found] ` <CGME20220930063818epcas5p4e321f0efa5a53759ea19eb8f1c63deef@epcas5p4.samsung.com>
2022-09-30 6:27 ` [PATCH for-next v12 05/12] nvme: " Anuj Gupta
[not found] ` <CGME20220930063821epcas5p48d4ec5136d487ea779ac74e2c0b740ac@epcas5p4.samsung.com>
2022-09-30 6:27 ` [PATCH for-next v12 06/12] nvme: refactor nvme_add_user_metadata Anuj Gupta
[not found] ` <CGME20220930063824epcas5p4f829f3b8673e2603cdc9a799ca44ea6e@epcas5p4.samsung.com>
2022-09-30 6:27 ` [PATCH for-next v12 07/12] nvme: refactor nvme_alloc_request Anuj Gupta
[not found] ` <CGME20220930063826epcas5p491d9bc62214c1d7c8c24c883299edfb7@epcas5p4.samsung.com>
2022-09-30 6:27 ` [PATCH for-next v12 08/12] block: rename bio_map_put to blk_mq_map_bio_put Anuj Gupta
[not found] ` <CGME20220930063828epcas5p2bfddb254b0dffde77e99c2acc4440bde@epcas5p2.samsung.com>
2022-09-30 6:27 ` [PATCH for-next v12 09/12] block: factor out blk_rq_map_bio_alloc helper Anuj Gupta
[not found] ` <CGME20220930063831epcas5p4b6a8559dedd39ef423a0b9a317163969@epcas5p4.samsung.com>
2022-09-30 6:27 ` [PATCH for-next v12 10/12] block: extend functionality to map bvec iterator Anuj Gupta
[not found] ` <CGME20220930063833epcas5p40fbff95f9d132f5a42dda80d307426e9@epcas5p4.samsung.com>
2022-09-30 6:27 ` [PATCH for-next v12 11/12] nvme: pass ubuffer as an integer Anuj Gupta
[not found] ` <CGME20220930063835epcas5p2812f8e3d0758b19c01198034fcddc019@epcas5p2.samsung.com>
2022-09-30 6:27 ` [PATCH for-next v12 12/12] nvme: wire up fixed buffer support for nvme passthrough Anuj Gupta
2022-09-30 14:04 ` [PATCH for-next v12 00/12] Fixed-buffer for uring-cmd/passthru Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox