* [PATCH for-next v12 00/12] Fixed-buffer for uring-cmd/passthru [not found] <CGME20220930063754epcas5p2aff33c952032713a39604388eacda910@epcas5p2.samsung.com> @ 2022-09-30 6:27 ` Anuj Gupta [not found] ` <CGME20220930063805epcas5p2c8eb80f32507f011baedc6d6b4d3f38d@epcas5p2.samsung.com> ` (12 more replies) 0 siblings, 13 replies; 16+ messages in thread From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw) To: axboe, hch, kbusch Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, Anuj Gupta Hi, uring-cmd lacks the ability to leverage the pre-registered buffers. This series adds that support in uring-cmd, and plumbs nvme passthrough to work with it. Patches 3 - 5 carve out a block helper and scsi, nvme then use it to avoid duplication of code. Patch 6 and 7 contains a bunch of general nvme cleanups, which got added along the iterations. Using registered-buffers showed ~20% IOPS hike from 2.62M to 3.17M in my setup Without fixedbufs ***************** # taskset -c 0 t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -B0 -O0 -n1 -u1 /dev/ng0n1 submitter=0, tid=3623, file=/dev/ng0n1, node=-1 polled=1, fixedbufs=0/0, register_files=1, buffered=1, QD=128 Engine=io_uring, sq_ring=128, cq_ring=128 IOPS=2.62M, BW=1281MiB/s, IOS/call=32/31 IOPS=2.62M, BW=1277MiB/s, IOS/call=32/32 IOPS=2.62M, BW=1277MiB/s, IOS/call=32/32 IOPS=2.61M, BW=1276MiB/s, IOS/call=32/32 ^CExiting on signal Maximum IOPS=2.62M With fixedbufs ************** # taskset -c 0 t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -B1 -O0 -n1 -u1 /dev/ng0n1 submitter=0, tid=3627, file=/dev/ng0n1, node=-1 polled=1, fixedbufs=1/0, register_files=1, buffered=1, QD=128 Engine=io_uring, sq_ring=128, cq_ring=128 IOPS=3.17M, BW=1546MiB/s, IOS/call=32/31 IOPS=3.17M, BW=1546MiB/s, IOS/call=32/31 IOPS=3.17M, BW=1546MiB/s, IOS/call=32/32 IOPS=3.16M, BW=1544MiB/s, IOS/call=32/32 ^CExiting on signal Maximum IOPS=3.17M Changes since v11: Patch 2 - Add a check for flags (Jens) Patch 3 - Moved the refactoring patches to start, before the nvme-refactoring patches (Christoph) Patch 3 - Initialize ret to 0, to prevent uninitialized variable warning (kernel test robot) Patch 4 - Added the onstack advantage part in the commit description (Christoph) Patch 7 - Move blk_rq_free_request into nvme_map_user_request to handle error scenarios, instead of doing it using goto in it's callers, helps in getting rid of a uninitialized variable warning (kernel test robot) Patch 10 - Folded it in with the next patch to avoid compiler warning for unused static functions(Christoph) Changes since v10: - Patch 3: Fix overly long line (Christoph) - Patch 4: create a helper in block-map for vectored and non-vectored-io, to be used by scsi and nvme (Christoph) - Patch 5: Rename bio_map_get to blk_rq_map_bio_alloc and bio_map_put to blk_mq_map_bio_put (Christoph) - Patch 6: Split it into a prep patch and avoid duplicate checks (Christoph) - Patch 7: Put changes to pass ubuffer as a integer in a separate prep patch and simplify condition checks in nvme (Christoph) Changes since v9: - Patch 6: Make blk_rq_map_user_iov() to operate on bvec iterator (Christoph) - Patch 7: Change nvme to use the above Changes since v8: - Split some patches further; now 7 patches rather than 5 (Christoph) - Applied a bunch of other suggested cleanups (Christoph) Changes since v7: - Patch 3: added many cleanups/refactoring suggested by Christoph - Patch 4: added copying-pages fallback for bounce-buffer/dma-alignment case (Christoph) Changes since v6: - Patch 1: fix warning for io_uring_cmd_import_fixed (robot) - Changes since v5: - Patch 4: newly addd, to split a nvme function into two - Patch 3: folded cleanups in bio_map_user_iov (Chaitanya, Pankaj) - Rebase to latest for-next Changes since v4: - Patch 1, 2: folded all review comments of Jens Changes since v3: - uring_cmd_flags, change from u16 to u32 (Jens) - patch 3, add another helper to reduce code-duplication (Jens) Changes since v2: - Kill the new opcode, add a flag instead (Pavel) - Fix standalone build issue with patch 1 (Pavel) Changes since v1: - Fix a naming issue for an exported helper Anuj Gupta (6): io_uring: add io_uring_cmd_import_fixed io_uring: introduce fixed buffer support for io_uring_cmd block: add blk_rq_map_user_io scsi: Use blk_rq_map_user_io helper nvme: Use blk_rq_map_user_io helper block: rename bio_map_put to blk_mq_map_bio_put Kanchan Joshi (6): nvme: refactor nvme_add_user_metadata nvme: refactor nvme_alloc_request block: factor out blk_rq_map_bio_alloc helper block: extend functionality to map bvec iterator nvme: pass ubuffer as an integer nvme: wire up fixed buffer support for nvme passthrough block/blk-map.c | 150 ++++++++++++++++++++++++++++++---- drivers/nvme/host/ioctl.c | 144 ++++++++++++++++++-------------- drivers/scsi/scsi_ioctl.c | 22 +---- drivers/scsi/sg.c | 22 +---- include/linux/blk-mq.h | 2 + include/linux/io_uring.h | 10 ++- include/uapi/linux/io_uring.h | 9 ++ io_uring/uring_cmd.c | 28 ++++++- 8 files changed, 266 insertions(+), 121 deletions(-) -- 2.25.1 ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <CGME20220930063805epcas5p2c8eb80f32507f011baedc6d6b4d3f38d@epcas5p2.samsung.com>]
* [PATCH for-next v12 01/12] io_uring: add io_uring_cmd_import_fixed [not found] ` <CGME20220930063805epcas5p2c8eb80f32507f011baedc6d6b4d3f38d@epcas5p2.samsung.com> @ 2022-09-30 6:27 ` Anuj Gupta 0 siblings, 0 replies; 16+ messages in thread From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw) To: axboe, hch, kbusch Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, Anuj Gupta, Kanchan Joshi This is a new helper that callers can use to obtain a bvec iterator for the previously mapped buffer. This is preparatory work to enable fixed-buffer support for io_uring_cmd. Signed-off-by: Anuj Gupta <[email protected]> Signed-off-by: Kanchan Joshi <[email protected]> --- include/linux/io_uring.h | 8 ++++++++ io_uring/uring_cmd.c | 10 ++++++++++ 2 files changed, 18 insertions(+) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index 58676c0a398f..1dbf51115c30 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -4,6 +4,7 @@ #include <linux/sched.h> #include <linux/xarray.h> +#include <uapi/linux/io_uring.h> enum io_uring_cmd_flags { IO_URING_F_COMPLETE_DEFER = 1, @@ -32,6 +33,8 @@ struct io_uring_cmd { }; #if defined(CONFIG_IO_URING) +int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, + struct iov_iter *iter, void *ioucmd); void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret, ssize_t res2); void io_uring_cmd_complete_in_task(struct io_uring_cmd *ioucmd, void (*task_work_cb)(struct io_uring_cmd *)); @@ -59,6 +62,11 @@ static inline void io_uring_free(struct task_struct *tsk) __io_uring_free(tsk); } #else +static int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, + struct iov_iter *iter, void *ioucmd) +{ + return -EOPNOTSUPP; +} static inline void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret, ssize_t ret2) { diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index f3ed61e9bd0f..6a6d69523d75 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -8,6 +8,7 @@ #include <uapi/linux/io_uring.h> #include "io_uring.h" +#include "rsrc.h" #include "uring_cmd.h" static void io_uring_cmd_work(struct io_kiocb *req, bool *locked) @@ -129,3 +130,12 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags) return IOU_ISSUE_SKIP_COMPLETE; } + +int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, + struct iov_iter *iter, void *ioucmd) +{ + struct io_kiocb *req = cmd_to_io_kiocb(ioucmd); + + return io_import_fixed(rw, iter, req->imu, ubuf, len); +} +EXPORT_SYMBOL_GPL(io_uring_cmd_import_fixed); -- 2.25.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
[parent not found: <CGME20220930063809epcas5p328b9e14ead49e9612b905e6f5b6682f7@epcas5p3.samsung.com>]
* [PATCH for-next v12 02/12] io_uring: introduce fixed buffer support for io_uring_cmd [not found] ` <CGME20220930063809epcas5p328b9e14ead49e9612b905e6f5b6682f7@epcas5p3.samsung.com> @ 2022-09-30 6:27 ` Anuj Gupta 2022-09-30 13:42 ` Jens Axboe 0 siblings, 1 reply; 16+ messages in thread From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw) To: axboe, hch, kbusch Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, Anuj Gupta, Kanchan Joshi Add IORING_URING_CMD_FIXED flag that is to be used for sending io_uring command with previously registered buffers. User-space passes the buffer index in sqe->buf_index, same as done in read/write variants that uses fixed buffers. Signed-off-by: Anuj Gupta <[email protected]> Signed-off-by: Kanchan Joshi <[email protected]> --- include/linux/io_uring.h | 2 +- include/uapi/linux/io_uring.h | 9 +++++++++ io_uring/uring_cmd.c | 18 +++++++++++++++++- 3 files changed, 27 insertions(+), 2 deletions(-) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index 1dbf51115c30..e10c5cc81082 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -28,7 +28,7 @@ struct io_uring_cmd { void *cookie; }; u32 cmd_op; - u32 pad; + u32 flags; u8 pdu[32]; /* available inline for free use */ }; diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 92f29d9505a6..ab7458033ee3 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -56,6 +56,7 @@ struct io_uring_sqe { __u32 hardlink_flags; __u32 xattr_flags; __u32 msg_ring_flags; + __u32 uring_cmd_flags; }; __u64 user_data; /* data to be passed back at completion time */ /* pack this to avoid bogus arm OABI complaints */ @@ -219,6 +220,14 @@ enum io_uring_op { IORING_OP_LAST, }; +/* + * sqe->uring_cmd_flags + * IORING_URING_CMD_FIXED use registered buffer; pass thig flag + * along with setting sqe->buf_index. + */ +#define IORING_URING_CMD_FIXED (1U << 0) + + /* * sqe->fsync_flags */ diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index 6a6d69523d75..05e8ad8cef87 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -4,6 +4,7 @@ #include <linux/file.h> #include <linux/io_uring.h> #include <linux/security.h> +#include <linux/nospec.h> #include <uapi/linux/io_uring.h> @@ -77,7 +78,22 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd); - if (sqe->rw_flags || sqe->__pad1) + if (sqe->__pad1) + return -EINVAL; + + ioucmd->flags = READ_ONCE(sqe->uring_cmd_flags); + if (ioucmd->flags & IORING_URING_CMD_FIXED) { + struct io_ring_ctx *ctx = req->ctx; + u16 index; + + req->buf_index = READ_ONCE(sqe->buf_index); + if (unlikely(req->buf_index >= ctx->nr_user_bufs)) + return -EFAULT; + index = array_index_nospec(req->buf_index, ctx->nr_user_bufs); + req->imu = ctx->user_bufs[index]; + io_req_set_rsrc_node(req, ctx, 0); + } + if (ioucmd->flags & ~IORING_URING_CMD_FIXED) return -EINVAL; ioucmd->cmd = sqe->cmd; ioucmd->cmd_op = READ_ONCE(sqe->cmd_op); -- 2.25.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH for-next v12 02/12] io_uring: introduce fixed buffer support for io_uring_cmd 2022-09-30 6:27 ` [PATCH for-next v12 02/12] io_uring: introduce fixed buffer support for io_uring_cmd Anuj Gupta @ 2022-09-30 13:42 ` Jens Axboe 2022-09-30 14:04 ` Anuj gupta 0 siblings, 1 reply; 16+ messages in thread From: Jens Axboe @ 2022-09-30 13:42 UTC (permalink / raw) To: Anuj Gupta, hch, kbusch Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, Kanchan Joshi On 9/30/22 12:27 AM, Anuj Gupta wrote: > Add IORING_URING_CMD_FIXED flag that is to be used for sending io_uring > command with previously registered buffers. User-space passes the buffer > index in sqe->buf_index, same as done in read/write variants that uses > fixed buffers. > > Signed-off-by: Anuj Gupta <[email protected]> > Signed-off-by: Kanchan Joshi <[email protected]> > --- > include/linux/io_uring.h | 2 +- > include/uapi/linux/io_uring.h | 9 +++++++++ > io_uring/uring_cmd.c | 18 +++++++++++++++++- > 3 files changed, 27 insertions(+), 2 deletions(-) > > diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h > index 1dbf51115c30..e10c5cc81082 100644 > --- a/include/linux/io_uring.h > +++ b/include/linux/io_uring.h > @@ -28,7 +28,7 @@ struct io_uring_cmd { > void *cookie; > }; > u32 cmd_op; > - u32 pad; > + u32 flags; > u8 pdu[32]; /* available inline for free use */ > }; > > diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h > index 92f29d9505a6..ab7458033ee3 100644 > --- a/include/uapi/linux/io_uring.h > +++ b/include/uapi/linux/io_uring.h > @@ -56,6 +56,7 @@ struct io_uring_sqe { > __u32 hardlink_flags; > __u32 xattr_flags; > __u32 msg_ring_flags; > + __u32 uring_cmd_flags; > }; > __u64 user_data; /* data to be passed back at completion time */ > /* pack this to avoid bogus arm OABI complaints */ > @@ -219,6 +220,14 @@ enum io_uring_op { > IORING_OP_LAST, > }; > > +/* > + * sqe->uring_cmd_flags > + * IORING_URING_CMD_FIXED use registered buffer; pass thig flag > + * along with setting sqe->buf_index. > + */ > +#define IORING_URING_CMD_FIXED (1U << 0) > + > + > /* > * sqe->fsync_flags > */ > diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c > index 6a6d69523d75..05e8ad8cef87 100644 > --- a/io_uring/uring_cmd.c > +++ b/io_uring/uring_cmd.c > @@ -4,6 +4,7 @@ > #include <linux/file.h> > #include <linux/io_uring.h> > #include <linux/security.h> > +#include <linux/nospec.h> > > #include <uapi/linux/io_uring.h> > > @@ -77,7 +78,22 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) > { > struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd); > > - if (sqe->rw_flags || sqe->__pad1) > + if (sqe->__pad1) > + return -EINVAL; > + > + ioucmd->flags = READ_ONCE(sqe->uring_cmd_flags); > + if (ioucmd->flags & IORING_URING_CMD_FIXED) { > + struct io_ring_ctx *ctx = req->ctx; > + u16 index; > + > + req->buf_index = READ_ONCE(sqe->buf_index); > + if (unlikely(req->buf_index >= ctx->nr_user_bufs)) > + return -EFAULT; > + index = array_index_nospec(req->buf_index, ctx->nr_user_bufs); > + req->imu = ctx->user_bufs[index]; > + io_req_set_rsrc_node(req, ctx, 0); > + } > + if (ioucmd->flags & ~IORING_URING_CMD_FIXED) > return -EINVAL; Not that it _really_ matters, but why isn't this check the first thing that is done after reading the flags? No need to respin, I can just move it myself. -- Jens Axboe ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH for-next v12 02/12] io_uring: introduce fixed buffer support for io_uring_cmd 2022-09-30 13:42 ` Jens Axboe @ 2022-09-30 14:04 ` Anuj gupta 0 siblings, 0 replies; 16+ messages in thread From: Anuj gupta @ 2022-09-30 14:04 UTC (permalink / raw) To: Jens Axboe Cc: Anuj Gupta, hch, kbusch, io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, Kanchan Joshi On Fri, Sep 30, 2022 at 7:28 PM Jens Axboe <[email protected]> wrote: > > On 9/30/22 12:27 AM, Anuj Gupta wrote: > > Add IORING_URING_CMD_FIXED flag that is to be used for sending io_uring > > command with previously registered buffers. User-space passes the buffer > > index in sqe->buf_index, same as done in read/write variants that uses > > fixed buffers. > > > > Signed-off-by: Anuj Gupta <[email protected]> > > Signed-off-by: Kanchan Joshi <[email protected]> > > --- > > include/linux/io_uring.h | 2 +- > > include/uapi/linux/io_uring.h | 9 +++++++++ > > io_uring/uring_cmd.c | 18 +++++++++++++++++- > > 3 files changed, 27 insertions(+), 2 deletions(-) > > > > diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h > > index 1dbf51115c30..e10c5cc81082 100644 > > --- a/include/linux/io_uring.h > > +++ b/include/linux/io_uring.h > > @@ -28,7 +28,7 @@ struct io_uring_cmd { > > void *cookie; > > }; > > u32 cmd_op; > > - u32 pad; > > + u32 flags; > > u8 pdu[32]; /* available inline for free use */ > > }; > > > > diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h > > index 92f29d9505a6..ab7458033ee3 100644 > > --- a/include/uapi/linux/io_uring.h > > +++ b/include/uapi/linux/io_uring.h > > @@ -56,6 +56,7 @@ struct io_uring_sqe { > > __u32 hardlink_flags; > > __u32 xattr_flags; > > __u32 msg_ring_flags; > > + __u32 uring_cmd_flags; > > }; > > __u64 user_data; /* data to be passed back at completion time */ > > /* pack this to avoid bogus arm OABI complaints */ > > @@ -219,6 +220,14 @@ enum io_uring_op { > > IORING_OP_LAST, > > }; > > > > +/* > > + * sqe->uring_cmd_flags > > + * IORING_URING_CMD_FIXED use registered buffer; pass thig flag > > + * along with setting sqe->buf_index. > > + */ > > +#define IORING_URING_CMD_FIXED (1U << 0) > > + > > + > > /* > > * sqe->fsync_flags > > */ > > diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c > > index 6a6d69523d75..05e8ad8cef87 100644 > > --- a/io_uring/uring_cmd.c > > +++ b/io_uring/uring_cmd.c > > @@ -4,6 +4,7 @@ > > #include <linux/file.h> > > #include <linux/io_uring.h> > > #include <linux/security.h> > > +#include <linux/nospec.h> > > > > #include <uapi/linux/io_uring.h> > > > > @@ -77,7 +78,22 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) > > { > > struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd); > > > > - if (sqe->rw_flags || sqe->__pad1) > > + if (sqe->__pad1) > > + return -EINVAL; > > + > > + ioucmd->flags = READ_ONCE(sqe->uring_cmd_flags); > > + if (ioucmd->flags & IORING_URING_CMD_FIXED) { > > + struct io_ring_ctx *ctx = req->ctx; > > + u16 index; > > + > > + req->buf_index = READ_ONCE(sqe->buf_index); > > + if (unlikely(req->buf_index >= ctx->nr_user_bufs)) > > + return -EFAULT; > > + index = array_index_nospec(req->buf_index, ctx->nr_user_bufs); > > + req->imu = ctx->user_bufs[index]; > > + io_req_set_rsrc_node(req, ctx, 0); > > + } > > + if (ioucmd->flags & ~IORING_URING_CMD_FIXED) > > return -EINVAL; > > Not that it _really_ matters, but why isn't this check the first thing > that is done after reading the flags? No need to respin, I can just move > it myself. > Right, checking this condition should have been the first thing to do after reading the flags. Thanks for taking care of it. > -- > Jens Axboe -- Anuj Gupta ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <CGME20220930063811epcas5p43cce58f5e1589c3e3780ce0cfd563986@epcas5p4.samsung.com>]
* [PATCH for-next v12 03/12] block: add blk_rq_map_user_io [not found] ` <CGME20220930063811epcas5p43cce58f5e1589c3e3780ce0cfd563986@epcas5p4.samsung.com> @ 2022-09-30 6:27 ` Anuj Gupta 0 siblings, 0 replies; 16+ messages in thread From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw) To: axboe, hch, kbusch Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, Anuj Gupta Create a helper blk_rq_map_user_io for mapping of vectored as well as non-vectored requests. This will help in saving dupilcation of code at few places in scsi and nvme. Signed-off-by: Anuj Gupta <[email protected]> Suggested-by: Christoph Hellwig <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> --- block/blk-map.c | 36 ++++++++++++++++++++++++++++++++++++ include/linux/blk-mq.h | 2 ++ 2 files changed, 38 insertions(+) diff --git a/block/blk-map.c b/block/blk-map.c index 7693f8e3c454..0e37bbedd46c 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -611,6 +611,42 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, } EXPORT_SYMBOL(blk_rq_map_user); +int blk_rq_map_user_io(struct request *req, struct rq_map_data *map_data, + void __user *ubuf, unsigned long buf_len, gfp_t gfp_mask, + bool vec, int iov_count, bool check_iter_count, int rw) +{ + int ret = 0; + + if (vec) { + struct iovec fast_iov[UIO_FASTIOV]; + struct iovec *iov = fast_iov; + struct iov_iter iter; + + ret = import_iovec(rw, ubuf, iov_count ? iov_count : buf_len, + UIO_FASTIOV, &iov, &iter); + if (ret < 0) + return ret; + + if (iov_count) { + /* SG_IO howto says that the shorter of the two wins */ + iov_iter_truncate(&iter, buf_len); + if (check_iter_count && !iov_iter_count(&iter)) { + kfree(iov); + return -EINVAL; + } + } + + ret = blk_rq_map_user_iov(req->q, req, map_data, &iter, + gfp_mask); + kfree(iov); + } else if (buf_len) { + ret = blk_rq_map_user(req->q, req, map_data, ubuf, buf_len, + gfp_mask); + } + return ret; +} +EXPORT_SYMBOL(blk_rq_map_user_io); + /** * blk_rq_unmap_user - unmap a request with user data * @bio: start of bio list diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 50811d0fb143..ba18e9bdb799 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -985,6 +985,8 @@ struct rq_map_data { int blk_rq_map_user(struct request_queue *, struct request *, struct rq_map_data *, void __user *, unsigned long, gfp_t); +int blk_rq_map_user_io(struct request *, struct rq_map_data *, + void __user *, unsigned long, gfp_t, bool, int, bool, int); int blk_rq_map_user_iov(struct request_queue *, struct request *, struct rq_map_data *, const struct iov_iter *, gfp_t); int blk_rq_unmap_user(struct bio *); -- 2.25.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
[parent not found: <CGME20220930063815epcas5p1e056d6a2a53949296a7657de804fd2ec@epcas5p1.samsung.com>]
* [PATCH for-next v12 04/12] scsi: Use blk_rq_map_user_io helper [not found] ` <CGME20220930063815epcas5p1e056d6a2a53949296a7657de804fd2ec@epcas5p1.samsung.com> @ 2022-09-30 6:27 ` Anuj Gupta 0 siblings, 0 replies; 16+ messages in thread From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw) To: axboe, hch, kbusch Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, Anuj Gupta Use the new blk_rq_map_user_io helper instead of duplicating code at various places. Additionally this also takes advantage of the on-stack iov fast path. Signed-off-by: Anuj Gupta <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> --- drivers/scsi/scsi_ioctl.c | 22 +++------------------- drivers/scsi/sg.c | 22 ++-------------------- 2 files changed, 5 insertions(+), 39 deletions(-) diff --git a/drivers/scsi/scsi_ioctl.c b/drivers/scsi/scsi_ioctl.c index 729e309e6034..2d20da55fb64 100644 --- a/drivers/scsi/scsi_ioctl.c +++ b/drivers/scsi/scsi_ioctl.c @@ -449,25 +449,9 @@ static int sg_io(struct scsi_device *sdev, struct sg_io_hdr *hdr, fmode_t mode) if (ret < 0) goto out_put_request; - ret = 0; - if (hdr->iovec_count && hdr->dxfer_len) { - struct iov_iter i; - struct iovec *iov = NULL; - - ret = import_iovec(rq_data_dir(rq), hdr->dxferp, - hdr->iovec_count, 0, &iov, &i); - if (ret < 0) - goto out_put_request; - - /* SG_IO howto says that the shorter of the two wins */ - iov_iter_truncate(&i, hdr->dxfer_len); - - ret = blk_rq_map_user_iov(rq->q, rq, NULL, &i, GFP_KERNEL); - kfree(iov); - } else if (hdr->dxfer_len) - ret = blk_rq_map_user(rq->q, rq, NULL, hdr->dxferp, - hdr->dxfer_len, GFP_KERNEL); - + ret = blk_rq_map_user_io(rq, NULL, hdr->dxferp, hdr->dxfer_len, + GFP_KERNEL, hdr->iovec_count && hdr->dxfer_len, + hdr->iovec_count, 0, rq_data_dir(rq)); if (ret) goto out_put_request; diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 94c5e9a9309c..ce34a8ad53b4 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -1804,26 +1804,8 @@ sg_start_req(Sg_request *srp, unsigned char *cmd) md->from_user = 0; } - if (iov_count) { - struct iovec *iov = NULL; - struct iov_iter i; - - res = import_iovec(rw, hp->dxferp, iov_count, 0, &iov, &i); - if (res < 0) - return res; - - iov_iter_truncate(&i, hp->dxfer_len); - if (!iov_iter_count(&i)) { - kfree(iov); - return -EINVAL; - } - - res = blk_rq_map_user_iov(q, rq, md, &i, GFP_ATOMIC); - kfree(iov); - } else - res = blk_rq_map_user(q, rq, md, hp->dxferp, - hp->dxfer_len, GFP_ATOMIC); - + res = blk_rq_map_user_io(rq, md, hp->dxferp, hp->dxfer_len, + GFP_ATOMIC, iov_count, iov_count, 1, rw); if (!res) { srp->bio = rq->bio; -- 2.25.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
[parent not found: <CGME20220930063818epcas5p4e321f0efa5a53759ea19eb8f1c63deef@epcas5p4.samsung.com>]
* [PATCH for-next v12 05/12] nvme: Use blk_rq_map_user_io helper [not found] ` <CGME20220930063818epcas5p4e321f0efa5a53759ea19eb8f1c63deef@epcas5p4.samsung.com> @ 2022-09-30 6:27 ` Anuj Gupta 0 siblings, 0 replies; 16+ messages in thread From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw) To: axboe, hch, kbusch Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, Anuj Gupta User blk_rq_map_user_io instead of duplicating the same code at different places Signed-off-by: Anuj Gupta <[email protected]> --- drivers/nvme/host/ioctl.c | 18 ++---------------- 1 file changed, 2 insertions(+), 16 deletions(-) diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index 914b142b6f2b..3746a02a88ef 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -88,22 +88,8 @@ static struct request *nvme_alloc_user_request(struct request_queue *q, nvme_req(req)->flags |= NVME_REQ_USERCMD; if (ubuffer && bufflen) { - if (!vec) - ret = blk_rq_map_user(q, req, NULL, ubuffer, bufflen, - GFP_KERNEL); - else { - struct iovec fast_iov[UIO_FASTIOV]; - struct iovec *iov = fast_iov; - struct iov_iter iter; - - ret = import_iovec(rq_data_dir(req), ubuffer, bufflen, - UIO_FASTIOV, &iov, &iter); - if (ret < 0) - goto out; - ret = blk_rq_map_user_iov(q, req, NULL, &iter, - GFP_KERNEL); - kfree(iov); - } + ret = blk_rq_map_user_io(req, NULL, ubuffer, bufflen, + GFP_KERNEL, vec, 0, 0, rq_data_dir(req)); if (ret) goto out; bio = req->bio; -- 2.25.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
[parent not found: <CGME20220930063821epcas5p48d4ec5136d487ea779ac74e2c0b740ac@epcas5p4.samsung.com>]
* [PATCH for-next v12 06/12] nvme: refactor nvme_add_user_metadata [not found] ` <CGME20220930063821epcas5p48d4ec5136d487ea779ac74e2c0b740ac@epcas5p4.samsung.com> @ 2022-09-30 6:27 ` Anuj Gupta 0 siblings, 0 replies; 16+ messages in thread From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw) To: axboe, hch, kbusch Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, Kanchan Joshi From: Kanchan Joshi <[email protected]> Pass struct request rather than bio. It helps to kill a parameter, and some processing clean-up too. Signed-off-by: Kanchan Joshi <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> --- drivers/nvme/host/ioctl.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index 3746a02a88ef..bcaa6b3f97ca 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -20,19 +20,20 @@ static void __user *nvme_to_user_ptr(uintptr_t ptrval) return (void __user *)ptrval; } -static void *nvme_add_user_metadata(struct bio *bio, void __user *ubuf, - unsigned len, u32 seed, bool write) +static void *nvme_add_user_metadata(struct request *req, void __user *ubuf, + unsigned len, u32 seed) { struct bio_integrity_payload *bip; int ret = -ENOMEM; void *buf; + struct bio *bio = req->bio; buf = kmalloc(len, GFP_KERNEL); if (!buf) goto out; ret = -EFAULT; - if (write && copy_from_user(buf, ubuf, len)) + if ((req_op(req) == REQ_OP_DRV_OUT) && copy_from_user(buf, ubuf, len)) goto out_free_meta; bip = bio_integrity_alloc(bio, GFP_KERNEL, 1); @@ -45,9 +46,13 @@ static void *nvme_add_user_metadata(struct bio *bio, void __user *ubuf, bip->bip_iter.bi_sector = seed; ret = bio_integrity_add_page(bio, virt_to_page(buf), len, offset_in_page(buf)); - if (ret == len) - return buf; - ret = -ENOMEM; + if (ret != len) { + ret = -ENOMEM; + goto out_free_meta; + } + + req->cmd_flags |= REQ_INTEGRITY; + return buf; out_free_meta: kfree(buf); out: @@ -70,7 +75,6 @@ static struct request *nvme_alloc_user_request(struct request_queue *q, u32 meta_seed, void **metap, unsigned timeout, bool vec, blk_opf_t rq_flags, blk_mq_req_flags_t blk_flags) { - bool write = nvme_is_write(cmd); struct nvme_ns *ns = q->queuedata; struct block_device *bdev = ns ? ns->disk->part0 : NULL; struct request *req; @@ -96,13 +100,12 @@ static struct request *nvme_alloc_user_request(struct request_queue *q, if (bdev) bio_set_dev(bio, bdev); if (bdev && meta_buffer && meta_len) { - meta = nvme_add_user_metadata(bio, meta_buffer, meta_len, - meta_seed, write); + meta = nvme_add_user_metadata(req, meta_buffer, + meta_len, meta_seed); if (IS_ERR(meta)) { ret = PTR_ERR(meta); goto out_unmap; } - req->cmd_flags |= REQ_INTEGRITY; *metap = meta; } } -- 2.25.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
[parent not found: <CGME20220930063824epcas5p4f829f3b8673e2603cdc9a799ca44ea6e@epcas5p4.samsung.com>]
* [PATCH for-next v12 07/12] nvme: refactor nvme_alloc_request [not found] ` <CGME20220930063824epcas5p4f829f3b8673e2603cdc9a799ca44ea6e@epcas5p4.samsung.com> @ 2022-09-30 6:27 ` Anuj Gupta 0 siblings, 0 replies; 16+ messages in thread From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw) To: axboe, hch, kbusch Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, Kanchan Joshi, Anuj Gupta From: Kanchan Joshi <[email protected]> nvme_alloc_request expects a large number of parameters. Split this out into two functions to reduce number of parameters. First one retains the name nvme_alloc_request, while second one is named nvme_map_user_request. Signed-off-by: Kanchan Joshi <[email protected]> Signed-off-by: Anuj Gupta <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> --- drivers/nvme/host/ioctl.c | 90 +++++++++++++++++++++++---------------- 1 file changed, 53 insertions(+), 37 deletions(-) diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index bcaa6b3f97ca..3f1e7af19716 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -70,54 +70,57 @@ static int nvme_finish_user_metadata(struct request *req, void __user *ubuf, } static struct request *nvme_alloc_user_request(struct request_queue *q, - struct nvme_command *cmd, void __user *ubuffer, - unsigned bufflen, void __user *meta_buffer, unsigned meta_len, - u32 meta_seed, void **metap, unsigned timeout, bool vec, - blk_opf_t rq_flags, blk_mq_req_flags_t blk_flags) + struct nvme_command *cmd, blk_opf_t rq_flags, + blk_mq_req_flags_t blk_flags) { - struct nvme_ns *ns = q->queuedata; - struct block_device *bdev = ns ? ns->disk->part0 : NULL; struct request *req; - struct bio *bio = NULL; - void *meta = NULL; - int ret; req = blk_mq_alloc_request(q, nvme_req_op(cmd) | rq_flags, blk_flags); if (IS_ERR(req)) return req; nvme_init_request(req, cmd); - - if (timeout) - req->timeout = timeout; nvme_req(req)->flags |= NVME_REQ_USERCMD; + return req; +} - if (ubuffer && bufflen) { - ret = blk_rq_map_user_io(req, NULL, ubuffer, bufflen, - GFP_KERNEL, vec, 0, 0, rq_data_dir(req)); - if (ret) - goto out; - bio = req->bio; - if (bdev) - bio_set_dev(bio, bdev); - if (bdev && meta_buffer && meta_len) { - meta = nvme_add_user_metadata(req, meta_buffer, - meta_len, meta_seed); - if (IS_ERR(meta)) { - ret = PTR_ERR(meta); - goto out_unmap; - } - *metap = meta; +static int nvme_map_user_request(struct request *req, void __user *ubuffer, + unsigned bufflen, void __user *meta_buffer, unsigned meta_len, + u32 meta_seed, void **metap, bool vec) +{ + struct request_queue *q = req->q; + struct nvme_ns *ns = q->queuedata; + struct block_device *bdev = ns ? ns->disk->part0 : NULL; + struct bio *bio = NULL; + void *meta = NULL; + int ret; + + ret = blk_rq_map_user_io(req, NULL, ubuffer, bufflen, GFP_KERNEL, vec, + 0, 0, rq_data_dir(req)); + + if (ret) + goto out; + bio = req->bio; + if (bdev) + bio_set_dev(bio, bdev); + + if (bdev && meta_buffer && meta_len) { + meta = nvme_add_user_metadata(req, meta_buffer, meta_len, + meta_seed); + if (IS_ERR(meta)) { + ret = PTR_ERR(meta); + goto out_unmap; } + *metap = meta; } - return req; + return ret; out_unmap: if (bio) blk_rq_unmap_user(bio); out: blk_mq_free_request(req); - return ERR_PTR(ret); + return ret; } static int nvme_submit_user_cmd(struct request_queue *q, @@ -132,11 +135,18 @@ static int nvme_submit_user_cmd(struct request_queue *q, u32 effects; int ret; - req = nvme_alloc_user_request(q, cmd, ubuffer, bufflen, meta_buffer, - meta_len, meta_seed, &meta, timeout, vec, 0, 0); + req = nvme_alloc_user_request(q, cmd, 0, 0); if (IS_ERR(req)) return PTR_ERR(req); + req->timeout = timeout; + if (ubuffer && bufflen) { + ret = nvme_map_user_request(req, ubuffer, bufflen, meta_buffer, + meta_len, meta_seed, &meta, vec); + if (ret) + return ret; + } + bio = req->bio; ctrl = nvme_req(req)->ctrl; @@ -456,6 +466,7 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, blk_opf_t rq_flags = 0; blk_mq_req_flags_t blk_flags = 0; void *meta = NULL; + int ret; if (!capable(CAP_SYS_ADMIN)) return -EACCES; @@ -495,13 +506,18 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, rq_flags |= REQ_POLLED; retry: - req = nvme_alloc_user_request(q, &c, nvme_to_user_ptr(d.addr), - d.data_len, nvme_to_user_ptr(d.metadata), - d.metadata_len, 0, &meta, d.timeout_ms ? - msecs_to_jiffies(d.timeout_ms) : 0, vec, rq_flags, - blk_flags); + req = nvme_alloc_user_request(q, &c, rq_flags, blk_flags); if (IS_ERR(req)) return PTR_ERR(req); + req->timeout = d.timeout_ms ? msecs_to_jiffies(d.timeout_ms) : 0; + + if (d.addr && d.data_len) { + ret = nvme_map_user_request(req, nvme_to_user_ptr(d.addr), + d.data_len, nvme_to_user_ptr(d.metadata), + d.metadata_len, 0, &meta, vec); + if (ret) + return ret; + } if (issue_flags & IO_URING_F_IOPOLL && rq_flags & REQ_POLLED) { if (unlikely(!req->bio)) { -- 2.25.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
[parent not found: <CGME20220930063826epcas5p491d9bc62214c1d7c8c24c883299edfb7@epcas5p4.samsung.com>]
* [PATCH for-next v12 08/12] block: rename bio_map_put to blk_mq_map_bio_put [not found] ` <CGME20220930063826epcas5p491d9bc62214c1d7c8c24c883299edfb7@epcas5p4.samsung.com> @ 2022-09-30 6:27 ` Anuj Gupta 0 siblings, 0 replies; 16+ messages in thread From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw) To: axboe, hch, kbusch Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, Anuj Gupta This patch renames existing bio_map_put function to blk_mq_map_bio_put. Signed-off-by: Anuj Gupta <[email protected]> Suggested-by: Christoph Hellwig <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> --- block/blk-map.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/blk-map.c b/block/blk-map.c index 0e37bbedd46c..84b13a4158b7 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -231,7 +231,7 @@ static int bio_copy_user_iov(struct request *rq, struct rq_map_data *map_data, return ret; } -static void bio_map_put(struct bio *bio) +static void blk_mq_map_bio_put(struct bio *bio) { if (bio->bi_opf & REQ_ALLOC_CACHE) { bio_put(bio); @@ -331,7 +331,7 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, out_unmap: bio_release_pages(bio, false); - bio_map_put(bio); + blk_mq_map_bio_put(bio); return ret; } @@ -672,7 +672,7 @@ int blk_rq_unmap_user(struct bio *bio) next_bio = bio; bio = bio->bi_next; - bio_map_put(next_bio); + blk_mq_map_bio_put(next_bio); } return ret; -- 2.25.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
[parent not found: <CGME20220930063828epcas5p2bfddb254b0dffde77e99c2acc4440bde@epcas5p2.samsung.com>]
* [PATCH for-next v12 09/12] block: factor out blk_rq_map_bio_alloc helper [not found] ` <CGME20220930063828epcas5p2bfddb254b0dffde77e99c2acc4440bde@epcas5p2.samsung.com> @ 2022-09-30 6:27 ` Anuj Gupta 0 siblings, 0 replies; 16+ messages in thread From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw) To: axboe, hch, kbusch Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, Kanchan Joshi From: Kanchan Joshi <[email protected]> Move bio allocation logic from bio_map_user_iov to a new helper blk_rq_map_bio_alloc. It is named so because functionality is opposite of what is done inside blk_mq_map_bio_put. This is a prep patch. Signed-off-by: Kanchan Joshi <[email protected]> --- block/blk-map.c | 33 ++++++++++++++++++++++----------- 1 file changed, 22 insertions(+), 11 deletions(-) diff --git a/block/blk-map.c b/block/blk-map.c index 84b13a4158b7..d6ea377394a9 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -241,17 +241,10 @@ static void blk_mq_map_bio_put(struct bio *bio) } } -static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, - gfp_t gfp_mask) +static struct bio *blk_rq_map_bio_alloc(struct request *rq, + unsigned int nr_vecs, gfp_t gfp_mask) { - unsigned int max_sectors = queue_max_hw_sectors(rq->q); - unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS); struct bio *bio; - int ret; - int j; - - if (!iov_iter_count(iter)) - return -EINVAL; if (rq->cmd_flags & REQ_POLLED) { blk_opf_t opf = rq->cmd_flags | REQ_ALLOC_CACHE; @@ -259,13 +252,31 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask, &fs_bio_set); if (!bio) - return -ENOMEM; + return NULL; } else { bio = bio_kmalloc(nr_vecs, gfp_mask); if (!bio) - return -ENOMEM; + return NULL; bio_init(bio, NULL, bio->bi_inline_vecs, nr_vecs, req_op(rq)); } + return bio; +} + +static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, + gfp_t gfp_mask) +{ + unsigned int max_sectors = queue_max_hw_sectors(rq->q); + unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS); + struct bio *bio; + int ret; + int j; + + if (!iov_iter_count(iter)) + return -EINVAL; + + bio = blk_rq_map_bio_alloc(rq, nr_vecs, gfp_mask); + if (bio == NULL) + return -ENOMEM; while (iov_iter_count(iter)) { struct page **pages, *stack_pages[UIO_FASTIOV]; -- 2.25.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
[parent not found: <CGME20220930063831epcas5p4b6a8559dedd39ef423a0b9a317163969@epcas5p4.samsung.com>]
* [PATCH for-next v12 10/12] block: extend functionality to map bvec iterator [not found] ` <CGME20220930063831epcas5p4b6a8559dedd39ef423a0b9a317163969@epcas5p4.samsung.com> @ 2022-09-30 6:27 ` Anuj Gupta 0 siblings, 0 replies; 16+ messages in thread From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw) To: axboe, hch, kbusch Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, Kanchan Joshi, Anuj Gupta From: Kanchan Joshi <[email protected]> Extend blk_rq_map_user_iov so that it can handle bvec iterator, using the new blk_rq_map_user_bvec function. It maps the pages from bvec iterator into a bio and place the bio into request. This helper will be used by nvme for uring-passthrough path when IO is done using pre-mapped buffers. Signed-off-by: Kanchan Joshi <[email protected]> Signed-off-by: Anuj Gupta <[email protected]> Suggested-by: Christoph Hellwig <[email protected]> --- block/blk-map.c | 75 ++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 71 insertions(+), 4 deletions(-) diff --git a/block/blk-map.c b/block/blk-map.c index d6ea377394a9..34735626b00f 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -548,6 +548,62 @@ int blk_rq_append_bio(struct request *rq, struct bio *bio) } EXPORT_SYMBOL(blk_rq_append_bio); +/* Prepare bio for passthrough IO given ITER_BVEC iter */ +static int blk_rq_map_user_bvec(struct request *rq, const struct iov_iter *iter) +{ + struct request_queue *q = rq->q; + size_t nr_iter = iov_iter_count(iter); + size_t nr_segs = iter->nr_segs; + struct bio_vec *bvecs, *bvprvp = NULL; + struct queue_limits *lim = &q->limits; + unsigned int nsegs = 0, bytes = 0; + struct bio *bio; + size_t i; + + if (!nr_iter || (nr_iter >> SECTOR_SHIFT) > queue_max_hw_sectors(q)) + return -EINVAL; + if (nr_segs > queue_max_segments(q)) + return -EINVAL; + + /* no iovecs to alloc, as we already have a BVEC iterator */ + bio = blk_rq_map_bio_alloc(rq, 0, GFP_KERNEL); + if (bio == NULL) + return -ENOMEM; + + bio_iov_bvec_set(bio, (struct iov_iter *)iter); + blk_rq_bio_prep(rq, bio, nr_segs); + + /* loop to perform a bunch of sanity checks */ + bvecs = (struct bio_vec *)iter->bvec; + for (i = 0; i < nr_segs; i++) { + struct bio_vec *bv = &bvecs[i]; + + /* + * If the queue doesn't support SG gaps and adding this + * offset would create a gap, fallback to copy. + */ + if (bvprvp && bvec_gap_to_prev(lim, bvprvp, bv->bv_offset)) { + blk_mq_map_bio_put(bio); + return -EREMOTEIO; + } + /* check full condition */ + if (nsegs >= nr_segs || bytes > UINT_MAX - bv->bv_len) + goto put_bio; + if (bytes + bv->bv_len > nr_iter) + goto put_bio; + if (bv->bv_offset + bv->bv_len > PAGE_SIZE) + goto put_bio; + + nsegs++; + bytes += bv->bv_len; + bvprvp = bv; + } + return 0; +put_bio: + blk_mq_map_bio_put(bio); + return -EINVAL; +} + /** * blk_rq_map_user_iov - map user data to a request, for passthrough requests * @q: request queue where request should be inserted @@ -567,24 +623,35 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq, struct rq_map_data *map_data, const struct iov_iter *iter, gfp_t gfp_mask) { - bool copy = false; + bool copy = false, map_bvec = false; unsigned long align = q->dma_pad_mask | queue_dma_alignment(q); struct bio *bio = NULL; struct iov_iter i; int ret = -EINVAL; - if (!iter_is_iovec(iter)) - goto fail; - if (map_data) copy = true; else if (blk_queue_may_bounce(q)) copy = true; else if (iov_iter_alignment(iter) & align) copy = true; + else if (iov_iter_is_bvec(iter)) + map_bvec = true; + else if (!iter_is_iovec(iter)) + copy = true; else if (queue_virt_boundary(q)) copy = queue_virt_boundary(q) & iov_iter_gap_alignment(iter); + if (map_bvec) { + ret = blk_rq_map_user_bvec(rq, iter); + if (!ret) + return 0; + if (ret != -EREMOTEIO) + goto fail; + /* fall back to copying the data on limits mismatches */ + copy = true; + } + i = *iter; do { if (copy) -- 2.25.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
[parent not found: <CGME20220930063833epcas5p40fbff95f9d132f5a42dda80d307426e9@epcas5p4.samsung.com>]
* [PATCH for-next v12 11/12] nvme: pass ubuffer as an integer [not found] ` <CGME20220930063833epcas5p40fbff95f9d132f5a42dda80d307426e9@epcas5p4.samsung.com> @ 2022-09-30 6:27 ` Anuj Gupta 0 siblings, 0 replies; 16+ messages in thread From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw) To: axboe, hch, kbusch Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, Kanchan Joshi, Anuj Gupta From: Kanchan Joshi <[email protected]> This is a prep patch. Modify nvme_submit_user_cmd and nvme_map_user_request to take ubuffer as plain integer argument, and do away with nvme_to_user_ptr conversion in callers. Signed-off-by: Anuj Gupta <[email protected]> Signed-off-by: Kanchan Joshi <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> --- drivers/nvme/host/ioctl.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index 3f1e7af19716..7a41caa9bfd2 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -83,9 +83,10 @@ static struct request *nvme_alloc_user_request(struct request_queue *q, return req; } -static int nvme_map_user_request(struct request *req, void __user *ubuffer, +static int nvme_map_user_request(struct request *req, u64 ubuffer, unsigned bufflen, void __user *meta_buffer, unsigned meta_len, - u32 meta_seed, void **metap, bool vec) + u32 meta_seed, void **metap, struct io_uring_cmd *ioucmd, + bool vec) { struct request_queue *q = req->q; struct nvme_ns *ns = q->queuedata; @@ -94,8 +95,8 @@ static int nvme_map_user_request(struct request *req, void __user *ubuffer, void *meta = NULL; int ret; - ret = blk_rq_map_user_io(req, NULL, ubuffer, bufflen, GFP_KERNEL, vec, - 0, 0, rq_data_dir(req)); + ret = blk_rq_map_user_io(req, NULL, nvme_to_user_ptr(ubuffer), bufflen, + GFP_KERNEL, vec, 0, 0, rq_data_dir(req)); if (ret) goto out; @@ -124,7 +125,7 @@ static int nvme_map_user_request(struct request *req, void __user *ubuffer, } static int nvme_submit_user_cmd(struct request_queue *q, - struct nvme_command *cmd, void __user *ubuffer, + struct nvme_command *cmd, u64 ubuffer, unsigned bufflen, void __user *meta_buffer, unsigned meta_len, u32 meta_seed, u64 *result, unsigned timeout, bool vec) { @@ -142,7 +143,7 @@ static int nvme_submit_user_cmd(struct request_queue *q, req->timeout = timeout; if (ubuffer && bufflen) { ret = nvme_map_user_request(req, ubuffer, bufflen, meta_buffer, - meta_len, meta_seed, &meta, vec); + meta_len, meta_seed, &meta, NULL, vec); if (ret) return ret; } @@ -226,7 +227,7 @@ static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio) c.rw.appmask = cpu_to_le16(io.appmask); return nvme_submit_user_cmd(ns->queue, &c, - nvme_to_user_ptr(io.addr), length, + io.addr, length, metadata, meta_len, lower_32_bits(io.slba), NULL, 0, false); } @@ -280,7 +281,7 @@ static int nvme_user_cmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns, timeout = msecs_to_jiffies(cmd.timeout_ms); status = nvme_submit_user_cmd(ns ? ns->queue : ctrl->admin_q, &c, - nvme_to_user_ptr(cmd.addr), cmd.data_len, + cmd.addr, cmd.data_len, nvme_to_user_ptr(cmd.metadata), cmd.metadata_len, 0, &result, timeout, false); @@ -326,7 +327,7 @@ static int nvme_user_cmd64(struct nvme_ctrl *ctrl, struct nvme_ns *ns, timeout = msecs_to_jiffies(cmd.timeout_ms); status = nvme_submit_user_cmd(ns ? ns->queue : ctrl->admin_q, &c, - nvme_to_user_ptr(cmd.addr), cmd.data_len, + cmd.addr, cmd.data_len, nvme_to_user_ptr(cmd.metadata), cmd.metadata_len, 0, &cmd.result, timeout, vec); @@ -512,9 +513,9 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, req->timeout = d.timeout_ms ? msecs_to_jiffies(d.timeout_ms) : 0; if (d.addr && d.data_len) { - ret = nvme_map_user_request(req, nvme_to_user_ptr(d.addr), + ret = nvme_map_user_request(req, d.addr, d.data_len, nvme_to_user_ptr(d.metadata), - d.metadata_len, 0, &meta, vec); + d.metadata_len, 0, &meta, ioucmd, vec); if (ret) return ret; } -- 2.25.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
[parent not found: <CGME20220930063835epcas5p2812f8e3d0758b19c01198034fcddc019@epcas5p2.samsung.com>]
* [PATCH for-next v12 12/12] nvme: wire up fixed buffer support for nvme passthrough [not found] ` <CGME20220930063835epcas5p2812f8e3d0758b19c01198034fcddc019@epcas5p2.samsung.com> @ 2022-09-30 6:27 ` Anuj Gupta 0 siblings, 0 replies; 16+ messages in thread From: Anuj Gupta @ 2022-09-30 6:27 UTC (permalink / raw) To: axboe, hch, kbusch Cc: io-uring, linux-nvme, linux-block, gost.dev, linux-scsi, Kanchan Joshi From: Kanchan Joshi <[email protected]> if io_uring sends passthrough command with IORING_URING_CMD_FIXED flag, use the pre-registered buffer for IO (non-vectored variant). Pass the buffer/length to io_uring and get the bvec iterator for the range. Next, pass this bvec to block-layer and obtain a bio/request for subsequent processing. Signed-off-by: Kanchan Joshi <[email protected]> --- drivers/nvme/host/ioctl.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index 7a41caa9bfd2..81f5550b670d 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -95,8 +95,22 @@ static int nvme_map_user_request(struct request *req, u64 ubuffer, void *meta = NULL; int ret; - ret = blk_rq_map_user_io(req, NULL, nvme_to_user_ptr(ubuffer), bufflen, - GFP_KERNEL, vec, 0, 0, rq_data_dir(req)); + if (ioucmd && (ioucmd->flags & IORING_URING_CMD_FIXED)) { + struct iov_iter iter; + + /* fixedbufs is only for non-vectored io */ + if (WARN_ON_ONCE(vec)) + return -EINVAL; + ret = io_uring_cmd_import_fixed(ubuffer, bufflen, + rq_data_dir(req), &iter, ioucmd); + if (ret < 0) + goto out; + ret = blk_rq_map_user_iov(q, req, NULL, &iter, GFP_KERNEL); + } else { + ret = blk_rq_map_user_io(req, NULL, nvme_to_user_ptr(ubuffer), + bufflen, GFP_KERNEL, vec, 0, 0, + rq_data_dir(req)); + } if (ret) goto out; -- 2.25.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH for-next v12 00/12] Fixed-buffer for uring-cmd/passthru 2022-09-30 6:27 ` [PATCH for-next v12 00/12] Fixed-buffer for uring-cmd/passthru Anuj Gupta ` (11 preceding siblings ...) [not found] ` <CGME20220930063835epcas5p2812f8e3d0758b19c01198034fcddc019@epcas5p2.samsung.com> @ 2022-09-30 14:04 ` Jens Axboe 12 siblings, 0 replies; 16+ messages in thread From: Jens Axboe @ 2022-09-30 14:04 UTC (permalink / raw) To: kbusch, Anuj Gupta, hch Cc: io-uring, linux-nvme, gost.dev, linux-block, linux-scsi On Fri, 30 Sep 2022 11:57:37 +0530, Anuj Gupta wrote: > uring-cmd lacks the ability to leverage the pre-registered buffers. > This series adds that support in uring-cmd, and plumbs nvme passthrough > to work with it. > Patches 3 - 5 carve out a block helper and scsi, nvme then use it to > avoid duplication of code. > Patch 6 and 7 contains a bunch of general nvme cleanups, which got added > along the iterations. > > [...] Applied, thanks! [01/12] io_uring: add io_uring_cmd_import_fixed commit: a9216fac3ed8819cbbda5d39dd5fcaa43dfd35d8 [02/12] io_uring: introduce fixed buffer support for io_uring_cmd commit: 9cda70f622cdcf049521a9c2886e5fd8a90a0591 [03/12] block: add blk_rq_map_user_io commit: 557654025df5706785d395558244890dc4b2c875 [04/12] scsi: Use blk_rq_map_user_io helper commit: 6732932c836a4313f471b92b4d90761f31d3fa81 [05/12] nvme: Use blk_rq_map_user_io helper commit: 7f05635764390d5f811971af9f4c89b794032c80 [06/12] nvme: refactor nvme_add_user_metadata commit: 38c0ddab7b93daa90c046d0b9064a34fb0e586e5 [07/12] nvme: refactor nvme_alloc_request commit: 470e900c8036ff1aafeb5f06f3cb7a375a081399 [08/12] block: rename bio_map_put to blk_mq_map_bio_put commit: 32f1c71b15fc9cb8e964c3d0c15ca99a70cfe8a7 [09/12] block: factor out blk_rq_map_bio_alloc helper commit: ab89e8e7ca526ca04baaad2aa28172d336425d67 [10/12] block: extend functionality to map bvec iterator commit: 37987547932c89f15f9b76950040131ddb591a8b [11/12] nvme: pass ubuffer as an integer commit: 4d174486820e625fa85bac5d4235d4b4cb659866 [12/12] nvme: wire up fixed buffer support for nvme passthrough commit: 23fd22e55b767be9c31fda57205afb2023cd6aad Best regards, -- Jens Axboe ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2022-09-30 14:05 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <CGME20220930063754epcas5p2aff33c952032713a39604388eacda910@epcas5p2.samsung.com> 2022-09-30 6:27 ` [PATCH for-next v12 00/12] Fixed-buffer for uring-cmd/passthru Anuj Gupta [not found] ` <CGME20220930063805epcas5p2c8eb80f32507f011baedc6d6b4d3f38d@epcas5p2.samsung.com> 2022-09-30 6:27 ` [PATCH for-next v12 01/12] io_uring: add io_uring_cmd_import_fixed Anuj Gupta [not found] ` <CGME20220930063809epcas5p328b9e14ead49e9612b905e6f5b6682f7@epcas5p3.samsung.com> 2022-09-30 6:27 ` [PATCH for-next v12 02/12] io_uring: introduce fixed buffer support for io_uring_cmd Anuj Gupta 2022-09-30 13:42 ` Jens Axboe 2022-09-30 14:04 ` Anuj gupta [not found] ` <CGME20220930063811epcas5p43cce58f5e1589c3e3780ce0cfd563986@epcas5p4.samsung.com> 2022-09-30 6:27 ` [PATCH for-next v12 03/12] block: add blk_rq_map_user_io Anuj Gupta [not found] ` <CGME20220930063815epcas5p1e056d6a2a53949296a7657de804fd2ec@epcas5p1.samsung.com> 2022-09-30 6:27 ` [PATCH for-next v12 04/12] scsi: Use blk_rq_map_user_io helper Anuj Gupta [not found] ` <CGME20220930063818epcas5p4e321f0efa5a53759ea19eb8f1c63deef@epcas5p4.samsung.com> 2022-09-30 6:27 ` [PATCH for-next v12 05/12] nvme: " Anuj Gupta [not found] ` <CGME20220930063821epcas5p48d4ec5136d487ea779ac74e2c0b740ac@epcas5p4.samsung.com> 2022-09-30 6:27 ` [PATCH for-next v12 06/12] nvme: refactor nvme_add_user_metadata Anuj Gupta [not found] ` <CGME20220930063824epcas5p4f829f3b8673e2603cdc9a799ca44ea6e@epcas5p4.samsung.com> 2022-09-30 6:27 ` [PATCH for-next v12 07/12] nvme: refactor nvme_alloc_request Anuj Gupta [not found] ` <CGME20220930063826epcas5p491d9bc62214c1d7c8c24c883299edfb7@epcas5p4.samsung.com> 2022-09-30 6:27 ` [PATCH for-next v12 08/12] block: rename bio_map_put to blk_mq_map_bio_put Anuj Gupta [not found] ` <CGME20220930063828epcas5p2bfddb254b0dffde77e99c2acc4440bde@epcas5p2.samsung.com> 2022-09-30 6:27 ` [PATCH for-next v12 09/12] block: factor out blk_rq_map_bio_alloc helper Anuj Gupta [not found] ` <CGME20220930063831epcas5p4b6a8559dedd39ef423a0b9a317163969@epcas5p4.samsung.com> 2022-09-30 6:27 ` [PATCH for-next v12 10/12] block: extend functionality to map bvec iterator Anuj Gupta [not found] ` <CGME20220930063833epcas5p40fbff95f9d132f5a42dda80d307426e9@epcas5p4.samsung.com> 2022-09-30 6:27 ` [PATCH for-next v12 11/12] nvme: pass ubuffer as an integer Anuj Gupta [not found] ` <CGME20220930063835epcas5p2812f8e3d0758b19c01198034fcddc019@epcas5p2.samsung.com> 2022-09-30 6:27 ` [PATCH for-next v12 12/12] nvme: wire up fixed buffer support for nvme passthrough Anuj Gupta 2022-09-30 14:04 ` [PATCH for-next v12 00/12] Fixed-buffer for uring-cmd/passthru Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox