* [PATCH for-next v4 0/4] fixed-buffer for uring-cmd/passthru [not found] <CGME20220905135842epcas5p4835d74beb091f5f50490714d93fc58f2@epcas5p4.samsung.com> @ 2022-09-05 13:48 ` Kanchan Joshi [not found] ` <CGME20220905135846epcas5p4fde0fc96442adc3cf11319375ba2596b@epcas5p4.samsung.com> ` (3 more replies) 0 siblings, 4 replies; 13+ messages in thread From: Kanchan Joshi @ 2022-09-05 13:48 UTC (permalink / raw) To: axboe, hch, kbusch, asml.silence Cc: io-uring, linux-nvme, linux-block, gost.dev, Kanchan Joshi Hi, Currently uring-cmd lacks the ability to leverage the pre-registered buffers. This series adds that support in uring-cmd, and plumbs nvme passthrough to work with it. Using registered-buffers showed IOPS hike from 1.9M to 2.2M to me. Patch 1, 3 = prep/infrastructure Patch 2 = expand io_uring command to use registered-buffers Patch 4 = expand nvme passthrough to use registered-buffers Changes since v3: - uring_cmd_flags, change from u16 to u32 (Jens) - patch 3, add another helper to reduce code-duplication (Jens) Changes since v2: - Kill the new opcode, add a flag instead (Pavel) - Fix standalone build issue with patch 1 (Pavel) Changes since v1: - Fix a naming issue for an exported helper Anuj Gupta (2): io_uring: introduce io_uring_cmd_import_fixed io_uring: introduce fixed buffer support for io_uring_cmd Kanchan Joshi (2): block: add helper to map bvec iterator for passthrough nvme: wire up fixed buffer support for nvme passthrough block/blk-map.c | 94 +++++++++++++++++++++++++++++++---- drivers/nvme/host/ioctl.c | 38 +++++++++----- include/linux/blk-mq.h | 1 + include/linux/io_uring.h | 11 +++- include/uapi/linux/io_uring.h | 9 ++++ io_uring/uring_cmd.c | 29 ++++++++++- 6 files changed, 158 insertions(+), 24 deletions(-) -- 2.25.1 ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CGME20220905135846epcas5p4fde0fc96442adc3cf11319375ba2596b@epcas5p4.samsung.com>]
* [PATCH for-next v4 1/4] io_uring: introduce io_uring_cmd_import_fixed [not found] ` <CGME20220905135846epcas5p4fde0fc96442adc3cf11319375ba2596b@epcas5p4.samsung.com> @ 2022-09-05 13:48 ` Kanchan Joshi 2022-09-05 17:50 ` Jens Axboe 2022-09-05 17:53 ` Jens Axboe 0 siblings, 2 replies; 13+ messages in thread From: Kanchan Joshi @ 2022-09-05 13:48 UTC (permalink / raw) To: axboe, hch, kbusch, asml.silence Cc: io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta, Kanchan Joshi From: Anuj Gupta <[email protected]> This is a new helper that callers can use to obtain a bvec iterator for the previously mapped buffer. This is preparatory work to enable fixed-buffer support for io_uring_cmd. Signed-off-by: Anuj Gupta <[email protected]> Signed-off-by: Kanchan Joshi <[email protected]> --- include/linux/io_uring.h | 8 ++++++++ io_uring/uring_cmd.c | 11 +++++++++++ 2 files changed, 19 insertions(+) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index 58676c0a398f..dba6fb47aa6c 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -4,6 +4,7 @@ #include <linux/sched.h> #include <linux/xarray.h> +#include <uapi/linux/io_uring.h> enum io_uring_cmd_flags { IO_URING_F_COMPLETE_DEFER = 1, @@ -32,6 +33,8 @@ struct io_uring_cmd { }; #if defined(CONFIG_IO_URING) +int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, + struct iov_iter *iter, void *ioucmd); void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret, ssize_t res2); void io_uring_cmd_complete_in_task(struct io_uring_cmd *ioucmd, void (*task_work_cb)(struct io_uring_cmd *)); @@ -59,6 +62,11 @@ static inline void io_uring_free(struct task_struct *tsk) __io_uring_free(tsk); } #else +int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, + struct iov_iter *iter, void *ioucmd) +{ + return -1; +} static inline void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret, ssize_t ret2) { diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index 6f99dbd5d550..8cddd18ad10b 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -7,6 +7,7 @@ #include <uapi/linux/io_uring.h> #include "io_uring.h" +#include "rsrc.h" #include "uring_cmd.h" static void io_uring_cmd_work(struct io_kiocb *req, bool *locked) @@ -124,3 +125,13 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags) return IOU_ISSUE_SKIP_COMPLETE; } + +int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, + int rw, struct iov_iter *iter, void *ioucmd) +{ + struct io_kiocb *req = cmd_to_io_kiocb(ioucmd); + struct io_mapped_ubuf *imu = req->imu; + + return io_import_fixed(rw, iter, imu, ubuf, len); +} +EXPORT_SYMBOL_GPL(io_uring_cmd_import_fixed); -- 2.25.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH for-next v4 1/4] io_uring: introduce io_uring_cmd_import_fixed 2022-09-05 13:48 ` [PATCH for-next v4 1/4] io_uring: introduce io_uring_cmd_import_fixed Kanchan Joshi @ 2022-09-05 17:50 ` Jens Axboe 2022-09-05 17:53 ` Jens Axboe 1 sibling, 0 replies; 13+ messages in thread From: Jens Axboe @ 2022-09-05 17:50 UTC (permalink / raw) To: Kanchan Joshi, hch, kbusch, asml.silence Cc: io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta On 9/5/22 7:48 AM, Kanchan Joshi wrote: > From: Anuj Gupta <[email protected]> > > This is a new helper that callers can use to obtain a bvec iterator for > the previously mapped buffer. This is preparatory work to enable > fixed-buffer support for io_uring_cmd. > > Signed-off-by: Anuj Gupta <[email protected]> > Signed-off-by: Kanchan Joshi <[email protected]> > --- > include/linux/io_uring.h | 8 ++++++++ > io_uring/uring_cmd.c | 11 +++++++++++ > 2 files changed, 19 insertions(+) > > diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h > index 58676c0a398f..dba6fb47aa6c 100644 > --- a/include/linux/io_uring.h > +++ b/include/linux/io_uring.h > @@ -4,6 +4,7 @@ > > #include <linux/sched.h> > #include <linux/xarray.h> > +#include <uapi/linux/io_uring.h> > > enum io_uring_cmd_flags { > IO_URING_F_COMPLETE_DEFER = 1, > @@ -32,6 +33,8 @@ struct io_uring_cmd { > }; > > #if defined(CONFIG_IO_URING) > +int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, > + struct iov_iter *iter, void *ioucmd); > void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret, ssize_t res2); > void io_uring_cmd_complete_in_task(struct io_uring_cmd *ioucmd, > void (*task_work_cb)(struct io_uring_cmd *)); > @@ -59,6 +62,11 @@ static inline void io_uring_free(struct task_struct *tsk) > __io_uring_free(tsk); > } > #else > +int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, > + struct iov_iter *iter, void *ioucmd) > +{ > + return -1; > +} Is this right? Shouldn't it return -EOPNOTSUPP or another suitable actual error value? Apart from that, I think the patchset looks fine now. -- Jens Axboe ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-next v4 1/4] io_uring: introduce io_uring_cmd_import_fixed 2022-09-05 13:48 ` [PATCH for-next v4 1/4] io_uring: introduce io_uring_cmd_import_fixed Kanchan Joshi 2022-09-05 17:50 ` Jens Axboe @ 2022-09-05 17:53 ` Jens Axboe 1 sibling, 0 replies; 13+ messages in thread From: Jens Axboe @ 2022-09-05 17:53 UTC (permalink / raw) To: Kanchan Joshi, hch, kbusch, asml.silence Cc: io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta On 9/5/22 7:48 AM, Kanchan Joshi wrote: > @@ -124,3 +125,13 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags) > > return IOU_ISSUE_SKIP_COMPLETE; > } > + > +int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, > + int rw, struct iov_iter *iter, void *ioucmd) > +{ > + struct io_kiocb *req = cmd_to_io_kiocb(ioucmd); > + struct io_mapped_ubuf *imu = req->imu; > + > + return io_import_fixed(rw, iter, imu, ubuf, len); > +} > +EXPORT_SYMBOL_GPL(io_uring_cmd_import_fixed); Oh, and since we're probably respinning this one anyway, I'd do: int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, struct iov_iter *iter, void *ioucmd) { struct io_kiocb *req = cmd_to_io_kiocb(ioucmd); return io_import_fixed(rw, iter, req->imu, ubuf, len); } EXPORT_SYMBOL_GPL(io_uring_cmd_import_fixed); to both fix the indentation and get rid of the 'imu' variable that isn't really necessary. -- Jens Axboe ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CGME20220905135848epcas5p445275a3af56a26a036878fe8a8bcb55f@epcas5p4.samsung.com>]
* [PATCH for-next v4 2/4] io_uring: introduce fixed buffer support for io_uring_cmd [not found] ` <CGME20220905135848epcas5p445275a3af56a26a036878fe8a8bcb55f@epcas5p4.samsung.com> @ 2022-09-05 13:48 ` Kanchan Joshi 2022-09-05 17:54 ` Jens Axboe 0 siblings, 1 reply; 13+ messages in thread From: Kanchan Joshi @ 2022-09-05 13:48 UTC (permalink / raw) To: axboe, hch, kbusch, asml.silence Cc: io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta, Kanchan Joshi From: Anuj Gupta <[email protected]> Add IORING_URING_CMD_FIXED flag that is to be used for sending io_uring command with previously registered buffers. User-space passes the buffer index in sqe->buf_index, same as done in read/write variants that uses fixed buffers. Signed-off-by: Anuj Gupta <[email protected]> Signed-off-by: Kanchan Joshi <[email protected]> --- include/linux/io_uring.h | 3 ++- include/uapi/linux/io_uring.h | 9 +++++++++ io_uring/uring_cmd.c | 18 +++++++++++++++++- 3 files changed, 28 insertions(+), 2 deletions(-) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index dba6fb47aa6c..6ca633b88816 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -16,6 +16,7 @@ enum io_uring_cmd_flags { IO_URING_F_SQE128 = 4, IO_URING_F_CQE32 = 8, IO_URING_F_IOPOLL = 16, + IO_URING_F_FIXEDBUFS = 32, }; struct io_uring_cmd { @@ -28,7 +29,7 @@ struct io_uring_cmd { void *cookie; }; u32 cmd_op; - u32 pad; + u32 flags; u8 pdu[32]; /* available inline for free use */ }; diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 48e5c70e0baf..34be8dd31f17 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -56,6 +56,7 @@ struct io_uring_sqe { __u32 hardlink_flags; __u32 xattr_flags; __u32 msg_ring_flags; + __u32 uring_cmd_flags; }; __u64 user_data; /* data to be passed back at completion time */ /* pack this to avoid bogus arm OABI complaints */ @@ -219,6 +220,14 @@ enum io_uring_op { IORING_OP_LAST, }; +/* + * sqe->uring_cmd_flags + * IORING_URING_CMD_FIXED use registered buffer; pass thig flag + * along with setting sqe->buf_index. + */ +#define IORING_URING_CMD_FIXED (1U << 0) + + /* * sqe->fsync_flags */ diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index 8cddd18ad10b..ea989a348d98 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -3,6 +3,7 @@ #include <linux/errno.h> #include <linux/file.h> #include <linux/io_uring.h> +#include <linux/nospec.h> #include <uapi/linux/io_uring.h> @@ -76,8 +77,21 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd); - if (sqe->rw_flags || sqe->__pad1) + if (sqe->__pad1) return -EINVAL; + + ioucmd->flags = READ_ONCE(sqe->uring_cmd_flags); + req->buf_index = READ_ONCE(sqe->buf_index); + if (ioucmd->flags & IORING_URING_CMD_FIXED) { + struct io_ring_ctx *ctx = req->ctx; + u16 index; + + if (unlikely(req->buf_index >= ctx->nr_user_bufs)) + return -EFAULT; + index = array_index_nospec(req->buf_index, ctx->nr_user_bufs); + req->imu = ctx->user_bufs[index]; + io_req_set_rsrc_node(req, ctx, 0); + } ioucmd->cmd = sqe->cmd; ioucmd->cmd_op = READ_ONCE(sqe->cmd_op); return 0; @@ -102,6 +116,8 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags) req->iopoll_completed = 0; WRITE_ONCE(ioucmd->cookie, NULL); } + if (ioucmd->flags & IORING_URING_CMD_FIXED) + issue_flags |= IO_URING_F_FIXEDBUFS; if (req_has_async_data(req)) ioucmd->cmd = req->async_data; -- 2.25.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH for-next v4 2/4] io_uring: introduce fixed buffer support for io_uring_cmd 2022-09-05 13:48 ` [PATCH for-next v4 2/4] io_uring: introduce fixed buffer support for io_uring_cmd Kanchan Joshi @ 2022-09-05 17:54 ` Jens Axboe 0 siblings, 0 replies; 13+ messages in thread From: Jens Axboe @ 2022-09-05 17:54 UTC (permalink / raw) To: Kanchan Joshi, hch, kbusch, asml.silence Cc: io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta On 9/5/22 7:48 AM, Kanchan Joshi wrote: > @@ -76,8 +77,21 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) > { > struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd); > > - if (sqe->rw_flags || sqe->__pad1) > + if (sqe->__pad1) > return -EINVAL; > + > + ioucmd->flags = READ_ONCE(sqe->uring_cmd_flags); > + req->buf_index = READ_ONCE(sqe->buf_index); > + if (ioucmd->flags & IORING_URING_CMD_FIXED) { > + struct io_ring_ctx *ctx = req->ctx; > + u16 index; > + > + if (unlikely(req->buf_index >= ctx->nr_user_bufs)) > + return -EFAULT; > + index = array_index_nospec(req->buf_index, ctx->nr_user_bufs); > + req->imu = ctx->user_bufs[index]; > + io_req_set_rsrc_node(req, ctx, 0); > + } Should that buf_index read and assignment be inside the IORING_URING_CMD_FIXED section? -- Jens Axboe ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CGME20220905135851epcas5p3d107b140fd6cba1feb338c1a31c4feb1@epcas5p3.samsung.com>]
* [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough [not found] ` <CGME20220905135851epcas5p3d107b140fd6cba1feb338c1a31c4feb1@epcas5p3.samsung.com> @ 2022-09-05 13:48 ` Kanchan Joshi 2022-09-06 6:25 ` Christoph Hellwig 0 siblings, 1 reply; 13+ messages in thread From: Kanchan Joshi @ 2022-09-05 13:48 UTC (permalink / raw) To: axboe, hch, kbusch, asml.silence Cc: io-uring, linux-nvme, linux-block, gost.dev, Kanchan Joshi, Anuj Gupta Add blk_rq_map_user_bvec which maps the bvec iterator into a bio and places that into the request. This helper will be used in nvme for uring-passthrough with fixed-buffer. While at it, create another helper bio_map_get to reduce the code duplication. Signed-off-by: Kanchan Joshi <[email protected]> Signed-off-by: Anuj Gupta <[email protected]> --- block/blk-map.c | 94 +++++++++++++++++++++++++++++++++++++----- include/linux/blk-mq.h | 1 + 2 files changed, 85 insertions(+), 10 deletions(-) diff --git a/block/blk-map.c b/block/blk-map.c index f3768876d618..e2f268167342 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -241,17 +241,10 @@ static void bio_map_put(struct bio *bio) } } -static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, +static struct bio *bio_map_get(struct request *rq, unsigned int nr_vecs, gfp_t gfp_mask) { - unsigned int max_sectors = queue_max_hw_sectors(rq->q); - unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS); struct bio *bio; - int ret; - int j; - - if (!iov_iter_count(iter)) - return -EINVAL; if (rq->cmd_flags & REQ_POLLED) { blk_opf_t opf = rq->cmd_flags | REQ_ALLOC_CACHE; @@ -259,13 +252,31 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask, &fs_bio_set); if (!bio) - return -ENOMEM; + return NULL; } else { bio = bio_kmalloc(nr_vecs, gfp_mask); if (!bio) - return -ENOMEM; + return NULL; bio_init(bio, NULL, bio->bi_inline_vecs, nr_vecs, req_op(rq)); } + return bio; +} + +static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, + gfp_t gfp_mask) +{ + unsigned int max_sectors = queue_max_hw_sectors(rq->q); + unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS); + struct bio *bio; + int ret; + int j; + + if (!iov_iter_count(iter)) + return -EINVAL; + + bio = bio_map_get(rq, nr_vecs, gfp_mask); + if (bio == NULL) + return -ENOMEM; while (iov_iter_count(iter)) { struct page **pages, *stack_pages[UIO_FASTIOV]; @@ -612,6 +623,69 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, } EXPORT_SYMBOL(blk_rq_map_user); +/* Prepare bio for passthrough IO given an existing bvec iter */ +int blk_rq_map_user_bvec(struct request *rq, struct iov_iter *iter) +{ + struct request_queue *q = rq->q; + size_t iter_count, nr_segs; + struct bio *bio; + struct bio_vec *bv, *bvec_arr, *bvprvp = NULL; + struct queue_limits *lim = &q->limits; + unsigned int nsegs = 0, bytes = 0; + int ret, i; + + iter_count = iov_iter_count(iter); + nr_segs = iter->nr_segs; + + if (!iter_count || (iter_count >> 9) > queue_max_hw_sectors(q)) + return -EINVAL; + if (nr_segs > queue_max_segments(q)) + return -EINVAL; + + /* no iovecs to alloc, as we already have a BVEC iterator */ + bio = bio_map_get(rq, 0, GFP_KERNEL); + if (bio == NULL) + return -ENOMEM; + + bio_iov_bvec_set(bio, iter); + blk_rq_bio_prep(rq, bio, nr_segs); + + /* loop to perform a bunch of sanity checks */ + bvec_arr = (struct bio_vec *)iter->bvec; + for (i = 0; i < nr_segs; i++) { + bv = &bvec_arr[i]; + /* + * If the queue doesn't support SG gaps and adding this + * offset would create a gap, disallow it. + */ + if (bvprvp && bvec_gap_to_prev(lim, bvprvp, bv->bv_offset)) { + ret = -EINVAL; + goto out_free; + } + + /* check full condition */ + if (nsegs >= nr_segs || bytes > UINT_MAX - bv->bv_len) { + ret = -EINVAL; + goto out_free; + } + + if (bytes + bv->bv_len <= iter_count && + bv->bv_offset + bv->bv_len <= PAGE_SIZE) { + nsegs++; + bytes += bv->bv_len; + } else { + ret = -EINVAL; + goto out_free; + } + bvprvp = bv; + } + return 0; +out_free: + bio_map_put(bio); + return ret; +} +EXPORT_SYMBOL(blk_rq_map_user_bvec); + /** * blk_rq_unmap_user - unmap a request with user data * @bio: start of bio list diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index b43c81d91892..83bef362f0f9 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -970,6 +970,7 @@ struct rq_map_data { bool from_user; }; +int blk_rq_map_user_bvec(struct request *rq, struct iov_iter *iter); int blk_rq_map_user(struct request_queue *, struct request *, struct rq_map_data *, void __user *, unsigned long, gfp_t); int blk_rq_map_user_iov(struct request_queue *, struct request *, -- 2.25.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough 2022-09-05 13:48 ` [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough Kanchan Joshi @ 2022-09-06 6:25 ` Christoph Hellwig 2022-09-06 6:33 ` Kanchan Joshi 0 siblings, 1 reply; 13+ messages in thread From: Christoph Hellwig @ 2022-09-06 6:25 UTC (permalink / raw) To: Kanchan Joshi Cc: axboe, hch, kbusch, asml.silence, io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta On Mon, Sep 05, 2022 at 07:18:32PM +0530, Kanchan Joshi wrote: > +static struct bio *bio_map_get(struct request *rq, unsigned int nr_vecs, > gfp_t gfp_mask) > { > @@ -259,13 +252,31 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, > bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask, > &fs_bio_set); > if (!bio) > - return -ENOMEM; > + return NULL; This context looks weird? That bio_alloc_bioset should not be there, as biosets are only used for file system I/O, which this is not. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough 2022-09-06 6:25 ` Christoph Hellwig @ 2022-09-06 6:33 ` Kanchan Joshi 2022-09-06 6:47 ` Kanchan Joshi 2022-09-06 6:51 ` Christoph Hellwig 0 siblings, 2 replies; 13+ messages in thread From: Kanchan Joshi @ 2022-09-06 6:33 UTC (permalink / raw) To: Christoph Hellwig Cc: axboe, kbusch, asml.silence, io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta [-- Attachment #1: Type: text/plain, Size: 775 bytes --] On Tue, Sep 06, 2022 at 08:25:22AM +0200, Christoph Hellwig wrote: >On Mon, Sep 05, 2022 at 07:18:32PM +0530, Kanchan Joshi wrote: >> +static struct bio *bio_map_get(struct request *rq, unsigned int nr_vecs, >> gfp_t gfp_mask) >> { >> @@ -259,13 +252,31 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, >> bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask, >> &fs_bio_set); >> if (!bio) >> - return -ENOMEM; >> + return NULL; > >This context looks weird? That bio_alloc_bioset should not be there, >as biosets are only used for file system I/O, which this is not. if you think it's a deal-breaker, maybe I can add a new bioset in nvme and pass that as argument to this helper. Would you prefer that over the current approach. [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough 2022-09-06 6:33 ` Kanchan Joshi @ 2022-09-06 6:47 ` Kanchan Joshi 2022-09-06 6:51 ` Christoph Hellwig 1 sibling, 0 replies; 13+ messages in thread From: Kanchan Joshi @ 2022-09-06 6:47 UTC (permalink / raw) To: Christoph Hellwig Cc: axboe, kbusch, asml.silence, io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta [-- Attachment #1: Type: text/plain, Size: 1196 bytes --] On Tue, Sep 06, 2022 at 12:03:29PM +0530, Kanchan Joshi wrote: >On Tue, Sep 06, 2022 at 08:25:22AM +0200, Christoph Hellwig wrote: >>On Mon, Sep 05, 2022 at 07:18:32PM +0530, Kanchan Joshi wrote: >>>+static struct bio *bio_map_get(struct request *rq, unsigned int nr_vecs, >>> gfp_t gfp_mask) >>> { >>>@@ -259,13 +252,31 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, >>> bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask, >>> &fs_bio_set); >>> if (!bio) >>>- return -ENOMEM; >>>+ return NULL; >> >>This context looks weird? That bio_alloc_bioset should not be there, >>as biosets are only used for file system I/O, which this is not. > >if you think it's a deal-breaker, maybe I can add a new bioset in nvme and >pass that as argument to this helper. Would you prefer that over the >current approach. seems I responded without looking carefully. The bioset addition is not part of this series. It got added earlier [1] [2], as part of optimizing polled-io on file/passthru path. [1] https://lore.kernel.org/linux-block/[email protected]/ [2] https://lore.kernel.org/linux-block/[email protected]/ [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough 2022-09-06 6:33 ` Kanchan Joshi 2022-09-06 6:47 ` Kanchan Joshi @ 2022-09-06 6:51 ` Christoph Hellwig 2022-09-06 13:06 ` Jens Axboe 1 sibling, 1 reply; 13+ messages in thread From: Christoph Hellwig @ 2022-09-06 6:51 UTC (permalink / raw) To: Kanchan Joshi Cc: Christoph Hellwig, axboe, kbusch, asml.silence, io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta On Tue, Sep 06, 2022 at 12:03:29PM +0530, Kanchan Joshi wrote: >> This context looks weird? That bio_alloc_bioset should not be there, >> as biosets are only used for file system I/O, which this is not. > > if you think it's a deal-breaker, maybe I can add a new bioset in nvme and > pass that as argument to this helper. Would you prefer that over the > current approach. The whole point is that biosets exist to allow for forward progress guarantees required for file system I/O. For passthrough I/O bio_kmalloc is perfectly fine and much simpler. Adding yet another bio_set just makes things even worse. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough 2022-09-06 6:51 ` Christoph Hellwig @ 2022-09-06 13:06 ` Jens Axboe 0 siblings, 0 replies; 13+ messages in thread From: Jens Axboe @ 2022-09-06 13:06 UTC (permalink / raw) To: Christoph Hellwig, Kanchan Joshi Cc: kbusch, asml.silence, io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta On 9/6/22 12:51 AM, Christoph Hellwig wrote: > On Tue, Sep 06, 2022 at 12:03:29PM +0530, Kanchan Joshi wrote: >>> This context looks weird? That bio_alloc_bioset should not be there, >>> as biosets are only used for file system I/O, which this is not. >> >> if you think it's a deal-breaker, maybe I can add a new bioset in nvme and >> pass that as argument to this helper. Would you prefer that over the >> current approach. > > The whole point is that biosets exist to allow for forward progress > guarantees required for file system I/O. For passthrough I/O > bio_kmalloc is perfectly fine and much simpler. Adding yet another > bio_set just makes things even worse. It's a performance concern too, efficiency is much worse by using kmalloc+kfree for passthrough. You don't get bio caching that way. -- Jens Axboe ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CGME20220905135854epcas5p2256848a964afec41f46502b0114698e2@epcas5p2.samsung.com>]
* [PATCH for-next v4 4/4] nvme: wire up fixed buffer support for nvme passthrough [not found] ` <CGME20220905135854epcas5p2256848a964afec41f46502b0114698e2@epcas5p2.samsung.com> @ 2022-09-05 13:48 ` Kanchan Joshi 0 siblings, 0 replies; 13+ messages in thread From: Kanchan Joshi @ 2022-09-05 13:48 UTC (permalink / raw) To: axboe, hch, kbusch, asml.silence Cc: io-uring, linux-nvme, linux-block, gost.dev, Kanchan Joshi, Anuj Gupta if io_uring sends passthrough command with IO_URING_F_FIXEDBUFS flag, use the pre-registered buffer to form the bio. While at it modify nvme_submit_user_cmd to take ubuffer as plain integer argument, and do away with nvme_to_user_ptr conversion in callers. Signed-off-by: Anuj Gupta <[email protected]> Signed-off-by: Kanchan Joshi <[email protected]> --- drivers/nvme/host/ioctl.c | 38 ++++++++++++++++++++++++++------------ 1 file changed, 26 insertions(+), 12 deletions(-) diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index 548aca8b5b9f..4341d758d6b9 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -65,10 +65,11 @@ static int nvme_finish_user_metadata(struct request *req, void __user *ubuf, } static struct request *nvme_alloc_user_request(struct request_queue *q, - struct nvme_command *cmd, void __user *ubuffer, + struct nvme_command *cmd, u64 ubuffer, unsigned bufflen, void __user *meta_buffer, unsigned meta_len, u32 meta_seed, void **metap, unsigned timeout, bool vec, - blk_opf_t rq_flags, blk_mq_req_flags_t blk_flags) + blk_opf_t rq_flags, blk_mq_req_flags_t blk_flags, + struct io_uring_cmd *ioucmd, bool fixedbufs) { bool write = nvme_is_write(cmd); struct nvme_ns *ns = q->queuedata; @@ -89,14 +90,27 @@ static struct request *nvme_alloc_user_request(struct request_queue *q, if (ubuffer && bufflen) { if (!vec) - ret = blk_rq_map_user(q, req, NULL, ubuffer, bufflen, - GFP_KERNEL); + if (fixedbufs) { + struct iov_iter iter; + + ret = io_uring_cmd_import_fixed(ubuffer, + bufflen, rq_data_dir(req), + &iter, ioucmd); + if (ret < 0) + goto out; + ret = blk_rq_map_user_bvec(req, &iter); + } else { + ret = blk_rq_map_user(q, req, NULL, + nvme_to_user_ptr(ubuffer), + bufflen, GFP_KERNEL); + } else { struct iovec fast_iov[UIO_FASTIOV]; struct iovec *iov = fast_iov; struct iov_iter iter; - ret = import_iovec(rq_data_dir(req), ubuffer, bufflen, + ret = import_iovec(rq_data_dir(req), + nvme_to_user_ptr(ubuffer), bufflen, UIO_FASTIOV, &iov, &iter); if (ret < 0) goto out; @@ -132,7 +146,7 @@ static struct request *nvme_alloc_user_request(struct request_queue *q, } static int nvme_submit_user_cmd(struct request_queue *q, - struct nvme_command *cmd, void __user *ubuffer, + struct nvme_command *cmd, u64 ubuffer, unsigned bufflen, void __user *meta_buffer, unsigned meta_len, u32 meta_seed, u64 *result, unsigned timeout, bool vec) { @@ -142,7 +156,7 @@ static int nvme_submit_user_cmd(struct request_queue *q, int ret; req = nvme_alloc_user_request(q, cmd, ubuffer, bufflen, meta_buffer, - meta_len, meta_seed, &meta, timeout, vec, 0, 0); + meta_len, meta_seed, &meta, timeout, vec, 0, 0, NULL, 0); if (IS_ERR(req)) return PTR_ERR(req); @@ -220,7 +234,7 @@ static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio) c.rw.appmask = cpu_to_le16(io.appmask); return nvme_submit_user_cmd(ns->queue, &c, - nvme_to_user_ptr(io.addr), length, + io.addr, length, metadata, meta_len, lower_32_bits(io.slba), NULL, 0, false); } @@ -274,7 +288,7 @@ static int nvme_user_cmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns, timeout = msecs_to_jiffies(cmd.timeout_ms); status = nvme_submit_user_cmd(ns ? ns->queue : ctrl->admin_q, &c, - nvme_to_user_ptr(cmd.addr), cmd.data_len, + cmd.addr, cmd.data_len, nvme_to_user_ptr(cmd.metadata), cmd.metadata_len, 0, &result, timeout, false); @@ -320,7 +334,7 @@ static int nvme_user_cmd64(struct nvme_ctrl *ctrl, struct nvme_ns *ns, timeout = msecs_to_jiffies(cmd.timeout_ms); status = nvme_submit_user_cmd(ns ? ns->queue : ctrl->admin_q, &c, - nvme_to_user_ptr(cmd.addr), cmd.data_len, + cmd.addr, cmd.data_len, nvme_to_user_ptr(cmd.metadata), cmd.metadata_len, 0, &cmd.result, timeout, vec); @@ -457,11 +471,11 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, rq_flags |= REQ_POLLED; retry: - req = nvme_alloc_user_request(q, &c, nvme_to_user_ptr(d.addr), + req = nvme_alloc_user_request(q, &c, d.addr, d.data_len, nvme_to_user_ptr(d.metadata), d.metadata_len, 0, &meta, d.timeout_ms ? msecs_to_jiffies(d.timeout_ms) : 0, vec, rq_flags, - blk_flags); + blk_flags, ioucmd, issue_flags & IO_URING_F_FIXEDBUFS); if (IS_ERR(req)) return PTR_ERR(req); req->end_io = nvme_uring_cmd_end_io; -- 2.25.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
end of thread, other threads:[~2022-09-06 13:06 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <CGME20220905135842epcas5p4835d74beb091f5f50490714d93fc58f2@epcas5p4.samsung.com> 2022-09-05 13:48 ` [PATCH for-next v4 0/4] fixed-buffer for uring-cmd/passthru Kanchan Joshi [not found] ` <CGME20220905135846epcas5p4fde0fc96442adc3cf11319375ba2596b@epcas5p4.samsung.com> 2022-09-05 13:48 ` [PATCH for-next v4 1/4] io_uring: introduce io_uring_cmd_import_fixed Kanchan Joshi 2022-09-05 17:50 ` Jens Axboe 2022-09-05 17:53 ` Jens Axboe [not found] ` <CGME20220905135848epcas5p445275a3af56a26a036878fe8a8bcb55f@epcas5p4.samsung.com> 2022-09-05 13:48 ` [PATCH for-next v4 2/4] io_uring: introduce fixed buffer support for io_uring_cmd Kanchan Joshi 2022-09-05 17:54 ` Jens Axboe [not found] ` <CGME20220905135851epcas5p3d107b140fd6cba1feb338c1a31c4feb1@epcas5p3.samsung.com> 2022-09-05 13:48 ` [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough Kanchan Joshi 2022-09-06 6:25 ` Christoph Hellwig 2022-09-06 6:33 ` Kanchan Joshi 2022-09-06 6:47 ` Kanchan Joshi 2022-09-06 6:51 ` Christoph Hellwig 2022-09-06 13:06 ` Jens Axboe [not found] ` <CGME20220905135854epcas5p2256848a964afec41f46502b0114698e2@epcas5p2.samsung.com> 2022-09-05 13:48 ` [PATCH for-next v4 4/4] nvme: wire up fixed buffer support for nvme passthrough Kanchan Joshi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox