public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH for-next v4 0/4] fixed-buffer for uring-cmd/passthru
       [not found] <CGME20220905135842epcas5p4835d74beb091f5f50490714d93fc58f2@epcas5p4.samsung.com>
@ 2022-09-05 13:48 ` Kanchan Joshi
       [not found]   ` <CGME20220905135846epcas5p4fde0fc96442adc3cf11319375ba2596b@epcas5p4.samsung.com>
                     ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Kanchan Joshi @ 2022-09-05 13:48 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, gost.dev, Kanchan Joshi

Hi,

Currently uring-cmd lacks the ability to leverage the pre-registered
buffers. This series adds that support in uring-cmd, and plumbs
nvme passthrough to work with it.

Using registered-buffers showed IOPS hike from 1.9M to 2.2M to me.

Patch 1, 3 = prep/infrastructure
Patch 2 = expand io_uring command to use registered-buffers
Patch 4 = expand nvme passthrough to use registered-buffers

Changes since v3:
- uring_cmd_flags, change from u16 to u32 (Jens)
- patch 3, add another helper to reduce code-duplication (Jens)

Changes since v2:
- Kill the new opcode, add a flag instead (Pavel)
- Fix standalone build issue with patch 1 (Pavel)

Changes since v1:
- Fix a naming issue for an exported helper

Anuj Gupta (2):
  io_uring: introduce io_uring_cmd_import_fixed
  io_uring: introduce fixed buffer support for io_uring_cmd

Kanchan Joshi (2):
  block: add helper to map bvec iterator for passthrough
  nvme: wire up fixed buffer support for nvme passthrough

 block/blk-map.c               | 94 +++++++++++++++++++++++++++++++----
 drivers/nvme/host/ioctl.c     | 38 +++++++++-----
 include/linux/blk-mq.h        |  1 +
 include/linux/io_uring.h      | 11 +++-
 include/uapi/linux/io_uring.h |  9 ++++
 io_uring/uring_cmd.c          | 29 ++++++++++-
 6 files changed, 158 insertions(+), 24 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH for-next v4 1/4] io_uring: introduce io_uring_cmd_import_fixed
       [not found]   ` <CGME20220905135846epcas5p4fde0fc96442adc3cf11319375ba2596b@epcas5p4.samsung.com>
@ 2022-09-05 13:48     ` Kanchan Joshi
  2022-09-05 17:50       ` Jens Axboe
  2022-09-05 17:53       ` Jens Axboe
  0 siblings, 2 replies; 13+ messages in thread
From: Kanchan Joshi @ 2022-09-05 13:48 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta,
	Kanchan Joshi

From: Anuj Gupta <[email protected]>

This is a new helper that callers can use to obtain a bvec iterator for
the previously mapped buffer. This is preparatory work to enable
fixed-buffer support for io_uring_cmd.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
 include/linux/io_uring.h |  8 ++++++++
 io_uring/uring_cmd.c     | 11 +++++++++++
 2 files changed, 19 insertions(+)

diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
index 58676c0a398f..dba6fb47aa6c 100644
--- a/include/linux/io_uring.h
+++ b/include/linux/io_uring.h
@@ -4,6 +4,7 @@
 
 #include <linux/sched.h>
 #include <linux/xarray.h>
+#include <uapi/linux/io_uring.h>
 
 enum io_uring_cmd_flags {
 	IO_URING_F_COMPLETE_DEFER	= 1,
@@ -32,6 +33,8 @@ struct io_uring_cmd {
 };
 
 #if defined(CONFIG_IO_URING)
+int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
+		struct iov_iter *iter, void *ioucmd);
 void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret, ssize_t res2);
 void io_uring_cmd_complete_in_task(struct io_uring_cmd *ioucmd,
 			void (*task_work_cb)(struct io_uring_cmd *));
@@ -59,6 +62,11 @@ static inline void io_uring_free(struct task_struct *tsk)
 		__io_uring_free(tsk);
 }
 #else
+int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
+		struct iov_iter *iter, void *ioucmd)
+{
+	return -1;
+}
 static inline void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret,
 		ssize_t ret2)
 {
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 6f99dbd5d550..8cddd18ad10b 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -7,6 +7,7 @@
 #include <uapi/linux/io_uring.h>
 
 #include "io_uring.h"
+#include "rsrc.h"
 #include "uring_cmd.h"
 
 static void io_uring_cmd_work(struct io_kiocb *req, bool *locked)
@@ -124,3 +125,13 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
 
 	return IOU_ISSUE_SKIP_COMPLETE;
 }
+
+int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len,
+		int rw, struct iov_iter *iter, void *ioucmd)
+{
+	struct io_kiocb *req = cmd_to_io_kiocb(ioucmd);
+	struct io_mapped_ubuf *imu = req->imu;
+
+	return io_import_fixed(rw, iter, imu, ubuf, len);
+}
+EXPORT_SYMBOL_GPL(io_uring_cmd_import_fixed);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH for-next v4 2/4] io_uring: introduce fixed buffer support for io_uring_cmd
       [not found]   ` <CGME20220905135848epcas5p445275a3af56a26a036878fe8a8bcb55f@epcas5p4.samsung.com>
@ 2022-09-05 13:48     ` Kanchan Joshi
  2022-09-05 17:54       ` Jens Axboe
  0 siblings, 1 reply; 13+ messages in thread
From: Kanchan Joshi @ 2022-09-05 13:48 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta,
	Kanchan Joshi

From: Anuj Gupta <[email protected]>

Add IORING_URING_CMD_FIXED flag that is to be used for sending io_uring
command with previously registered buffers. User-space passes the buffer
index in sqe->buf_index, same as done in read/write variants that uses
fixed buffers.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
 include/linux/io_uring.h      |  3 ++-
 include/uapi/linux/io_uring.h |  9 +++++++++
 io_uring/uring_cmd.c          | 18 +++++++++++++++++-
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
index dba6fb47aa6c..6ca633b88816 100644
--- a/include/linux/io_uring.h
+++ b/include/linux/io_uring.h
@@ -16,6 +16,7 @@ enum io_uring_cmd_flags {
 	IO_URING_F_SQE128		= 4,
 	IO_URING_F_CQE32		= 8,
 	IO_URING_F_IOPOLL		= 16,
+	IO_URING_F_FIXEDBUFS		= 32,
 };
 
 struct io_uring_cmd {
@@ -28,7 +29,7 @@ struct io_uring_cmd {
 		void *cookie;
 	};
 	u32		cmd_op;
-	u32		pad;
+	u32		flags;
 	u8		pdu[32]; /* available inline for free use */
 };
 
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 48e5c70e0baf..34be8dd31f17 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -56,6 +56,7 @@ struct io_uring_sqe {
 		__u32		hardlink_flags;
 		__u32		xattr_flags;
 		__u32		msg_ring_flags;
+		__u32		uring_cmd_flags;
 	};
 	__u64	user_data;	/* data to be passed back at completion time */
 	/* pack this to avoid bogus arm OABI complaints */
@@ -219,6 +220,14 @@ enum io_uring_op {
 	IORING_OP_LAST,
 };
 
+/*
+ * sqe->uring_cmd_flags
+ * IORING_URING_CMD_FIXED	use registered buffer; pass thig flag
+ *				along with setting sqe->buf_index.
+ */
+#define IORING_URING_CMD_FIXED	(1U << 0)
+
+
 /*
  * sqe->fsync_flags
  */
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 8cddd18ad10b..ea989a348d98 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -3,6 +3,7 @@
 #include <linux/errno.h>
 #include <linux/file.h>
 #include <linux/io_uring.h>
+#include <linux/nospec.h>
 
 #include <uapi/linux/io_uring.h>
 
@@ -76,8 +77,21 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 {
 	struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
 
-	if (sqe->rw_flags || sqe->__pad1)
+	if (sqe->__pad1)
 		return -EINVAL;
+
+	ioucmd->flags = READ_ONCE(sqe->uring_cmd_flags);
+	req->buf_index = READ_ONCE(sqe->buf_index);
+	if (ioucmd->flags & IORING_URING_CMD_FIXED) {
+		struct io_ring_ctx *ctx = req->ctx;
+		u16 index;
+
+		if (unlikely(req->buf_index >= ctx->nr_user_bufs))
+			return -EFAULT;
+		index = array_index_nospec(req->buf_index, ctx->nr_user_bufs);
+		req->imu = ctx->user_bufs[index];
+		io_req_set_rsrc_node(req, ctx, 0);
+	}
 	ioucmd->cmd = sqe->cmd;
 	ioucmd->cmd_op = READ_ONCE(sqe->cmd_op);
 	return 0;
@@ -102,6 +116,8 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
 		req->iopoll_completed = 0;
 		WRITE_ONCE(ioucmd->cookie, NULL);
 	}
+	if (ioucmd->flags & IORING_URING_CMD_FIXED)
+		issue_flags |= IO_URING_F_FIXEDBUFS;
 
 	if (req_has_async_data(req))
 		ioucmd->cmd = req->async_data;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough
       [not found]   ` <CGME20220905135851epcas5p3d107b140fd6cba1feb338c1a31c4feb1@epcas5p3.samsung.com>
@ 2022-09-05 13:48     ` Kanchan Joshi
  2022-09-06  6:25       ` Christoph Hellwig
  0 siblings, 1 reply; 13+ messages in thread
From: Kanchan Joshi @ 2022-09-05 13:48 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, gost.dev, Kanchan Joshi,
	Anuj Gupta

Add blk_rq_map_user_bvec which maps the bvec iterator into a bio and
places that into the request. This helper will be used in nvme for
uring-passthrough with fixed-buffer.
While at it, create another helper bio_map_get to reduce the code
duplication.

Signed-off-by: Kanchan Joshi <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
---
 block/blk-map.c        | 94 +++++++++++++++++++++++++++++++++++++-----
 include/linux/blk-mq.h |  1 +
 2 files changed, 85 insertions(+), 10 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index f3768876d618..e2f268167342 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -241,17 +241,10 @@ static void bio_map_put(struct bio *bio)
 	}
 }
 
-static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
+static struct bio *bio_map_get(struct request *rq, unsigned int nr_vecs,
 		gfp_t gfp_mask)
 {
-	unsigned int max_sectors = queue_max_hw_sectors(rq->q);
-	unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS);
 	struct bio *bio;
-	int ret;
-	int j;
-
-	if (!iov_iter_count(iter))
-		return -EINVAL;
 
 	if (rq->cmd_flags & REQ_POLLED) {
 		blk_opf_t opf = rq->cmd_flags | REQ_ALLOC_CACHE;
@@ -259,13 +252,31 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
 		bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask,
 					&fs_bio_set);
 		if (!bio)
-			return -ENOMEM;
+			return NULL;
 	} else {
 		bio = bio_kmalloc(nr_vecs, gfp_mask);
 		if (!bio)
-			return -ENOMEM;
+			return NULL;
 		bio_init(bio, NULL, bio->bi_inline_vecs, nr_vecs, req_op(rq));
 	}
+	return bio;
+}
+
+static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
+		gfp_t gfp_mask)
+{
+	unsigned int max_sectors = queue_max_hw_sectors(rq->q);
+	unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS);
+	struct bio *bio;
+	int ret;
+	int j;
+
+	if (!iov_iter_count(iter))
+		return -EINVAL;
+
+	bio = bio_map_get(rq, nr_vecs, gfp_mask);
+	if (bio == NULL)
+		return -ENOMEM;
 
 	while (iov_iter_count(iter)) {
 		struct page **pages, *stack_pages[UIO_FASTIOV];
@@ -612,6 +623,69 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
 }
 EXPORT_SYMBOL(blk_rq_map_user);
 
+/* Prepare bio for passthrough IO given an existing bvec iter */
+int blk_rq_map_user_bvec(struct request *rq, struct iov_iter *iter)
+{
+	struct request_queue *q = rq->q;
+	size_t iter_count, nr_segs;
+	struct bio *bio;
+	struct bio_vec *bv, *bvec_arr, *bvprvp = NULL;
+	struct queue_limits *lim = &q->limits;
+	unsigned int nsegs = 0, bytes = 0;
+	int ret, i;
+
+	iter_count = iov_iter_count(iter);
+	nr_segs = iter->nr_segs;
+
+	if (!iter_count || (iter_count >> 9) > queue_max_hw_sectors(q))
+		return -EINVAL;
+	if (nr_segs > queue_max_segments(q))
+		return -EINVAL;
+
+	/* no iovecs to alloc, as we already have a BVEC iterator */
+	bio = bio_map_get(rq, 0, GFP_KERNEL);
+	if (bio == NULL)
+		return -ENOMEM;
+
+	bio_iov_bvec_set(bio, iter);
+	blk_rq_bio_prep(rq, bio, nr_segs);
+
+	/* loop to perform a bunch of sanity checks */
+	bvec_arr = (struct bio_vec *)iter->bvec;
+	for (i = 0; i < nr_segs; i++) {
+		bv = &bvec_arr[i];
+		/*
+		 * If the queue doesn't support SG gaps and adding this
+		 * offset would create a gap, disallow it.
+		 */
+		if (bvprvp && bvec_gap_to_prev(lim, bvprvp, bv->bv_offset)) {
+			ret = -EINVAL;
+			goto out_free;
+		}
+
+		/* check full condition */
+		if (nsegs >= nr_segs || bytes > UINT_MAX - bv->bv_len) {
+			ret = -EINVAL;
+			goto out_free;
+		}
+
+		if (bytes + bv->bv_len <= iter_count &&
+				bv->bv_offset + bv->bv_len <= PAGE_SIZE) {
+			nsegs++;
+			bytes += bv->bv_len;
+		} else {
+			ret = -EINVAL;
+			goto out_free;
+		}
+		bvprvp = bv;
+	}
+	return 0;
+out_free:
+	bio_map_put(bio);
+	return ret;
+}
+EXPORT_SYMBOL(blk_rq_map_user_bvec);
+
 /**
  * blk_rq_unmap_user - unmap a request with user data
  * @bio:	       start of bio list
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index b43c81d91892..83bef362f0f9 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -970,6 +970,7 @@ struct rq_map_data {
 	bool from_user;
 };
 
+int blk_rq_map_user_bvec(struct request *rq, struct iov_iter *iter);
 int blk_rq_map_user(struct request_queue *, struct request *,
 		struct rq_map_data *, void __user *, unsigned long, gfp_t);
 int blk_rq_map_user_iov(struct request_queue *, struct request *,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH for-next v4 4/4] nvme: wire up fixed buffer support for nvme passthrough
       [not found]   ` <CGME20220905135854epcas5p2256848a964afec41f46502b0114698e2@epcas5p2.samsung.com>
@ 2022-09-05 13:48     ` Kanchan Joshi
  0 siblings, 0 replies; 13+ messages in thread
From: Kanchan Joshi @ 2022-09-05 13:48 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, gost.dev, Kanchan Joshi,
	Anuj Gupta

if io_uring sends passthrough command with IO_URING_F_FIXEDBUFS flag,
use the pre-registered buffer to form the bio.
While at it modify nvme_submit_user_cmd to take ubuffer as plain integer
argument, and do away with nvme_to_user_ptr conversion in callers.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
 drivers/nvme/host/ioctl.c | 38 ++++++++++++++++++++++++++------------
 1 file changed, 26 insertions(+), 12 deletions(-)

diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 548aca8b5b9f..4341d758d6b9 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -65,10 +65,11 @@ static int nvme_finish_user_metadata(struct request *req, void __user *ubuf,
 }
 
 static struct request *nvme_alloc_user_request(struct request_queue *q,
-		struct nvme_command *cmd, void __user *ubuffer,
+		struct nvme_command *cmd, u64 ubuffer,
 		unsigned bufflen, void __user *meta_buffer, unsigned meta_len,
 		u32 meta_seed, void **metap, unsigned timeout, bool vec,
-		blk_opf_t rq_flags, blk_mq_req_flags_t blk_flags)
+		blk_opf_t rq_flags, blk_mq_req_flags_t blk_flags,
+		struct io_uring_cmd *ioucmd, bool fixedbufs)
 {
 	bool write = nvme_is_write(cmd);
 	struct nvme_ns *ns = q->queuedata;
@@ -89,14 +90,27 @@ static struct request *nvme_alloc_user_request(struct request_queue *q,
 
 	if (ubuffer && bufflen) {
 		if (!vec)
-			ret = blk_rq_map_user(q, req, NULL, ubuffer, bufflen,
-				GFP_KERNEL);
+			if (fixedbufs) {
+				struct iov_iter iter;
+
+				ret = io_uring_cmd_import_fixed(ubuffer,
+						bufflen, rq_data_dir(req),
+						&iter, ioucmd);
+				if (ret < 0)
+					goto out;
+				ret = blk_rq_map_user_bvec(req, &iter);
+			} else {
+				ret = blk_rq_map_user(q, req, NULL,
+						nvme_to_user_ptr(ubuffer),
+						bufflen, GFP_KERNEL);
+			}
 		else {
 			struct iovec fast_iov[UIO_FASTIOV];
 			struct iovec *iov = fast_iov;
 			struct iov_iter iter;
 
-			ret = import_iovec(rq_data_dir(req), ubuffer, bufflen,
+			ret = import_iovec(rq_data_dir(req),
+					nvme_to_user_ptr(ubuffer), bufflen,
 					UIO_FASTIOV, &iov, &iter);
 			if (ret < 0)
 				goto out;
@@ -132,7 +146,7 @@ static struct request *nvme_alloc_user_request(struct request_queue *q,
 }
 
 static int nvme_submit_user_cmd(struct request_queue *q,
-		struct nvme_command *cmd, void __user *ubuffer,
+		struct nvme_command *cmd, u64 ubuffer,
 		unsigned bufflen, void __user *meta_buffer, unsigned meta_len,
 		u32 meta_seed, u64 *result, unsigned timeout, bool vec)
 {
@@ -142,7 +156,7 @@ static int nvme_submit_user_cmd(struct request_queue *q,
 	int ret;
 
 	req = nvme_alloc_user_request(q, cmd, ubuffer, bufflen, meta_buffer,
-			meta_len, meta_seed, &meta, timeout, vec, 0, 0);
+			meta_len, meta_seed, &meta, timeout, vec, 0, 0, NULL, 0);
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
@@ -220,7 +234,7 @@ static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio)
 	c.rw.appmask = cpu_to_le16(io.appmask);
 
 	return nvme_submit_user_cmd(ns->queue, &c,
-			nvme_to_user_ptr(io.addr), length,
+			io.addr, length,
 			metadata, meta_len, lower_32_bits(io.slba), NULL, 0,
 			false);
 }
@@ -274,7 +288,7 @@ static int nvme_user_cmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 		timeout = msecs_to_jiffies(cmd.timeout_ms);
 
 	status = nvme_submit_user_cmd(ns ? ns->queue : ctrl->admin_q, &c,
-			nvme_to_user_ptr(cmd.addr), cmd.data_len,
+			cmd.addr, cmd.data_len,
 			nvme_to_user_ptr(cmd.metadata), cmd.metadata_len,
 			0, &result, timeout, false);
 
@@ -320,7 +334,7 @@ static int nvme_user_cmd64(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 		timeout = msecs_to_jiffies(cmd.timeout_ms);
 
 	status = nvme_submit_user_cmd(ns ? ns->queue : ctrl->admin_q, &c,
-			nvme_to_user_ptr(cmd.addr), cmd.data_len,
+			cmd.addr, cmd.data_len,
 			nvme_to_user_ptr(cmd.metadata), cmd.metadata_len,
 			0, &cmd.result, timeout, vec);
 
@@ -457,11 +471,11 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 		rq_flags |= REQ_POLLED;
 
 retry:
-	req = nvme_alloc_user_request(q, &c, nvme_to_user_ptr(d.addr),
+	req = nvme_alloc_user_request(q, &c, d.addr,
 			d.data_len, nvme_to_user_ptr(d.metadata),
 			d.metadata_len, 0, &meta, d.timeout_ms ?
 			msecs_to_jiffies(d.timeout_ms) : 0, vec, rq_flags,
-			blk_flags);
+			blk_flags, ioucmd, issue_flags & IO_URING_F_FIXEDBUFS);
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 	req->end_io = nvme_uring_cmd_end_io;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next v4 1/4] io_uring: introduce io_uring_cmd_import_fixed
  2022-09-05 13:48     ` [PATCH for-next v4 1/4] io_uring: introduce io_uring_cmd_import_fixed Kanchan Joshi
@ 2022-09-05 17:50       ` Jens Axboe
  2022-09-05 17:53       ` Jens Axboe
  1 sibling, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2022-09-05 17:50 UTC (permalink / raw)
  To: Kanchan Joshi, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta

On 9/5/22 7:48 AM, Kanchan Joshi wrote:
> From: Anuj Gupta <[email protected]>
> 
> This is a new helper that callers can use to obtain a bvec iterator for
> the previously mapped buffer. This is preparatory work to enable
> fixed-buffer support for io_uring_cmd.
> 
> Signed-off-by: Anuj Gupta <[email protected]>
> Signed-off-by: Kanchan Joshi <[email protected]>
> ---
>  include/linux/io_uring.h |  8 ++++++++
>  io_uring/uring_cmd.c     | 11 +++++++++++
>  2 files changed, 19 insertions(+)
> 
> diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
> index 58676c0a398f..dba6fb47aa6c 100644
> --- a/include/linux/io_uring.h
> +++ b/include/linux/io_uring.h
> @@ -4,6 +4,7 @@
>  
>  #include <linux/sched.h>
>  #include <linux/xarray.h>
> +#include <uapi/linux/io_uring.h>
>  
>  enum io_uring_cmd_flags {
>  	IO_URING_F_COMPLETE_DEFER	= 1,
> @@ -32,6 +33,8 @@ struct io_uring_cmd {
>  };
>  
>  #if defined(CONFIG_IO_URING)
> +int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
> +		struct iov_iter *iter, void *ioucmd);
>  void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret, ssize_t res2);
>  void io_uring_cmd_complete_in_task(struct io_uring_cmd *ioucmd,
>  			void (*task_work_cb)(struct io_uring_cmd *));
> @@ -59,6 +62,11 @@ static inline void io_uring_free(struct task_struct *tsk)
>  		__io_uring_free(tsk);
>  }
>  #else
> +int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
> +		struct iov_iter *iter, void *ioucmd)
> +{
> +	return -1;
> +}

Is this right? Shouldn't it return -EOPNOTSUPP or another suitable actual
error value?

Apart from that, I think the patchset looks fine now.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next v4 1/4] io_uring: introduce io_uring_cmd_import_fixed
  2022-09-05 13:48     ` [PATCH for-next v4 1/4] io_uring: introduce io_uring_cmd_import_fixed Kanchan Joshi
  2022-09-05 17:50       ` Jens Axboe
@ 2022-09-05 17:53       ` Jens Axboe
  1 sibling, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2022-09-05 17:53 UTC (permalink / raw)
  To: Kanchan Joshi, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta

On 9/5/22 7:48 AM, Kanchan Joshi wrote:
> @@ -124,3 +125,13 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
>  
>  	return IOU_ISSUE_SKIP_COMPLETE;
>  }
> +
> +int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len,
> +		int rw, struct iov_iter *iter, void *ioucmd)
> +{
> +	struct io_kiocb *req = cmd_to_io_kiocb(ioucmd);
> +	struct io_mapped_ubuf *imu = req->imu;
> +
> +	return io_import_fixed(rw, iter, imu, ubuf, len);
> +}
> +EXPORT_SYMBOL_GPL(io_uring_cmd_import_fixed);

Oh, and since we're probably respinning this one anyway, I'd do:

int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
			      struct iov_iter *iter, void *ioucmd)
{
	struct io_kiocb *req = cmd_to_io_kiocb(ioucmd);

	return io_import_fixed(rw, iter, req->imu, ubuf, len);
}
EXPORT_SYMBOL_GPL(io_uring_cmd_import_fixed);

to both fix the indentation and get rid of the 'imu' variable that isn't
really necessary.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next v4 2/4] io_uring: introduce fixed buffer support for io_uring_cmd
  2022-09-05 13:48     ` [PATCH for-next v4 2/4] io_uring: introduce fixed buffer support for io_uring_cmd Kanchan Joshi
@ 2022-09-05 17:54       ` Jens Axboe
  0 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2022-09-05 17:54 UTC (permalink / raw)
  To: Kanchan Joshi, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta

On 9/5/22 7:48 AM, Kanchan Joshi wrote:
> @@ -76,8 +77,21 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>  {
>  	struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
>  
> -	if (sqe->rw_flags || sqe->__pad1)
> +	if (sqe->__pad1)
>  		return -EINVAL;
> +
> +	ioucmd->flags = READ_ONCE(sqe->uring_cmd_flags);
> +	req->buf_index = READ_ONCE(sqe->buf_index);
> +	if (ioucmd->flags & IORING_URING_CMD_FIXED) {
> +		struct io_ring_ctx *ctx = req->ctx;
> +		u16 index;
> +
> +		if (unlikely(req->buf_index >= ctx->nr_user_bufs))
> +			return -EFAULT;
> +		index = array_index_nospec(req->buf_index, ctx->nr_user_bufs);
> +		req->imu = ctx->user_bufs[index];
> +		io_req_set_rsrc_node(req, ctx, 0);
> +	}

Should that buf_index read and assignment be inside the
IORING_URING_CMD_FIXED section?

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough
  2022-09-05 13:48     ` [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough Kanchan Joshi
@ 2022-09-06  6:25       ` Christoph Hellwig
  2022-09-06  6:33         ` Kanchan Joshi
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2022-09-06  6:25 UTC (permalink / raw)
  To: Kanchan Joshi
  Cc: axboe, hch, kbusch, asml.silence, io-uring, linux-nvme,
	linux-block, gost.dev, Anuj Gupta

On Mon, Sep 05, 2022 at 07:18:32PM +0530, Kanchan Joshi wrote:
> +static struct bio *bio_map_get(struct request *rq, unsigned int nr_vecs,
>  		gfp_t gfp_mask)
>  {
> @@ -259,13 +252,31 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
>  		bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask,
>  					&fs_bio_set);
>  		if (!bio)
> -			return -ENOMEM;
> +			return NULL;

This context looks weird?  That bio_alloc_bioset should not be there,
as biosets are only used for file system I/O, which this is not.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough
  2022-09-06  6:25       ` Christoph Hellwig
@ 2022-09-06  6:33         ` Kanchan Joshi
  2022-09-06  6:47           ` Kanchan Joshi
  2022-09-06  6:51           ` Christoph Hellwig
  0 siblings, 2 replies; 13+ messages in thread
From: Kanchan Joshi @ 2022-09-06  6:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, kbusch, asml.silence, io-uring, linux-nvme, linux-block,
	gost.dev, Anuj Gupta

[-- Attachment #1: Type: text/plain, Size: 775 bytes --]

On Tue, Sep 06, 2022 at 08:25:22AM +0200, Christoph Hellwig wrote:
>On Mon, Sep 05, 2022 at 07:18:32PM +0530, Kanchan Joshi wrote:
>> +static struct bio *bio_map_get(struct request *rq, unsigned int nr_vecs,
>>  		gfp_t gfp_mask)
>>  {
>> @@ -259,13 +252,31 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
>>  		bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask,
>>  					&fs_bio_set);
>>  		if (!bio)
>> -			return -ENOMEM;
>> +			return NULL;
>
>This context looks weird?  That bio_alloc_bioset should not be there,
>as biosets are only used for file system I/O, which this is not.

if you think it's a deal-breaker, maybe I can add a new bioset in nvme and
pass that as argument to this helper. Would you prefer that over the
current approach.

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough
  2022-09-06  6:33         ` Kanchan Joshi
@ 2022-09-06  6:47           ` Kanchan Joshi
  2022-09-06  6:51           ` Christoph Hellwig
  1 sibling, 0 replies; 13+ messages in thread
From: Kanchan Joshi @ 2022-09-06  6:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, kbusch, asml.silence, io-uring, linux-nvme, linux-block,
	gost.dev, Anuj Gupta

[-- Attachment #1: Type: text/plain, Size: 1196 bytes --]

On Tue, Sep 06, 2022 at 12:03:29PM +0530, Kanchan Joshi wrote:
>On Tue, Sep 06, 2022 at 08:25:22AM +0200, Christoph Hellwig wrote:
>>On Mon, Sep 05, 2022 at 07:18:32PM +0530, Kanchan Joshi wrote:
>>>+static struct bio *bio_map_get(struct request *rq, unsigned int nr_vecs,
>>> 		gfp_t gfp_mask)
>>> {
>>>@@ -259,13 +252,31 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
>>> 		bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask,
>>> 					&fs_bio_set);
>>> 		if (!bio)
>>>-			return -ENOMEM;
>>>+			return NULL;
>>
>>This context looks weird?  That bio_alloc_bioset should not be there,
>>as biosets are only used for file system I/O, which this is not.
>
>if you think it's a deal-breaker, maybe I can add a new bioset in nvme and
>pass that as argument to this helper. Would you prefer that over the
>current approach.

seems I responded without looking carefully. The bioset addition is not
part of this series. It got added earlier [1] [2], as part of optimizing
polled-io on file/passthru path.

[1] https://lore.kernel.org/linux-block/[email protected]/
[2] https://lore.kernel.org/linux-block/[email protected]/

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough
  2022-09-06  6:33         ` Kanchan Joshi
  2022-09-06  6:47           ` Kanchan Joshi
@ 2022-09-06  6:51           ` Christoph Hellwig
  2022-09-06 13:06             ` Jens Axboe
  1 sibling, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2022-09-06  6:51 UTC (permalink / raw)
  To: Kanchan Joshi
  Cc: Christoph Hellwig, axboe, kbusch, asml.silence, io-uring,
	linux-nvme, linux-block, gost.dev, Anuj Gupta

On Tue, Sep 06, 2022 at 12:03:29PM +0530, Kanchan Joshi wrote:
>> This context looks weird?  That bio_alloc_bioset should not be there,
>> as biosets are only used for file system I/O, which this is not.
>
> if you think it's a deal-breaker, maybe I can add a new bioset in nvme and
> pass that as argument to this helper. Would you prefer that over the
> current approach.

The whole point is that biosets exist to allow for forward progress
guarantees required for file system I/O.  For passthrough I/O
bio_kmalloc is perfectly fine and much simpler.  Adding yet another
bio_set just makes things even worse.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough
  2022-09-06  6:51           ` Christoph Hellwig
@ 2022-09-06 13:06             ` Jens Axboe
  0 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2022-09-06 13:06 UTC (permalink / raw)
  To: Christoph Hellwig, Kanchan Joshi
  Cc: kbusch, asml.silence, io-uring, linux-nvme, linux-block, gost.dev,
	Anuj Gupta

On 9/6/22 12:51 AM, Christoph Hellwig wrote:
> On Tue, Sep 06, 2022 at 12:03:29PM +0530, Kanchan Joshi wrote:
>>> This context looks weird?  That bio_alloc_bioset should not be there,
>>> as biosets are only used for file system I/O, which this is not.
>>
>> if you think it's a deal-breaker, maybe I can add a new bioset in nvme and
>> pass that as argument to this helper. Would you prefer that over the
>> current approach.
> 
> The whole point is that biosets exist to allow for forward progress
> guarantees required for file system I/O.  For passthrough I/O
> bio_kmalloc is perfectly fine and much simpler.  Adding yet another
> bio_set just makes things even worse.

It's a performance concern too, efficiency is much worse by using
kmalloc+kfree for passthrough. You don't get bio caching that way.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-09-06 13:06 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CGME20220905135842epcas5p4835d74beb091f5f50490714d93fc58f2@epcas5p4.samsung.com>
2022-09-05 13:48 ` [PATCH for-next v4 0/4] fixed-buffer for uring-cmd/passthru Kanchan Joshi
     [not found]   ` <CGME20220905135846epcas5p4fde0fc96442adc3cf11319375ba2596b@epcas5p4.samsung.com>
2022-09-05 13:48     ` [PATCH for-next v4 1/4] io_uring: introduce io_uring_cmd_import_fixed Kanchan Joshi
2022-09-05 17:50       ` Jens Axboe
2022-09-05 17:53       ` Jens Axboe
     [not found]   ` <CGME20220905135848epcas5p445275a3af56a26a036878fe8a8bcb55f@epcas5p4.samsung.com>
2022-09-05 13:48     ` [PATCH for-next v4 2/4] io_uring: introduce fixed buffer support for io_uring_cmd Kanchan Joshi
2022-09-05 17:54       ` Jens Axboe
     [not found]   ` <CGME20220905135851epcas5p3d107b140fd6cba1feb338c1a31c4feb1@epcas5p3.samsung.com>
2022-09-05 13:48     ` [PATCH for-next v4 3/4] block: add helper to map bvec iterator for passthrough Kanchan Joshi
2022-09-06  6:25       ` Christoph Hellwig
2022-09-06  6:33         ` Kanchan Joshi
2022-09-06  6:47           ` Kanchan Joshi
2022-09-06  6:51           ` Christoph Hellwig
2022-09-06 13:06             ` Jens Axboe
     [not found]   ` <CGME20220905135854epcas5p2256848a964afec41f46502b0114698e2@epcas5p2.samsung.com>
2022-09-05 13:48     ` [PATCH for-next v4 4/4] nvme: wire up fixed buffer support for nvme passthrough Kanchan Joshi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox