public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH for-next v5 0/4] fixed-buffer for uring-cmd/passthru
       [not found] <CGME20220906063719epcas5p3157e79583a5412a3be81f3d96f8aaadd@epcas5p3.samsung.com>
@ 2022-09-06  6:27 ` Kanchan Joshi
       [not found]   ` <CGME20220906063723epcas5p23946fd33031aee591210af1c3cd2d574@epcas5p2.samsung.com>
                     ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Kanchan Joshi @ 2022-09-06  6:27 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, gost.dev, Kanchan Joshi

Hi,

Currently uring-cmd lacks the ability to leverage the pre-registered
buffers. This series adds that support in uring-cmd, and plumbs
nvme passthrough to work with it.

Using registered-buffers showed IOPS hike from 1.9M to 2.2M to me.

Patch 1, 3 = prep/infrastructure
Patch 2 = expand io_uring command to use registered-buffers
Patch 4 = expand nvme passthrough to use registered-buffers


Changes since v4:
- Patch 1, 2: folded all review comments of Jens

Changes since v3:
- uring_cmd_flags, change from u16 to u32 (Jens)
- patch 3, add another helper to reduce code-duplication (Jens)

Changes since v2:
- Kill the new opcode, add a flag instead (Pavel)
- Fix standalone build issue with patch 1 (Pavel)

Changes since v1:
- Fix a naming issue for an exported helper

Anuj Gupta (2):
  io_uring: introduce io_uring_cmd_import_fixed
  io_uring: introduce fixed buffer support for io_uring_cmd

Kanchan Joshi (2):
  block: add helper to map bvec iterator for passthrough
  nvme: wire up fixed buffer support for nvme passthrough

 block/blk-map.c               | 94 +++++++++++++++++++++++++++++++----
 drivers/nvme/host/ioctl.c     | 38 +++++++++-----
 include/linux/blk-mq.h        |  1 +
 include/linux/io_uring.h      | 11 +++-
 include/uapi/linux/io_uring.h |  9 ++++
 io_uring/uring_cmd.c          | 28 ++++++++++-
 6 files changed, 157 insertions(+), 24 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH for-next v5 1/4] io_uring: introduce io_uring_cmd_import_fixed
       [not found]   ` <CGME20220906063723epcas5p23946fd33031aee591210af1c3cd2d574@epcas5p2.samsung.com>
@ 2022-09-06  6:27     ` Kanchan Joshi
  0 siblings, 0 replies; 12+ messages in thread
From: Kanchan Joshi @ 2022-09-06  6:27 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta,
	Kanchan Joshi

From: Anuj Gupta <[email protected]>

This is a new helper that callers can use to obtain a bvec iterator for
the previously mapped buffer. This is preparatory work to enable
fixed-buffer support for io_uring_cmd.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
 include/linux/io_uring.h |  8 ++++++++
 io_uring/uring_cmd.c     | 10 ++++++++++
 2 files changed, 18 insertions(+)

diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
index 58676c0a398f..202d90bc2c88 100644
--- a/include/linux/io_uring.h
+++ b/include/linux/io_uring.h
@@ -4,6 +4,7 @@
 
 #include <linux/sched.h>
 #include <linux/xarray.h>
+#include <uapi/linux/io_uring.h>
 
 enum io_uring_cmd_flags {
 	IO_URING_F_COMPLETE_DEFER	= 1,
@@ -32,6 +33,8 @@ struct io_uring_cmd {
 };
 
 #if defined(CONFIG_IO_URING)
+int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
+			      struct iov_iter *iter, void *ioucmd);
 void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret, ssize_t res2);
 void io_uring_cmd_complete_in_task(struct io_uring_cmd *ioucmd,
 			void (*task_work_cb)(struct io_uring_cmd *));
@@ -59,6 +62,11 @@ static inline void io_uring_free(struct task_struct *tsk)
 		__io_uring_free(tsk);
 }
 #else
+int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
+			      struct iov_iter *iter, void *ioucmd)
+{
+	return -EOPNOTSUPP;
+}
 static inline void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret,
 		ssize_t ret2)
 {
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 6f99dbd5d550..b8f4dc84c403 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -7,6 +7,7 @@
 #include <uapi/linux/io_uring.h>
 
 #include "io_uring.h"
+#include "rsrc.h"
 #include "uring_cmd.h"
 
 static void io_uring_cmd_work(struct io_kiocb *req, bool *locked)
@@ -124,3 +125,12 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
 
 	return IOU_ISSUE_SKIP_COMPLETE;
 }
+
+int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
+			      struct iov_iter *iter, void *ioucmd)
+{
+	struct io_kiocb *req = cmd_to_io_kiocb(ioucmd);
+
+	return io_import_fixed(rw, iter, req->imu, ubuf, len);
+}
+EXPORT_SYMBOL_GPL(io_uring_cmd_import_fixed);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH for-next v5 2/4] io_uring: introduce fixed buffer support for io_uring_cmd
       [not found]   ` <CGME20220906063726epcas5p42f764b4c01b841dd1fc34abebcab02e6@epcas5p4.samsung.com>
@ 2022-09-06  6:27     ` Kanchan Joshi
  0 siblings, 0 replies; 12+ messages in thread
From: Kanchan Joshi @ 2022-09-06  6:27 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, gost.dev, Anuj Gupta,
	Kanchan Joshi

From: Anuj Gupta <[email protected]>

Add IORING_URING_CMD_FIXED flag that is to be used for sending io_uring
command with previously registered buffers. User-space passes the buffer
index in sqe->buf_index, same as done in read/write variants that uses
fixed buffers.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
 include/linux/io_uring.h      |  3 ++-
 include/uapi/linux/io_uring.h |  9 +++++++++
 io_uring/uring_cmd.c          | 18 +++++++++++++++++-
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
index 202d90bc2c88..621b2c5469ed 100644
--- a/include/linux/io_uring.h
+++ b/include/linux/io_uring.h
@@ -16,6 +16,7 @@ enum io_uring_cmd_flags {
 	IO_URING_F_SQE128		= 4,
 	IO_URING_F_CQE32		= 8,
 	IO_URING_F_IOPOLL		= 16,
+	IO_URING_F_FIXEDBUFS		= 32,
 };
 
 struct io_uring_cmd {
@@ -28,7 +29,7 @@ struct io_uring_cmd {
 		void *cookie;
 	};
 	u32		cmd_op;
-	u32		pad;
+	u32		flags;
 	u8		pdu[32]; /* available inline for free use */
 };
 
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 48e5c70e0baf..34be8dd31f17 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -56,6 +56,7 @@ struct io_uring_sqe {
 		__u32		hardlink_flags;
 		__u32		xattr_flags;
 		__u32		msg_ring_flags;
+		__u32		uring_cmd_flags;
 	};
 	__u64	user_data;	/* data to be passed back at completion time */
 	/* pack this to avoid bogus arm OABI complaints */
@@ -219,6 +220,14 @@ enum io_uring_op {
 	IORING_OP_LAST,
 };
 
+/*
+ * sqe->uring_cmd_flags
+ * IORING_URING_CMD_FIXED	use registered buffer; pass thig flag
+ *				along with setting sqe->buf_index.
+ */
+#define IORING_URING_CMD_FIXED	(1U << 0)
+
+
 /*
  * sqe->fsync_flags
  */
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index b8f4dc84c403..f9189885edec 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -3,6 +3,7 @@
 #include <linux/errno.h>
 #include <linux/file.h>
 #include <linux/io_uring.h>
+#include <linux/nospec.h>
 
 #include <uapi/linux/io_uring.h>
 
@@ -76,8 +77,21 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 {
 	struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
 
-	if (sqe->rw_flags || sqe->__pad1)
+	if (sqe->__pad1)
 		return -EINVAL;
+
+	if (ioucmd->flags & IORING_URING_CMD_FIXED) {
+		struct io_ring_ctx *ctx = req->ctx;
+		u16 index;
+
+		ioucmd->flags = READ_ONCE(sqe->uring_cmd_flags);
+		req->buf_index = READ_ONCE(sqe->buf_index);
+		if (unlikely(req->buf_index >= ctx->nr_user_bufs))
+			return -EFAULT;
+		index = array_index_nospec(req->buf_index, ctx->nr_user_bufs);
+		req->imu = ctx->user_bufs[index];
+		io_req_set_rsrc_node(req, ctx, 0);
+	}
 	ioucmd->cmd = sqe->cmd;
 	ioucmd->cmd_op = READ_ONCE(sqe->cmd_op);
 	return 0;
@@ -102,6 +116,8 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
 		req->iopoll_completed = 0;
 		WRITE_ONCE(ioucmd->cookie, NULL);
 	}
+	if (ioucmd->flags & IORING_URING_CMD_FIXED)
+		issue_flags |= IO_URING_F_FIXEDBUFS;
 
 	if (req_has_async_data(req))
 		ioucmd->cmd = req->async_data;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH for-next v5 3/4] block: add helper to map bvec iterator for passthrough
       [not found]   ` <CGME20220906063729epcas5p1bf05e6873de0f7246234380d66c21fb9@epcas5p1.samsung.com>
@ 2022-09-06  6:27     ` Kanchan Joshi
  2022-09-07 15:32       ` Chaitanya Kulkarni
  0 siblings, 1 reply; 12+ messages in thread
From: Kanchan Joshi @ 2022-09-06  6:27 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, gost.dev, Kanchan Joshi,
	Anuj Gupta

Add blk_rq_map_user_bvec which maps the bvec iterator into a bio and
places that into the request. This helper will be used in nvme for
uring-passthrough with fixed-buffer.
While at it, create another helper bio_map_get to reduce the code
duplication.

Signed-off-by: Kanchan Joshi <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
---
 block/blk-map.c        | 94 +++++++++++++++++++++++++++++++++++++-----
 include/linux/blk-mq.h |  1 +
 2 files changed, 85 insertions(+), 10 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index f3768876d618..e2f268167342 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -241,17 +241,10 @@ static void bio_map_put(struct bio *bio)
 	}
 }
 
-static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
+static struct bio *bio_map_get(struct request *rq, unsigned int nr_vecs,
 		gfp_t gfp_mask)
 {
-	unsigned int max_sectors = queue_max_hw_sectors(rq->q);
-	unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS);
 	struct bio *bio;
-	int ret;
-	int j;
-
-	if (!iov_iter_count(iter))
-		return -EINVAL;
 
 	if (rq->cmd_flags & REQ_POLLED) {
 		blk_opf_t opf = rq->cmd_flags | REQ_ALLOC_CACHE;
@@ -259,13 +252,31 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
 		bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask,
 					&fs_bio_set);
 		if (!bio)
-			return -ENOMEM;
+			return NULL;
 	} else {
 		bio = bio_kmalloc(nr_vecs, gfp_mask);
 		if (!bio)
-			return -ENOMEM;
+			return NULL;
 		bio_init(bio, NULL, bio->bi_inline_vecs, nr_vecs, req_op(rq));
 	}
+	return bio;
+}
+
+static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
+		gfp_t gfp_mask)
+{
+	unsigned int max_sectors = queue_max_hw_sectors(rq->q);
+	unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS);
+	struct bio *bio;
+	int ret;
+	int j;
+
+	if (!iov_iter_count(iter))
+		return -EINVAL;
+
+	bio = bio_map_get(rq, nr_vecs, gfp_mask);
+	if (bio == NULL)
+		return -ENOMEM;
 
 	while (iov_iter_count(iter)) {
 		struct page **pages, *stack_pages[UIO_FASTIOV];
@@ -612,6 +623,69 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
 }
 EXPORT_SYMBOL(blk_rq_map_user);
 
+/* Prepare bio for passthrough IO given an existing bvec iter */
+int blk_rq_map_user_bvec(struct request *rq, struct iov_iter *iter)
+{
+	struct request_queue *q = rq->q;
+	size_t iter_count, nr_segs;
+	struct bio *bio;
+	struct bio_vec *bv, *bvec_arr, *bvprvp = NULL;
+	struct queue_limits *lim = &q->limits;
+	unsigned int nsegs = 0, bytes = 0;
+	int ret, i;
+
+	iter_count = iov_iter_count(iter);
+	nr_segs = iter->nr_segs;
+
+	if (!iter_count || (iter_count >> 9) > queue_max_hw_sectors(q))
+		return -EINVAL;
+	if (nr_segs > queue_max_segments(q))
+		return -EINVAL;
+
+	/* no iovecs to alloc, as we already have a BVEC iterator */
+	bio = bio_map_get(rq, 0, GFP_KERNEL);
+	if (bio == NULL)
+		return -ENOMEM;
+
+	bio_iov_bvec_set(bio, iter);
+	blk_rq_bio_prep(rq, bio, nr_segs);
+
+	/* loop to perform a bunch of sanity checks */
+	bvec_arr = (struct bio_vec *)iter->bvec;
+	for (i = 0; i < nr_segs; i++) {
+		bv = &bvec_arr[i];
+		/*
+		 * If the queue doesn't support SG gaps and adding this
+		 * offset would create a gap, disallow it.
+		 */
+		if (bvprvp && bvec_gap_to_prev(lim, bvprvp, bv->bv_offset)) {
+			ret = -EINVAL;
+			goto out_free;
+		}
+
+		/* check full condition */
+		if (nsegs >= nr_segs || bytes > UINT_MAX - bv->bv_len) {
+			ret = -EINVAL;
+			goto out_free;
+		}
+
+		if (bytes + bv->bv_len <= iter_count &&
+				bv->bv_offset + bv->bv_len <= PAGE_SIZE) {
+			nsegs++;
+			bytes += bv->bv_len;
+		} else {
+			ret = -EINVAL;
+			goto out_free;
+		}
+		bvprvp = bv;
+	}
+	return 0;
+out_free:
+	bio_map_put(bio);
+	return ret;
+}
+EXPORT_SYMBOL(blk_rq_map_user_bvec);
+
 /**
  * blk_rq_unmap_user - unmap a request with user data
  * @bio:	       start of bio list
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index b43c81d91892..83bef362f0f9 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -970,6 +970,7 @@ struct rq_map_data {
 	bool from_user;
 };
 
+int blk_rq_map_user_bvec(struct request *rq, struct iov_iter *iter);
 int blk_rq_map_user(struct request_queue *, struct request *,
 		struct rq_map_data *, void __user *, unsigned long, gfp_t);
 int blk_rq_map_user_iov(struct request_queue *, struct request *,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH for-next v5 4/4] nvme: wire up fixed buffer support for nvme passthrough
       [not found]   ` <CGME20220906063733epcas5p22984174bd6dbb2571152fea18af90924@epcas5p2.samsung.com>
@ 2022-09-06  6:27     ` Kanchan Joshi
  2022-09-07 14:51       ` Chaitanya Kulkarni
  0 siblings, 1 reply; 12+ messages in thread
From: Kanchan Joshi @ 2022-09-06  6:27 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, gost.dev, Kanchan Joshi,
	Anuj Gupta

if io_uring sends passthrough command with IO_URING_F_FIXEDBUFS flag,
use the pre-registered buffer to form the bio.
While at it modify nvme_submit_user_cmd to take ubuffer as plain integer
argument, and do away with nvme_to_user_ptr conversion in callers.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
---
 drivers/nvme/host/ioctl.c | 38 ++++++++++++++++++++++++++------------
 1 file changed, 26 insertions(+), 12 deletions(-)

diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 548aca8b5b9f..4341d758d6b9 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -65,10 +65,11 @@ static int nvme_finish_user_metadata(struct request *req, void __user *ubuf,
 }
 
 static struct request *nvme_alloc_user_request(struct request_queue *q,
-		struct nvme_command *cmd, void __user *ubuffer,
+		struct nvme_command *cmd, u64 ubuffer,
 		unsigned bufflen, void __user *meta_buffer, unsigned meta_len,
 		u32 meta_seed, void **metap, unsigned timeout, bool vec,
-		blk_opf_t rq_flags, blk_mq_req_flags_t blk_flags)
+		blk_opf_t rq_flags, blk_mq_req_flags_t blk_flags,
+		struct io_uring_cmd *ioucmd, bool fixedbufs)
 {
 	bool write = nvme_is_write(cmd);
 	struct nvme_ns *ns = q->queuedata;
@@ -89,14 +90,27 @@ static struct request *nvme_alloc_user_request(struct request_queue *q,
 
 	if (ubuffer && bufflen) {
 		if (!vec)
-			ret = blk_rq_map_user(q, req, NULL, ubuffer, bufflen,
-				GFP_KERNEL);
+			if (fixedbufs) {
+				struct iov_iter iter;
+
+				ret = io_uring_cmd_import_fixed(ubuffer,
+						bufflen, rq_data_dir(req),
+						&iter, ioucmd);
+				if (ret < 0)
+					goto out;
+				ret = blk_rq_map_user_bvec(req, &iter);
+			} else {
+				ret = blk_rq_map_user(q, req, NULL,
+						nvme_to_user_ptr(ubuffer),
+						bufflen, GFP_KERNEL);
+			}
 		else {
 			struct iovec fast_iov[UIO_FASTIOV];
 			struct iovec *iov = fast_iov;
 			struct iov_iter iter;
 
-			ret = import_iovec(rq_data_dir(req), ubuffer, bufflen,
+			ret = import_iovec(rq_data_dir(req),
+					nvme_to_user_ptr(ubuffer), bufflen,
 					UIO_FASTIOV, &iov, &iter);
 			if (ret < 0)
 				goto out;
@@ -132,7 +146,7 @@ static struct request *nvme_alloc_user_request(struct request_queue *q,
 }
 
 static int nvme_submit_user_cmd(struct request_queue *q,
-		struct nvme_command *cmd, void __user *ubuffer,
+		struct nvme_command *cmd, u64 ubuffer,
 		unsigned bufflen, void __user *meta_buffer, unsigned meta_len,
 		u32 meta_seed, u64 *result, unsigned timeout, bool vec)
 {
@@ -142,7 +156,7 @@ static int nvme_submit_user_cmd(struct request_queue *q,
 	int ret;
 
 	req = nvme_alloc_user_request(q, cmd, ubuffer, bufflen, meta_buffer,
-			meta_len, meta_seed, &meta, timeout, vec, 0, 0);
+			meta_len, meta_seed, &meta, timeout, vec, 0, 0, NULL, 0);
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
@@ -220,7 +234,7 @@ static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio)
 	c.rw.appmask = cpu_to_le16(io.appmask);
 
 	return nvme_submit_user_cmd(ns->queue, &c,
-			nvme_to_user_ptr(io.addr), length,
+			io.addr, length,
 			metadata, meta_len, lower_32_bits(io.slba), NULL, 0,
 			false);
 }
@@ -274,7 +288,7 @@ static int nvme_user_cmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 		timeout = msecs_to_jiffies(cmd.timeout_ms);
 
 	status = nvme_submit_user_cmd(ns ? ns->queue : ctrl->admin_q, &c,
-			nvme_to_user_ptr(cmd.addr), cmd.data_len,
+			cmd.addr, cmd.data_len,
 			nvme_to_user_ptr(cmd.metadata), cmd.metadata_len,
 			0, &result, timeout, false);
 
@@ -320,7 +334,7 @@ static int nvme_user_cmd64(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 		timeout = msecs_to_jiffies(cmd.timeout_ms);
 
 	status = nvme_submit_user_cmd(ns ? ns->queue : ctrl->admin_q, &c,
-			nvme_to_user_ptr(cmd.addr), cmd.data_len,
+			cmd.addr, cmd.data_len,
 			nvme_to_user_ptr(cmd.metadata), cmd.metadata_len,
 			0, &cmd.result, timeout, vec);
 
@@ -457,11 +471,11 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 		rq_flags |= REQ_POLLED;
 
 retry:
-	req = nvme_alloc_user_request(q, &c, nvme_to_user_ptr(d.addr),
+	req = nvme_alloc_user_request(q, &c, d.addr,
 			d.data_len, nvme_to_user_ptr(d.metadata),
 			d.metadata_len, 0, &meta, d.timeout_ms ?
 			msecs_to_jiffies(d.timeout_ms) : 0, vec, rq_flags,
-			blk_flags);
+			blk_flags, ioucmd, issue_flags & IO_URING_F_FIXEDBUFS);
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 	req->end_io = nvme_uring_cmd_end_io;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next v5 4/4] nvme: wire up fixed buffer support for nvme passthrough
  2022-09-06  6:27     ` [PATCH for-next v5 4/4] nvme: wire up fixed buffer support for nvme passthrough Kanchan Joshi
@ 2022-09-07 14:51       ` Chaitanya Kulkarni
  2022-09-08 10:47         ` Kanchan Joshi
  0 siblings, 1 reply; 12+ messages in thread
From: Chaitanya Kulkarni @ 2022-09-07 14:51 UTC (permalink / raw)
  To: Kanchan Joshi, [email protected], [email protected], [email protected]
  Cc: [email protected], [email protected],
	[email protected]


>   	req = nvme_alloc_user_request(q, cmd, ubuffer, bufflen, meta_buffer,
> -			meta_len, meta_seed, &meta, timeout, vec, 0, 0);
> +			meta_len, meta_seed, &meta, timeout, vec, 0, 0, NULL, 0);
>   	if (IS_ERR(req))
>   		return PTR_ERR(req);

14 Arguments to the function!

Kanchan, I'm not pointing out to this patch it has happened over
the years, I think it is high time we find a way to trim this
down.

Least we can do is to pass a structure member than 14 different
arguments, is everyone okay with it ?

-ck


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next v5 3/4] block: add helper to map bvec iterator for passthrough
  2022-09-06  6:27     ` [PATCH for-next v5 3/4] block: add helper to map bvec iterator for passthrough Kanchan Joshi
@ 2022-09-07 15:32       ` Chaitanya Kulkarni
  2022-09-08 10:52         ` Kanchan Joshi
  0 siblings, 1 reply; 12+ messages in thread
From: Chaitanya Kulkarni @ 2022-09-07 15:32 UTC (permalink / raw)
  To: Kanchan Joshi
  Cc: [email protected], [email protected], [email protected],
	[email protected], [email protected], [email protected],
	[email protected], [email protected], Anuj Gupta

On 9/5/22 23:27, Kanchan Joshi wrote:
> Add blk_rq_map_user_bvec which maps the bvec iterator into a bio and
> places that into the request. This helper will be used in nvme for
> uring-passthrough with fixed-buffer.
> While at it, create another helper bio_map_get to reduce the code
> duplication.
> 
> Signed-off-by: Kanchan Joshi <[email protected]>
> Signed-off-by: Anuj Gupta <[email protected]>
> ---
>   block/blk-map.c        | 94 +++++++++++++++++++++++++++++++++++++-----
>   include/linux/blk-mq.h |  1 +
>   2 files changed, 85 insertions(+), 10 deletions(-)
> 
> diff --git a/block/blk-map.c b/block/blk-map.c
> index f3768876d618..e2f268167342 100644
> --- a/block/blk-map.c
> +++ b/block/blk-map.c
> @@ -241,17 +241,10 @@ static void bio_map_put(struct bio *bio)
>   	}
>   }
>   
> -static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
> +static struct bio *bio_map_get(struct request *rq, unsigned int nr_vecs,
>   		gfp_t gfp_mask)
>   {
> -	unsigned int max_sectors = queue_max_hw_sectors(rq->q);
> -	unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS);
>   	struct bio *bio;
> -	int ret;
> -	int j;
> -
> -	if (!iov_iter_count(iter))
> -		return -EINVAL;
>   
>   	if (rq->cmd_flags & REQ_POLLED) {
>   		blk_opf_t opf = rq->cmd_flags | REQ_ALLOC_CACHE;
> @@ -259,13 +252,31 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
>   		bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask,
>   					&fs_bio_set);
>   		if (!bio)
> -			return -ENOMEM;
> +			return NULL;
>   	} else {
>   		bio = bio_kmalloc(nr_vecs, gfp_mask);
>   		if (!bio)
> -			return -ENOMEM;
> +			return NULL;
>   		bio_init(bio, NULL, bio->bi_inline_vecs, nr_vecs, req_op(rq));
>   	}
> +	return bio;
> +}
> +
> +static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
> +		gfp_t gfp_mask)
> +{
> +	unsigned int max_sectors = queue_max_hw_sectors(rq->q);
> +	unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS);
> +	struct bio *bio;
> +	int ret;
> +	int j;
> +
> +	if (!iov_iter_count(iter))
> +		return -EINVAL;
> +
> +	bio = bio_map_get(rq, nr_vecs, gfp_mask);
> +	if (bio == NULL)
> +		return -ENOMEM;
>   
>   	while (iov_iter_count(iter)) {
>   		struct page **pages, *stack_pages[UIO_FASTIOV];
> @@ -612,6 +623,69 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
>   }
>   EXPORT_SYMBOL(blk_rq_map_user);
>   
> +/* Prepare bio for passthrough IO given an existing bvec iter */
> +int blk_rq_map_user_bvec(struct request *rq, struct iov_iter *iter)
> +{
> +	struct request_queue *q = rq->q;
> +	size_t iter_count, nr_segs;
> +	struct bio *bio;
> +	struct bio_vec *bv, *bvec_arr, *bvprvp = NULL;
> +	struct queue_limits *lim = &q->limits;
> +	unsigned int nsegs = 0, bytes = 0;
> +	int ret, i;
> +

consider this (untested), it also sets the variable i data type same
as it comparison variable in nr_segs the loop i.e. size_t :-

+       struct bio_vec *bv, *bvec_arr, *bvprvp = NULL;
+       struct request_queue *q = rq->q;
+       struct queue_limits *lim = &q->limits;
+       unsigned int nsegs = 0, bytes = 0;
+       size_t iter_count, nr_segs, i;
+       struct bio *bio;
+       int ret;


> +	iter_count = iov_iter_count(iter);
> +	nr_segs = iter->nr_segs;
> +
> +	if (!iter_count || (iter_count >> 9) > queue_max_hw_sectors(q))
> +		return -EINVAL;

can we remove braces for iter_count >> 9 without impacting the intended
functionality?

> +	if (nr_segs > queue_max_segments(q))
> +		return -EINVAL;
> +
> +	/* no iovecs to alloc, as we already have a BVEC iterator */
> +	bio = bio_map_get(rq, 0, GFP_KERNEL);
> +	if (bio == NULL)
> +		return -ENOMEM;
> +
> +	bio_iov_bvec_set(bio, iter);
> +	blk_rq_bio_prep(rq, bio, nr_segs);
> +
> +	/* loop to perform a bunch of sanity checks */
> +	bvec_arr = (struct bio_vec *)iter->bvec;
> +	for (i = 0; i < nr_segs; i++) {
> +		bv = &bvec_arr[i];

I'd just call bvecs instead of bvec_arr, just personal preference.

> +		/*
> +		 * If the queue doesn't support SG gaps and adding this
> +		 * offset would create a gap, disallow it.
> +		 */
> +		if (bvprvp && bvec_gap_to_prev(lim, bvprvp, bv->bv_offset)) {
> +			ret = -EINVAL;
> +			goto out_free;
> +		}
> +
> +		/* check full condition */
> +		if (nsegs >= nr_segs || bytes > UINT_MAX - bv->bv_len) {
> +			ret = -EINVAL;
> +			goto out_free;
> +		}
> +
> +		if (bytes + bv->bv_len <= iter_count &&
> +				bv->bv_offset + bv->bv_len <= PAGE_SIZE) {
> +			nsegs++;
> +			bytes += bv->bv_len;
> +		} else {
> +			ret = -EINVAL;
> +			goto out_free;

you are only calling goto out_free; from loop with ret = -EINVAL.
you can remove redundant ret = -EINVAL assignment in the loop and
call return -EINVAL from the out_free: label instead return ret.
That will also allow us to remove braces in the loop.

This will also allow us to remove the ret variable on the
since I guess we are in the fast path.

> +		}
> +		bvprvp = bv;
> +	}
> +	return 0;
> +out_free:
> +	bio_map_put(bio);
> +	return ret;
> +}
> +EXPORT_SYMBOL(blk_rq_map_user_bvec);

why not use EXPORT_SYMBOL_GPL() for new addition?

I'm aware that blk-map.c only uses EXPORT_SYMBOL().

> +
>   /**
>    * blk_rq_unmap_user - unmap a request with user data
>    * @bio:	       start of bio list
> diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
> index b43c81d91892..83bef362f0f9 100644
> --- a/include/linux/blk-mq.h
> +++ b/include/linux/blk-mq.h
> @@ -970,6 +970,7 @@ struct rq_map_data {
>   	bool from_user;
>   };
>   
> +int blk_rq_map_user_bvec(struct request *rq, struct iov_iter *iter);
>   int blk_rq_map_user(struct request_queue *, struct request *,
>   		struct rq_map_data *, void __user *, unsigned long, gfp_t);
>   int blk_rq_map_user_iov(struct request_queue *, struct request *,


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next v5 4/4] nvme: wire up fixed buffer support for nvme passthrough
  2022-09-07 14:51       ` Chaitanya Kulkarni
@ 2022-09-08 10:47         ` Kanchan Joshi
  2022-09-08 14:50           ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Kanchan Joshi @ 2022-09-08 10:47 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: [email protected], [email protected], [email protected],
	[email protected], [email protected],
	[email protected]

[-- Attachment #1: Type: text/plain, Size: 8084 bytes --]

On Wed, Sep 07, 2022 at 02:51:31PM +0000, Chaitanya Kulkarni wrote:
>
>>   	req = nvme_alloc_user_request(q, cmd, ubuffer, bufflen, meta_buffer,
>> -			meta_len, meta_seed, &meta, timeout, vec, 0, 0);
>> +			meta_len, meta_seed, &meta, timeout, vec, 0, 0, NULL, 0);
>>   	if (IS_ERR(req))
>>   		return PTR_ERR(req);
>
>14 Arguments to the function!
>
>Kanchan, I'm not pointing out to this patch it has happened over
>the years, I think it is high time we find a way to trim this
>down.
>
>Least we can do is to pass a structure member than 14 different
>arguments, is everyone okay with it ?
>
Maybe it's just me, but there is something (unrelatedness) about these
fields which makes packing all these into a single container feel bit 
unnatural (or do you already have a good name for such struct?). 

So how about we split the nvme_alloc_user_request into two.
That will also serve the purpose. 
Here is a patch that creates
- new nvme_alloc_user_request with 5 parameters
- new nvme_map_user_request with 8 parameters


diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 548aca8b5b9f..cb2fa4db50dd 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -65,18 +65,10 @@ static int nvme_finish_user_metadata(struct request *req, void __user *ubuf,
 }

 static struct request *nvme_alloc_user_request(struct request_queue *q,
-               struct nvme_command *cmd, void __user *ubuffer,
-               unsigned bufflen, void __user *meta_buffer, unsigned meta_len,
-               u32 meta_seed, void **metap, unsigned timeout, bool vec,
+               struct nvme_command *cmd, unsigned timeout,
                blk_opf_t rq_flags, blk_mq_req_flags_t blk_flags)
 {
-       bool write = nvme_is_write(cmd);
-       struct nvme_ns *ns = q->queuedata;
-       struct block_device *bdev = ns ? ns->disk->part0 : NULL;
        struct request *req;
-       struct bio *bio = NULL;
-       void *meta = NULL;
-       int ret;

        req = blk_mq_alloc_request(q, nvme_req_op(cmd) | rq_flags, blk_flags);
        if (IS_ERR(req))
@@ -86,49 +78,61 @@ static struct request *nvme_alloc_user_request(struct request_queue *q,
        if (timeout)
                req->timeout = timeout;
        nvme_req(req)->flags |= NVME_REQ_USERCMD;
+       return req;
+}

-       if (ubuffer && bufflen) {
-               if (!vec)
-                       ret = blk_rq_map_user(q, req, NULL, ubuffer, bufflen,
-                               GFP_KERNEL);
-               else {
-                       struct iovec fast_iov[UIO_FASTIOV];
-                       struct iovec *iov = fast_iov;
-                       struct iov_iter iter;
-
-                       ret = import_iovec(rq_data_dir(req), ubuffer, bufflen,
-                                       UIO_FASTIOV, &iov, &iter);
-                       if (ret < 0)
-                               goto out;
-                       ret = blk_rq_map_user_iov(q, req, NULL, &iter,
-                                       GFP_KERNEL);
-                       kfree(iov);
-               }
-               if (ret)
+static int nvme_map_user_request(struct request *req, void __user *ubuffer,
+               unsigned bufflen, void __user *meta_buffer, unsigned meta_len,
+               u32 meta_seed, void **metap, bool vec)
+{
+       struct request_queue *q = req->q;
+       struct nvme_ns *ns = q->queuedata;
+       struct block_device *bdev = ns ? ns->disk->part0 : NULL;
+       struct bio *bio = NULL;
+       void *meta = NULL;
+       int ret;
+
+       if (!ubuffer || !bufflen)
+               return 0;
+
+       if (!vec)
+               ret = blk_rq_map_user(q, req, NULL, ubuffer, bufflen,
+                       GFP_KERNEL);
+       else {
+               struct iovec fast_iov[UIO_FASTIOV];
+               struct iovec *iov = fast_iov;
+               struct iov_iter iter;
+
+               ret = import_iovec(rq_data_dir(req), ubuffer, bufflen,
+                               UIO_FASTIOV, &iov, &iter);
+               if (ret < 0)
                        goto out;
-               bio = req->bio;
-               if (bdev)
-                       bio_set_dev(bio, bdev);
-               if (bdev && meta_buffer && meta_len) {
-                       meta = nvme_add_user_metadata(bio, meta_buffer, meta_len,
-                                       meta_seed, write);
-                       if (IS_ERR(meta)) {
-                               ret = PTR_ERR(meta);
-                               goto out_unmap;
-                       }
-                       req->cmd_flags |= REQ_INTEGRITY;
-                       *metap = meta;
+               ret = blk_rq_map_user_iov(q, req, NULL, &iter, GFP_KERNEL);
+               kfree(iov);
+       }
+       bio = req->bio;
+       if (ret)
+               goto out_unmap;
+       if (bdev)
+               bio_set_dev(bio, bdev);
+       if (bdev && meta_buffer && meta_len) {
+               meta = nvme_add_user_metadata(bio, meta_buffer, meta_len,
+                               meta_seed, req_op(req) == REQ_OP_DRV_OUT);
+               if (IS_ERR(meta)) {
+                       ret = PTR_ERR(meta);
+                       goto out_unmap;
                }
+               req->cmd_flags |= REQ_INTEGRITY;
+               *metap = meta;
        }

-       return req;
+       return ret;

 out_unmap:
        if (bio)
                blk_rq_unmap_user(bio);
 out:
-       blk_mq_free_request(req);
-       return ERR_PTR(ret);
+       return ret;
 }

 static int nvme_submit_user_cmd(struct request_queue *q,
@@ -141,13 +145,16 @@ static int nvme_submit_user_cmd(struct request_queue *q,
        struct bio *bio;
        int ret;

-       req = nvme_alloc_user_request(q, cmd, ubuffer, bufflen, meta_buffer,
-                       meta_len, meta_seed, &meta, timeout, vec, 0, 0);
+       req = nvme_alloc_user_request(q, cmd, timeout, 0, 0);
        if (IS_ERR(req))
                return PTR_ERR(req);

-       bio = req->bio;
+       ret = nvme_map_user_request(req, ubuffer, bufflen, meta_buffer,
+                       meta_len, meta_seed, &meta, vec);
+       if (ret)
+               goto out;

+       bio = req->bio;
        ret = nvme_execute_passthru_rq(req);

        if (result)
@@ -157,6 +164,7 @@ static int nvme_submit_user_cmd(struct request_queue *q,
                                                meta_len, ret);
        if (bio)
                blk_rq_unmap_user(bio);
+out:
        blk_mq_free_request(req);
        return ret;
 }
@@ -418,6 +426,7 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
        blk_opf_t rq_flags = 0;
        blk_mq_req_flags_t blk_flags = 0;
        void *meta = NULL;
+       int ret;

        if (!capable(CAP_SYS_ADMIN))
                return -EACCES;
@@ -457,13 +466,17 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
                rq_flags |= REQ_POLLED;

 retry:
-       req = nvme_alloc_user_request(q, &c, nvme_to_user_ptr(d.addr),
-                       d.data_len, nvme_to_user_ptr(d.metadata),
-                       d.metadata_len, 0, &meta, d.timeout_ms ?
-                       msecs_to_jiffies(d.timeout_ms) : 0, vec, rq_flags,
-                       blk_flags);
+       req = nvme_alloc_user_request(q, &c,
+                       d.timeout_ms ? msecs_to_jiffies(d.timeout_ms) : 0,
+                       rq_flags, blk_flags);
        if (IS_ERR(req))
                return PTR_ERR(req);
+
+       ret = nvme_map_user_request(req, nvme_to_user_ptr(d.addr),
+                       d.data_len, nvme_to_user_ptr(d.metadata),
+                       d.metadata_len, 0, &meta, vec);
+       if (ret)
+               goto out_err;
        req->end_io = nvme_uring_cmd_end_io;
        req->end_io_data = ioucmd;

@@ -486,6 +499,9 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,

        blk_execute_rq_nowait(req, false);
        return -EIOCBQUEUED;
+out_err:
+       blk_mq_free_request(req);
+       return ret;
 }

 static bool is_ctrl_ioctl(unsigned int cmd)


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next v5 3/4] block: add helper to map bvec iterator for passthrough
  2022-09-07 15:32       ` Chaitanya Kulkarni
@ 2022-09-08 10:52         ` Kanchan Joshi
  2022-09-08 14:46           ` Jens Axboe
  2022-09-08 15:11           ` Pankaj Raghav
  0 siblings, 2 replies; 12+ messages in thread
From: Kanchan Joshi @ 2022-09-08 10:52 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: [email protected], [email protected], [email protected],
	[email protected], [email protected], [email protected],
	[email protected], [email protected], Anuj Gupta

[-- Attachment #1: Type: text/plain, Size: 3697 bytes --]

On Wed, Sep 07, 2022 at 03:32:26PM +0000, Chaitanya Kulkarni wrote:
>On 9/5/22 23:27, Kanchan Joshi wrote:
>> Add blk_rq_map_user_bvec which maps the bvec iterator into a bio and
>> places that into the request. This helper will be used in nvme for
>> uring-passthrough with fixed-buffer.
>> While at it, create another helper bio_map_get to reduce the code
>> duplication.
>>
>> Signed-off-by: Kanchan Joshi <[email protected]>
>> Signed-off-by: Anuj Gupta <[email protected]>
>> ---
>>   block/blk-map.c        | 94 +++++++++++++++++++++++++++++++++++++-----
>>   include/linux/blk-mq.h |  1 +
>>   2 files changed, 85 insertions(+), 10 deletions(-)
>>
>> diff --git a/block/blk-map.c b/block/blk-map.c
>> index f3768876d618..e2f268167342 100644
>> --- a/block/blk-map.c
>> +++ b/block/blk-map.c
>> @@ -241,17 +241,10 @@ static void bio_map_put(struct bio *bio)
>>   	}
>>   }
>>
>> -static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
>> +static struct bio *bio_map_get(struct request *rq, unsigned int nr_vecs,
>>   		gfp_t gfp_mask)
>>   {
>> -	unsigned int max_sectors = queue_max_hw_sectors(rq->q);
>> -	unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS);
>>   	struct bio *bio;
>> -	int ret;
>> -	int j;
>> -
>> -	if (!iov_iter_count(iter))
>> -		return -EINVAL;
>>
>>   	if (rq->cmd_flags & REQ_POLLED) {
>>   		blk_opf_t opf = rq->cmd_flags | REQ_ALLOC_CACHE;
>> @@ -259,13 +252,31 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
>>   		bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask,
>>   					&fs_bio_set);
>>   		if (!bio)
>> -			return -ENOMEM;
>> +			return NULL;
>>   	} else {
>>   		bio = bio_kmalloc(nr_vecs, gfp_mask);
>>   		if (!bio)
>> -			return -ENOMEM;
>> +			return NULL;
>>   		bio_init(bio, NULL, bio->bi_inline_vecs, nr_vecs, req_op(rq));
>>   	}
>> +	return bio;
>> +}
>> +
>> +static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
>> +		gfp_t gfp_mask)
>> +{
>> +	unsigned int max_sectors = queue_max_hw_sectors(rq->q);
>> +	unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS);
>> +	struct bio *bio;
>> +	int ret;
>> +	int j;
>> +
>> +	if (!iov_iter_count(iter))
>> +		return -EINVAL;
>> +
>> +	bio = bio_map_get(rq, nr_vecs, gfp_mask);
>> +	if (bio == NULL)
>> +		return -ENOMEM;
>>
>>   	while (iov_iter_count(iter)) {
>>   		struct page **pages, *stack_pages[UIO_FASTIOV];
>> @@ -612,6 +623,69 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
>>   }
>>   EXPORT_SYMBOL(blk_rq_map_user);
>>
>> +/* Prepare bio for passthrough IO given an existing bvec iter */
>> +int blk_rq_map_user_bvec(struct request *rq, struct iov_iter *iter)
>> +{
>> +	struct request_queue *q = rq->q;
>> +	size_t iter_count, nr_segs;
>> +	struct bio *bio;
>> +	struct bio_vec *bv, *bvec_arr, *bvprvp = NULL;
>> +	struct queue_limits *lim = &q->limits;
>> +	unsigned int nsegs = 0, bytes = 0;
>> +	int ret, i;
>> +
>
>consider this (untested), it also sets the variable i data type same
>as it comparison variable in nr_segs the loop i.e. size_t :-
>
>+       struct bio_vec *bv, *bvec_arr, *bvprvp = NULL;
>+       struct request_queue *q = rq->q;
>+       struct queue_limits *lim = &q->limits;
>+       unsigned int nsegs = 0, bytes = 0;
>+       size_t iter_count, nr_segs, i;
>+       struct bio *bio;
>+       int ret;
>
>
>> +	iter_count = iov_iter_count(iter);
>> +	nr_segs = iter->nr_segs;
>> +
>> +	if (!iter_count || (iter_count >> 9) > queue_max_hw_sectors(q))
>> +		return -EINVAL;
>
>can we remove braces for iter_count >> 9 without impacting the intended
>functionality?

I think removing that make it hard to read.
I will fold all other changes you mentioned in v6.


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next v5 3/4] block: add helper to map bvec iterator for passthrough
  2022-09-08 10:52         ` Kanchan Joshi
@ 2022-09-08 14:46           ` Jens Axboe
  2022-09-08 15:11           ` Pankaj Raghav
  1 sibling, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2022-09-08 14:46 UTC (permalink / raw)
  To: Kanchan Joshi, Chaitanya Kulkarni
  Cc: [email protected], [email protected],
	[email protected], [email protected], [email protected],
	[email protected], [email protected], Anuj Gupta

On 9/8/22 4:52 AM, Kanchan Joshi wrote:
> On Wed, Sep 07, 2022 at 03:32:26PM +0000, Chaitanya Kulkarni wrote:
>> On 9/5/22 23:27, Kanchan Joshi wrote:
>>> Add blk_rq_map_user_bvec which maps the bvec iterator into a bio and
>>> places that into the request. This helper will be used in nvme for
>>> uring-passthrough with fixed-buffer.
>>> While at it, create another helper bio_map_get to reduce the code
>>> duplication.
>>>
>>> Signed-off-by: Kanchan Joshi <[email protected]>
>>> Signed-off-by: Anuj Gupta <[email protected]>
>>> ---
>>>   block/blk-map.c        | 94 +++++++++++++++++++++++++++++++++++++-----
>>>   include/linux/blk-mq.h |  1 +
>>>   2 files changed, 85 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/block/blk-map.c b/block/blk-map.c
>>> index f3768876d618..e2f268167342 100644
>>> --- a/block/blk-map.c
>>> +++ b/block/blk-map.c
>>> @@ -241,17 +241,10 @@ static void bio_map_put(struct bio *bio)
>>>       }
>>>   }
>>>
>>> -static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
>>> +static struct bio *bio_map_get(struct request *rq, unsigned int nr_vecs,
>>>           gfp_t gfp_mask)
>>>   {
>>> -    unsigned int max_sectors = queue_max_hw_sectors(rq->q);
>>> -    unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS);
>>>       struct bio *bio;
>>> -    int ret;
>>> -    int j;
>>> -
>>> -    if (!iov_iter_count(iter))
>>> -        return -EINVAL;
>>>
>>>       if (rq->cmd_flags & REQ_POLLED) {
>>>           blk_opf_t opf = rq->cmd_flags | REQ_ALLOC_CACHE;
>>> @@ -259,13 +252,31 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
>>>           bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask,
>>>                       &fs_bio_set);
>>>           if (!bio)
>>> -            return -ENOMEM;
>>> +            return NULL;
>>>       } else {
>>>           bio = bio_kmalloc(nr_vecs, gfp_mask);
>>>           if (!bio)
>>> -            return -ENOMEM;
>>> +            return NULL;
>>>           bio_init(bio, NULL, bio->bi_inline_vecs, nr_vecs, req_op(rq));
>>>       }
>>> +    return bio;
>>> +}
>>> +
>>> +static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
>>> +        gfp_t gfp_mask)
>>> +{
>>> +    unsigned int max_sectors = queue_max_hw_sectors(rq->q);
>>> +    unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS);
>>> +    struct bio *bio;
>>> +    int ret;
>>> +    int j;
>>> +
>>> +    if (!iov_iter_count(iter))
>>> +        return -EINVAL;
>>> +
>>> +    bio = bio_map_get(rq, nr_vecs, gfp_mask);
>>> +    if (bio == NULL)
>>> +        return -ENOMEM;
>>>
>>>       while (iov_iter_count(iter)) {
>>>           struct page **pages, *stack_pages[UIO_FASTIOV];
>>> @@ -612,6 +623,69 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
>>>   }
>>>   EXPORT_SYMBOL(blk_rq_map_user);
>>>
>>> +/* Prepare bio for passthrough IO given an existing bvec iter */
>>> +int blk_rq_map_user_bvec(struct request *rq, struct iov_iter *iter)
>>> +{
>>> +    struct request_queue *q = rq->q;
>>> +    size_t iter_count, nr_segs;
>>> +    struct bio *bio;
>>> +    struct bio_vec *bv, *bvec_arr, *bvprvp = NULL;
>>> +    struct queue_limits *lim = &q->limits;
>>> +    unsigned int nsegs = 0, bytes = 0;
>>> +    int ret, i;
>>> +
>>
>> consider this (untested), it also sets the variable i data type same
>> as it comparison variable in nr_segs the loop i.e. size_t :-
>>
>> +       struct bio_vec *bv, *bvec_arr, *bvprvp = NULL;
>> +       struct request_queue *q = rq->q;
>> +       struct queue_limits *lim = &q->limits;
>> +       unsigned int nsegs = 0, bytes = 0;
>> +       size_t iter_count, nr_segs, i;
>> +       struct bio *bio;
>> +       int ret;
>>
>>
>>> +    iter_count = iov_iter_count(iter);
>>> +    nr_segs = iter->nr_segs;
>>> +
>>> +    if (!iter_count || (iter_count >> 9) > queue_max_hw_sectors(q))
>>> +        return -EINVAL;
>>
>> can we remove braces for iter_count >> 9 without impacting the intended
>> functionality?
> 
> I think removing that make it hard to read.
> I will fold all other changes you mentioned in v6.

Agree - if you have to think about operator precedence, then that's a
sign that the code is less readable and more fragile.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next v5 4/4] nvme: wire up fixed buffer support for nvme passthrough
  2022-09-08 10:47         ` Kanchan Joshi
@ 2022-09-08 14:50           ` Jens Axboe
  0 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2022-09-08 14:50 UTC (permalink / raw)
  To: Kanchan Joshi, Chaitanya Kulkarni
  Cc: [email protected], [email protected], [email protected],
	[email protected], [email protected]

On 9/8/22 4:47 AM, Kanchan Joshi wrote:
> On Wed, Sep 07, 2022 at 02:51:31PM +0000, Chaitanya Kulkarni wrote:
>>
>>> ????? req = nvme_alloc_user_request(q, cmd, ubuffer, bufflen, meta_buffer,
>>> -??????????? meta_len, meta_seed, &meta, timeout, vec, 0, 0);
>>> +??????????? meta_len, meta_seed, &meta, timeout, vec, 0, 0, NULL, 0);
>>> ????? if (IS_ERR(req))
>>> ????????? return PTR_ERR(req);
>>
>> 14 Arguments to the function!
>>
>> Kanchan, I'm not pointing out to this patch it has happened over
>> the years, I think it is high time we find a way to trim this
>> down.
>>
>> Least we can do is to pass a structure member than 14 different
>> arguments, is everyone okay with it ?
>>
> Maybe it's just me, but there is something (unrelatedness) about these
> fields which makes packing all these into a single container feel bit
> unnatural (or do you already have a good name for such struct?).

I think the bigger question here would be "does it generate better
code?". Because it doesn't make the code any better at all, it just
potentially makes it more fragile. Packing into a struct is just a
work-around for the interface being pretty horrible, and it'd be a much
better idea to separate it out into separate functions instead rather
than have this behemoth of a function that does it all.

In any case, I think that's a separate cleanup that should be done, it
should not gate the change. It's already horrible.

> So how about we split the nvme_alloc_user_request into two.
> That will also serve the purpose. Here is a patch that creates
> - new nvme_alloc_user_request with 5 parameters
> - new nvme_map_user_request with 8 parameters

This is a good start though.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next v5 3/4] block: add helper to map bvec iterator for passthrough
  2022-09-08 10:52         ` Kanchan Joshi
  2022-09-08 14:46           ` Jens Axboe
@ 2022-09-08 15:11           ` Pankaj Raghav
  1 sibling, 0 replies; 12+ messages in thread
From: Pankaj Raghav @ 2022-09-08 15:11 UTC (permalink / raw)
  To: Kanchan Joshi, Chaitanya Kulkarni
  Cc: [email protected], [email protected], [email protected],
	[email protected], [email protected], [email protected],
	[email protected], [email protected], Anuj Gupta

>>
>> consider this (untested), it also sets the variable i data type same
>> as it comparison variable in nr_segs the loop i.e. size_t :-
>>
>> +       struct bio_vec *bv, *bvec_arr, *bvprvp = NULL;
>> +       struct request_queue *q = rq->q;
>> +       struct queue_limits *lim = &q->limits;
>> +       unsigned int nsegs = 0, bytes = 0;
>> +       size_t iter_count, nr_segs, i;
>> +       struct bio *bio;
>> +       int ret;
>>
>>
>>> +    iter_count = iov_iter_count(iter);
>>> +    nr_segs = iter->nr_segs;
>>> +
>>> +    if (!iter_count || (iter_count >> 9) > queue_max_hw_sectors(q))
>>> +        return -EINVAL;
>>
>> can we remove braces for iter_count >> 9 without impacting the intended
>> functionality?
> 
You can also use the SECTOR_SHIFT macro instead of hard-coding 9.

> I think removing that make it hard to read.
> I will fold all other changes you mentioned in v6.
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-09-08 15:11 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CGME20220906063719epcas5p3157e79583a5412a3be81f3d96f8aaadd@epcas5p3.samsung.com>
2022-09-06  6:27 ` [PATCH for-next v5 0/4] fixed-buffer for uring-cmd/passthru Kanchan Joshi
     [not found]   ` <CGME20220906063723epcas5p23946fd33031aee591210af1c3cd2d574@epcas5p2.samsung.com>
2022-09-06  6:27     ` [PATCH for-next v5 1/4] io_uring: introduce io_uring_cmd_import_fixed Kanchan Joshi
     [not found]   ` <CGME20220906063726epcas5p42f764b4c01b841dd1fc34abebcab02e6@epcas5p4.samsung.com>
2022-09-06  6:27     ` [PATCH for-next v5 2/4] io_uring: introduce fixed buffer support for io_uring_cmd Kanchan Joshi
     [not found]   ` <CGME20220906063729epcas5p1bf05e6873de0f7246234380d66c21fb9@epcas5p1.samsung.com>
2022-09-06  6:27     ` [PATCH for-next v5 3/4] block: add helper to map bvec iterator for passthrough Kanchan Joshi
2022-09-07 15:32       ` Chaitanya Kulkarni
2022-09-08 10:52         ` Kanchan Joshi
2022-09-08 14:46           ` Jens Axboe
2022-09-08 15:11           ` Pankaj Raghav
     [not found]   ` <CGME20220906063733epcas5p22984174bd6dbb2571152fea18af90924@epcas5p2.samsung.com>
2022-09-06  6:27     ` [PATCH for-next v5 4/4] nvme: wire up fixed buffer support for nvme passthrough Kanchan Joshi
2022-09-07 14:51       ` Chaitanya Kulkarni
2022-09-08 10:47         ` Kanchan Joshi
2022-09-08 14:50           ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox