public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH for-next v3 0/4] iopoll support for io_uring/nvme
       [not found] <CGME20220823162504epcas5p22a67e394c0fe1f563432b2f411b2fad3@epcas5p2.samsung.com>
@ 2022-08-23 16:14 ` Kanchan Joshi
       [not found]   ` <CGME20220823162508epcas5p3ae39903d3ee1079134fb70ed675159fc@epcas5p3.samsung.com>
                     ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Kanchan Joshi @ 2022-08-23 16:14 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, ming.lei, gost.dev,
	Kanchan Joshi

Series enables async polling on io_uring command, and nvme passthrough
(for io-commands) is wired up to leverage that.
This gives nice IOPS hike (even 100% at times) particularly on lower
queue-depths.

Changes since v2:
- rebase against for-next (which now has bio-cache that can operate on
  polled passthrough IO)
- bit of rewording in commit description

Changes since v1:
- corrected variable name (Jens)
- fix for a warning (test-robot)

Kanchan Joshi (4):
  fs: add file_operations->uring_cmd_iopoll
  io_uring: add iopoll infrastructure for io_uring_cmd
  block: export blk_rq_is_poll
  nvme: wire up async polling for io passthrough commands

 block/blk-mq.c                |  3 +-
 drivers/nvme/host/core.c      |  1 +
 drivers/nvme/host/ioctl.c     | 73 ++++++++++++++++++++++++++++++++---
 drivers/nvme/host/multipath.c |  1 +
 drivers/nvme/host/nvme.h      |  2 +
 include/linux/blk-mq.h        |  1 +
 include/linux/fs.h            |  1 +
 include/linux/io_uring.h      |  8 +++-
 io_uring/io_uring.c           |  6 +++
 io_uring/opdef.c              |  1 +
 io_uring/rw.c                 |  8 +++-
 io_uring/uring_cmd.c          | 11 +++++-
 12 files changed, 105 insertions(+), 11 deletions(-)


base-commit: 15e543410e9ba86d36a0410bdaf0c02f59fb8936
-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH for-next v3 1/4] fs: add file_operations->uring_cmd_iopoll
       [not found]   ` <CGME20220823162508epcas5p3ae39903d3ee1079134fb70ed675159fc@epcas5p3.samsung.com>
@ 2022-08-23 16:14     ` Kanchan Joshi
  0 siblings, 0 replies; 7+ messages in thread
From: Kanchan Joshi @ 2022-08-23 16:14 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, ming.lei, gost.dev,
	Kanchan Joshi

io_uring will invoke this to do completion polling on uring-cmd
operations.

Signed-off-by: Kanchan Joshi <[email protected]>
---
 include/linux/fs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 9eced4cc286e..d6badd19784f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2132,6 +2132,7 @@ struct file_operations {
 				   loff_t len, unsigned int remap_flags);
 	int (*fadvise)(struct file *, loff_t, loff_t, int);
 	int (*uring_cmd)(struct io_uring_cmd *ioucmd, unsigned int issue_flags);
+	int (*uring_cmd_iopoll)(struct io_uring_cmd *ioucmd);
 } __randomize_layout;
 
 struct inode_operations {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH for-next v3 2/4] io_uring: add iopoll infrastructure for io_uring_cmd
       [not found]   ` <CGME20220823162511epcas5p46fc0e384524f0a386651bc694ff21976@epcas5p4.samsung.com>
@ 2022-08-23 16:14     ` Kanchan Joshi
  0 siblings, 0 replies; 7+ messages in thread
From: Kanchan Joshi @ 2022-08-23 16:14 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, ming.lei, gost.dev,
	Kanchan Joshi, Pankaj Raghav

Put this up in the same way as iopoll is done for regular read/write IO.
Make place for storing a cookie into struct io_uring_cmd on submission.
Perform the completion using the ->uring_cmd_iopoll handler.

Signed-off-by: Kanchan Joshi <[email protected]>
Signed-off-by: Pankaj Raghav <[email protected]>
---
 include/linux/io_uring.h |  8 ++++++--
 io_uring/io_uring.c      |  6 ++++++
 io_uring/opdef.c         |  1 +
 io_uring/rw.c            |  8 +++++++-
 io_uring/uring_cmd.c     | 11 +++++++++--
 5 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
index 4a2f6cc5a492..58676c0a398f 100644
--- a/include/linux/io_uring.h
+++ b/include/linux/io_uring.h
@@ -20,8 +20,12 @@ enum io_uring_cmd_flags {
 struct io_uring_cmd {
 	struct file	*file;
 	const void	*cmd;
-	/* callback to defer completions to task context */
-	void (*task_work_cb)(struct io_uring_cmd *cmd);
+	union {
+		/* callback to defer completions to task context */
+		void (*task_work_cb)(struct io_uring_cmd *cmd);
+		/* used for polled completion */
+		void *cookie;
+	};
 	u32		cmd_op;
 	u32		pad;
 	u8		pdu[32]; /* available inline for free use */
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index ebfdb2212ec2..04abcc67648e 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1296,6 +1296,12 @@ static int io_iopoll_check(struct io_ring_ctx *ctx, long min)
 			    wq_list_empty(&ctx->iopoll_list))
 				break;
 		}
+
+		if (task_work_pending(current)) {
+			mutex_unlock(&ctx->uring_lock);
+			io_run_task_work();
+			mutex_lock(&ctx->uring_lock);
+		}
 		ret = io_do_iopoll(ctx, !min);
 		if (ret < 0)
 			break;
diff --git a/io_uring/opdef.c b/io_uring/opdef.c
index 72dd2b2d8a9d..9a0df19306fe 100644
--- a/io_uring/opdef.c
+++ b/io_uring/opdef.c
@@ -466,6 +466,7 @@ const struct io_op_def io_op_defs[] = {
 		.needs_file		= 1,
 		.plug			= 1,
 		.name			= "URING_CMD",
+		.iopoll			= 1,
 		.async_size		= uring_cmd_pdu_size(1),
 		.prep			= io_uring_cmd_prep,
 		.issue			= io_uring_cmd,
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 1babd77da79c..9698a789b3d5 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -1005,7 +1005,13 @@ int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin)
 		if (READ_ONCE(req->iopoll_completed))
 			break;
 
-		ret = rw->kiocb.ki_filp->f_op->iopoll(&rw->kiocb, &iob, poll_flags);
+		if (req->opcode == IORING_OP_URING_CMD) {
+			struct io_uring_cmd *ioucmd = (struct io_uring_cmd *)rw;
+
+			ret = req->file->f_op->uring_cmd_iopoll(ioucmd);
+		} else
+			ret = rw->kiocb.ki_filp->f_op->iopoll(&rw->kiocb, &iob,
+							poll_flags);
 		if (unlikely(ret < 0))
 			return ret;
 		else if (ret)
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 8e0cc2d9205e..b0e7feeed365 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -49,7 +49,11 @@ void io_uring_cmd_done(struct io_uring_cmd *ioucmd, ssize_t ret, ssize_t res2)
 	io_req_set_res(req, ret, 0);
 	if (req->ctx->flags & IORING_SETUP_CQE32)
 		io_req_set_cqe32_extra(req, res2, 0);
-	__io_req_complete(req, 0);
+	if (req->ctx->flags & IORING_SETUP_IOPOLL)
+		/* order with io_iopoll_req_issued() checking ->iopoll_complete */
+		smp_store_release(&req->iopoll_completed, 1);
+	else
+		__io_req_complete(req, 0);
 }
 EXPORT_SYMBOL_GPL(io_uring_cmd_done);
 
@@ -92,8 +96,11 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
 		issue_flags |= IO_URING_F_SQE128;
 	if (ctx->flags & IORING_SETUP_CQE32)
 		issue_flags |= IO_URING_F_CQE32;
-	if (ctx->flags & IORING_SETUP_IOPOLL)
+	if (ctx->flags & IORING_SETUP_IOPOLL) {
 		issue_flags |= IO_URING_F_IOPOLL;
+		req->iopoll_completed = 0;
+		WRITE_ONCE(ioucmd->cookie, NULL);
+	}
 
 	if (req_has_async_data(req))
 		ioucmd->cmd = req->async_data;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH for-next v3 3/4] block: export blk_rq_is_poll
       [not found]   ` <CGME20220823162514epcas5p1a86cebaed6993eacd976b59fc2c68f29@epcas5p1.samsung.com>
@ 2022-08-23 16:14     ` Kanchan Joshi
  0 siblings, 0 replies; 7+ messages in thread
From: Kanchan Joshi @ 2022-08-23 16:14 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, ming.lei, gost.dev,
	Kanchan Joshi

This is in preparation to support iopoll for nvme passthrough.

Signed-off-by: Kanchan Joshi <[email protected]>
---
 block/blk-mq.c         | 3 ++-
 include/linux/blk-mq.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4b90d2d8cfb0..4a07cab7dfb8 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1233,7 +1233,7 @@ static void blk_end_sync_rq(struct request *rq, blk_status_t ret)
 	complete(&wait->done);
 }
 
-static bool blk_rq_is_poll(struct request *rq)
+bool blk_rq_is_poll(struct request *rq)
 {
 	if (!rq->mq_hctx)
 		return false;
@@ -1243,6 +1243,7 @@ static bool blk_rq_is_poll(struct request *rq)
 		return false;
 	return true;
 }
+EXPORT_SYMBOL_GPL(blk_rq_is_poll);
 
 static void blk_rq_poll_completion(struct request *rq, struct completion *wait)
 {
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 74b99d716b0b..b43c81d91892 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -980,6 +980,7 @@ int blk_rq_map_kern(struct request_queue *, struct request *, void *,
 int blk_rq_append_bio(struct request *rq, struct bio *bio);
 void blk_execute_rq_nowait(struct request *rq, bool at_head);
 blk_status_t blk_execute_rq(struct request *rq, bool at_head);
+bool blk_rq_is_poll(struct request *rq);
 
 struct req_iterator {
 	struct bvec_iter iter;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH for-next v3 4/4] nvme: wire up async polling for io passthrough commands
       [not found]   ` <CGME20220823162517epcas5p2f1b808e60bae4bc1161b2d3a3a388534@epcas5p2.samsung.com>
@ 2022-08-23 16:14     ` Kanchan Joshi
  2023-08-09  1:15       ` Ming Lei
  0 siblings, 1 reply; 7+ messages in thread
From: Kanchan Joshi @ 2022-08-23 16:14 UTC (permalink / raw)
  To: axboe, hch, kbusch, asml.silence
  Cc: io-uring, linux-nvme, linux-block, ming.lei, gost.dev,
	Kanchan Joshi, Anuj Gupta

Store a cookie during submission, and use that to implement
completion-polling inside the ->uring_cmd_iopoll handler.
This handler makes use of existing bio poll facility.

Signed-off-by: Kanchan Joshi <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
---
 drivers/nvme/host/core.c      |  1 +
 drivers/nvme/host/ioctl.c     | 73 ++++++++++++++++++++++++++++++++---
 drivers/nvme/host/multipath.c |  1 +
 drivers/nvme/host/nvme.h      |  2 +
 4 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index af367b22871b..7ac0deb8bbf8 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3976,6 +3976,7 @@ static const struct file_operations nvme_ns_chr_fops = {
 	.unlocked_ioctl	= nvme_ns_chr_ioctl,
 	.compat_ioctl	= compat_ptr_ioctl,
 	.uring_cmd	= nvme_ns_chr_uring_cmd,
+	.uring_cmd_iopoll = nvme_ns_chr_uring_cmd_iopoll,
 };
 
 static int nvme_add_ns_cdev(struct nvme_ns *ns)
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 27614bee7380..7756b439a688 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -391,11 +391,19 @@ static void nvme_uring_cmd_end_io(struct request *req, blk_status_t err)
 	struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd);
 	/* extract bio before reusing the same field for request */
 	struct bio *bio = pdu->bio;
+	void *cookie = READ_ONCE(ioucmd->cookie);
 
 	pdu->req = req;
 	req->bio = bio;
-	/* this takes care of moving rest of completion-work to task context */
-	io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_cb);
+
+	/*
+	 * For iopoll, complete it directly.
+	 * Otherwise, move the completion to task work.
+	 */
+	if (cookie != NULL && blk_rq_is_poll(req))
+		nvme_uring_task_cb(ioucmd);
+	else
+		io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_cb);
 }
 
 static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
@@ -445,7 +453,10 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 		rq_flags = REQ_NOWAIT;
 		blk_flags = BLK_MQ_REQ_NOWAIT;
 	}
+	if (issue_flags & IO_URING_F_IOPOLL)
+		rq_flags |= REQ_POLLED;
 
+retry:
 	req = nvme_alloc_user_request(q, &c, nvme_to_user_ptr(d.addr),
 			d.data_len, nvme_to_user_ptr(d.metadata),
 			d.metadata_len, 0, &meta, d.timeout_ms ?
@@ -456,6 +467,17 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 	req->end_io = nvme_uring_cmd_end_io;
 	req->end_io_data = ioucmd;
 
+	if (issue_flags & IO_URING_F_IOPOLL && rq_flags & REQ_POLLED) {
+		if (unlikely(!req->bio)) {
+			/* we can't poll this, so alloc regular req instead */
+			blk_mq_free_request(req);
+			rq_flags &= ~REQ_POLLED;
+			goto retry;
+		} else {
+			WRITE_ONCE(ioucmd->cookie, req->bio);
+			req->bio->bi_opf |= REQ_POLLED;
+		}
+	}
 	/* to free bio on completion, as req->bio will be null at that time */
 	pdu->bio = req->bio;
 	pdu->meta = meta;
@@ -559,9 +581,6 @@ long nvme_ns_chr_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 
 static int nvme_uring_cmd_checks(unsigned int issue_flags)
 {
-	/* IOPOLL not supported yet */
-	if (issue_flags & IO_URING_F_IOPOLL)
-		return -EOPNOTSUPP;
 
 	/* NVMe passthrough requires big SQE/CQE support */
 	if ((issue_flags & (IO_URING_F_SQE128|IO_URING_F_CQE32)) !=
@@ -604,6 +623,23 @@ int nvme_ns_chr_uring_cmd(struct io_uring_cmd *ioucmd, unsigned int issue_flags)
 	return nvme_ns_uring_cmd(ns, ioucmd, issue_flags);
 }
 
+int nvme_ns_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd)
+{
+	struct bio *bio;
+	int ret = 0;
+	struct nvme_ns *ns;
+	struct request_queue *q;
+
+	rcu_read_lock();
+	bio = READ_ONCE(ioucmd->cookie);
+	ns = container_of(file_inode(ioucmd->file)->i_cdev,
+			struct nvme_ns, cdev);
+	q = ns->queue;
+	if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev)
+		ret = bio_poll(bio, NULL, 0);
+	rcu_read_unlock();
+	return ret;
+}
 #ifdef CONFIG_NVME_MULTIPATH
 static int nvme_ns_head_ctrl_ioctl(struct nvme_ns *ns, unsigned int cmd,
 		void __user *argp, struct nvme_ns_head *head, int srcu_idx)
@@ -685,6 +721,29 @@ int nvme_ns_head_chr_uring_cmd(struct io_uring_cmd *ioucmd,
 	srcu_read_unlock(&head->srcu, srcu_idx);
 	return ret;
 }
+
+int nvme_ns_head_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd)
+{
+	struct cdev *cdev = file_inode(ioucmd->file)->i_cdev;
+	struct nvme_ns_head *head = container_of(cdev, struct nvme_ns_head, cdev);
+	int srcu_idx = srcu_read_lock(&head->srcu);
+	struct nvme_ns *ns = nvme_find_path(head);
+	struct bio *bio;
+	int ret = 0;
+	struct request_queue *q;
+
+	if (ns) {
+		rcu_read_lock();
+		bio = READ_ONCE(ioucmd->cookie);
+		q = ns->queue;
+		if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio
+				&& bio->bi_bdev)
+			ret = bio_poll(bio, NULL, 0);
+		rcu_read_unlock();
+	}
+	srcu_read_unlock(&head->srcu, srcu_idx);
+	return ret;
+}
 #endif /* CONFIG_NVME_MULTIPATH */
 
 int nvme_dev_uring_cmd(struct io_uring_cmd *ioucmd, unsigned int issue_flags)
@@ -692,6 +751,10 @@ int nvme_dev_uring_cmd(struct io_uring_cmd *ioucmd, unsigned int issue_flags)
 	struct nvme_ctrl *ctrl = ioucmd->file->private_data;
 	int ret;
 
+	/* IOPOLL not supported yet */
+	if (issue_flags & IO_URING_F_IOPOLL)
+		return -EOPNOTSUPP;
+
 	ret = nvme_uring_cmd_checks(issue_flags);
 	if (ret)
 		return ret;
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 6ef497c75a16..00f2f81e20fa 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -439,6 +439,7 @@ static const struct file_operations nvme_ns_head_chr_fops = {
 	.unlocked_ioctl	= nvme_ns_head_chr_ioctl,
 	.compat_ioctl	= compat_ptr_ioctl,
 	.uring_cmd	= nvme_ns_head_chr_uring_cmd,
+	.uring_cmd_iopoll = nvme_ns_head_chr_uring_cmd_iopoll,
 };
 
 static int nvme_add_ns_head_cdev(struct nvme_ns_head *head)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 1bdf714dcd9e..fdcbc93dea21 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -821,6 +821,8 @@ long nvme_ns_head_chr_ioctl(struct file *file, unsigned int cmd,
 		unsigned long arg);
 long nvme_dev_ioctl(struct file *file, unsigned int cmd,
 		unsigned long arg);
+int nvme_ns_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd);
+int nvme_ns_head_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd);
 int nvme_ns_chr_uring_cmd(struct io_uring_cmd *ioucmd,
 		unsigned int issue_flags);
 int nvme_ns_head_chr_uring_cmd(struct io_uring_cmd *ioucmd,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH for-next v3 0/4] iopoll support for io_uring/nvme
  2022-08-23 16:14 ` [PATCH for-next v3 0/4] iopoll support for io_uring/nvme Kanchan Joshi
                     ` (3 preceding siblings ...)
       [not found]   ` <CGME20220823162517epcas5p2f1b808e60bae4bc1161b2d3a3a388534@epcas5p2.samsung.com>
@ 2022-09-02 15:35   ` Jens Axboe
  4 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2022-09-02 15:35 UTC (permalink / raw)
  To: kbusch, Kanchan Joshi, asml.silence, hch
  Cc: gost.dev, io-uring, linux-nvme, ming.lei, linux-block

On Tue, 23 Aug 2022 21:44:39 +0530, Kanchan Joshi wrote:
> Series enables async polling on io_uring command, and nvme passthrough
> (for io-commands) is wired up to leverage that.
> This gives nice IOPS hike (even 100% at times) particularly on lower
> queue-depths.
> 
> Changes since v2:
> - rebase against for-next (which now has bio-cache that can operate on
>   polled passthrough IO)
> - bit of rewording in commit description
> 
> [...]

Applied, thanks!

[1/4] fs: add file_operations->uring_cmd_iopoll
      commit: acdb4c6b62aa229c14a57422e4effab233b2c455
[2/4] io_uring: add iopoll infrastructure for io_uring_cmd
      commit: 585cee108ddaa8893611f95611e32ed605fc6936
[3/4] block: export blk_rq_is_poll
      commit: 2dd9d642edf01043885cefeaaa06d9c8e1aa4503
[4/4] nvme: wire up async polling for io passthrough commands
      commit: 5760ebead11801cdb73cdf3841e73231968f6af4

Best regards,
-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH for-next v3 4/4] nvme: wire up async polling for io passthrough commands
  2022-08-23 16:14     ` [PATCH for-next v3 4/4] nvme: wire up async polling for io passthrough commands Kanchan Joshi
@ 2023-08-09  1:15       ` Ming Lei
  0 siblings, 0 replies; 7+ messages in thread
From: Ming Lei @ 2023-08-09  1:15 UTC (permalink / raw)
  To: Kanchan Joshi
  Cc: axboe, hch, kbusch, asml.silence, io-uring, linux-nvme,
	linux-block, gost.dev, Anuj Gupta, ming.lei

Hi Kanchan,

On Tue, Aug 23, 2022 at 09:44:43PM +0530, Kanchan Joshi wrote:
> Store a cookie during submission, and use that to implement
> completion-polling inside the ->uring_cmd_iopoll handler.
> This handler makes use of existing bio poll facility.
> 
> Signed-off-by: Kanchan Joshi <[email protected]>
> Signed-off-by: Anuj Gupta <[email protected]>
> ---

...

>  
> +int nvme_ns_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd)
> +{
> +	struct bio *bio;
> +	int ret = 0;
> +	struct nvme_ns *ns;
> +	struct request_queue *q;
> +
> +	rcu_read_lock();
> +	bio = READ_ONCE(ioucmd->cookie);
> +	ns = container_of(file_inode(ioucmd->file)->i_cdev,
> +			struct nvme_ns, cdev);
> +	q = ns->queue;
> +	if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev)
> +		ret = bio_poll(bio, NULL, 0);
> +	rcu_read_unlock();
> +	return ret;
> +}

It looks not good to call bio_poll() with holding rcu read lock,
since set_page_dirty_lock() may sleep from end_io code path.

blk_rq_unmap_user
	bio_release_pages
		__bio_release_pages
			set_page_dirty_lock
				lock_page

Probably you need to move dirtying pages into wq context, such as
bio_check_pages_dirty(), then I guess pt io poll perf may drop.

Maybe we need to investigate how to remove the rcu read lock here.


>  #ifdef CONFIG_NVME_MULTIPATH
>  static int nvme_ns_head_ctrl_ioctl(struct nvme_ns *ns, unsigned int cmd,
>  		void __user *argp, struct nvme_ns_head *head, int srcu_idx)
> @@ -685,6 +721,29 @@ int nvme_ns_head_chr_uring_cmd(struct io_uring_cmd *ioucmd,
>  	srcu_read_unlock(&head->srcu, srcu_idx);
>  	return ret;
>  }
> +
> +int nvme_ns_head_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd)
> +{
> +	struct cdev *cdev = file_inode(ioucmd->file)->i_cdev;
> +	struct nvme_ns_head *head = container_of(cdev, struct nvme_ns_head, cdev);
> +	int srcu_idx = srcu_read_lock(&head->srcu);
> +	struct nvme_ns *ns = nvme_find_path(head);
> +	struct bio *bio;
> +	int ret = 0;
> +	struct request_queue *q;
> +
> +	if (ns) {
> +		rcu_read_lock();
> +		bio = READ_ONCE(ioucmd->cookie);
> +		q = ns->queue;
> +		if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio
> +				&& bio->bi_bdev)
> +			ret = bio_poll(bio, NULL, 0);
> +		rcu_read_unlock();

Same with above.


thanks,
Ming


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-08-09  1:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CGME20220823162504epcas5p22a67e394c0fe1f563432b2f411b2fad3@epcas5p2.samsung.com>
2022-08-23 16:14 ` [PATCH for-next v3 0/4] iopoll support for io_uring/nvme Kanchan Joshi
     [not found]   ` <CGME20220823162508epcas5p3ae39903d3ee1079134fb70ed675159fc@epcas5p3.samsung.com>
2022-08-23 16:14     ` [PATCH for-next v3 1/4] fs: add file_operations->uring_cmd_iopoll Kanchan Joshi
     [not found]   ` <CGME20220823162511epcas5p46fc0e384524f0a386651bc694ff21976@epcas5p4.samsung.com>
2022-08-23 16:14     ` [PATCH for-next v3 2/4] io_uring: add iopoll infrastructure for io_uring_cmd Kanchan Joshi
     [not found]   ` <CGME20220823162514epcas5p1a86cebaed6993eacd976b59fc2c68f29@epcas5p1.samsung.com>
2022-08-23 16:14     ` [PATCH for-next v3 3/4] block: export blk_rq_is_poll Kanchan Joshi
     [not found]   ` <CGME20220823162517epcas5p2f1b808e60bae4bc1161b2d3a3a388534@epcas5p2.samsung.com>
2022-08-23 16:14     ` [PATCH for-next v3 4/4] nvme: wire up async polling for io passthrough commands Kanchan Joshi
2023-08-09  1:15       ` Ming Lei
2022-09-02 15:35   ` [PATCH for-next v3 0/4] iopoll support for io_uring/nvme Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox