* [PATCH 0/4] iopoll support for io_uring/nvme passthrough [not found] <CGME20220805155300epcas5p1b98722e20990d0095238964e2be9db34@epcas5p1.samsung.com> @ 2022-08-05 15:42 ` Kanchan Joshi [not found] ` <CGME20220805155304epcas5p1bb687a8f9b25317af39def01696626e8@epcas5p1.samsung.com> ` (4 more replies) 0 siblings, 5 replies; 20+ messages in thread From: Kanchan Joshi @ 2022-08-05 15:42 UTC (permalink / raw) To: axboe, hch Cc: io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev, Kanchan Joshi Hi, Series enables async polling on io_uring command, and nvme passthrough (for io-commands) is wired up to leverage that. 512b randread performance (KIOP) below: QD_batch block passthru passthru-poll block-poll 1_1 80 81 158 157 8_2 406 470 680 700 16_4 620 656 931 920 128_32 879 1056 1120 1132 Polling is giving the clear win here. Upstream fio is used for testing. passthru command line: fio -iodepth=64 -rw=randread -ioengine=io_uring_cmd -bs=512 -numjobs=1 -runtime=60 -group_reporting -iodepth_batch_submit=16 -iodepth_batch_complete_min=1 -iodepth_batch_complete_max=16 -cmd_type=nvme -hipri=0 -filename=/dev/ng1n1 -name=io_uring_cmd_64 block command line: fio -direct=1 -iodepth=64 -rw=randread -ioengine=io_uring -bs=512 -numjobs=1 -runtime=60 -group_reporting -iodepth_batch_submit=16 -iodepth_batch_complete_min=1 -iodepth_batch_complete_max=16 -hipri=0 -filename=/dev/nvme1n1 name=io_uring_64 Bit of code went into non-passthrough path for io_uring (patch 2) but I do not see that causing any performance regression. peak-perf test showed 2.3M IOPS with or without this series for block-io. io_uring: Running taskset -c 0,12 t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -B1 -n2 /dev/nvme0n1 submitter=0, tid=3089, file=/dev/nvme0n1, node=-1 submitter=1, tid=3090, file=/dev/nvme0n1, node=-1 polled=1, fixedbufs=1/0, register_files=1, buffered=0, QD=128 Engine=io_uring, sq_ring=128, cq_ring=128 polled=1, fixedbufs=1/0, register_files=1, buffered=0, QD=128 Engine=io_uring, sq_ring=128, cq_ring=128 IOPS=2.31M, BW=1126MiB/s, IOS/call=31/31 IOPS=2.30M, BW=1124MiB/s, IOS/call=32/31 IOPS=2.30M, BW=1123MiB/s, IOS/call=32/32 Kanchan Joshi (4): fs: add file_operations->uring_cmd_iopoll io_uring: add iopoll infrastructure for io_uring_cmd block: export blk_rq_is_poll nvme: wire up async polling for io passthrough commands block/blk-mq.c | 3 +- drivers/nvme/host/core.c | 1 + drivers/nvme/host/ioctl.c | 73 ++++++++++++++++++++++++++++++++--- drivers/nvme/host/multipath.c | 1 + drivers/nvme/host/nvme.h | 2 + include/linux/blk-mq.h | 1 + include/linux/fs.h | 1 + include/linux/io_uring.h | 8 +++- io_uring/io_uring.c | 6 +++ io_uring/opdef.c | 1 + io_uring/rw.c | 8 +++- io_uring/uring_cmd.c | 11 +++++- 12 files changed, 105 insertions(+), 11 deletions(-) -- 2.25.1 ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <CGME20220805155304epcas5p1bb687a8f9b25317af39def01696626e8@epcas5p1.samsung.com>]
* [PATCH 1/4] fs: add file_operations->uring_cmd_iopoll [not found] ` <CGME20220805155304epcas5p1bb687a8f9b25317af39def01696626e8@epcas5p1.samsung.com> @ 2022-08-05 15:42 ` Kanchan Joshi 0 siblings, 0 replies; 20+ messages in thread From: Kanchan Joshi @ 2022-08-05 15:42 UTC (permalink / raw) To: axboe, hch Cc: io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev, Kanchan Joshi io_uring will trigger this to do completion polling on uring-cmd operations. Signed-off-by: Kanchan Joshi <[email protected]> --- include/linux/fs.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/fs.h b/include/linux/fs.h index 9f131e559d05..449941f99f50 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2134,6 +2134,7 @@ struct file_operations { loff_t len, unsigned int remap_flags); int (*fadvise)(struct file *, loff_t, loff_t, int); int (*uring_cmd)(struct io_uring_cmd *ioucmd, unsigned int issue_flags); + int (*uring_cmd_iopoll)(struct io_uring_cmd *ioucmd); } __randomize_layout; struct inode_operations { -- 2.25.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
[parent not found: <CGME20220805155307epcas5p4bab3f05dc13d8fc2f03c7a26e9bd8c7c@epcas5p4.samsung.com>]
* [PATCH 2/4] io_uring: add iopoll infrastructure for io_uring_cmd [not found] ` <CGME20220805155307epcas5p4bab3f05dc13d8fc2f03c7a26e9bd8c7c@epcas5p4.samsung.com> @ 2022-08-05 15:42 ` Kanchan Joshi 0 siblings, 0 replies; 20+ messages in thread From: Kanchan Joshi @ 2022-08-05 15:42 UTC (permalink / raw) To: axboe, hch Cc: io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev, Kanchan Joshi, Pankaj Raghav Put this up in the same way as iopoll is done for regular read/write IO. Make place for storing a cookie into struct io_uring_cmd on its submission. Perform the completion using the ->uring_cmd_iopoll handler. Signed-off-by: Kanchan Joshi <[email protected]> Signed-off-by: Pankaj Raghav <[email protected]> --- include/linux/io_uring.h | 8 ++++++-- io_uring/io_uring.c | 6 ++++++ io_uring/opdef.c | 1 + io_uring/rw.c | 8 +++++++- io_uring/uring_cmd.c | 11 +++++++++-- 5 files changed, 29 insertions(+), 5 deletions(-) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index 4a2f6cc5a492..58676c0a398f 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -20,8 +20,12 @@ enum io_uring_cmd_flags { struct io_uring_cmd { struct file *file; const void *cmd; - /* callback to defer completions to task context */ - void (*task_work_cb)(struct io_uring_cmd *cmd); + union { + /* callback to defer completions to task context */ + void (*task_work_cb)(struct io_uring_cmd *cmd); + /* used for polled completion */ + void *cookie; + }; u32 cmd_op; u32 pad; u8 pdu[32]; /* available inline for free use */ diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index b54218da075c..48a430a86b50 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -1296,6 +1296,12 @@ static int io_iopoll_check(struct io_ring_ctx *ctx, long min) wq_list_empty(&ctx->iopoll_list)) break; } + + if (task_work_pending(current)) { + mutex_unlock(&ctx->uring_lock); + io_run_task_work(); + mutex_lock(&ctx->uring_lock); + } ret = io_do_iopoll(ctx, !min); if (ret < 0) break; diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 72dd2b2d8a9d..9a0df19306fe 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -466,6 +466,7 @@ const struct io_op_def io_op_defs[] = { .needs_file = 1, .plug = 1, .name = "URING_CMD", + .iopoll = 1, .async_size = uring_cmd_pdu_size(1), .prep = io_uring_cmd_prep, .issue = io_uring_cmd, diff --git a/io_uring/rw.c b/io_uring/rw.c index 2b784795103c..1a4fb8a44b9a 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -1005,7 +1005,13 @@ int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin) if (READ_ONCE(req->iopoll_completed)) break; - ret = rw->kiocb.ki_filp->f_op->iopoll(&rw->kiocb, &iob, poll_flags); + if (req->opcode == IORING_OP_URING_CMD) { + struct io_uring_cmd *ioucmd = (struct io_uring_cmd *)rw; + + ret = req->file->f_op->uring_cmd_iopoll(ioucmd); + } else + ret = rw->kiocb.ki_filp->f_op->iopoll(&rw->kiocb, &iob, + poll_flags); if (unlikely(ret < 0)) return ret; else if (ret) diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index 0a421ed51e7e..5cc339fba8b8 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -49,7 +49,11 @@ void io_uring_cmd_done(struct io_uring_cmd *ioucmd, ssize_t ret, ssize_t res2) io_req_set_res(req, 0, ret); if (req->ctx->flags & IORING_SETUP_CQE32) io_req_set_cqe32_extra(req, res2, 0); - __io_req_complete(req, 0); + if (req->ctx->flags & IORING_SETUP_IOPOLL) + /* order with io_iopoll_req_issued() checking ->iopoll_completed */ + smp_store_release(&req->iopoll_completed, 1); + else + __io_req_complete(req, 0); } EXPORT_SYMBOL_GPL(io_uring_cmd_done); @@ -89,8 +93,11 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags) issue_flags |= IO_URING_F_SQE128; if (ctx->flags & IORING_SETUP_CQE32) issue_flags |= IO_URING_F_CQE32; - if (ctx->flags & IORING_SETUP_IOPOLL) + if (ctx->flags & IORING_SETUP_IOPOLL) { issue_flags |= IO_URING_F_IOPOLL; + req->iopoll_completed = 0; + WRITE_ONCE(ioucmd->cookie, NULL); + } if (req_has_async_data(req)) ioucmd->cmd = req->async_data; -- 2.25.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
[parent not found: <CGME20220805155310epcas5p2bd7ec5b9bee73893958f4bc84038eca0@epcas5p2.samsung.com>]
* [PATCH 3/4] block: export blk_rq_is_poll [not found] ` <CGME20220805155310epcas5p2bd7ec5b9bee73893958f4bc84038eca0@epcas5p2.samsung.com> @ 2022-08-05 15:42 ` Kanchan Joshi 0 siblings, 0 replies; 20+ messages in thread From: Kanchan Joshi @ 2022-08-05 15:42 UTC (permalink / raw) To: axboe, hch Cc: io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev, Kanchan Joshi This is being done as preparation to support iopoll for nvme passthrough Signed-off-by: Kanchan Joshi <[email protected]> --- block/blk-mq.c | 3 ++- include/linux/blk-mq.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 5ee62b95f3e5..de42f7237bad 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1233,7 +1233,7 @@ static void blk_end_sync_rq(struct request *rq, blk_status_t ret) complete(&wait->done); } -static bool blk_rq_is_poll(struct request *rq) +bool blk_rq_is_poll(struct request *rq) { if (!rq->mq_hctx) return false; @@ -1243,6 +1243,7 @@ static bool blk_rq_is_poll(struct request *rq) return false; return true; } +EXPORT_SYMBOL_GPL(blk_rq_is_poll); static void blk_rq_poll_completion(struct request *rq, struct completion *wait) { diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index effee1dc715a..8f841caaa4cb 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -981,6 +981,7 @@ int blk_rq_map_kern(struct request_queue *, struct request *, void *, int blk_rq_append_bio(struct request *rq, struct bio *bio); void blk_execute_rq_nowait(struct request *rq, bool at_head); blk_status_t blk_execute_rq(struct request *rq, bool at_head); +bool blk_rq_is_poll(struct request *rq); struct req_iterator { struct bvec_iter iter; -- 2.25.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
[parent not found: <CGME20220805155313epcas5p2d35d22831bd07ef33fbdc28bd99ae1d0@epcas5p2.samsung.com>]
* [PATCH 4/4] nvme: wire up async polling for io passthrough commands [not found] ` <CGME20220805155313epcas5p2d35d22831bd07ef33fbdc28bd99ae1d0@epcas5p2.samsung.com> @ 2022-08-05 15:42 ` Kanchan Joshi 2022-08-05 17:03 ` Jens Axboe ` (4 more replies) 0 siblings, 5 replies; 20+ messages in thread From: Kanchan Joshi @ 2022-08-05 15:42 UTC (permalink / raw) To: axboe, hch Cc: io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev, Kanchan Joshi, Anuj Gupta Store a cookie during submission, and use that to implement completion-polling inside the ->uring_cmd_iopoll handler. This handler makes use of existing bio poll facility. Signed-off-by: Kanchan Joshi <[email protected]> Signed-off-by: Anuj Gupta <[email protected]> --- drivers/nvme/host/core.c | 1 + drivers/nvme/host/ioctl.c | 73 ++++++++++++++++++++++++++++++++--- drivers/nvme/host/multipath.c | 1 + drivers/nvme/host/nvme.h | 2 + 4 files changed, 72 insertions(+), 5 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 2429b11eb9a8..77b6c2882afd 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -3976,6 +3976,7 @@ static const struct file_operations nvme_ns_chr_fops = { .unlocked_ioctl = nvme_ns_chr_ioctl, .compat_ioctl = compat_ptr_ioctl, .uring_cmd = nvme_ns_chr_uring_cmd, + .uring_cmd_iopoll = nvme_ns_chr_uring_cmd_iopoll, }; static int nvme_add_ns_cdev(struct nvme_ns *ns) diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index 27614bee7380..136f0fd25710 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -391,11 +391,19 @@ static void nvme_uring_cmd_end_io(struct request *req, blk_status_t err) struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); /* extract bio before reusing the same field for request */ struct bio *bio = pdu->bio; + void *cookie = READ_ONCE(ioucmd->cookie); pdu->req = req; req->bio = bio; - /* this takes care of moving rest of completion-work to task context */ - io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_cb); + + /* + * For iopoll, complete it directly. + * Otherwise, move the completion to task work. + */ + if (cookie != NULL && blk_rq_is_poll(req)) + nvme_uring_task_cb(ioucmd); + else + io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_cb); } static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, @@ -445,7 +453,10 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, rq_flags = REQ_NOWAIT; blk_flags = BLK_MQ_REQ_NOWAIT; } + if (issue_flags & IO_URING_F_IOPOLL) + rq_flags |= REQ_POLLED; +retry: req = nvme_alloc_user_request(q, &c, nvme_to_user_ptr(d.addr), d.data_len, nvme_to_user_ptr(d.metadata), d.metadata_len, 0, &meta, d.timeout_ms ? @@ -456,6 +467,17 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, req->end_io = nvme_uring_cmd_end_io; req->end_io_data = ioucmd; + if (issue_flags & IO_URING_F_IOPOLL && rq_flags & REQ_POLLED) { + if (unlikely(!req->bio)) { + /* we can't poll this, so alloc regular req instead */ + blk_mq_free_request(req); + rq_flags &= ~REQ_POLLED; + goto retry; + } else { + WRITE_ONCE(ioucmd->cookie, req->bio); + req->bio->bi_opf |= REQ_POLLED; + } + } /* to free bio on completion, as req->bio will be null at that time */ pdu->bio = req->bio; pdu->meta = meta; @@ -559,9 +581,6 @@ long nvme_ns_chr_ioctl(struct file *file, unsigned int cmd, unsigned long arg) static int nvme_uring_cmd_checks(unsigned int issue_flags) { - /* IOPOLL not supported yet */ - if (issue_flags & IO_URING_F_IOPOLL) - return -EOPNOTSUPP; /* NVMe passthrough requires big SQE/CQE support */ if ((issue_flags & (IO_URING_F_SQE128|IO_URING_F_CQE32)) != @@ -604,6 +623,23 @@ int nvme_ns_chr_uring_cmd(struct io_uring_cmd *ioucmd, unsigned int issue_flags) return nvme_ns_uring_cmd(ns, ioucmd, issue_flags); } +int nvme_ns_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd) +{ + struct bio *bio; + int ret; + struct nvme_ns *ns; + struct request_queue *q; + + rcu_read_lock(); + bio = READ_ONCE(ioucmd->cookie); + ns = container_of(file_inode(ioucmd->file)->i_cdev, + struct nvme_ns, cdev); + q = ns->queue; + if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) + ret = bio_poll(bio, 0, 0); + rcu_read_unlock(); + return ret; +} #ifdef CONFIG_NVME_MULTIPATH static int nvme_ns_head_ctrl_ioctl(struct nvme_ns *ns, unsigned int cmd, void __user *argp, struct nvme_ns_head *head, int srcu_idx) @@ -685,6 +721,29 @@ int nvme_ns_head_chr_uring_cmd(struct io_uring_cmd *ioucmd, srcu_read_unlock(&head->srcu, srcu_idx); return ret; } + +int nvme_ns_head_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd) +{ + struct cdev *cdev = file_inode(ioucmd->file)->i_cdev; + struct nvme_ns_head *head = container_of(cdev, struct nvme_ns_head, cdev); + int srcu_idx = srcu_read_lock(&head->srcu); + struct nvme_ns *ns = nvme_find_path(head); + struct bio *bio; + int ret = 0; + struct request_queue *q; + + if (ns) { + rcu_read_lock(); + bio = READ_ONCE(ioucmd->private); + q = ns->queue; + if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio + && bio->bi_bdev) + ret = bio_poll(bio, 0, 0); + rcu_read_unlock(); + } + srcu_read_unlock(&head->srcu, srcu_idx); + return ret; +} #endif /* CONFIG_NVME_MULTIPATH */ int nvme_dev_uring_cmd(struct io_uring_cmd *ioucmd, unsigned int issue_flags) @@ -692,6 +751,10 @@ int nvme_dev_uring_cmd(struct io_uring_cmd *ioucmd, unsigned int issue_flags) struct nvme_ctrl *ctrl = ioucmd->file->private_data; int ret; + /* IOPOLL not supported yet */ + if (issue_flags & IO_URING_F_IOPOLL) + return -EOPNOTSUPP; + ret = nvme_uring_cmd_checks(issue_flags); if (ret) return ret; diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index 6ef497c75a16..00f2f81e20fa 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -439,6 +439,7 @@ static const struct file_operations nvme_ns_head_chr_fops = { .unlocked_ioctl = nvme_ns_head_chr_ioctl, .compat_ioctl = compat_ptr_ioctl, .uring_cmd = nvme_ns_head_chr_uring_cmd, + .uring_cmd_iopoll = nvme_ns_head_chr_uring_cmd_iopoll, }; static int nvme_add_ns_head_cdev(struct nvme_ns_head *head) diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index bdc0ff7ed9ab..3f2d3dda6e6c 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -821,6 +821,8 @@ long nvme_ns_head_chr_ioctl(struct file *file, unsigned int cmd, unsigned long arg); long nvme_dev_ioctl(struct file *file, unsigned int cmd, unsigned long arg); +int nvme_ns_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd); +int nvme_ns_head_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd); int nvme_ns_chr_uring_cmd(struct io_uring_cmd *ioucmd, unsigned int issue_flags); int nvme_ns_head_chr_uring_cmd(struct io_uring_cmd *ioucmd, -- 2.25.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] nvme: wire up async polling for io passthrough commands 2022-08-05 15:42 ` [PATCH 4/4] nvme: wire up async polling for io passthrough commands Kanchan Joshi @ 2022-08-05 17:03 ` Jens Axboe 2022-08-05 17:07 ` Kanchan Joshi 2022-08-05 21:22 ` kernel test robot ` (3 subsequent siblings) 4 siblings, 1 reply; 20+ messages in thread From: Jens Axboe @ 2022-08-05 17:03 UTC (permalink / raw) To: Kanchan Joshi, hch Cc: io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev, Anuj Gupta On 8/5/22 9:42 AM, Kanchan Joshi wrote: > @@ -685,6 +721,29 @@ int nvme_ns_head_chr_uring_cmd(struct io_uring_cmd *ioucmd, > srcu_read_unlock(&head->srcu, srcu_idx); > return ret; > } > + > +int nvme_ns_head_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd) > +{ > + struct cdev *cdev = file_inode(ioucmd->file)->i_cdev; > + struct nvme_ns_head *head = container_of(cdev, struct nvme_ns_head, cdev); > + int srcu_idx = srcu_read_lock(&head->srcu); > + struct nvme_ns *ns = nvme_find_path(head); > + struct bio *bio; > + int ret = 0; > + struct request_queue *q; > + > + if (ns) { > + rcu_read_lock(); > + bio = READ_ONCE(ioucmd->private); > + q = ns->queue; > + if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio > + && bio->bi_bdev) > + ret = bio_poll(bio, 0, 0); > + rcu_read_unlock(); > + } > + srcu_read_unlock(&head->srcu, srcu_idx); > + return ret; > +} > #endif /* CONFIG_NVME_MULTIPATH */ Looks like that READ_ONCE() should be: bio = READ_ONCE(ioucmd->cookie); ? -- Jens Axboe ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] nvme: wire up async polling for io passthrough commands 2022-08-05 17:03 ` Jens Axboe @ 2022-08-05 17:07 ` Kanchan Joshi 0 siblings, 0 replies; 20+ messages in thread From: Kanchan Joshi @ 2022-08-05 17:07 UTC (permalink / raw) To: Jens Axboe Cc: hch, io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev, Anuj Gupta [-- Attachment #1: Type: text/plain, Size: 1202 bytes --] On Fri, Aug 05, 2022 at 11:03:55AM -0600, Jens Axboe wrote: >On 8/5/22 9:42 AM, Kanchan Joshi wrote: >> @@ -685,6 +721,29 @@ int nvme_ns_head_chr_uring_cmd(struct io_uring_cmd *ioucmd, >> srcu_read_unlock(&head->srcu, srcu_idx); >> return ret; >> } >> + >> +int nvme_ns_head_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd) >> +{ >> + struct cdev *cdev = file_inode(ioucmd->file)->i_cdev; >> + struct nvme_ns_head *head = container_of(cdev, struct nvme_ns_head, cdev); >> + int srcu_idx = srcu_read_lock(&head->srcu); >> + struct nvme_ns *ns = nvme_find_path(head); >> + struct bio *bio; >> + int ret = 0; >> + struct request_queue *q; >> + >> + if (ns) { >> + rcu_read_lock(); >> + bio = READ_ONCE(ioucmd->private); >> + q = ns->queue; >> + if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio >> + && bio->bi_bdev) >> + ret = bio_poll(bio, 0, 0); >> + rcu_read_unlock(); >> + } >> + srcu_read_unlock(&head->srcu, srcu_idx); >> + return ret; >> +} >> #endif /* CONFIG_NVME_MULTIPATH */ > >Looks like that READ_ONCE() should be: > > bio = READ_ONCE(ioucmd->cookie); > >? Damn, indeed. Would have caught if I had compiled this with NVME_MULTIPATH config enabled. Thanks for the catch. [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] nvme: wire up async polling for io passthrough commands 2022-08-05 15:42 ` [PATCH 4/4] nvme: wire up async polling for io passthrough commands Kanchan Joshi 2022-08-05 17:03 ` Jens Axboe @ 2022-08-05 21:22 ` kernel test robot 2022-08-06 0:06 ` kernel test robot ` (2 subsequent siblings) 4 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2022-08-05 21:22 UTC (permalink / raw) To: Kanchan Joshi, axboe, hch Cc: kbuild-all, io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev, Kanchan Joshi, Anuj Gupta Hi Kanchan, Thank you for the patch! Yet something to improve: [auto build test ERROR on axboe-block/for-next] [also build test ERROR on linus/master] [cannot apply to hch-configfs/for-next v5.19] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Kanchan-Joshi/fs-add-file_operations-uring_cmd_iopoll/20220806-004320 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next config: i386-allyesconfig (https://download.01.org/0day-ci/archive/20220806/[email protected]/config) compiler: gcc-11 (Debian 11.3.0-3) 11.3.0 reproduce (this is a W=1 build): # https://github.com/intel-lab-lkp/linux/commit/0964795577fbf09d8b315269504b5e87b5ac492b git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Kanchan-Joshi/fs-add-file_operations-uring_cmd_iopoll/20220806-004320 git checkout 0964795577fbf09d8b315269504b5e87b5ac492b # save the config file mkdir build_dir && cp config build_dir/.config make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot <[email protected]> All errors (new ones prefixed by >>): In file included from <command-line>: drivers/nvme/host/ioctl.c: In function 'nvme_ns_head_chr_uring_cmd_iopoll': >> drivers/nvme/host/ioctl.c:737:39: error: 'struct io_uring_cmd' has no member named 'private' 737 | bio = READ_ONCE(ioucmd->private); | ^~ include/linux/compiler_types.h:334:23: note: in definition of macro '__compiletime_assert' 334 | if (!(condition)) \ | ^~~~~~~~~ include/linux/compiler_types.h:354:9: note: in expansion of macro '_compiletime_assert' 354 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^~~~~~~~~~~~~~~~~~~ include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert' 36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ | ^~~~~~~~~~~~~~~~~~ include/asm-generic/rwonce.h:36:28: note: in expansion of macro '__native_word' 36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ | ^~~~~~~~~~~~~ include/asm-generic/rwonce.h:49:9: note: in expansion of macro 'compiletime_assert_rwonce_type' 49 | compiletime_assert_rwonce_type(x); \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/nvme/host/ioctl.c:737:23: note: in expansion of macro 'READ_ONCE' 737 | bio = READ_ONCE(ioucmd->private); | ^~~~~~~~~ >> drivers/nvme/host/ioctl.c:737:39: error: 'struct io_uring_cmd' has no member named 'private' 737 | bio = READ_ONCE(ioucmd->private); | ^~ include/linux/compiler_types.h:334:23: note: in definition of macro '__compiletime_assert' 334 | if (!(condition)) \ | ^~~~~~~~~ include/linux/compiler_types.h:354:9: note: in expansion of macro '_compiletime_assert' 354 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^~~~~~~~~~~~~~~~~~~ include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert' 36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ | ^~~~~~~~~~~~~~~~~~ include/asm-generic/rwonce.h:36:28: note: in expansion of macro '__native_word' 36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ | ^~~~~~~~~~~~~ include/asm-generic/rwonce.h:49:9: note: in expansion of macro 'compiletime_assert_rwonce_type' 49 | compiletime_assert_rwonce_type(x); \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/nvme/host/ioctl.c:737:23: note: in expansion of macro 'READ_ONCE' 737 | bio = READ_ONCE(ioucmd->private); | ^~~~~~~~~ >> drivers/nvme/host/ioctl.c:737:39: error: 'struct io_uring_cmd' has no member named 'private' 737 | bio = READ_ONCE(ioucmd->private); | ^~ include/linux/compiler_types.h:334:23: note: in definition of macro '__compiletime_assert' 334 | if (!(condition)) \ | ^~~~~~~~~ include/linux/compiler_types.h:354:9: note: in expansion of macro '_compiletime_assert' 354 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^~~~~~~~~~~~~~~~~~~ include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert' 36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ | ^~~~~~~~~~~~~~~~~~ include/asm-generic/rwonce.h:36:28: note: in expansion of macro '__native_word' 36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ | ^~~~~~~~~~~~~ include/asm-generic/rwonce.h:49:9: note: in expansion of macro 'compiletime_assert_rwonce_type' 49 | compiletime_assert_rwonce_type(x); \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/nvme/host/ioctl.c:737:23: note: in expansion of macro 'READ_ONCE' 737 | bio = READ_ONCE(ioucmd->private); | ^~~~~~~~~ >> drivers/nvme/host/ioctl.c:737:39: error: 'struct io_uring_cmd' has no member named 'private' 737 | bio = READ_ONCE(ioucmd->private); | ^~ include/linux/compiler_types.h:334:23: note: in definition of macro '__compiletime_assert' 334 | if (!(condition)) \ | ^~~~~~~~~ include/linux/compiler_types.h:354:9: note: in expansion of macro '_compiletime_assert' 354 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^~~~~~~~~~~~~~~~~~~ include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert' 36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ | ^~~~~~~~~~~~~~~~~~ include/asm-generic/rwonce.h:36:28: note: in expansion of macro '__native_word' 36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ | ^~~~~~~~~~~~~ include/asm-generic/rwonce.h:49:9: note: in expansion of macro 'compiletime_assert_rwonce_type' 49 | compiletime_assert_rwonce_type(x); \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/nvme/host/ioctl.c:737:23: note: in expansion of macro 'READ_ONCE' 737 | bio = READ_ONCE(ioucmd->private); | ^~~~~~~~~ >> drivers/nvme/host/ioctl.c:737:39: error: 'struct io_uring_cmd' has no member named 'private' 737 | bio = READ_ONCE(ioucmd->private); | ^~ include/linux/compiler_types.h:334:23: note: in definition of macro '__compiletime_assert' 334 | if (!(condition)) \ | ^~~~~~~~~ include/linux/compiler_types.h:354:9: note: in expansion of macro '_compiletime_assert' 354 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^~~~~~~~~~~~~~~~~~~ include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert' 36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ | ^~~~~~~~~~~~~~~~~~ include/asm-generic/rwonce.h:49:9: note: in expansion of macro 'compiletime_assert_rwonce_type' 49 | compiletime_assert_rwonce_type(x); \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/nvme/host/ioctl.c:737:23: note: in expansion of macro 'READ_ONCE' 737 | bio = READ_ONCE(ioucmd->private); | ^~~~~~~~~ >> drivers/nvme/host/ioctl.c:737:39: error: 'struct io_uring_cmd' has no member named 'private' 737 | bio = READ_ONCE(ioucmd->private); | ^~ include/linux/compiler_types.h:310:27: note: in definition of macro '__unqual_scalar_typeof' 310 | _Generic((x), \ | ^ include/asm-generic/rwonce.h:50:9: note: in expansion of macro '__READ_ONCE' 50 | __READ_ONCE(x); \ | ^~~~~~~~~~~ drivers/nvme/host/ioctl.c:737:23: note: in expansion of macro 'READ_ONCE' 737 | bio = READ_ONCE(ioucmd->private); | ^~~~~~~~~ In file included from ./arch/x86/include/generated/asm/rwonce.h:1, from include/linux/compiler.h:248, from include/linux/ptrace.h:5, from drivers/nvme/host/ioctl.c:6: >> drivers/nvme/host/ioctl.c:737:39: error: 'struct io_uring_cmd' has no member named 'private' 737 | bio = READ_ONCE(ioucmd->private); | ^~ include/asm-generic/rwonce.h:44:73: note: in definition of macro '__READ_ONCE' 44 | #define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x)) | ^ drivers/nvme/host/ioctl.c:737:23: note: in expansion of macro 'READ_ONCE' 737 | bio = READ_ONCE(ioucmd->private); | ^~~~~~~~~ vim +737 drivers/nvme/host/ioctl.c 724 725 int nvme_ns_head_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd) 726 { 727 struct cdev *cdev = file_inode(ioucmd->file)->i_cdev; 728 struct nvme_ns_head *head = container_of(cdev, struct nvme_ns_head, cdev); 729 int srcu_idx = srcu_read_lock(&head->srcu); 730 struct nvme_ns *ns = nvme_find_path(head); 731 struct bio *bio; 732 int ret = 0; 733 struct request_queue *q; 734 735 if (ns) { 736 rcu_read_lock(); > 737 bio = READ_ONCE(ioucmd->private); 738 q = ns->queue; 739 if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio 740 && bio->bi_bdev) 741 ret = bio_poll(bio, 0, 0); 742 rcu_read_unlock(); 743 } 744 srcu_read_unlock(&head->srcu, srcu_idx); 745 return ret; 746 } 747 #endif /* CONFIG_NVME_MULTIPATH */ 748 -- 0-DAY CI Kernel Test Service https://01.org/lkp ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] nvme: wire up async polling for io passthrough commands 2022-08-05 15:42 ` [PATCH 4/4] nvme: wire up async polling for io passthrough commands Kanchan Joshi 2022-08-05 17:03 ` Jens Axboe 2022-08-05 21:22 ` kernel test robot @ 2022-08-06 0:06 ` kernel test robot 2022-08-06 1:38 ` kernel test robot 2022-08-07 12:25 ` kernel test robot 4 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2022-08-06 0:06 UTC (permalink / raw) To: Kanchan Joshi, axboe, hch Cc: llvm, kbuild-all, io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev, Kanchan Joshi, Anuj Gupta Hi Kanchan, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on axboe-block/for-next] [also build test WARNING on linus/master next-20220805] [cannot apply to hch-configfs/for-next v5.19] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Kanchan-Joshi/fs-add-file_operations-uring_cmd_iopoll/20220806-004320 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next config: i386-randconfig-a002 (https://download.01.org/0day-ci/archive/20220806/[email protected]/config) compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 26dd42705c2af0b8f6e5d6cdb32c9bd5ed9524eb) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/0964795577fbf09d8b315269504b5e87b5ac492b git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Kanchan-Joshi/fs-add-file_operations-uring_cmd_iopoll/20220806-004320 git checkout 0964795577fbf09d8b315269504b5e87b5ac492b # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash drivers/nvme/host/ If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot <[email protected]> All warnings (new ones prefixed by >>): >> drivers/nvme/host/ioctl.c:638:6: warning: variable 'ret' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/nvme/host/ioctl.c:641:9: note: uninitialized use occurs here return ret; ^~~ drivers/nvme/host/ioctl.c:638:2: note: remove the 'if' if its condition is always true if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> drivers/nvme/host/ioctl.c:638:6: warning: variable 'ret' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized] if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/nvme/host/ioctl.c:641:9: note: uninitialized use occurs here return ret; ^~~ drivers/nvme/host/ioctl.c:638:6: note: remove the '&&' if its condition is always true if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> drivers/nvme/host/ioctl.c:638:6: warning: variable 'ret' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized] if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/nvme/host/ioctl.c:641:9: note: uninitialized use occurs here return ret; ^~~ drivers/nvme/host/ioctl.c:638:6: note: remove the '&&' if its condition is always true if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/nvme/host/ioctl.c:629:9: note: initialize the variable 'ret' to silence this warning int ret; ^ = 0 3 warnings generated. vim +638 drivers/nvme/host/ioctl.c 625 626 int nvme_ns_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd) 627 { 628 struct bio *bio; 629 int ret; 630 struct nvme_ns *ns; 631 struct request_queue *q; 632 633 rcu_read_lock(); 634 bio = READ_ONCE(ioucmd->cookie); 635 ns = container_of(file_inode(ioucmd->file)->i_cdev, 636 struct nvme_ns, cdev); 637 q = ns->queue; > 638 if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) 639 ret = bio_poll(bio, 0, 0); 640 rcu_read_unlock(); 641 return ret; 642 } 643 #ifdef CONFIG_NVME_MULTIPATH 644 static int nvme_ns_head_ctrl_ioctl(struct nvme_ns *ns, unsigned int cmd, 645 void __user *argp, struct nvme_ns_head *head, int srcu_idx) 646 __releases(&head->srcu) 647 { 648 struct nvme_ctrl *ctrl = ns->ctrl; 649 int ret; 650 651 nvme_get_ctrl(ns->ctrl); 652 srcu_read_unlock(&head->srcu, srcu_idx); 653 ret = nvme_ctrl_ioctl(ns->ctrl, cmd, argp); 654 655 nvme_put_ctrl(ctrl); 656 return ret; 657 } 658 -- 0-DAY CI Kernel Test Service https://01.org/lkp ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] nvme: wire up async polling for io passthrough commands 2022-08-05 15:42 ` [PATCH 4/4] nvme: wire up async polling for io passthrough commands Kanchan Joshi ` (2 preceding siblings ...) 2022-08-06 0:06 ` kernel test robot @ 2022-08-06 1:38 ` kernel test robot 2022-08-07 12:25 ` kernel test robot 4 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2022-08-06 1:38 UTC (permalink / raw) To: Kanchan Joshi, axboe, hch Cc: llvm, kbuild-all, io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev, Kanchan Joshi, Anuj Gupta Hi Kanchan, Thank you for the patch! Yet something to improve: [auto build test ERROR on axboe-block/for-next] [also build test ERROR on linus/master] [cannot apply to hch-configfs/for-next v5.19] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Kanchan-Joshi/fs-add-file_operations-uring_cmd_iopoll/20220806-004320 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next config: hexagon-randconfig-r015-20220805 (https://download.01.org/0day-ci/archive/20220806/[email protected]/config) compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 5f1c7e2cc5a3c07cbc2412e851a7283c1841f520) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/0964795577fbf09d8b315269504b5e87b5ac492b git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Kanchan-Joshi/fs-add-file_operations-uring_cmd_iopoll/20220806-004320 git checkout 0964795577fbf09d8b315269504b5e87b5ac492b # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon SHELL=/bin/bash drivers/nvme/host/ If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot <[email protected]> All error/warnings (new ones prefixed by >>): >> drivers/nvme/host/ioctl.c:638:6: warning: variable 'ret' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/hexagon/include/asm/bitops.h:175:28: note: expanded from macro 'test_bit' #define test_bit(nr, addr) __test_bit(nr, addr) ^ drivers/nvme/host/ioctl.c:641:9: note: uninitialized use occurs here return ret; ^~~ drivers/nvme/host/ioctl.c:638:2: note: remove the 'if' if its condition is always true if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> drivers/nvme/host/ioctl.c:638:6: warning: variable 'ret' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized] if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/hexagon/include/asm/bitops.h:175:28: note: expanded from macro 'test_bit' #define test_bit(nr, addr) __test_bit(nr, addr) ^ drivers/nvme/host/ioctl.c:641:9: note: uninitialized use occurs here return ret; ^~~ drivers/nvme/host/ioctl.c:638:6: note: remove the '&&' if its condition is always true if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/hexagon/include/asm/bitops.h:175:28: note: expanded from macro 'test_bit' #define test_bit(nr, addr) __test_bit(nr, addr) ^ >> drivers/nvme/host/ioctl.c:638:6: warning: variable 'ret' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized] if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/hexagon/include/asm/bitops.h:175:28: note: expanded from macro 'test_bit' #define test_bit(nr, addr) __test_bit(nr, addr) ^~~~~~~~~~~~~~~~~~~~ drivers/nvme/host/ioctl.c:641:9: note: uninitialized use occurs here return ret; ^~~ drivers/nvme/host/ioctl.c:638:6: note: remove the '&&' if its condition is always true if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/hexagon/include/asm/bitops.h:175:28: note: expanded from macro 'test_bit' #define test_bit(nr, addr) __test_bit(nr, addr) ^ drivers/nvme/host/ioctl.c:629:9: note: initialize the variable 'ret' to silence this warning int ret; ^ = 0 >> drivers/nvme/host/ioctl.c:737:27: error: no member named 'private' in 'struct io_uring_cmd' bio = READ_ONCE(ioucmd->private); ~~~~~~ ^ include/asm-generic/rwonce.h:49:33: note: expanded from macro 'READ_ONCE' compiletime_assert_rwonce_type(x); \ ^ include/asm-generic/rwonce.h:36:35: note: expanded from macro 'compiletime_assert_rwonce_type' compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ ^ include/linux/compiler_types.h:321:10: note: expanded from macro '__native_word' (sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \ ^ include/linux/compiler_types.h:354:22: note: expanded from macro 'compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^~~~~~~~~ include/linux/compiler_types.h:342:23: note: expanded from macro '_compiletime_assert' __compiletime_assert(condition, msg, prefix, suffix) ^~~~~~~~~ include/linux/compiler_types.h:334:9: note: expanded from macro '__compiletime_assert' if (!(condition)) \ ^~~~~~~~~ >> drivers/nvme/host/ioctl.c:737:27: error: no member named 'private' in 'struct io_uring_cmd' bio = READ_ONCE(ioucmd->private); ~~~~~~ ^ include/asm-generic/rwonce.h:49:33: note: expanded from macro 'READ_ONCE' compiletime_assert_rwonce_type(x); \ ^ include/asm-generic/rwonce.h:36:35: note: expanded from macro 'compiletime_assert_rwonce_type' compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ ^ include/linux/compiler_types.h:321:39: note: expanded from macro '__native_word' (sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \ ^ include/linux/compiler_types.h:354:22: note: expanded from macro 'compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^~~~~~~~~ include/linux/compiler_types.h:342:23: note: expanded from macro '_compiletime_assert' __compiletime_assert(condition, msg, prefix, suffix) ^~~~~~~~~ include/linux/compiler_types.h:334:9: note: expanded from macro '__compiletime_assert' if (!(condition)) \ ^~~~~~~~~ >> drivers/nvme/host/ioctl.c:737:27: error: no member named 'private' in 'struct io_uring_cmd' bio = READ_ONCE(ioucmd->private); ~~~~~~ ^ include/asm-generic/rwonce.h:49:33: note: expanded from macro 'READ_ONCE' compiletime_assert_rwonce_type(x); \ ^ include/asm-generic/rwonce.h:36:35: note: expanded from macro 'compiletime_assert_rwonce_type' compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ ^ include/linux/compiler_types.h:322:10: note: expanded from macro '__native_word' sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long)) ^ include/linux/compiler_types.h:354:22: note: expanded from macro 'compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^~~~~~~~~ include/linux/compiler_types.h:342:23: note: expanded from macro '_compiletime_assert' __compiletime_assert(condition, msg, prefix, suffix) ^~~~~~~~~ include/linux/compiler_types.h:334:9: note: expanded from macro '__compiletime_assert' if (!(condition)) \ ^~~~~~~~~ >> drivers/nvme/host/ioctl.c:737:27: error: no member named 'private' in 'struct io_uring_cmd' bio = READ_ONCE(ioucmd->private); ~~~~~~ ^ include/asm-generic/rwonce.h:49:33: note: expanded from macro 'READ_ONCE' compiletime_assert_rwonce_type(x); \ ^ include/asm-generic/rwonce.h:36:35: note: expanded from macro 'compiletime_assert_rwonce_type' compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ ^ include/linux/compiler_types.h:322:38: note: expanded from macro '__native_word' sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long)) ^ include/linux/compiler_types.h:354:22: note: expanded from macro 'compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^~~~~~~~~ include/linux/compiler_types.h:342:23: note: expanded from macro '_compiletime_assert' __compiletime_assert(condition, msg, prefix, suffix) ^~~~~~~~~ include/linux/compiler_types.h:334:9: note: expanded from macro '__compiletime_assert' if (!(condition)) \ ^~~~~~~~~ >> drivers/nvme/host/ioctl.c:737:27: error: no member named 'private' in 'struct io_uring_cmd' bio = READ_ONCE(ioucmd->private); ~~~~~~ ^ include/asm-generic/rwonce.h:49:33: note: expanded from macro 'READ_ONCE' compiletime_assert_rwonce_type(x); \ ^ include/asm-generic/rwonce.h:36:48: note: expanded from macro 'compiletime_assert_rwonce_type' compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ ^ include/linux/compiler_types.h:354:22: note: expanded from macro 'compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^~~~~~~~~ include/linux/compiler_types.h:342:23: note: expanded from macro '_compiletime_assert' __compiletime_assert(condition, msg, prefix, suffix) ^~~~~~~~~ include/linux/compiler_types.h:334:9: note: expanded from macro '__compiletime_assert' if (!(condition)) \ ^~~~~~~~~ >> drivers/nvme/host/ioctl.c:737:27: error: no member named 'private' in 'struct io_uring_cmd' bio = READ_ONCE(ioucmd->private); ~~~~~~ ^ include/asm-generic/rwonce.h:50:14: note: expanded from macro 'READ_ONCE' __READ_ONCE(x); \ ^ include/asm-generic/rwonce.h:44:65: note: expanded from macro '__READ_ONCE' #define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x)) ^ include/linux/compiler_types.h:310:13: note: expanded from macro '__unqual_scalar_typeof' _Generic((x), \ ^ >> drivers/nvme/host/ioctl.c:737:27: error: no member named 'private' in 'struct io_uring_cmd' bio = READ_ONCE(ioucmd->private); ~~~~~~ ^ include/asm-generic/rwonce.h:50:14: note: expanded from macro 'READ_ONCE' __READ_ONCE(x); \ ^ include/asm-generic/rwonce.h:44:65: note: expanded from macro '__READ_ONCE' #define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x)) ^ include/linux/compiler_types.h:317:15: note: expanded from macro '__unqual_scalar_typeof' default: (x))) ^ >> drivers/nvme/host/ioctl.c:737:27: error: no member named 'private' in 'struct io_uring_cmd' bio = READ_ONCE(ioucmd->private); ~~~~~~ ^ include/asm-generic/rwonce.h:50:14: note: expanded from macro 'READ_ONCE' __READ_ONCE(x); \ ^ include/asm-generic/rwonce.h:44:72: note: expanded from macro '__READ_ONCE' #define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x)) ^ >> drivers/nvme/host/ioctl.c:737:7: error: assigning to 'struct bio *' from incompatible type 'void' bio = READ_ONCE(ioucmd->private); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~ 3 warnings and 9 errors generated. vim +737 drivers/nvme/host/ioctl.c 625 626 int nvme_ns_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd) 627 { 628 struct bio *bio; 629 int ret; 630 struct nvme_ns *ns; 631 struct request_queue *q; 632 633 rcu_read_lock(); 634 bio = READ_ONCE(ioucmd->cookie); 635 ns = container_of(file_inode(ioucmd->file)->i_cdev, 636 struct nvme_ns, cdev); 637 q = ns->queue; > 638 if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) 639 ret = bio_poll(bio, 0, 0); 640 rcu_read_unlock(); 641 return ret; 642 } 643 #ifdef CONFIG_NVME_MULTIPATH 644 static int nvme_ns_head_ctrl_ioctl(struct nvme_ns *ns, unsigned int cmd, 645 void __user *argp, struct nvme_ns_head *head, int srcu_idx) 646 __releases(&head->srcu) 647 { 648 struct nvme_ctrl *ctrl = ns->ctrl; 649 int ret; 650 651 nvme_get_ctrl(ns->ctrl); 652 srcu_read_unlock(&head->srcu, srcu_idx); 653 ret = nvme_ctrl_ioctl(ns->ctrl, cmd, argp); 654 655 nvme_put_ctrl(ctrl); 656 return ret; 657 } 658 659 int nvme_ns_head_ioctl(struct block_device *bdev, fmode_t mode, 660 unsigned int cmd, unsigned long arg) 661 { 662 struct nvme_ns_head *head = bdev->bd_disk->private_data; 663 void __user *argp = (void __user *)arg; 664 struct nvme_ns *ns; 665 int srcu_idx, ret = -EWOULDBLOCK; 666 667 srcu_idx = srcu_read_lock(&head->srcu); 668 ns = nvme_find_path(head); 669 if (!ns) 670 goto out_unlock; 671 672 /* 673 * Handle ioctls that apply to the controller instead of the namespace 674 * seperately and drop the ns SRCU reference early. This avoids a 675 * deadlock when deleting namespaces using the passthrough interface. 676 */ 677 if (is_ctrl_ioctl(cmd)) 678 return nvme_ns_head_ctrl_ioctl(ns, cmd, argp, head, srcu_idx); 679 680 ret = nvme_ns_ioctl(ns, cmd, argp); 681 out_unlock: 682 srcu_read_unlock(&head->srcu, srcu_idx); 683 return ret; 684 } 685 686 long nvme_ns_head_chr_ioctl(struct file *file, unsigned int cmd, 687 unsigned long arg) 688 { 689 struct cdev *cdev = file_inode(file)->i_cdev; 690 struct nvme_ns_head *head = 691 container_of(cdev, struct nvme_ns_head, cdev); 692 void __user *argp = (void __user *)arg; 693 struct nvme_ns *ns; 694 int srcu_idx, ret = -EWOULDBLOCK; 695 696 srcu_idx = srcu_read_lock(&head->srcu); 697 ns = nvme_find_path(head); 698 if (!ns) 699 goto out_unlock; 700 701 if (is_ctrl_ioctl(cmd)) 702 return nvme_ns_head_ctrl_ioctl(ns, cmd, argp, head, srcu_idx); 703 704 ret = nvme_ns_ioctl(ns, cmd, argp); 705 out_unlock: 706 srcu_read_unlock(&head->srcu, srcu_idx); 707 return ret; 708 } 709 710 int nvme_ns_head_chr_uring_cmd(struct io_uring_cmd *ioucmd, 711 unsigned int issue_flags) 712 { 713 struct cdev *cdev = file_inode(ioucmd->file)->i_cdev; 714 struct nvme_ns_head *head = container_of(cdev, struct nvme_ns_head, cdev); 715 int srcu_idx = srcu_read_lock(&head->srcu); 716 struct nvme_ns *ns = nvme_find_path(head); 717 int ret = -EINVAL; 718 719 if (ns) 720 ret = nvme_ns_uring_cmd(ns, ioucmd, issue_flags); 721 srcu_read_unlock(&head->srcu, srcu_idx); 722 return ret; 723 } 724 725 int nvme_ns_head_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd) 726 { 727 struct cdev *cdev = file_inode(ioucmd->file)->i_cdev; 728 struct nvme_ns_head *head = container_of(cdev, struct nvme_ns_head, cdev); 729 int srcu_idx = srcu_read_lock(&head->srcu); 730 struct nvme_ns *ns = nvme_find_path(head); 731 struct bio *bio; 732 int ret = 0; 733 struct request_queue *q; 734 735 if (ns) { 736 rcu_read_lock(); > 737 bio = READ_ONCE(ioucmd->private); 738 q = ns->queue; 739 if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio 740 && bio->bi_bdev) 741 ret = bio_poll(bio, 0, 0); 742 rcu_read_unlock(); 743 } 744 srcu_read_unlock(&head->srcu, srcu_idx); 745 return ret; 746 } 747 #endif /* CONFIG_NVME_MULTIPATH */ 748 -- 0-DAY CI Kernel Test Service https://01.org/lkp ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] nvme: wire up async polling for io passthrough commands 2022-08-05 15:42 ` [PATCH 4/4] nvme: wire up async polling for io passthrough commands Kanchan Joshi ` (3 preceding siblings ...) 2022-08-06 1:38 ` kernel test robot @ 2022-08-07 12:25 ` kernel test robot 4 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2022-08-07 12:25 UTC (permalink / raw) To: Kanchan Joshi, axboe, hch Cc: kbuild-all, io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev, Kanchan Joshi, Anuj Gupta Hi Kanchan, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on axboe-block/for-next] [also build test WARNING on linus/master next-20220805] [cannot apply to hch-configfs/for-next v5.19] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Kanchan-Joshi/fs-add-file_operations-uring_cmd_iopoll/20220806-004320 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next config: ia64-randconfig-s032-20220807 (https://download.01.org/0day-ci/archive/20220807/[email protected]/config) compiler: ia64-linux-gcc (GCC) 12.1.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # apt-get install sparse # sparse version: v0.6.4-39-gce1a6720-dirty # https://github.com/intel-lab-lkp/linux/commit/0964795577fbf09d8b315269504b5e87b5ac492b git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Kanchan-Joshi/fs-add-file_operations-uring_cmd_iopoll/20220806-004320 git checkout 0964795577fbf09d8b315269504b5e87b5ac492b # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=ia64 SHELL=/bin/bash drivers/nvme/host/ If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot <[email protected]> sparse warnings: (new ones prefixed by >>) >> drivers/nvme/host/ioctl.c:639:37: sparse: sparse: Using plain integer as NULL pointer vim +639 drivers/nvme/host/ioctl.c 625 626 int nvme_ns_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd) 627 { 628 struct bio *bio; 629 int ret; 630 struct nvme_ns *ns; 631 struct request_queue *q; 632 633 rcu_read_lock(); 634 bio = READ_ONCE(ioucmd->cookie); 635 ns = container_of(file_inode(ioucmd->file)->i_cdev, 636 struct nvme_ns, cdev); 637 q = ns->queue; 638 if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev) > 639 ret = bio_poll(bio, 0, 0); 640 rcu_read_unlock(); 641 return ret; 642 } 643 #ifdef CONFIG_NVME_MULTIPATH 644 static int nvme_ns_head_ctrl_ioctl(struct nvme_ns *ns, unsigned int cmd, 645 void __user *argp, struct nvme_ns_head *head, int srcu_idx) 646 __releases(&head->srcu) 647 { 648 struct nvme_ctrl *ctrl = ns->ctrl; 649 int ret; 650 651 nvme_get_ctrl(ns->ctrl); 652 srcu_read_unlock(&head->srcu, srcu_idx); 653 ret = nvme_ctrl_ioctl(ns->ctrl, cmd, argp); 654 655 nvme_put_ctrl(ctrl); 656 return ret; 657 } 658 -- 0-DAY CI Kernel Test Service https://01.org/lkp ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/4] iopoll support for io_uring/nvme passthrough 2022-08-05 15:42 ` [PATCH 0/4] iopoll support for io_uring/nvme passthrough Kanchan Joshi ` (3 preceding siblings ...) [not found] ` <CGME20220805155313epcas5p2d35d22831bd07ef33fbdc28bd99ae1d0@epcas5p2.samsung.com> @ 2022-08-05 17:04 ` Jens Axboe 2022-08-05 17:13 ` Kanchan Joshi 2022-08-05 17:18 ` Jens Axboe 4 siblings, 2 replies; 20+ messages in thread From: Jens Axboe @ 2022-08-05 17:04 UTC (permalink / raw) To: Kanchan Joshi, hch Cc: io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev On 8/5/22 9:42 AM, Kanchan Joshi wrote: > Hi, > > Series enables async polling on io_uring command, and nvme passthrough > (for io-commands) is wired up to leverage that. > > 512b randread performance (KIOP) below: > > QD_batch block passthru passthru-poll block-poll > 1_1 80 81 158 157 > 8_2 406 470 680 700 > 16_4 620 656 931 920 > 128_32 879 1056 1120 1132 Curious on why passthru is slower than block-poll? Are we missing something here? -- Jens Axboe ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/4] iopoll support for io_uring/nvme passthrough 2022-08-05 17:04 ` [PATCH 0/4] iopoll support for io_uring/nvme passthrough Jens Axboe @ 2022-08-05 17:13 ` Kanchan Joshi 2022-08-05 17:27 ` Jens Axboe 2022-08-05 17:18 ` Jens Axboe 1 sibling, 1 reply; 20+ messages in thread From: Kanchan Joshi @ 2022-08-05 17:13 UTC (permalink / raw) To: Jens Axboe Cc: hch, io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev [-- Attachment #1: Type: text/plain, Size: 866 bytes --] On Fri, Aug 05, 2022 at 11:04:22AM -0600, Jens Axboe wrote: >On 8/5/22 9:42 AM, Kanchan Joshi wrote: >> Hi, >> >> Series enables async polling on io_uring command, and nvme passthrough >> (for io-commands) is wired up to leverage that. >> >> 512b randread performance (KIOP) below: >> >> QD_batch block passthru passthru-poll block-poll >> 1_1 80 81 158 157 >> 8_2 406 470 680 700 >> 16_4 620 656 931 920 >> 128_32 879 1056 1120 1132 > >Curious on why passthru is slower than block-poll? Are we missing >something here? passthru-poll vs block-poll you mean? passthru does not have bio-cache, while block path is running with that. Maybe completion-batching is also playing some role, not too sure about that at the moment. [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/4] iopoll support for io_uring/nvme passthrough 2022-08-05 17:13 ` Kanchan Joshi @ 2022-08-05 17:27 ` Jens Axboe 0 siblings, 0 replies; 20+ messages in thread From: Jens Axboe @ 2022-08-05 17:27 UTC (permalink / raw) To: Kanchan Joshi Cc: hch, io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev On 8/5/22 11:13 AM, Kanchan Joshi wrote: > On Fri, Aug 05, 2022 at 11:04:22AM -0600, Jens Axboe wrote: >> On 8/5/22 9:42 AM, Kanchan Joshi wrote: >>> Hi, >>> >>> Series enables async polling on io_uring command, and nvme passthrough >>> (for io-commands) is wired up to leverage that. >>> >>> 512b randread performance (KIOP) below: >>> >>> QD_batch block passthru passthru-poll block-poll >>> 1_1 80 81 158 157 >>> 8_2 406 470 680 700 >>> 16_4 620 656 931 920 >>> 128_32 879 1056 1120 1132 >> >> Curious on why passthru is slower than block-poll? Are we missing >> something here? > passthru-poll vs block-poll you mean? > passthru does not have bio-cache, while block path is running with that. > Maybe completion-batching is also playing some role, not too sure about that > at the moment. Yeah, see other email on a quick rundown. We should make bio_map_user_iov() use the bio caching, that'd make a big difference. Won't fully close the gap, but will be close if we exclude the lack of fixedbufs. -- Jens Axboe ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/4] iopoll support for io_uring/nvme passthrough 2022-08-05 17:04 ` [PATCH 0/4] iopoll support for io_uring/nvme passthrough Jens Axboe 2022-08-05 17:13 ` Kanchan Joshi @ 2022-08-05 17:18 ` Jens Axboe 2022-08-05 18:11 ` Keith Busch 2022-08-05 18:21 ` Jens Axboe 1 sibling, 2 replies; 20+ messages in thread From: Jens Axboe @ 2022-08-05 17:18 UTC (permalink / raw) To: Kanchan Joshi, hch Cc: io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev On 8/5/22 11:04 AM, Jens Axboe wrote: > On 8/5/22 9:42 AM, Kanchan Joshi wrote: >> Hi, >> >> Series enables async polling on io_uring command, and nvme passthrough >> (for io-commands) is wired up to leverage that. >> >> 512b randread performance (KIOP) below: >> >> QD_batch block passthru passthru-poll block-poll >> 1_1 80 81 158 157 >> 8_2 406 470 680 700 >> 16_4 620 656 931 920 >> 128_32 879 1056 1120 1132 > > Curious on why passthru is slower than block-poll? Are we missing > something here? I took a quick peek, running it here. List of items making it slower: - No fixedbufs support for passthru, each each request will go through get_user_pages() and put_pages() on completion. This is about a 10% change for me, by itself. - nvme_uring_cmd_io() -> nvme_alloc_user_request() -> blk_rq_map_user() -> blk_rq_map_user_iov() -> memset() is another ~4% for me. - The kmalloc+kfree per command is roughly 9% extra slowdown. There are other little things, but the above are the main ones. Even if I disable fixedbufs for non-passthru, passthru is about ~24% slower here using a single device and a single core, which is mostly the above mentioned items. This isn't specific to the iopoll support, that's obviously faster than IRQ driven for this test case. This is just comparing passthru with the regular block path for doing random 512b reads. -- Jens Axboe ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/4] iopoll support for io_uring/nvme passthrough 2022-08-05 17:18 ` Jens Axboe @ 2022-08-05 18:11 ` Keith Busch 2022-08-05 18:15 ` Jens Axboe 2022-08-05 18:21 ` Jens Axboe 1 sibling, 1 reply; 20+ messages in thread From: Keith Busch @ 2022-08-05 18:11 UTC (permalink / raw) To: Jens Axboe Cc: Kanchan Joshi, hch, io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev On Fri, Aug 05, 2022 at 11:18:38AM -0600, Jens Axboe wrote: > On 8/5/22 11:04 AM, Jens Axboe wrote: > > On 8/5/22 9:42 AM, Kanchan Joshi wrote: > >> Hi, > >> > >> Series enables async polling on io_uring command, and nvme passthrough > >> (for io-commands) is wired up to leverage that. > >> > >> 512b randread performance (KIOP) below: > >> > >> QD_batch block passthru passthru-poll block-poll > >> 1_1 80 81 158 157 > >> 8_2 406 470 680 700 > >> 16_4 620 656 931 920 > >> 128_32 879 1056 1120 1132 > > > > Curious on why passthru is slower than block-poll? Are we missing > > something here? > > I took a quick peek, running it here. List of items making it slower: > > - No fixedbufs support for passthru, each each request will go through > get_user_pages() and put_pages() on completion. This is about a 10% > change for me, by itself. Enabling fixed buffer support through here looks like it will take a little bit of work. The driver needs an opcode or flag to tell it the user address is a fixed buffer, and io_uring needs to export its registered buffer for a driver like nvme to get to. > - nvme_uring_cmd_io() -> nvme_alloc_user_request() -> blk_rq_map_user() > -> blk_rq_map_user_iov() -> memset() is another ~4% for me. Where's the memset() coming from? That should only happen if we need to bounce, right? This type of request shouldn't need that unless you're using odd user address alignment. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/4] iopoll support for io_uring/nvme passthrough 2022-08-05 18:11 ` Keith Busch @ 2022-08-05 18:15 ` Jens Axboe 2022-08-07 17:58 ` Kanchan Joshi 0 siblings, 1 reply; 20+ messages in thread From: Jens Axboe @ 2022-08-05 18:15 UTC (permalink / raw) To: Keith Busch Cc: Kanchan Joshi, hch, io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev On 8/5/22 12:11 PM, Keith Busch wrote: > On Fri, Aug 05, 2022 at 11:18:38AM -0600, Jens Axboe wrote: >> On 8/5/22 11:04 AM, Jens Axboe wrote: >>> On 8/5/22 9:42 AM, Kanchan Joshi wrote: >>>> Hi, >>>> >>>> Series enables async polling on io_uring command, and nvme passthrough >>>> (for io-commands) is wired up to leverage that. >>>> >>>> 512b randread performance (KIOP) below: >>>> >>>> QD_batch block passthru passthru-poll block-poll >>>> 1_1 80 81 158 157 >>>> 8_2 406 470 680 700 >>>> 16_4 620 656 931 920 >>>> 128_32 879 1056 1120 1132 >>> >>> Curious on why passthru is slower than block-poll? Are we missing >>> something here? >> >> I took a quick peek, running it here. List of items making it slower: >> >> - No fixedbufs support for passthru, each each request will go through >> get_user_pages() and put_pages() on completion. This is about a 10% >> change for me, by itself. > > Enabling fixed buffer support through here looks like it will take a > little bit of work. The driver needs an opcode or flag to tell it the > user address is a fixed buffer, and io_uring needs to export its > registered buffer for a driver like nvme to get to. Yeah, it's not a straight forward thing. But if this will be used with recycled buffers, then it'll definitely be worthwhile to look into. >> - nvme_uring_cmd_io() -> nvme_alloc_user_request() -> blk_rq_map_user() >> -> blk_rq_map_user_iov() -> memset() is another ~4% for me. > > Where's the memset() coming from? That should only happen if we need > to bounce, right? This type of request shouldn't need that unless > you're using odd user address alignment. Not sure, need to double check! Hacking up a patch to get rid of the frivolous alloc+free, we'll see how it stands after that and I'll find it. -- Jens Axboe ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/4] iopoll support for io_uring/nvme passthrough 2022-08-05 18:15 ` Jens Axboe @ 2022-08-07 17:58 ` Kanchan Joshi 2022-08-07 18:46 ` Jens Axboe 0 siblings, 1 reply; 20+ messages in thread From: Kanchan Joshi @ 2022-08-07 17:58 UTC (permalink / raw) To: Jens Axboe Cc: Keith Busch, hch, io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev [-- Attachment #1: Type: text/plain, Size: 1858 bytes --] On Fri, Aug 05, 2022 at 12:15:24PM -0600, Jens Axboe wrote: >On 8/5/22 12:11 PM, Keith Busch wrote: >> On Fri, Aug 05, 2022 at 11:18:38AM -0600, Jens Axboe wrote: >>> On 8/5/22 11:04 AM, Jens Axboe wrote: >>>> On 8/5/22 9:42 AM, Kanchan Joshi wrote: >>>>> Hi, >>>>> >>>>> Series enables async polling on io_uring command, and nvme passthrough >>>>> (for io-commands) is wired up to leverage that. >>>>> >>>>> 512b randread performance (KIOP) below: >>>>> >>>>> QD_batch block passthru passthru-poll block-poll >>>>> 1_1 80 81 158 157 >>>>> 8_2 406 470 680 700 >>>>> 16_4 620 656 931 920 >>>>> 128_32 879 1056 1120 1132 >>>> >>>> Curious on why passthru is slower than block-poll? Are we missing >>>> something here? >>> >>> I took a quick peek, running it here. List of items making it slower: >>> >>> - No fixedbufs support for passthru, each each request will go through >>> get_user_pages() and put_pages() on completion. This is about a 10% >>> change for me, by itself. >> >> Enabling fixed buffer support through here looks like it will take a >> little bit of work. The driver needs an opcode or flag to tell it the >> user address is a fixed buffer, and io_uring needs to export its >> registered buffer for a driver like nvme to get to. > >Yeah, it's not a straight forward thing. But if this will be used with >recycled buffers, then it'll definitely be worthwhile to look into. Had posted bio-cache and fixedbufs in the initial round but retracted to get the foundation settled first. https://lore.kernel.org/linux-nvme/[email protected]/ I see that you brought back bio-cache already. I can refresh fixedbufs. Completion-batching seems too tightly coupled to block-path. [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/4] iopoll support for io_uring/nvme passthrough 2022-08-07 17:58 ` Kanchan Joshi @ 2022-08-07 18:46 ` Jens Axboe 0 siblings, 0 replies; 20+ messages in thread From: Jens Axboe @ 2022-08-07 18:46 UTC (permalink / raw) To: Kanchan Joshi Cc: Keith Busch, hch, io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev On 8/7/22 11:58 AM, Kanchan Joshi wrote: > On Fri, Aug 05, 2022 at 12:15:24PM -0600, Jens Axboe wrote: >> On 8/5/22 12:11 PM, Keith Busch wrote: >>> On Fri, Aug 05, 2022 at 11:18:38AM -0600, Jens Axboe wrote: >>>> On 8/5/22 11:04 AM, Jens Axboe wrote: >>>>> On 8/5/22 9:42 AM, Kanchan Joshi wrote: >>>>>> Hi, >>>>>> >>>>>> Series enables async polling on io_uring command, and nvme passthrough >>>>>> (for io-commands) is wired up to leverage that. >>>>>> >>>>>> 512b randread performance (KIOP) below: >>>>>> >>>>>> QD_batch block passthru passthru-poll block-poll >>>>>> 1_1 80 81 158 157 >>>>>> 8_2 406 470 680 700 >>>>>> 16_4 620 656 931 920 >>>>>> 128_32 879 1056 1120 1132 >>>>> >>>>> Curious on why passthru is slower than block-poll? Are we missing >>>>> something here? >>>> >>>> I took a quick peek, running it here. List of items making it slower: >>>> >>>> - No fixedbufs support for passthru, each each request will go through >>>> get_user_pages() and put_pages() on completion. This is about a 10% >>>> change for me, by itself. >>> >>> Enabling fixed buffer support through here looks like it will take a >>> little bit of work. The driver needs an opcode or flag to tell it the >>> user address is a fixed buffer, and io_uring needs to export its >>> registered buffer for a driver like nvme to get to. >> >> Yeah, it's not a straight forward thing. But if this will be used with >> recycled buffers, then it'll definitely be worthwhile to look into. > > Had posted bio-cache and fixedbufs in the initial round but retracted > to get the foundation settled first. > https://lore.kernel.org/linux-nvme/[email protected]/ > > I see that you brought back bio-cache already. I can refresh fixedbufs. Excellent, yes please bring back the fixedbufs. It's a 5-10% win, nothing to sneeze at. > Completion-batching seems too tightly coupled to block-path. It's really not, in fact it'd be even simpler to do for passthru. The rq->end_io handler just needs to know about it. -- Jens Axboe ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/4] iopoll support for io_uring/nvme passthrough 2022-08-05 17:18 ` Jens Axboe 2022-08-05 18:11 ` Keith Busch @ 2022-08-05 18:21 ` Jens Axboe 1 sibling, 0 replies; 20+ messages in thread From: Jens Axboe @ 2022-08-05 18:21 UTC (permalink / raw) To: Kanchan Joshi, hch Cc: io-uring, linux-nvme, linux-block, ming.lei, joshiiitr, gost.dev, Keith Busch On 8/5/22 11:18 AM, Jens Axboe wrote: > On 8/5/22 11:04 AM, Jens Axboe wrote: >> On 8/5/22 9:42 AM, Kanchan Joshi wrote: >>> Hi, >>> >>> Series enables async polling on io_uring command, and nvme passthrough >>> (for io-commands) is wired up to leverage that. >>> >>> 512b randread performance (KIOP) below: >>> >>> QD_batch block passthru passthru-poll block-poll >>> 1_1 80 81 158 157 >>> 8_2 406 470 680 700 >>> 16_4 620 656 931 920 >>> 128_32 879 1056 1120 1132 >> >> Curious on why passthru is slower than block-poll? Are we missing >> something here? > > I took a quick peek, running it here. List of items making it slower: > > - No fixedbufs support for passthru, each each request will go through > get_user_pages() and put_pages() on completion. This is about a 10% > change for me, by itself. > > - nvme_uring_cmd_io() -> nvme_alloc_user_request() -> blk_rq_map_user() > -> blk_rq_map_user_iov() -> memset() is another ~4% for me. > > - The kmalloc+kfree per command is roughly 9% extra slowdown. > > There are other little things, but the above are the main ones. Even if > I disable fixedbufs for non-passthru, passthru is about ~24% slower > here using a single device and a single core, which is mostly the above > mentioned items. > > This isn't specific to the iopoll support, that's obviously faster than > IRQ driven for this test case. This is just comparing passthru with > the regular block path for doing random 512b reads. Here's a hack that gets rid of the page array alloc+free for smaller vec ranges, and uses the bio cache for polled IO too. This reclaims about 14% of the 24% compared to block-iopoll, in particular for this run it brings IOPS from ~1815K to 2110K for passthru-polled. We also don't seem to be taking advantage of request completion batching, and tag batch allocations. Outside of that, looks like just some generic block bits that need fixing up (and the attached patch that needs some cleanup gets us most of the way there), so nothing we can't get sorted out. Keith, the memset() seems to be tied to the allocations fixed in this patch, it's gone now as well. diff --git a/block/blk-map.c b/block/blk-map.c index df8b066cd548..8861b89e15a8 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -157,10 +157,8 @@ static int bio_copy_user_iov(struct request *rq, struct rq_map_data *map_data, goto out_bmd; bio_init(bio, NULL, bio->bi_inline_vecs, nr_pages, req_op(rq)); - if (map_data) { - nr_pages = 1 << map_data->page_order; + if (map_data) i = map_data->offset / PAGE_SIZE; - } while (len) { unsigned int bytes = PAGE_SIZE; @@ -232,7 +230,7 @@ static int bio_copy_user_iov(struct request *rq, struct rq_map_data *map_data, } static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, - gfp_t gfp_mask) + gfp_t gfp_mask) { unsigned int max_sectors = queue_max_hw_sectors(rq->q); unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS); @@ -243,18 +241,34 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, if (!iov_iter_count(iter)) return -EINVAL; - bio = bio_kmalloc(nr_vecs, gfp_mask); - if (!bio) - return -ENOMEM; - bio_init(bio, NULL, bio->bi_inline_vecs, nr_vecs, req_op(rq)); + if (rq->cmd_flags & REQ_POLLED) { + blk_opf_t opf = rq->cmd_flags | REQ_ALLOC_CACHE; + + bio = bio_alloc_bioset(NULL, nr_vecs, opf, gfp_mask, + &fs_bio_set); + if (!bio) + return -ENOMEM; + } else { + bio = bio_kmalloc(nr_vecs, gfp_mask); + if (!bio) + return -ENOMEM; + bio_init(bio, NULL, bio->bi_inline_vecs, nr_vecs, req_op(rq)); + } while (iov_iter_count(iter)) { - struct page **pages; + struct page **pages, *stack_pages[8]; ssize_t bytes; size_t offs, added = 0; int npages; - bytes = iov_iter_get_pages_alloc(iter, &pages, LONG_MAX, &offs); + if (nr_vecs < ARRAY_SIZE(stack_pages)) { + pages = stack_pages; + bytes = iov_iter_get_pages(iter, pages, LONG_MAX, + nr_vecs, &offs); + } else { + bytes = iov_iter_get_pages_alloc(iter, &pages, LONG_MAX, + &offs); + } if (unlikely(bytes <= 0)) { ret = bytes ? bytes : -EFAULT; goto out_unmap; @@ -291,7 +305,8 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, */ while (j < npages) put_page(pages[j++]); - kvfree(pages); + if (pages != stack_pages) + kvfree(pages); /* couldn't stuff something into bio? */ if (bytes) break; @@ -304,8 +319,12 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, out_unmap: bio_release_pages(bio, false); - bio_uninit(bio); - kfree(bio); + if (bio->bi_opf & REQ_ALLOC_CACHE) { + bio_put(bio); + } else { + bio_uninit(bio); + kfree(bio); + } return ret; } @@ -325,8 +344,12 @@ static void bio_invalidate_vmalloc_pages(struct bio *bio) static void bio_map_kern_endio(struct bio *bio) { bio_invalidate_vmalloc_pages(bio); - bio_uninit(bio); - kfree(bio); + if (bio->bi_opf & REQ_ALLOC_CACHE) { + bio_put(bio); + } else { + bio_uninit(bio); + kfree(bio); + } } /** @@ -610,8 +633,12 @@ int blk_rq_unmap_user(struct bio *bio) next_bio = bio; bio = bio->bi_next; - bio_uninit(next_bio); - kfree(next_bio); + if (next_bio->bi_opf & REQ_ALLOC_CACHE) { + bio_put(next_bio); + } else { + bio_uninit(next_bio); + kfree(next_bio); + } } return ret; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 8f841caaa4cb..ce2f4b8dc0d9 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -964,11 +964,10 @@ blk_status_t blk_insert_cloned_request(struct request *rq); struct rq_map_data { struct page **pages; - int page_order; int nr_entries; unsigned long offset; - int null_mapped; - int from_user; + bool null_mapped; + bool from_user; }; int blk_rq_map_user(struct request_queue *, struct request *, -- Jens Axboe ^ permalink raw reply related [flat|nested] 20+ messages in thread
end of thread, other threads:[~2022-08-07 18:46 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <CGME20220805155300epcas5p1b98722e20990d0095238964e2be9db34@epcas5p1.samsung.com> 2022-08-05 15:42 ` [PATCH 0/4] iopoll support for io_uring/nvme passthrough Kanchan Joshi [not found] ` <CGME20220805155304epcas5p1bb687a8f9b25317af39def01696626e8@epcas5p1.samsung.com> 2022-08-05 15:42 ` [PATCH 1/4] fs: add file_operations->uring_cmd_iopoll Kanchan Joshi [not found] ` <CGME20220805155307epcas5p4bab3f05dc13d8fc2f03c7a26e9bd8c7c@epcas5p4.samsung.com> 2022-08-05 15:42 ` [PATCH 2/4] io_uring: add iopoll infrastructure for io_uring_cmd Kanchan Joshi [not found] ` <CGME20220805155310epcas5p2bd7ec5b9bee73893958f4bc84038eca0@epcas5p2.samsung.com> 2022-08-05 15:42 ` [PATCH 3/4] block: export blk_rq_is_poll Kanchan Joshi [not found] ` <CGME20220805155313epcas5p2d35d22831bd07ef33fbdc28bd99ae1d0@epcas5p2.samsung.com> 2022-08-05 15:42 ` [PATCH 4/4] nvme: wire up async polling for io passthrough commands Kanchan Joshi 2022-08-05 17:03 ` Jens Axboe 2022-08-05 17:07 ` Kanchan Joshi 2022-08-05 21:22 ` kernel test robot 2022-08-06 0:06 ` kernel test robot 2022-08-06 1:38 ` kernel test robot 2022-08-07 12:25 ` kernel test robot 2022-08-05 17:04 ` [PATCH 0/4] iopoll support for io_uring/nvme passthrough Jens Axboe 2022-08-05 17:13 ` Kanchan Joshi 2022-08-05 17:27 ` Jens Axboe 2022-08-05 17:18 ` Jens Axboe 2022-08-05 18:11 ` Keith Busch 2022-08-05 18:15 ` Jens Axboe 2022-08-07 17:58 ` Kanchan Joshi 2022-08-07 18:46 ` Jens Axboe 2022-08-05 18:21 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox