* [PATCH v5 0/6] io_uring passthrough for nvme
[not found] <CGME20220511055306epcas5p3bf3b4c1e32d2bb43db12785bd7caf5da@epcas5p3.samsung.com>
@ 2022-05-11 5:47 ` Kanchan Joshi
[not found] ` <CGME20220511055308epcas5p3627bcb0ec10d7a2222e701898e9ad0db@epcas5p3.samsung.com>
` (7 more replies)
0 siblings, 8 replies; 15+ messages in thread
From: Kanchan Joshi @ 2022-05-11 5:47 UTC (permalink / raw)
To: axboe, hch
Cc: io-uring, linux-nvme, asml.silence, ming.lei, mcgrof, shr,
joshiiitr, anuj20.g, gost.dev
This series is against "for-5.19/io_uring-passthrough" branch (linux-block).
Patches to be refreshed on top of 2bb04df7c ("io_uring: support CQE32").
uring-cmd is the facility to enable io_uring capabilities (async is one
of those) for any arbitrary command (ioctl, fsctl or whatever else)
exposed by the command-providers (driver, fs etc.). The series
introduces uring-cmd, and connects nvme passthrough (over generic device
/dev/ngXnY) to it.
uring-cmd is specified by IORING_OP_URING_CMD. The storage for the
command is provided in the SQE itself. On a regular ring, 16 bytes of
space is available, which can be accessed using "sqe->cmd".
Alternatively, application can setup the ring with the flag
IORING_SETUP_SQE128. In that case, each SQE of the ring is 128b in size,
and provides 80b of storage space for placing the command.
nvme io-passthrough is specified by new operation NVME_URING_CMD_IO.
This operates on a new structure nvme_uring_cmd which is 72b in size.
nvme passthrough requires two results to be returned to user-space.
Therefore ring needs to be setup with the flag IORING_SETUP_CQE32.
When this flag is specified, each CQE of the ring is 32b in size.
The uring-cmd infrastructure exports helpers so that additional result
is collected from the provider and placed into the CQE.
Testing is done using this custom fio branch:
https://github.com/joshkan/fio/tree/big-cqe-pt.v4
regular io_uring io (read/write) is turned into passthrough io on
supplying "-uring_cmd=1" option.
Example command line:
fio -iodepth=1 -rw=randread -ioengine=io_uring -bs=4k -numjobs=1 -size=4k -group_reporting -filename=/dev/ng0n1 -name=io_uring_1 -uring_cmd=1
Changes since v4:
https://lore.kernel.org/linux-nvme/[email protected]/
- Allow uring-cmd to operate on regular ring too
- Move big-sqe/big-cqe requirement to nvme
- Add support for cases when uring-cmd needs deferral
- Redone Patch 3
- In nvme, use READ_ONCE while reading cmd fields from SQE
- Refactoring in Patch 4 based on the feedback of Christoph
Changes since v3:
https://lore.kernel.org/linux-nvme/[email protected]/
- Cleaned up placements of new fields in sqe and io_uring_cmd
- Removed packed-attribute from nvme_uring_cmd struct
- Applied all other Christoph's feedback too
- Applied Jens feedback
- Collected reviewed-by
Changes since v2:
https://lore.kernel.org/linux-nvme/[email protected]/
- Rewire uring-cmd infrastructure on top of new big CQE
- Prep patch (refactored) and feedback from Christoph
- Add new opcode and structure in nvme for uring-cmd
- Enable vectored-io
Changes since v1:
https://lore.kernel.org/linux-nvme/[email protected]/
- Trim down by removing patches for polling, fixed-buffer and bio-cache
- Add big CQE and move uring-cmd to use that
- Remove indirect (pointer) submission
Anuj Gupta (1):
nvme: add vectored-io support for uring-cmd
Christoph Hellwig (1):
nvme: refactor nvme_submit_user_cmd()
Jens Axboe (3):
fs,io_uring: add infrastructure for uring-cmd
block: wire-up support for passthrough plugging
io_uring: finish IOPOLL/ioprio prep handler removal
Kanchan Joshi (1):
nvme: wire-up uring-cmd support for io-passthru on char-device.
block/blk-mq.c | 73 +++++-----
drivers/nvme/host/core.c | 1 +
drivers/nvme/host/ioctl.c | 247 ++++++++++++++++++++++++++++++--
drivers/nvme/host/multipath.c | 1 +
drivers/nvme/host/nvme.h | 4 +
fs/io_uring.c | 135 ++++++++++++++---
include/linux/fs.h | 2 +
include/linux/io_uring.h | 33 +++++
include/uapi/linux/io_uring.h | 21 +--
include/uapi/linux/nvme_ioctl.h | 26 ++++
10 files changed, 471 insertions(+), 72 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v5 1/6] fs,io_uring: add infrastructure for uring-cmd
[not found] ` <CGME20220511055308epcas5p3627bcb0ec10d7a2222e701898e9ad0db@epcas5p3.samsung.com>
@ 2022-05-11 5:47 ` Kanchan Joshi
2022-05-11 12:42 ` Christoph Hellwig
0 siblings, 1 reply; 15+ messages in thread
From: Kanchan Joshi @ 2022-05-11 5:47 UTC (permalink / raw)
To: axboe, hch
Cc: io-uring, linux-nvme, asml.silence, ming.lei, mcgrof, shr,
joshiiitr, anuj20.g, gost.dev
From: Jens Axboe <[email protected]>
file_operations->uring_cmd is a file private handler.
This is somewhat similar to ioctl but hopefully a lot more sane and
useful as it can be used to enable many io_uring capabilities for the
underlying operation.
IORING_OP_URING_CMD is a file private kind of request. io_uring doesn't
know what is in this command type, it's for the provider of ->uring_cmd()
to deal with.
Co-developed-by: Kanchan Joshi <[email protected]>
Signed-off-by: Kanchan Joshi <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 123 ++++++++++++++++++++++++++++++++--
include/linux/fs.h | 2 +
include/linux/io_uring.h | 33 +++++++++
include/uapi/linux/io_uring.h | 21 +++---
4 files changed, 164 insertions(+), 15 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index ceaf7826ed71..592be5b89add 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -202,13 +202,6 @@ struct io_rings {
struct io_uring_cqe cqes[] ____cacheline_aligned_in_smp;
};
-enum io_uring_cmd_flags {
- IO_URING_F_COMPLETE_DEFER = 1,
- IO_URING_F_UNLOCKED = 2,
- /* int's last bit, sign checks are usually faster than a bit test */
- IO_URING_F_NONBLOCK = INT_MIN,
-};
-
struct io_mapped_ubuf {
u64 ubuf;
u64 ubuf_end;
@@ -972,6 +965,7 @@ struct io_kiocb {
struct io_xattr xattr;
struct io_socket sock;
struct io_nop nop;
+ struct io_uring_cmd uring_cmd;
};
u8 opcode;
@@ -1050,6 +1044,14 @@ struct io_cancel_data {
int seq;
};
+/*
+ * The URING_CMD payload starts at 'cmd' in the first sqe, and continues into
+ * the following sqe if SQE128 is used.
+ */
+#define uring_cmd_pdu_size(is_sqe128) \
+ ((1 + !!(is_sqe128)) * sizeof(struct io_uring_sqe) - \
+ offsetof(struct io_uring_sqe, cmd))
+
struct io_op_def {
/* needs req->file assigned */
unsigned needs_file : 1;
@@ -1289,6 +1291,12 @@ static const struct io_op_def io_op_defs[] = {
[IORING_OP_SOCKET] = {
.audit_skip = 1,
},
+ [IORING_OP_URING_CMD] = {
+ .needs_file = 1,
+ .plug = 1,
+ .needs_async_setup = 1,
+ .async_size = uring_cmd_pdu_size(1),
+ },
};
/* requests with any of those set should undergo io_disarm_next() */
@@ -1428,6 +1436,8 @@ const char *io_uring_get_opcode(u8 opcode)
return "GETXATTR";
case IORING_OP_SOCKET:
return "SOCKET";
+ case IORING_OP_URING_CMD:
+ return "URING_CMD";
case IORING_OP_LAST:
return "INVALID";
}
@@ -4910,6 +4920,96 @@ static int io_linkat(struct io_kiocb *req, unsigned int issue_flags)
return 0;
}
+static void io_uring_cmd_work(struct io_kiocb *req, bool *locked)
+{
+ req->uring_cmd.task_work_cb(&req->uring_cmd);
+}
+
+void io_uring_cmd_complete_in_task(struct io_uring_cmd *ioucmd,
+ void (*task_work_cb)(struct io_uring_cmd *))
+{
+ struct io_kiocb *req = container_of(ioucmd, struct io_kiocb, uring_cmd);
+
+ req->uring_cmd.task_work_cb = task_work_cb;
+ req->io_task_work.func = io_uring_cmd_work;
+ io_req_task_work_add(req, !!(req->ctx->flags & IORING_SETUP_SQPOLL));
+}
+EXPORT_SYMBOL_GPL(io_uring_cmd_complete_in_task);
+
+/*
+ * Called by consumers of io_uring_cmd, if they originally returned
+ * -EIOCBQUEUED upon receiving the command.
+ */
+void io_uring_cmd_done(struct io_uring_cmd *ioucmd, ssize_t ret, ssize_t res2)
+{
+ struct io_kiocb *req = container_of(ioucmd, struct io_kiocb, uring_cmd);
+
+ if (ret < 0)
+ req_set_fail(req);
+ if (req->ctx->flags & IORING_SETUP_CQE32)
+ __io_req_complete32(req, 0, ret, 0, res2, 0);
+ else
+ io_req_complete(req, ret);
+}
+EXPORT_SYMBOL_GPL(io_uring_cmd_done);
+
+static int io_uring_cmd_prep_async(struct io_kiocb *req)
+{
+ size_t cmd_size;
+
+ cmd_size = uring_cmd_pdu_size(req->ctx->flags & IORING_SETUP_SQE128);
+
+ memcpy(req->async_data, req->uring_cmd.cmd, cmd_size);
+ return 0;
+}
+
+static int io_uring_cmd_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ struct io_uring_cmd *ioucmd = &req->uring_cmd;
+
+ if (sqe->ioprio || sqe->rw_flags)
+ return -EINVAL;
+ ioucmd->cmd = sqe->cmd;
+ ioucmd->cmd_op = READ_ONCE(sqe->cmd_op);
+ return 0;
+}
+
+static int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_uring_cmd *ioucmd = &req->uring_cmd;
+ struct io_ring_ctx *ctx = req->ctx;
+ struct file *file = req->file;
+ int ret;
+
+ if (!req->file->f_op->uring_cmd)
+ return -EOPNOTSUPP;
+
+ if (ctx->flags & IORING_SETUP_SQE128)
+ issue_flags |= IO_URING_F_SQE128;
+ if (ctx->flags & IORING_SETUP_CQE32)
+ issue_flags |= IO_URING_F_CQE32;
+ if (ctx->flags & IORING_SETUP_IOPOLL)
+ issue_flags |= IO_URING_F_IOPOLL;
+
+ if (req_has_async_data(req))
+ ioucmd->cmd = req->async_data;
+
+ ret = file->f_op->uring_cmd(ioucmd, issue_flags);
+ if (ret == -EAGAIN) {
+ if (!req_has_async_data(req)) {
+ if (io_alloc_async_data(req))
+ return -ENOMEM;
+ io_uring_cmd_prep_async(req);
+ }
+ return -EAGAIN;
+ }
+
+ if (ret != -EIOCBQUEUED)
+ io_uring_cmd_done(ioucmd, ret, 0);
+ return 0;
+}
+
static int io_shutdown_prep(struct io_kiocb *req,
const struct io_uring_sqe *sqe)
{
@@ -7755,6 +7855,8 @@ static int io_req_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
return io_getxattr_prep(req, sqe);
case IORING_OP_SOCKET:
return io_socket_prep(req, sqe);
+ case IORING_OP_URING_CMD:
+ return io_uring_cmd_prep(req, sqe);
}
printk_once(KERN_WARNING "io_uring: unhandled opcode %d\n",
@@ -7787,6 +7889,8 @@ static int io_req_prep_async(struct io_kiocb *req)
return io_recvmsg_prep_async(req);
case IORING_OP_CONNECT:
return io_connect_prep_async(req);
+ case IORING_OP_URING_CMD:
+ return io_uring_cmd_prep_async(req);
}
printk_once(KERN_WARNING "io_uring: prep_async() bad opcode %d\n",
req->opcode);
@@ -8081,6 +8185,9 @@ static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags)
case IORING_OP_SOCKET:
ret = io_socket(req, issue_flags);
break;
+ case IORING_OP_URING_CMD:
+ ret = io_uring_cmd(req, issue_flags);
+ break;
default:
ret = -EINVAL;
break;
@@ -12699,6 +12806,8 @@ static int __init io_uring_init(void)
BUILD_BUG_ON(sizeof(atomic_t) != sizeof(u32));
+ BUILD_BUG_ON(sizeof(struct io_uring_cmd) > 64);
+
req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC |
SLAB_ACCOUNT);
return 0;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index bbde95387a23..87b5af1d9fbe 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1953,6 +1953,7 @@ struct dir_context {
#define REMAP_FILE_ADVISORY (REMAP_FILE_CAN_SHORTEN)
struct iov_iter;
+struct io_uring_cmd;
struct file_operations {
struct module *owner;
@@ -1995,6 +1996,7 @@ struct file_operations {
struct file *file_out, loff_t pos_out,
loff_t len, unsigned int remap_flags);
int (*fadvise)(struct file *, loff_t, loff_t, int);
+ int (*uring_cmd)(struct io_uring_cmd *ioucmd, unsigned int issue_flags);
} __randomize_layout;
struct inode_operations {
diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
index 24651c229ed2..4a2f6cc5a492 100644
--- a/include/linux/io_uring.h
+++ b/include/linux/io_uring.h
@@ -5,7 +5,32 @@
#include <linux/sched.h>
#include <linux/xarray.h>
+enum io_uring_cmd_flags {
+ IO_URING_F_COMPLETE_DEFER = 1,
+ IO_URING_F_UNLOCKED = 2,
+ /* int's last bit, sign checks are usually faster than a bit test */
+ IO_URING_F_NONBLOCK = INT_MIN,
+
+ /* ctx state flags, for URING_CMD */
+ IO_URING_F_SQE128 = 4,
+ IO_URING_F_CQE32 = 8,
+ IO_URING_F_IOPOLL = 16,
+};
+
+struct io_uring_cmd {
+ struct file *file;
+ const void *cmd;
+ /* callback to defer completions to task context */
+ void (*task_work_cb)(struct io_uring_cmd *cmd);
+ u32 cmd_op;
+ u32 pad;
+ u8 pdu[32]; /* available inline for free use */
+};
+
#if defined(CONFIG_IO_URING)
+void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret, ssize_t res2);
+void io_uring_cmd_complete_in_task(struct io_uring_cmd *ioucmd,
+ void (*task_work_cb)(struct io_uring_cmd *));
struct sock *io_uring_get_socket(struct file *file);
void __io_uring_cancel(bool cancel_all);
void __io_uring_free(struct task_struct *tsk);
@@ -30,6 +55,14 @@ static inline void io_uring_free(struct task_struct *tsk)
__io_uring_free(tsk);
}
#else
+static inline void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret,
+ ssize_t ret2)
+{
+}
+static inline void io_uring_cmd_complete_in_task(struct io_uring_cmd *ioucmd,
+ void (*task_work_cb)(struct io_uring_cmd *))
+{
+}
static inline struct sock *io_uring_get_socket(struct file *file)
{
return NULL;
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index ac2d90d669c3..23618be55dd2 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -22,6 +22,7 @@ struct io_uring_sqe {
union {
__u64 off; /* offset into file */
__u64 addr2;
+ __u32 cmd_op;
};
union {
__u64 addr; /* pointer to buffer or iovecs */
@@ -61,14 +62,17 @@ struct io_uring_sqe {
__s32 splice_fd_in;
__u32 file_index;
};
- __u64 addr3;
- __u64 __pad2[1];
-
- /*
- * If the ring is initialized with IORING_SETUP_SQE128, then this field
- * contains 64-bytes of padding, doubling the size of the SQE.
- */
- __u64 __big_sqe_pad[0];
+ union {
+ struct {
+ __u64 addr3;
+ __u64 __pad2[1];
+ };
+ /*
+ * If the ring is initialized with IORING_SETUP_SQE128, then
+ * this field is used for 80 bytes of arbitrary command data
+ */
+ __u8 cmd[0];
+ };
};
enum {
@@ -175,6 +179,7 @@ enum io_uring_op {
IORING_OP_FGETXATTR,
IORING_OP_GETXATTR,
IORING_OP_SOCKET,
+ IORING_OP_URING_CMD,
/* this goes last, obviously */
IORING_OP_LAST,
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v5 2/6] block: wire-up support for passthrough plugging
[not found] ` <CGME20220511055310epcas5p46650f5b6fe963279f686b8f50a98a286@epcas5p4.samsung.com>
@ 2022-05-11 5:47 ` Kanchan Joshi
0 siblings, 0 replies; 15+ messages in thread
From: Kanchan Joshi @ 2022-05-11 5:47 UTC (permalink / raw)
To: axboe, hch
Cc: io-uring, linux-nvme, asml.silence, ming.lei, mcgrof, shr,
joshiiitr, anuj20.g, gost.dev
From: Jens Axboe <[email protected]>
Add support for plugging in passthrough path. When plugging is enabled, the
requests are added to a plug instead of getting dispatched to the driver.
And when the plug is finished, the whole batch gets dispatched via
->queue_rqs which turns out to be more efficient. Otherwise dispatching
used to happen via ->queue_rq, one request at a time.
Signed-off-by: Jens Axboe <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
block/blk-mq.c | 73 +++++++++++++++++++++++++++-----------------------
1 file changed, 39 insertions(+), 34 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 84d749511f55..2cf011b57cf9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2340,6 +2340,40 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
blk_mq_hctx_mark_pending(hctx, ctx);
}
+/*
+ * Allow 2x BLK_MAX_REQUEST_COUNT requests on plug queue for multiple
+ * queues. This is important for md arrays to benefit from merging
+ * requests.
+ */
+static inline unsigned short blk_plug_max_rq_count(struct blk_plug *plug)
+{
+ if (plug->multiple_queues)
+ return BLK_MAX_REQUEST_COUNT * 2;
+ return BLK_MAX_REQUEST_COUNT;
+}
+
+static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq)
+{
+ struct request *last = rq_list_peek(&plug->mq_list);
+
+ if (!plug->rq_count) {
+ trace_block_plug(rq->q);
+ } else if (plug->rq_count >= blk_plug_max_rq_count(plug) ||
+ (!blk_queue_nomerges(rq->q) &&
+ blk_rq_bytes(last) >= BLK_PLUG_FLUSH_SIZE)) {
+ blk_mq_flush_plug_list(plug, false);
+ trace_block_plug(rq->q);
+ }
+
+ if (!plug->multiple_queues && last && last->q != rq->q)
+ plug->multiple_queues = true;
+ if (!plug->has_elevator && (rq->rq_flags & RQF_ELV))
+ plug->has_elevator = true;
+ rq->rq_next = NULL;
+ rq_list_add(&plug->mq_list, rq);
+ plug->rq_count++;
+}
+
/**
* blk_mq_request_bypass_insert - Insert a request at dispatch list.
* @rq: Pointer to request to be inserted.
@@ -2353,7 +2387,12 @@ void blk_mq_request_bypass_insert(struct request *rq, bool at_head,
bool run_queue)
{
struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
+ struct blk_plug *plug = current->plug;
+ if (plug) {
+ blk_add_rq_to_plug(plug, rq);
+ return;
+ }
spin_lock(&hctx->lock);
if (at_head)
list_add(&rq->queuelist, &hctx->dispatch);
@@ -2676,40 +2715,6 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
hctx->queue->mq_ops->commit_rqs(hctx);
}
-/*
- * Allow 2x BLK_MAX_REQUEST_COUNT requests on plug queue for multiple
- * queues. This is important for md arrays to benefit from merging
- * requests.
- */
-static inline unsigned short blk_plug_max_rq_count(struct blk_plug *plug)
-{
- if (plug->multiple_queues)
- return BLK_MAX_REQUEST_COUNT * 2;
- return BLK_MAX_REQUEST_COUNT;
-}
-
-static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq)
-{
- struct request *last = rq_list_peek(&plug->mq_list);
-
- if (!plug->rq_count) {
- trace_block_plug(rq->q);
- } else if (plug->rq_count >= blk_plug_max_rq_count(plug) ||
- (!blk_queue_nomerges(rq->q) &&
- blk_rq_bytes(last) >= BLK_PLUG_FLUSH_SIZE)) {
- blk_mq_flush_plug_list(plug, false);
- trace_block_plug(rq->q);
- }
-
- if (!plug->multiple_queues && last && last->q != rq->q)
- plug->multiple_queues = true;
- if (!plug->has_elevator && (rq->rq_flags & RQF_ELV))
- plug->has_elevator = true;
- rq->rq_next = NULL;
- rq_list_add(&plug->mq_list, rq);
- plug->rq_count++;
-}
-
static bool blk_mq_attempt_bio_merge(struct request_queue *q,
struct bio *bio, unsigned int nr_segs)
{
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v5 3/6] nvme: refactor nvme_submit_user_cmd()
[not found] ` <CGME20220511055312epcas5p3b1e9989a32cb1a79f8a941476fc433d1@epcas5p3.samsung.com>
@ 2022-05-11 5:47 ` Kanchan Joshi
0 siblings, 0 replies; 15+ messages in thread
From: Kanchan Joshi @ 2022-05-11 5:47 UTC (permalink / raw)
To: axboe, hch
Cc: io-uring, linux-nvme, asml.silence, ming.lei, mcgrof, shr,
joshiiitr, anuj20.g, gost.dev
From: Christoph Hellwig <[email protected]>
Divide the work into two helpers, namely nvme_alloc_user_request and
nvme_execute_user_rq. This is a prep patch, to help wiring up
uring-cmd support in nvme.
Signed-off-by: Christoph Hellwig <[email protected]>
[axboe: fold in fix for assuming bio is non-NULL]
Signed-off-by: Jens Axboe <[email protected]>
---
drivers/nvme/host/ioctl.c | 56 +++++++++++++++++++++++++++++++--------
1 file changed, 45 insertions(+), 11 deletions(-)
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 554566371ffa..8d2569b656cc 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -53,10 +53,20 @@ static void *nvme_add_user_metadata(struct bio *bio, void __user *ubuf,
return ERR_PTR(ret);
}
-static int nvme_submit_user_cmd(struct request_queue *q,
+static int nvme_finish_user_metadata(struct request *req, void __user *ubuf,
+ void *meta, unsigned len, int ret)
+{
+ if (!ret && req_op(req) == REQ_OP_DRV_IN &&
+ copy_to_user(ubuf, meta, len))
+ ret = -EFAULT;
+ kfree(meta);
+ return ret;
+}
+
+static struct request *nvme_alloc_user_request(struct request_queue *q,
struct nvme_command *cmd, void __user *ubuffer,
unsigned bufflen, void __user *meta_buffer, unsigned meta_len,
- u32 meta_seed, u64 *result, unsigned timeout, bool vec)
+ u32 meta_seed, void **metap, unsigned timeout, bool vec)
{
bool write = nvme_is_write(cmd);
struct nvme_ns *ns = q->queuedata;
@@ -68,7 +78,7 @@ static int nvme_submit_user_cmd(struct request_queue *q,
req = blk_mq_alloc_request(q, nvme_req_op(cmd), 0);
if (IS_ERR(req))
- return PTR_ERR(req);
+ return req;
nvme_init_request(req, cmd);
if (timeout)
@@ -105,26 +115,50 @@ static int nvme_submit_user_cmd(struct request_queue *q,
goto out_unmap;
}
req->cmd_flags |= REQ_INTEGRITY;
+ *metap = meta;
}
}
+ return req;
+
+out_unmap:
+ if (bio)
+ blk_rq_unmap_user(bio);
+out:
+ blk_mq_free_request(req);
+ return ERR_PTR(ret);
+}
+
+static int nvme_submit_user_cmd(struct request_queue *q,
+ struct nvme_command *cmd, void __user *ubuffer,
+ unsigned bufflen, void __user *meta_buffer, unsigned meta_len,
+ u32 meta_seed, u64 *result, unsigned timeout, bool vec)
+{
+ struct request *req;
+ void *meta = NULL;
+ struct bio *bio;
+ int ret;
+
+ req = nvme_alloc_user_request(q, cmd, ubuffer, bufflen, meta_buffer,
+ meta_len, meta_seed, &meta, timeout, vec);
+ if (IS_ERR(req))
+ return PTR_ERR(req);
+
+ bio = req->bio;
+
ret = nvme_execute_passthru_rq(req);
+
if (result)
*result = le64_to_cpu(nvme_req(req)->result.u64);
- if (meta && !ret && !write) {
- if (copy_to_user(meta_buffer, meta, meta_len))
- ret = -EFAULT;
- }
- kfree(meta);
- out_unmap:
+ if (meta)
+ ret = nvme_finish_user_metadata(req, meta_buffer, meta,
+ meta_len, ret);
if (bio)
blk_rq_unmap_user(bio);
- out:
blk_mq_free_request(req);
return ret;
}
-
static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio)
{
struct nvme_user_io io;
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v5 4/6] nvme: wire-up uring-cmd support for io-passthru on char-device.
[not found] ` <CGME20220511055314epcas5p42ddbc17608f2677814c07b0d790e1318@epcas5p4.samsung.com>
@ 2022-05-11 5:47 ` Kanchan Joshi
0 siblings, 0 replies; 15+ messages in thread
From: Kanchan Joshi @ 2022-05-11 5:47 UTC (permalink / raw)
To: axboe, hch
Cc: io-uring, linux-nvme, asml.silence, ming.lei, mcgrof, shr,
joshiiitr, anuj20.g, gost.dev
Introduce handler for fops->uring_cmd(), implementing async passthru
on char device (/dev/ngX). The handler supports newly introduced
operation NVME_URING_CMD_IO. This operates on a new structure
nvme_uring_cmd, which is similar to struct nvme_passthru_cmd64 but
without the embedded 8b result field. This field is not needed since
uring-cmd allows to return additional result via big-CQE.
Signed-off-by: Kanchan Joshi <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
drivers/nvme/host/core.c | 1 +
drivers/nvme/host/ioctl.c | 192 +++++++++++++++++++++++++++++++-
drivers/nvme/host/multipath.c | 1 +
drivers/nvme/host/nvme.h | 4 +
include/uapi/linux/nvme_ioctl.h | 25 +++++
5 files changed, 220 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index e1846d04817f..682df98db341 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3699,6 +3699,7 @@ static const struct file_operations nvme_ns_chr_fops = {
.release = nvme_ns_chr_release,
.unlocked_ioctl = nvme_ns_chr_ioctl,
.compat_ioctl = compat_ptr_ioctl,
+ .uring_cmd = nvme_ns_chr_uring_cmd,
};
static int nvme_add_ns_cdev(struct nvme_ns *ns)
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 8d2569b656cc..92d695262d8f 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -5,6 +5,7 @@
*/
#include <linux/ptrace.h> /* for force_successful_syscall_return */
#include <linux/nvme_ioctl.h>
+#include <linux/io_uring.h>
#include "nvme.h"
/*
@@ -66,7 +67,8 @@ static int nvme_finish_user_metadata(struct request *req, void __user *ubuf,
static struct request *nvme_alloc_user_request(struct request_queue *q,
struct nvme_command *cmd, void __user *ubuffer,
unsigned bufflen, void __user *meta_buffer, unsigned meta_len,
- u32 meta_seed, void **metap, unsigned timeout, bool vec)
+ u32 meta_seed, void **metap, unsigned timeout, bool vec,
+ unsigned int rq_flags, blk_mq_req_flags_t blk_flags)
{
bool write = nvme_is_write(cmd);
struct nvme_ns *ns = q->queuedata;
@@ -76,7 +78,7 @@ static struct request *nvme_alloc_user_request(struct request_queue *q,
void *meta = NULL;
int ret;
- req = blk_mq_alloc_request(q, nvme_req_op(cmd), 0);
+ req = blk_mq_alloc_request(q, nvme_req_op(cmd) | rq_flags, blk_flags);
if (IS_ERR(req))
return req;
nvme_init_request(req, cmd);
@@ -140,7 +142,7 @@ static int nvme_submit_user_cmd(struct request_queue *q,
int ret;
req = nvme_alloc_user_request(q, cmd, ubuffer, bufflen, meta_buffer,
- meta_len, meta_seed, &meta, timeout, vec);
+ meta_len, meta_seed, &meta, timeout, vec, 0, 0);
if (IS_ERR(req))
return PTR_ERR(req);
@@ -330,6 +332,139 @@ static int nvme_user_cmd64(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
return status;
}
+struct nvme_uring_data {
+ __u64 metadata;
+ __u64 addr;
+ __u32 data_len;
+ __u32 metadata_len;
+ __u32 timeout_ms;
+};
+
+/*
+ * This overlays struct io_uring_cmd pdu.
+ * Expect build errors if this grows larger than that.
+ */
+struct nvme_uring_cmd_pdu {
+ union {
+ struct bio *bio;
+ struct request *req;
+ };
+ void *meta; /* kernel-resident buffer */
+ void __user *meta_buffer;
+ u32 meta_len;
+};
+
+static inline struct nvme_uring_cmd_pdu *nvme_uring_cmd_pdu(
+ struct io_uring_cmd *ioucmd)
+{
+ return (struct nvme_uring_cmd_pdu *)&ioucmd->pdu;
+}
+
+static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd)
+{
+ struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd);
+ struct request *req = pdu->req;
+ struct bio *bio = req->bio;
+ int status;
+ u64 result;
+
+ if (nvme_req(req)->flags & NVME_REQ_CANCELLED)
+ status = -EINTR;
+ else
+ status = nvme_req(req)->status;
+
+ result = le64_to_cpu(nvme_req(req)->result.u64);
+
+ if (pdu->meta)
+ status = nvme_finish_user_metadata(req, pdu->meta_buffer,
+ pdu->meta, pdu->meta_len, status);
+ if (bio)
+ blk_rq_unmap_user(bio);
+ blk_mq_free_request(req);
+
+ io_uring_cmd_done(ioucmd, status, result);
+}
+
+static void nvme_uring_cmd_end_io(struct request *req, blk_status_t err)
+{
+ struct io_uring_cmd *ioucmd = req->end_io_data;
+ struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd);
+ /* extract bio before reusing the same field for request */
+ struct bio *bio = pdu->bio;
+
+ pdu->req = req;
+ req->bio = bio;
+ /* this takes care of moving rest of completion-work to task context */
+ io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_cb);
+}
+
+static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
+ struct io_uring_cmd *ioucmd, unsigned int issue_flags)
+{
+ struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd);
+ const struct nvme_uring_cmd *cmd = ioucmd->cmd;
+ struct request_queue *q = ns ? ns->queue : ctrl->admin_q;
+ struct nvme_uring_data d;
+ struct nvme_command c;
+ struct request *req;
+ unsigned int rq_flags = 0;
+ blk_mq_req_flags_t blk_flags = 0;
+ void *meta = NULL;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EACCES;
+
+ c.common.opcode = READ_ONCE(cmd->opcode);
+ c.common.flags = READ_ONCE(cmd->flags);
+ if (c.common.flags)
+ return -EINVAL;
+
+ c.common.command_id = 0;
+ c.common.nsid = cpu_to_le32(cmd->nsid);
+ if (!nvme_validate_passthru_nsid(ctrl, ns, le32_to_cpu(c.common.nsid)))
+ return -EINVAL;
+
+ c.common.cdw2[0] = cpu_to_le32(READ_ONCE(cmd->cdw2));
+ c.common.cdw2[1] = cpu_to_le32(READ_ONCE(cmd->cdw3));
+ c.common.metadata = 0;
+ c.common.dptr.prp1 = c.common.dptr.prp2 = 0;
+ c.common.cdw10 = cpu_to_le32(READ_ONCE(cmd->cdw10));
+ c.common.cdw11 = cpu_to_le32(READ_ONCE(cmd->cdw11));
+ c.common.cdw12 = cpu_to_le32(READ_ONCE(cmd->cdw12));
+ c.common.cdw13 = cpu_to_le32(READ_ONCE(cmd->cdw13));
+ c.common.cdw14 = cpu_to_le32(READ_ONCE(cmd->cdw14));
+ c.common.cdw15 = cpu_to_le32(READ_ONCE(cmd->cdw15));
+
+ d.metadata = READ_ONCE(cmd->metadata);
+ d.addr = READ_ONCE(cmd->addr);
+ d.data_len = READ_ONCE(cmd->data_len);
+ d.metadata_len = READ_ONCE(cmd->metadata_len);
+ d.timeout_ms = READ_ONCE(cmd->timeout_ms);
+
+ if (issue_flags & IO_URING_F_NONBLOCK) {
+ rq_flags = REQ_NOWAIT;
+ blk_flags = BLK_MQ_REQ_NOWAIT;
+ }
+
+ req = nvme_alloc_user_request(q, &c, nvme_to_user_ptr(d.addr),
+ d.data_len, nvme_to_user_ptr(d.metadata),
+ d.metadata_len, 0, &meta, d.timeout_ms ?
+ msecs_to_jiffies(d.timeout_ms) : 0, 0, rq_flags,
+ blk_flags);
+ if (IS_ERR(req))
+ return PTR_ERR(req);
+ req->end_io_data = ioucmd;
+
+ /* to free bio on completion, as req->bio will be null at that time */
+ pdu->bio = req->bio;
+ pdu->meta = meta;
+ pdu->meta_buffer = nvme_to_user_ptr(d.metadata);
+ pdu->meta_len = d.metadata_len;
+
+ blk_execute_rq_nowait(req, 0, nvme_uring_cmd_end_io);
+ return -EIOCBQUEUED;
+}
+
static bool is_ctrl_ioctl(unsigned int cmd)
{
if (cmd == NVME_IOCTL_ADMIN_CMD || cmd == NVME_IOCTL_ADMIN64_CMD)
@@ -421,6 +556,42 @@ long nvme_ns_chr_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
return __nvme_ioctl(ns, cmd, (void __user *)arg);
}
+static int nvme_ns_uring_cmd(struct nvme_ns *ns, struct io_uring_cmd *ioucmd,
+ unsigned int issue_flags)
+{
+ struct nvme_ctrl *ctrl = ns->ctrl;
+ int ret;
+
+ BUILD_BUG_ON(sizeof(struct nvme_uring_cmd_pdu) > sizeof(ioucmd->pdu));
+
+ /* IOPOLL not supported yet */
+ if (issue_flags & IO_URING_F_IOPOLL)
+ return -EOPNOTSUPP;
+
+ /* NVMe passthrough requires bit SQE/CQE support */
+ if ((issue_flags & (IO_URING_F_SQE128|IO_URING_F_CQE32)) !=
+ (IO_URING_F_SQE128|IO_URING_F_CQE32))
+ return -EOPNOTSUPP;
+
+ switch (ioucmd->cmd_op) {
+ case NVME_URING_CMD_IO:
+ ret = nvme_uring_cmd_io(ctrl, ns, ioucmd, issue_flags);
+ break;
+ default:
+ ret = -ENOTTY;
+ }
+
+ return ret;
+}
+
+int nvme_ns_chr_uring_cmd(struct io_uring_cmd *ioucmd, unsigned int issue_flags)
+{
+ struct nvme_ns *ns = container_of(file_inode(ioucmd->file)->i_cdev,
+ struct nvme_ns, cdev);
+
+ return nvme_ns_uring_cmd(ns, ioucmd, issue_flags);
+}
+
#ifdef CONFIG_NVME_MULTIPATH
static int nvme_ns_head_ctrl_ioctl(struct nvme_ns *ns, unsigned int cmd,
void __user *argp, struct nvme_ns_head *head, int srcu_idx)
@@ -487,6 +658,21 @@ long nvme_ns_head_chr_ioctl(struct file *file, unsigned int cmd,
srcu_read_unlock(&head->srcu, srcu_idx);
return ret;
}
+
+int nvme_ns_head_chr_uring_cmd(struct io_uring_cmd *ioucmd,
+ unsigned int issue_flags)
+{
+ struct cdev *cdev = file_inode(ioucmd->file)->i_cdev;
+ struct nvme_ns_head *head = container_of(cdev, struct nvme_ns_head, cdev);
+ int srcu_idx = srcu_read_lock(&head->srcu);
+ struct nvme_ns *ns = nvme_find_path(head);
+ int ret = -EINVAL;
+
+ if (ns)
+ ret = nvme_ns_uring_cmd(ns, ioucmd, issue_flags);
+ srcu_read_unlock(&head->srcu, srcu_idx);
+ return ret;
+}
#endif /* CONFIG_NVME_MULTIPATH */
static int nvme_dev_user_cmd(struct nvme_ctrl *ctrl, void __user *argp)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index d464fdf978fb..d3e2440d8abb 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -437,6 +437,7 @@ static const struct file_operations nvme_ns_head_chr_fops = {
.release = nvme_ns_head_chr_release,
.unlocked_ioctl = nvme_ns_head_chr_ioctl,
.compat_ioctl = compat_ptr_ioctl,
+ .uring_cmd = nvme_ns_head_chr_uring_cmd,
};
static int nvme_add_ns_head_cdev(struct nvme_ns_head *head)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index a2b53ca63335..086ccbdd7003 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -782,6 +782,10 @@ long nvme_ns_head_chr_ioctl(struct file *file, unsigned int cmd,
unsigned long arg);
long nvme_dev_ioctl(struct file *file, unsigned int cmd,
unsigned long arg);
+int nvme_ns_chr_uring_cmd(struct io_uring_cmd *ioucmd,
+ unsigned int issue_flags);
+int nvme_ns_head_chr_uring_cmd(struct io_uring_cmd *ioucmd,
+ unsigned int issue_flags);
int nvme_getgeo(struct block_device *bdev, struct hd_geometry *geo);
extern const struct attribute_group *nvme_ns_id_attr_groups[];
diff --git a/include/uapi/linux/nvme_ioctl.h b/include/uapi/linux/nvme_ioctl.h
index b2e43185e3b5..04e458c649ab 100644
--- a/include/uapi/linux/nvme_ioctl.h
+++ b/include/uapi/linux/nvme_ioctl.h
@@ -70,6 +70,28 @@ struct nvme_passthru_cmd64 {
__u64 result;
};
+/* same as struct nvme_passthru_cmd64, minus the 8b result field */
+struct nvme_uring_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 rsvd1;
+ __u32 nsid;
+ __u32 cdw2;
+ __u32 cdw3;
+ __u64 metadata;
+ __u64 addr;
+ __u32 metadata_len;
+ __u32 data_len;
+ __u32 cdw10;
+ __u32 cdw11;
+ __u32 cdw12;
+ __u32 cdw13;
+ __u32 cdw14;
+ __u32 cdw15;
+ __u32 timeout_ms;
+ __u32 rsvd2;
+};
+
#define nvme_admin_cmd nvme_passthru_cmd
#define NVME_IOCTL_ID _IO('N', 0x40)
@@ -83,4 +105,7 @@ struct nvme_passthru_cmd64 {
#define NVME_IOCTL_IO64_CMD _IOWR('N', 0x48, struct nvme_passthru_cmd64)
#define NVME_IOCTL_IO64_CMD_VEC _IOWR('N', 0x49, struct nvme_passthru_cmd64)
+/* io_uring async commands: */
+#define NVME_URING_CMD_IO _IOWR('N', 0x80, struct nvme_uring_cmd)
+
#endif /* _UAPI_LINUX_NVME_IOCTL_H */
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v5 5/6] nvme: add vectored-io support for uring-cmd
[not found] ` <CGME20220511055319epcas5p1d78cb03bd1b145a6d58c8e616795af14@epcas5p1.samsung.com>
@ 2022-05-11 5:47 ` Kanchan Joshi
0 siblings, 0 replies; 15+ messages in thread
From: Kanchan Joshi @ 2022-05-11 5:47 UTC (permalink / raw)
To: axboe, hch
Cc: io-uring, linux-nvme, asml.silence, ming.lei, mcgrof, shr,
joshiiitr, anuj20.g, gost.dev
From: Anuj Gupta <[email protected]>
wire up support for async passthru that takes an array of buffers (using
iovec). Exposed via a new op NVME_URING_CMD_IO_VEC. Same 'struct
nvme_uring_cmd' is to be used with -
1. cmd.addr as base address of user iovec array
2. cmd.data_len as count of iovec array elements
Signed-off-by: Kanchan Joshi <[email protected]>
Signed-off-by: Anuj Gupta <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
drivers/nvme/host/ioctl.c | 9 ++++++---
include/uapi/linux/nvme_ioctl.h | 1 +
2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 92d695262d8f..7b0e2c9cdcae 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -399,7 +399,7 @@ static void nvme_uring_cmd_end_io(struct request *req, blk_status_t err)
}
static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
- struct io_uring_cmd *ioucmd, unsigned int issue_flags)
+ struct io_uring_cmd *ioucmd, unsigned int issue_flags, bool vec)
{
struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd);
const struct nvme_uring_cmd *cmd = ioucmd->cmd;
@@ -449,7 +449,7 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
req = nvme_alloc_user_request(q, &c, nvme_to_user_ptr(d.addr),
d.data_len, nvme_to_user_ptr(d.metadata),
d.metadata_len, 0, &meta, d.timeout_ms ?
- msecs_to_jiffies(d.timeout_ms) : 0, 0, rq_flags,
+ msecs_to_jiffies(d.timeout_ms) : 0, vec, rq_flags,
blk_flags);
if (IS_ERR(req))
return PTR_ERR(req);
@@ -575,7 +575,10 @@ static int nvme_ns_uring_cmd(struct nvme_ns *ns, struct io_uring_cmd *ioucmd,
switch (ioucmd->cmd_op) {
case NVME_URING_CMD_IO:
- ret = nvme_uring_cmd_io(ctrl, ns, ioucmd, issue_flags);
+ ret = nvme_uring_cmd_io(ctrl, ns, ioucmd, issue_flags, false);
+ break;
+ case NVME_URING_CMD_IO_VEC:
+ ret = nvme_uring_cmd_io(ctrl, ns, ioucmd, issue_flags, true);
break;
default:
ret = -ENOTTY;
diff --git a/include/uapi/linux/nvme_ioctl.h b/include/uapi/linux/nvme_ioctl.h
index 04e458c649ab..0b1876aa5a59 100644
--- a/include/uapi/linux/nvme_ioctl.h
+++ b/include/uapi/linux/nvme_ioctl.h
@@ -107,5 +107,6 @@ struct nvme_uring_cmd {
/* io_uring async commands: */
#define NVME_URING_CMD_IO _IOWR('N', 0x80, struct nvme_uring_cmd)
+#define NVME_URING_CMD_IO_VEC _IOWR('N', 0x81, struct nvme_uring_cmd)
#endif /* _UAPI_LINUX_NVME_IOCTL_H */
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v5 6/6] io_uring: finish IOPOLL/ioprio prep handler removal
[not found] ` <CGME20220511055320epcas5p2457aaf8e7405387282f60831f5a4eca9@epcas5p2.samsung.com>
@ 2022-05-11 5:47 ` Kanchan Joshi
2022-05-11 6:54 ` Christoph Hellwig
0 siblings, 1 reply; 15+ messages in thread
From: Kanchan Joshi @ 2022-05-11 5:47 UTC (permalink / raw)
To: axboe, hch
Cc: io-uring, linux-nvme, asml.silence, ming.lei, mcgrof, shr,
joshiiitr, anuj20.g, gost.dev
From: Jens Axboe <[email protected]>
Commit 73911426aaaa unified the checking of this, but was done before we
had a few new opcode handlers.
Finish the job and apply it to the xattr/socket/uring_cmd handlers as
well so we check and use this fully consistently.
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 592be5b89add..44c57dca358d 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4517,10 +4517,6 @@ static int __io_getxattr_prep(struct io_kiocb *req,
const char __user *name;
int ret;
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (unlikely(sqe->ioprio))
- return -EINVAL;
if (unlikely(req->flags & REQ_F_FIXED_FILE))
return -EBADF;
@@ -4630,10 +4626,6 @@ static int __io_setxattr_prep(struct io_kiocb *req,
const char __user *name;
int ret;
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (unlikely(sqe->ioprio))
- return -EINVAL;
if (unlikely(req->flags & REQ_F_FIXED_FILE))
return -EBADF;
@@ -4968,7 +4960,7 @@ static int io_uring_cmd_prep(struct io_kiocb *req,
{
struct io_uring_cmd *ioucmd = &req->uring_cmd;
- if (sqe->ioprio || sqe->rw_flags)
+ if (sqe->rw_flags)
return -EINVAL;
ioucmd->cmd = sqe->cmd;
ioucmd->cmd_op = READ_ONCE(sqe->cmd_op);
@@ -6405,9 +6397,7 @@ static int io_socket_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
{
struct io_socket *sock = &req->sock;
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (sqe->ioprio || sqe->addr || sqe->rw_flags || sqe->buf_index)
+ if (sqe->addr || sqe->rw_flags || sqe->buf_index)
return -EINVAL;
sock->domain = READ_ONCE(sqe->fd);
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v5 6/6] io_uring: finish IOPOLL/ioprio prep handler removal
2022-05-11 5:47 ` [PATCH v5 6/6] io_uring: finish IOPOLL/ioprio prep handler removal Kanchan Joshi
@ 2022-05-11 6:54 ` Christoph Hellwig
2022-05-11 11:05 ` Kanchan Joshi
2022-05-11 12:38 ` Jens Axboe
0 siblings, 2 replies; 15+ messages in thread
From: Christoph Hellwig @ 2022-05-11 6:54 UTC (permalink / raw)
To: Kanchan Joshi
Cc: axboe, hch, io-uring, linux-nvme, asml.silence, ming.lei, mcgrof,
shr, joshiiitr, anuj20.g, gost.dev
I think this should go first, and the uring_cmd bits need to do the
right thing from the beginning.
On Wed, May 11, 2022 at 11:17:50AM +0530, Kanchan Joshi wrote:
> From: Jens Axboe <[email protected]>
>
> Commit 73911426aaaa unified the checking of this, but was done before we
> had a few new opcode handlers.
Also please quote the actual commit subject as well here per the usual
Fixes-like refernence style.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v5 6/6] io_uring: finish IOPOLL/ioprio prep handler removal
2022-05-11 6:54 ` Christoph Hellwig
@ 2022-05-11 11:05 ` Kanchan Joshi
2022-05-11 12:38 ` Jens Axboe
1 sibling, 0 replies; 15+ messages in thread
From: Kanchan Joshi @ 2022-05-11 11:05 UTC (permalink / raw)
To: Christoph Hellwig
Cc: axboe, io-uring, linux-nvme, asml.silence, ming.lei, mcgrof, shr,
joshiiitr, anuj20.g, gost.dev
[-- Attachment #1: Type: text/plain, Size: 350 bytes --]
On Wed, May 11, 2022 at 08:54:23AM +0200, Christoph Hellwig wrote:
>I think this should go first, and the uring_cmd bits need to do the
>right thing from the beginning.
so this patch should do what it does but only for xattr/socket.
And then it should move down to the point before uring-cmd.
Jens - let me know if you want me to iterate with that.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v5 6/6] io_uring: finish IOPOLL/ioprio prep handler removal
2022-05-11 6:54 ` Christoph Hellwig
2022-05-11 11:05 ` Kanchan Joshi
@ 2022-05-11 12:38 ` Jens Axboe
1 sibling, 0 replies; 15+ messages in thread
From: Jens Axboe @ 2022-05-11 12:38 UTC (permalink / raw)
To: Christoph Hellwig, Kanchan Joshi
Cc: io-uring, linux-nvme, asml.silence, ming.lei, mcgrof, shr,
joshiiitr, anuj20.g, gost.dev
On 5/11/22 12:54 AM, Christoph Hellwig wrote:
> I think this should go first, and the uring_cmd bits need to do the
> right thing from the beginning.
It just needs folding in, I'll do that.
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v5 0/6] io_uring passthrough for nvme
2022-05-11 5:47 ` [PATCH v5 0/6] io_uring passthrough for nvme Kanchan Joshi
` (5 preceding siblings ...)
[not found] ` <CGME20220511055320epcas5p2457aaf8e7405387282f60831f5a4eca9@epcas5p2.samsung.com>
@ 2022-05-11 12:39 ` Jens Axboe
2022-05-11 13:14 ` Kanchan Joshi
2022-05-11 13:48 ` Jens Axboe
7 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2022-05-11 12:39 UTC (permalink / raw)
To: Kanchan Joshi, hch
Cc: io-uring, linux-nvme, asml.silence, ming.lei, mcgrof, shr,
joshiiitr, anuj20.g, gost.dev
Which patches had changes in this series? I'm assuming it's just patch
1, but the changelog doesn't actually say. Would save me from comparing
to what's in-tree already.
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v5 1/6] fs,io_uring: add infrastructure for uring-cmd
2022-05-11 5:47 ` [PATCH v5 1/6] fs,io_uring: add infrastructure for uring-cmd Kanchan Joshi
@ 2022-05-11 12:42 ` Christoph Hellwig
0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2022-05-11 12:42 UTC (permalink / raw)
To: Kanchan Joshi
Cc: axboe, hch, io-uring, linux-nvme, asml.silence, ming.lei, mcgrof,
shr, joshiiitr, anuj20.g, gost.dev
Looks good (modulo the bits from patch 6 that should be folded in):
Reviewed-by: Christoph Hellwig <[email protected]>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v5 0/6] io_uring passthrough for nvme
2022-05-11 12:39 ` [PATCH v5 0/6] io_uring passthrough for nvme Jens Axboe
@ 2022-05-11 13:14 ` Kanchan Joshi
2022-05-11 13:39 ` Jens Axboe
0 siblings, 1 reply; 15+ messages in thread
From: Kanchan Joshi @ 2022-05-11 13:14 UTC (permalink / raw)
To: Jens Axboe
Cc: Kanchan Joshi, Christoph Hellwig, io-uring, linux-nvme,
Pavel Begunkov, Ming Lei, Luis Chamberlain, Stefan Roesch,
Anuj Gupta, gost.dev
On Wed, May 11, 2022 at 6:09 PM Jens Axboe <[email protected]> wrote:
>
> Which patches had changes in this series? I'm assuming it's just patch
> 1, but the changelog doesn't actually say. Would save me from comparing
> to what's in-tree already.
Compared to in-tree, it is Patch 1 and Patch 4.
This part in patch 4:
+int nvme_ns_head_chr_uring_cmd(struct io_uring_cmd *ioucmd,
+ unsigned int issue_flags)
+{
+ struct cdev *cdev = file_inode(ioucmd->file)->i_cdev;
+ struct nvme_ns_head *head = container_of(cdev, struct
nvme_ns_head, cdev);
+ int srcu_idx = srcu_read_lock(&head->srcu);
+ struct nvme_ns *ns = nvme_find_path(head);
+ int ret = -EINVAL;
+
+ if (ns)
+ ret = nvme_ns_uring_cmd(ns, ioucmd, issue_flags);
+ srcu_read_unlock(&head->srcu, srcu_idx);
+ return ret;
+}
Initializing ret to -EINVAL rather than 0.
We do not support admin commands yet, so ns can be null only if
something goes wrong with multipath.
So if at all anything goes wrong and ns is null, it is better to
return failure than success.
And I removed the lore links from commit-messages, thinking those will
be refreshed too.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v5 0/6] io_uring passthrough for nvme
2022-05-11 13:14 ` Kanchan Joshi
@ 2022-05-11 13:39 ` Jens Axboe
0 siblings, 0 replies; 15+ messages in thread
From: Jens Axboe @ 2022-05-11 13:39 UTC (permalink / raw)
To: Kanchan Joshi
Cc: Kanchan Joshi, Christoph Hellwig, io-uring, linux-nvme,
Pavel Begunkov, Ming Lei, Luis Chamberlain, Stefan Roesch,
Anuj Gupta, gost.dev
On 5/11/22 7:14 AM, Kanchan Joshi wrote:
> On Wed, May 11, 2022 at 6:09 PM Jens Axboe <[email protected]> wrote:
>>
>> Which patches had changes in this series? I'm assuming it's just patch
>> 1, but the changelog doesn't actually say. Would save me from comparing
>> to what's in-tree already.
>
> Compared to in-tree, it is Patch 1 and Patch 4.
> This part in patch 4:
> +int nvme_ns_head_chr_uring_cmd(struct io_uring_cmd *ioucmd,
> + unsigned int issue_flags)
> +{
> + struct cdev *cdev = file_inode(ioucmd->file)->i_cdev;
> + struct nvme_ns_head *head = container_of(cdev, struct
> nvme_ns_head, cdev);
> + int srcu_idx = srcu_read_lock(&head->srcu);
> + struct nvme_ns *ns = nvme_find_path(head);
> + int ret = -EINVAL;
> +
> + if (ns)
> + ret = nvme_ns_uring_cmd(ns, ioucmd, issue_flags);
> + srcu_read_unlock(&head->srcu, srcu_idx);
> + return ret;
> +}
> Initializing ret to -EINVAL rather than 0.
> We do not support admin commands yet, so ns can be null only if
> something goes wrong with multipath.
> So if at all anything goes wrong and ns is null, it is better to
> return failure than success.
>
> And I removed the lore links from commit-messages, thinking those will
> be refreshed too.
OK all good, that's what I expected. I'll update the series.
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v5 0/6] io_uring passthrough for nvme
2022-05-11 5:47 ` [PATCH v5 0/6] io_uring passthrough for nvme Kanchan Joshi
` (6 preceding siblings ...)
2022-05-11 12:39 ` [PATCH v5 0/6] io_uring passthrough for nvme Jens Axboe
@ 2022-05-11 13:48 ` Jens Axboe
7 siblings, 0 replies; 15+ messages in thread
From: Jens Axboe @ 2022-05-11 13:48 UTC (permalink / raw)
To: Christoph Hellwig, joshi.k
Cc: joshiiitr, anuj20.g, shr, ming.lei, linux-nvme, io-uring, mcgrof,
gost.dev, asml.silence
On Wed, 11 May 2022 11:17:44 +0530, Kanchan Joshi wrote:
> This series is against "for-5.19/io_uring-passthrough" branch (linux-block).
> Patches to be refreshed on top of 2bb04df7c ("io_uring: support CQE32").
>
> uring-cmd is the facility to enable io_uring capabilities (async is one
> of those) for any arbitrary command (ioctl, fsctl or whatever else)
> exposed by the command-providers (driver, fs etc.). The series
> introduces uring-cmd, and connects nvme passthrough (over generic device
> /dev/ngXnY) to it.
>
> [...]
Applied, thanks!
[1/6] fs,io_uring: add infrastructure for uring-cmd
commit: ee692a21e9bf8354bd3ec816f1cf4bff8619ed77
[2/6] block: wire-up support for passthrough plugging
commit: 1c2d2fff6dc04662dc8e86b537989643e1abeed9
[3/6] nvme: refactor nvme_submit_user_cmd()
commit: bcad2565b5d64700cf68cc9d48618ab817ff5bc4
[4/6] nvme: wire-up uring-cmd support for io-passthru on char-device.
commit: 456cba386e94f22fa1b1426303fdcac9e66b1417
[5/6] nvme: add vectored-io support for uring-cmd
commit: f569add47119fa910ed7711b26b8d38e21f7ea77
[6/6] io_uring: finish IOPOLL/ioprio prep handler removal
(no commit info)
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2022-05-11 13:48 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20220511055306epcas5p3bf3b4c1e32d2bb43db12785bd7caf5da@epcas5p3.samsung.com>
2022-05-11 5:47 ` [PATCH v5 0/6] io_uring passthrough for nvme Kanchan Joshi
[not found] ` <CGME20220511055308epcas5p3627bcb0ec10d7a2222e701898e9ad0db@epcas5p3.samsung.com>
2022-05-11 5:47 ` [PATCH v5 1/6] fs,io_uring: add infrastructure for uring-cmd Kanchan Joshi
2022-05-11 12:42 ` Christoph Hellwig
[not found] ` <CGME20220511055310epcas5p46650f5b6fe963279f686b8f50a98a286@epcas5p4.samsung.com>
2022-05-11 5:47 ` [PATCH v5 2/6] block: wire-up support for passthrough plugging Kanchan Joshi
[not found] ` <CGME20220511055312epcas5p3b1e9989a32cb1a79f8a941476fc433d1@epcas5p3.samsung.com>
2022-05-11 5:47 ` [PATCH v5 3/6] nvme: refactor nvme_submit_user_cmd() Kanchan Joshi
[not found] ` <CGME20220511055314epcas5p42ddbc17608f2677814c07b0d790e1318@epcas5p4.samsung.com>
2022-05-11 5:47 ` [PATCH v5 4/6] nvme: wire-up uring-cmd support for io-passthru on char-device Kanchan Joshi
[not found] ` <CGME20220511055319epcas5p1d78cb03bd1b145a6d58c8e616795af14@epcas5p1.samsung.com>
2022-05-11 5:47 ` [PATCH v5 5/6] nvme: add vectored-io support for uring-cmd Kanchan Joshi
[not found] ` <CGME20220511055320epcas5p2457aaf8e7405387282f60831f5a4eca9@epcas5p2.samsung.com>
2022-05-11 5:47 ` [PATCH v5 6/6] io_uring: finish IOPOLL/ioprio prep handler removal Kanchan Joshi
2022-05-11 6:54 ` Christoph Hellwig
2022-05-11 11:05 ` Kanchan Joshi
2022-05-11 12:38 ` Jens Axboe
2022-05-11 12:39 ` [PATCH v5 0/6] io_uring passthrough for nvme Jens Axboe
2022-05-11 13:14 ` Kanchan Joshi
2022-05-11 13:39 ` Jens Axboe
2022-05-11 13:48 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox