* [PATCH v2 01/12] io_uring: support CQE32 in io_uring_cqe
2022-04-20 19:14 [PATCH v2 00/12] add large CQE support for io-uring Stefan Roesch
@ 2022-04-20 19:14 ` Stefan Roesch
2022-04-20 19:14 ` [PATCH v2 02/12] io_uring: wire up inline completion path for CQE32 Stefan Roesch
` (11 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Stefan Roesch @ 2022-04-20 19:14 UTC (permalink / raw)
To: io-uring, kernel-team; +Cc: shr, Jens Axboe
This adds the struct io_uring_cqe_extra in the structure io_uring_cqe to
support large CQE's.
Co-developed-by: Jens Axboe <[email protected]>
Signed-off-by: Stefan Roesch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
include/uapi/linux/io_uring.h | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index ee677dbd6a6d..c0e9b5e8d20c 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -111,6 +111,7 @@ enum {
#define IORING_SETUP_R_DISABLED (1U << 6) /* start with ring disabled */
#define IORING_SETUP_SUBMIT_ALL (1U << 7) /* continue submit on error */
#define IORING_SETUP_SQE128 (1U << 8) /* SQEs are 128b */
+#define IORING_SETUP_CQE32 (1U << 9) /* CQEs are 32b */
enum {
IORING_OP_NOP,
@@ -201,6 +202,11 @@ enum {
#define IORING_POLL_UPDATE_EVENTS (1U << 1)
#define IORING_POLL_UPDATE_USER_DATA (1U << 2)
+struct io_uring_cqe_extra {
+ __u64 extra1;
+ __u64 extra2;
+};
+
/*
* IO completion data structure (Completion Queue Entry)
*/
@@ -208,6 +214,12 @@ struct io_uring_cqe {
__u64 user_data; /* sqe->data submission passed back */
__s32 res; /* result code for this event */
__u32 flags;
+
+ /*
+ * If the ring is initialized with IORING_SETUP_CQE32, then this field
+ * contains 16-bytes of padding, doubling the size of the CQE.
+ */
+ struct io_uring_cqe_extra b[];
};
/*
--
2.30.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v2 02/12] io_uring: wire up inline completion path for CQE32
2022-04-20 19:14 [PATCH v2 00/12] add large CQE support for io-uring Stefan Roesch
2022-04-20 19:14 ` [PATCH v2 01/12] io_uring: support CQE32 in io_uring_cqe Stefan Roesch
@ 2022-04-20 19:14 ` Stefan Roesch
2022-04-20 19:14 ` [PATCH v2 03/12] io_uring: change ring size calculation " Stefan Roesch
` (10 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Stefan Roesch @ 2022-04-20 19:14 UTC (permalink / raw)
To: io-uring, kernel-team; +Cc: shr, Jens Axboe
Rather than always use the slower locked path, wire up use of the
deferred completion path that normal CQEs can take. This reuses the
hash list node for the storage we need to hold the two 64-bit values
that must be passed back.
Co-developed-by: Jens Axboe <[email protected]>
Signed-off-by: Stefan Roesch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 4c32cf987ef3..bf2b02518332 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -964,7 +964,13 @@ struct io_kiocb {
atomic_t poll_refs;
struct io_task_work io_task_work;
/* for polled requests, i.e. IORING_OP_POLL_ADD and async armed poll */
- struct hlist_node hash_node;
+ union {
+ struct hlist_node hash_node;
+ struct {
+ u64 extra1;
+ u64 extra2;
+ };
+ };
/* internal polling, see IORING_FEAT_FAST_POLL */
struct async_poll *apoll;
/* opcode allocated if it needs to store data for async defer */
--
2.30.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v2 03/12] io_uring: change ring size calculation for CQE32
2022-04-20 19:14 [PATCH v2 00/12] add large CQE support for io-uring Stefan Roesch
2022-04-20 19:14 ` [PATCH v2 01/12] io_uring: support CQE32 in io_uring_cqe Stefan Roesch
2022-04-20 19:14 ` [PATCH v2 02/12] io_uring: wire up inline completion path for CQE32 Stefan Roesch
@ 2022-04-20 19:14 ` Stefan Roesch
2022-04-20 19:14 ` [PATCH v2 04/12] io_uring: add CQE32 setup processing Stefan Roesch
` (9 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Stefan Roesch @ 2022-04-20 19:14 UTC (permalink / raw)
To: io-uring, kernel-team; +Cc: shr, Jens Axboe
This changes the function rings_size to take large CQE's into account.
Co-developed-by: Jens Axboe <[email protected]>
Signed-off-by: Stefan Roesch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index bf2b02518332..9712483d3a17 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -9693,8 +9693,8 @@ static void *io_mem_alloc(size_t size)
return (void *) __get_free_pages(gfp, get_order(size));
}
-static unsigned long rings_size(unsigned sq_entries, unsigned cq_entries,
- size_t *sq_offset)
+static unsigned long rings_size(struct io_ring_ctx *ctx, unsigned int sq_entries,
+ unsigned int cq_entries, size_t *sq_offset)
{
struct io_rings *rings;
size_t off, sq_array_size;
@@ -9702,6 +9702,10 @@ static unsigned long rings_size(unsigned sq_entries, unsigned cq_entries,
off = struct_size(rings, cqes, cq_entries);
if (off == SIZE_MAX)
return SIZE_MAX;
+ if (ctx->flags & IORING_SETUP_CQE32) {
+ if (check_shl_overflow(off, 1, &off))
+ return SIZE_MAX;
+ }
#ifdef CONFIG_SMP
off = ALIGN(off, SMP_CACHE_BYTES);
@@ -11365,7 +11369,7 @@ static __cold int io_allocate_scq_urings(struct io_ring_ctx *ctx,
ctx->sq_entries = p->sq_entries;
ctx->cq_entries = p->cq_entries;
- size = rings_size(p->sq_entries, p->cq_entries, &sq_array_offset);
+ size = rings_size(ctx, p->sq_entries, p->cq_entries, &sq_array_offset);
if (size == SIZE_MAX)
return -EOVERFLOW;
--
2.30.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v2 04/12] io_uring: add CQE32 setup processing
2022-04-20 19:14 [PATCH v2 00/12] add large CQE support for io-uring Stefan Roesch
` (2 preceding siblings ...)
2022-04-20 19:14 ` [PATCH v2 03/12] io_uring: change ring size calculation " Stefan Roesch
@ 2022-04-20 19:14 ` Stefan Roesch
2022-04-20 19:14 ` [PATCH v2 05/12] io_uring: add CQE32 completion processing Stefan Roesch
` (8 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Stefan Roesch @ 2022-04-20 19:14 UTC (permalink / raw)
To: io-uring, kernel-team; +Cc: shr
This adds two new function to setup and fill the CQE32 result structure.
Signed-off-by: Stefan Roesch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 58 insertions(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 9712483d3a17..abbd2efbe255 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2175,12 +2175,70 @@ static inline bool __io_fill_cqe_req_filled(struct io_ring_ctx *ctx,
req->cqe.res, req->cqe.flags);
}
+static inline bool __io_fill_cqe32_req_filled(struct io_ring_ctx *ctx,
+ struct io_kiocb *req)
+{
+ struct io_uring_cqe *cqe;
+ u64 extra1 = req->extra1;
+ u64 extra2 = req->extra2;
+
+ trace_io_uring_complete(req->ctx, req, req->cqe.user_data,
+ req->cqe.res, req->cqe.flags);
+
+ /*
+ * If we can't get a cq entry, userspace overflowed the
+ * submission (by quite a lot). Increment the overflow count in
+ * the ring.
+ */
+ cqe = io_get_cqe(ctx);
+ if (likely(cqe)) {
+ memcpy(cqe, &req->cqe, sizeof(struct io_uring_cqe));
+ cqe->b[0].extra1 = extra1;
+ cqe->b[0].extra2 = extra2;
+ return true;
+ }
+
+ return io_cqring_event_overflow(ctx, req->cqe.user_data,
+ req->cqe.res, req->cqe.flags, extra1, extra2);
+}
+
static inline bool __io_fill_cqe_req(struct io_kiocb *req, s32 res, u32 cflags)
{
trace_io_uring_complete(req->ctx, req, req->cqe.user_data, res, cflags);
return __io_fill_cqe(req->ctx, req->cqe.user_data, res, cflags);
}
+static void __io_fill_cqe32_req(struct io_kiocb *req, s32 res, u32 cflags,
+ u64 extra1, u64 extra2)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ struct io_uring_cqe *cqe;
+
+ if (WARN_ON_ONCE(!(ctx->flags & IORING_SETUP_CQE32)))
+ return;
+ if (req->flags & REQ_F_CQE_SKIP)
+ return;
+
+ trace_io_uring_complete(ctx, req, req->user_data, res, cflags);
+
+ /*
+ * If we can't get a cq entry, userspace overflowed the
+ * submission (by quite a lot). Increment the overflow count in
+ * the ring.
+ */
+ cqe = io_get_cqe(ctx);
+ if (likely(cqe)) {
+ WRITE_ONCE(cqe->user_data, req->cqe.user_data);
+ WRITE_ONCE(cqe->res, res);
+ WRITE_ONCE(cqe->flags, cflags);
+ WRITE_ONCE(cqe->b[0].extra1, extra1);
+ WRITE_ONCE(cqe->b[0].extra2, extra2);
+ return;
+ }
+
+ io_cqring_event_overflow(ctx, req->cqe.user_data, res, cflags);
+}
+
static noinline bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data,
s32 res, u32 cflags)
{
--
2.30.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v2 05/12] io_uring: add CQE32 completion processing
2022-04-20 19:14 [PATCH v2 00/12] add large CQE support for io-uring Stefan Roesch
` (3 preceding siblings ...)
2022-04-20 19:14 ` [PATCH v2 04/12] io_uring: add CQE32 setup processing Stefan Roesch
@ 2022-04-20 19:14 ` Stefan Roesch
2022-04-22 1:34 ` Kanchan Joshi
2022-04-20 19:14 ` [PATCH v2 06/12] io_uring: modify io_get_cqe for CQE32 Stefan Roesch
` (7 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Stefan Roesch @ 2022-04-20 19:14 UTC (permalink / raw)
To: io-uring, kernel-team; +Cc: shr, Jens Axboe
This adds the completion processing for the large CQE's and makes sure
that the extra1 and extra2 fields are passed through.
Co-developed-by: Jens Axboe <[email protected]>
Signed-off-by: Stefan Roesch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 55 +++++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 47 insertions(+), 8 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index abbd2efbe255..c93a9353c88d 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2247,18 +2247,15 @@ static noinline bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data,
return __io_fill_cqe(ctx, user_data, res, cflags);
}
-static void __io_req_complete_post(struct io_kiocb *req, s32 res,
- u32 cflags)
+static void __io_req_complete_put(struct io_kiocb *req)
{
- struct io_ring_ctx *ctx = req->ctx;
-
- if (!(req->flags & REQ_F_CQE_SKIP))
- __io_fill_cqe_req(req, res, cflags);
/*
* If we're the last reference to this request, add to our locked
* free_list cache.
*/
if (req_ref_put_and_test(req)) {
+ struct io_ring_ctx *ctx = req->ctx;
+
if (req->flags & IO_REQ_LINK_FLAGS) {
if (req->flags & IO_DISARM_MASK)
io_disarm_next(req);
@@ -2281,8 +2278,23 @@ static void __io_req_complete_post(struct io_kiocb *req, s32 res,
}
}
-static void io_req_complete_post(struct io_kiocb *req, s32 res,
- u32 cflags)
+static void __io_req_complete_post(struct io_kiocb *req, s32 res,
+ u32 cflags)
+{
+ if (!(req->flags & REQ_F_CQE_SKIP))
+ __io_fill_cqe_req(req, res, cflags);
+ __io_req_complete_put(req);
+}
+
+static void __io_req_complete_post32(struct io_kiocb *req, s32 res,
+ u32 cflags, u64 extra1, u64 extra2)
+{
+ if (!(req->flags & REQ_F_CQE_SKIP))
+ __io_fill_cqe32_req(req, res, cflags, extra1, extra2);
+ __io_req_complete_put(req);
+}
+
+static void io_req_complete_post(struct io_kiocb *req, s32 res, u32 cflags)
{
struct io_ring_ctx *ctx = req->ctx;
@@ -2293,6 +2305,18 @@ static void io_req_complete_post(struct io_kiocb *req, s32 res,
io_cqring_ev_posted(ctx);
}
+static void io_req_complete_post32(struct io_kiocb *req, s32 res,
+ u32 cflags, u64 extra1, u64 extra2)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+
+ spin_lock(&ctx->completion_lock);
+ __io_req_complete_post32(req, res, cflags, extra1, extra2);
+ io_commit_cqring(ctx);
+ spin_unlock(&ctx->completion_lock);
+ io_cqring_ev_posted(ctx);
+}
+
static inline void io_req_complete_state(struct io_kiocb *req, s32 res,
u32 cflags)
{
@@ -2310,6 +2334,21 @@ static inline void __io_req_complete(struct io_kiocb *req, unsigned issue_flags,
io_req_complete_post(req, res, cflags);
}
+static inline void __io_req_complete32(struct io_kiocb *req,
+ unsigned int issue_flags, s32 res,
+ u32 cflags, u64 extra1, u64 extra2)
+{
+ if (issue_flags & IO_URING_F_COMPLETE_DEFER) {
+ req->cqe.res = res;
+ req->cqe.flags = cflags;
+ req->extra1 = extra1;
+ req->extra2 = extra2;
+ req->flags |= REQ_F_COMPLETE_INLINE;
+ } else {
+ io_req_complete_post32(req, res, cflags, extra1, extra2);
+ }
+}
+
static inline void io_req_complete(struct io_kiocb *req, s32 res)
{
__io_req_complete(req, 0, res, 0);
--
2.30.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH v2 05/12] io_uring: add CQE32 completion processing
2022-04-20 19:14 ` [PATCH v2 05/12] io_uring: add CQE32 completion processing Stefan Roesch
@ 2022-04-22 1:34 ` Kanchan Joshi
2022-04-22 21:39 ` Stefan Roesch
0 siblings, 1 reply; 29+ messages in thread
From: Kanchan Joshi @ 2022-04-22 1:34 UTC (permalink / raw)
To: Stefan Roesch; +Cc: io-uring, kernel-team, Jens Axboe
On Thu, Apr 21, 2022 at 10:44 AM Stefan Roesch <[email protected]> wrote:
>
> This adds the completion processing for the large CQE's and makes sure
> that the extra1 and extra2 fields are passed through.
>
> Co-developed-by: Jens Axboe <[email protected]>
> Signed-off-by: Stefan Roesch <[email protected]>
> Signed-off-by: Jens Axboe <[email protected]>
> ---
> fs/io_uring.c | 55 +++++++++++++++++++++++++++++++++++++++++++--------
> 1 file changed, 47 insertions(+), 8 deletions(-)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index abbd2efbe255..c93a9353c88d 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -2247,18 +2247,15 @@ static noinline bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data,
> return __io_fill_cqe(ctx, user_data, res, cflags);
> }
>
> -static void __io_req_complete_post(struct io_kiocb *req, s32 res,
> - u32 cflags)
> +static void __io_req_complete_put(struct io_kiocb *req)
> {
> - struct io_ring_ctx *ctx = req->ctx;
> -
> - if (!(req->flags & REQ_F_CQE_SKIP))
> - __io_fill_cqe_req(req, res, cflags);
> /*
> * If we're the last reference to this request, add to our locked
> * free_list cache.
> */
> if (req_ref_put_and_test(req)) {
> + struct io_ring_ctx *ctx = req->ctx;
> +
> if (req->flags & IO_REQ_LINK_FLAGS) {
> if (req->flags & IO_DISARM_MASK)
> io_disarm_next(req);
> @@ -2281,8 +2278,23 @@ static void __io_req_complete_post(struct io_kiocb *req, s32 res,
> }
> }
>
> -static void io_req_complete_post(struct io_kiocb *req, s32 res,
> - u32 cflags)
> +static void __io_req_complete_post(struct io_kiocb *req, s32 res,
> + u32 cflags)
> +{
> + if (!(req->flags & REQ_F_CQE_SKIP))
> + __io_fill_cqe_req(req, res, cflags);
> + __io_req_complete_put(req);
> +}
> +
> +static void __io_req_complete_post32(struct io_kiocb *req, s32 res,
> + u32 cflags, u64 extra1, u64 extra2)
> +{
> + if (!(req->flags & REQ_F_CQE_SKIP))
> + __io_fill_cqe32_req(req, res, cflags, extra1, extra2);
> + __io_req_complete_put(req);
> +}
> +
> +static void io_req_complete_post(struct io_kiocb *req, s32 res, u32 cflags)
> {
> struct io_ring_ctx *ctx = req->ctx;
>
> @@ -2293,6 +2305,18 @@ static void io_req_complete_post(struct io_kiocb *req, s32 res,
> io_cqring_ev_posted(ctx);
> }
>
> +static void io_req_complete_post32(struct io_kiocb *req, s32 res,
> + u32 cflags, u64 extra1, u64 extra2)
> +{
> + struct io_ring_ctx *ctx = req->ctx;
> +
> + spin_lock(&ctx->completion_lock);
> + __io_req_complete_post32(req, res, cflags, extra1, extra2);
> + io_commit_cqring(ctx);
> + spin_unlock(&ctx->completion_lock);
> + io_cqring_ev_posted(ctx);
> +}
> +
> static inline void io_req_complete_state(struct io_kiocb *req, s32 res,
> u32 cflags)
> {
> @@ -2310,6 +2334,21 @@ static inline void __io_req_complete(struct io_kiocb *req, unsigned issue_flags,
> io_req_complete_post(req, res, cflags);
> }
>
> +static inline void __io_req_complete32(struct io_kiocb *req,
> + unsigned int issue_flags, s32 res,
> + u32 cflags, u64 extra1, u64 extra2)
> +{
> + if (issue_flags & IO_URING_F_COMPLETE_DEFER) {
> + req->cqe.res = res;
> + req->cqe.flags = cflags;
> + req->extra1 = extra1;
> + req->extra2 = extra2;
> + req->flags |= REQ_F_COMPLETE_INLINE;
nit: we can use the existing helper (io_req_complete_state) to
populate these fields rather than open-coding.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v2 05/12] io_uring: add CQE32 completion processing
2022-04-22 1:34 ` Kanchan Joshi
@ 2022-04-22 21:39 ` Stefan Roesch
0 siblings, 0 replies; 29+ messages in thread
From: Stefan Roesch @ 2022-04-22 21:39 UTC (permalink / raw)
To: Kanchan Joshi; +Cc: io-uring, kernel-team, Jens Axboe
On 4/21/22 6:34 PM, Kanchan Joshi wrote:
> On Thu, Apr 21, 2022 at 10:44 AM Stefan Roesch <[email protected]> wrote:
>>
>> This adds the completion processing for the large CQE's and makes sure
>> that the extra1 and extra2 fields are passed through.
>>
>> Co-developed-by: Jens Axboe <[email protected]>
>> Signed-off-by: Stefan Roesch <[email protected]>
>> Signed-off-by: Jens Axboe <[email protected]>
>> ---
>> fs/io_uring.c | 55 +++++++++++++++++++++++++++++++++++++++++++--------
>> 1 file changed, 47 insertions(+), 8 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index abbd2efbe255..c93a9353c88d 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -2247,18 +2247,15 @@ static noinline bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data,
>> return __io_fill_cqe(ctx, user_data, res, cflags);
>> }
>>
>> -static void __io_req_complete_post(struct io_kiocb *req, s32 res,
>> - u32 cflags)
>> +static void __io_req_complete_put(struct io_kiocb *req)
>> {
>> - struct io_ring_ctx *ctx = req->ctx;
>> -
>> - if (!(req->flags & REQ_F_CQE_SKIP))
>> - __io_fill_cqe_req(req, res, cflags);
>> /*
>> * If we're the last reference to this request, add to our locked
>> * free_list cache.
>> */
>> if (req_ref_put_and_test(req)) {
>> + struct io_ring_ctx *ctx = req->ctx;
>> +
>> if (req->flags & IO_REQ_LINK_FLAGS) {
>> if (req->flags & IO_DISARM_MASK)
>> io_disarm_next(req);
>> @@ -2281,8 +2278,23 @@ static void __io_req_complete_post(struct io_kiocb *req, s32 res,
>> }
>> }
>>
>> -static void io_req_complete_post(struct io_kiocb *req, s32 res,
>> - u32 cflags)
>> +static void __io_req_complete_post(struct io_kiocb *req, s32 res,
>> + u32 cflags)
>> +{
>> + if (!(req->flags & REQ_F_CQE_SKIP))
>> + __io_fill_cqe_req(req, res, cflags);
>> + __io_req_complete_put(req);
>> +}
>> +
>> +static void __io_req_complete_post32(struct io_kiocb *req, s32 res,
>> + u32 cflags, u64 extra1, u64 extra2)
>> +{
>> + if (!(req->flags & REQ_F_CQE_SKIP))
>> + __io_fill_cqe32_req(req, res, cflags, extra1, extra2);
>> + __io_req_complete_put(req);
>> +}
>> +
>> +static void io_req_complete_post(struct io_kiocb *req, s32 res, u32 cflags)
>> {
>> struct io_ring_ctx *ctx = req->ctx;
>>
>> @@ -2293,6 +2305,18 @@ static void io_req_complete_post(struct io_kiocb *req, s32 res,
>> io_cqring_ev_posted(ctx);
>> }
>>
>> +static void io_req_complete_post32(struct io_kiocb *req, s32 res,
>> + u32 cflags, u64 extra1, u64 extra2)
>> +{
>> + struct io_ring_ctx *ctx = req->ctx;
>> +
>> + spin_lock(&ctx->completion_lock);
>> + __io_req_complete_post32(req, res, cflags, extra1, extra2);
>> + io_commit_cqring(ctx);
>> + spin_unlock(&ctx->completion_lock);
>> + io_cqring_ev_posted(ctx);
>> +}
>> +
>> static inline void io_req_complete_state(struct io_kiocb *req, s32 res,
>> u32 cflags)
>> {
>> @@ -2310,6 +2334,21 @@ static inline void __io_req_complete(struct io_kiocb *req, unsigned issue_flags,
>> io_req_complete_post(req, res, cflags);
>> }
>>
>> +static inline void __io_req_complete32(struct io_kiocb *req,
>> + unsigned int issue_flags, s32 res,
>> + u32 cflags, u64 extra1, u64 extra2)
>> +{
>> + if (issue_flags & IO_URING_F_COMPLETE_DEFER) {
>> + req->cqe.res = res;
>> + req->cqe.flags = cflags;
>> + req->extra1 = extra1;
>> + req->extra2 = extra2;
>> + req->flags |= REQ_F_COMPLETE_INLINE;
>
> nit: we can use the existing helper (io_req_complete_state) to
> populate these fields rather than open-coding.
V3 will have that change.
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v2 06/12] io_uring: modify io_get_cqe for CQE32
2022-04-20 19:14 [PATCH v2 00/12] add large CQE support for io-uring Stefan Roesch
` (4 preceding siblings ...)
2022-04-20 19:14 ` [PATCH v2 05/12] io_uring: add CQE32 completion processing Stefan Roesch
@ 2022-04-20 19:14 ` Stefan Roesch
2022-04-22 1:25 ` Kanchan Joshi
2022-04-20 19:14 ` [PATCH v2 07/12] io_uring: flush completions " Stefan Roesch
` (6 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Stefan Roesch @ 2022-04-20 19:14 UTC (permalink / raw)
To: io-uring, kernel-team; +Cc: shr
Modify accesses to the CQE array to take large CQE's into account. The
index needs to be shifted by one for large CQE's.
Signed-off-by: Stefan Roesch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index c93a9353c88d..bd352815b9e7 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1909,8 +1909,12 @@ static noinline struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx)
{
struct io_rings *rings = ctx->rings;
unsigned int off = ctx->cached_cq_tail & (ctx->cq_entries - 1);
+ unsigned int shift = 0;
unsigned int free, queued, len;
+ if (ctx->flags & IORING_SETUP_CQE32)
+ shift = 1;
+
/* userspace may cheat modifying the tail, be safe and do min */
queued = min(__io_cqring_events(ctx), ctx->cq_entries);
free = ctx->cq_entries - queued;
@@ -1922,12 +1926,13 @@ static noinline struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx)
ctx->cached_cq_tail++;
ctx->cqe_cached = &rings->cqes[off];
ctx->cqe_sentinel = ctx->cqe_cached + len;
- return ctx->cqe_cached++;
+ ctx->cqe_cached++;
+ return &rings->cqes[off << shift];
}
static inline struct io_uring_cqe *io_get_cqe(struct io_ring_ctx *ctx)
{
- if (likely(ctx->cqe_cached < ctx->cqe_sentinel)) {
+ if (likely(ctx->cqe_cached < ctx->cqe_sentinel && !(ctx->flags & IORING_SETUP_CQE32))) {
ctx->cached_cq_tail++;
return ctx->cqe_cached++;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH v2 06/12] io_uring: modify io_get_cqe for CQE32
2022-04-20 19:14 ` [PATCH v2 06/12] io_uring: modify io_get_cqe for CQE32 Stefan Roesch
@ 2022-04-22 1:25 ` Kanchan Joshi
2022-04-22 23:59 ` Stefan Roesch
0 siblings, 1 reply; 29+ messages in thread
From: Kanchan Joshi @ 2022-04-22 1:25 UTC (permalink / raw)
To: Stefan Roesch; +Cc: io-uring, kernel-team
On Thu, Apr 21, 2022 at 3:54 PM Stefan Roesch <[email protected]> wrote:
>
> Modify accesses to the CQE array to take large CQE's into account. The
> index needs to be shifted by one for large CQE's.
>
> Signed-off-by: Stefan Roesch <[email protected]>
> Signed-off-by: Jens Axboe <[email protected]>
> ---
> fs/io_uring.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index c93a9353c88d..bd352815b9e7 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -1909,8 +1909,12 @@ static noinline struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx)
> {
> struct io_rings *rings = ctx->rings;
> unsigned int off = ctx->cached_cq_tail & (ctx->cq_entries - 1);
> + unsigned int shift = 0;
> unsigned int free, queued, len;
>
> + if (ctx->flags & IORING_SETUP_CQE32)
> + shift = 1;
> +
> /* userspace may cheat modifying the tail, be safe and do min */
> queued = min(__io_cqring_events(ctx), ctx->cq_entries);
> free = ctx->cq_entries - queued;
> @@ -1922,12 +1926,13 @@ static noinline struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx)
> ctx->cached_cq_tail++;
> ctx->cqe_cached = &rings->cqes[off];
> ctx->cqe_sentinel = ctx->cqe_cached + len;
> - return ctx->cqe_cached++;
> + ctx->cqe_cached++;
> + return &rings->cqes[off << shift];
> }
>
> static inline struct io_uring_cqe *io_get_cqe(struct io_ring_ctx *ctx)
> {
> - if (likely(ctx->cqe_cached < ctx->cqe_sentinel)) {
> + if (likely(ctx->cqe_cached < ctx->cqe_sentinel && !(ctx->flags & IORING_SETUP_CQE32))) {
> ctx->cached_cq_tail++;
> return ctx->cqe_cached++;
> }
This excludes CQE-caching for 32b CQEs.
How about something like below to have that enabled (adding
io_get_cqe32 for the new ring) -
+static noinline struct io_uring_cqe *__io_get_cqe32(struct io_ring_ctx *ctx)
+{
+ struct io_rings *rings = ctx->rings;
+ unsigned int off = ctx->cached_cq_tail & (ctx->cq_entries - 1);
+ unsigned int free, queued, len;
+
+ /* userspace may cheat modifying the tail, be safe and do min */
+ queued = min(__io_cqring_events(ctx), ctx->cq_entries);
+ free = ctx->cq_entries - queued;
+ /* we need a contiguous range, limit based on the current
array offset */
+ len = min(free, ctx->cq_entries - off);
+ if (!len)
+ return NULL;
+
+ ctx->cached_cq_tail++;
+ /* double increment for 32 CQEs */
+ ctx->cqe_cached = &rings->cqes[off << 1];
+ ctx->cqe_sentinel = ctx->cqe_cached + (len << 1);
+ return ctx->cqe_cached;
+}
+
+static inline struct io_uring_cqe *io_get_cqe32(struct io_ring_ctx *ctx)
+{
+ struct io_uring_cqe *cqe32;
+ if (likely(ctx->cqe_cached < ctx->cqe_sentinel)) {
+ ctx->cached_cq_tail++;
+ cqe32 = ctx->cqe_cached;
+ } else
+ cqe32 = __io_get_cqe32(ctx);
+ /* double increment for 32b CQE*/
+ ctx->cqe_cached += 2;
+ return cqe32;
+}
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v2 06/12] io_uring: modify io_get_cqe for CQE32
2022-04-22 1:25 ` Kanchan Joshi
@ 2022-04-22 23:59 ` Stefan Roesch
0 siblings, 0 replies; 29+ messages in thread
From: Stefan Roesch @ 2022-04-22 23:59 UTC (permalink / raw)
To: Kanchan Joshi; +Cc: io-uring, kernel-team
On 4/21/22 6:25 PM, Kanchan Joshi wrote:
> On Thu, Apr 21, 2022 at 3:54 PM Stefan Roesch <[email protected]> wrote:
>>
>> Modify accesses to the CQE array to take large CQE's into account. The
>> index needs to be shifted by one for large CQE's.
>>
>> Signed-off-by: Stefan Roesch <[email protected]>
>> Signed-off-by: Jens Axboe <[email protected]>
>> ---
>> fs/io_uring.c | 9 +++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index c93a9353c88d..bd352815b9e7 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -1909,8 +1909,12 @@ static noinline struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx)
>> {
>> struct io_rings *rings = ctx->rings;
>> unsigned int off = ctx->cached_cq_tail & (ctx->cq_entries - 1);
>> + unsigned int shift = 0;
>> unsigned int free, queued, len;
>>
>> + if (ctx->flags & IORING_SETUP_CQE32)
>> + shift = 1;
>> +
>> /* userspace may cheat modifying the tail, be safe and do min */
>> queued = min(__io_cqring_events(ctx), ctx->cq_entries);
>> free = ctx->cq_entries - queued;
>> @@ -1922,12 +1926,13 @@ static noinline struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx)
>> ctx->cached_cq_tail++;
>> ctx->cqe_cached = &rings->cqes[off];
>> ctx->cqe_sentinel = ctx->cqe_cached + len;
>> - return ctx->cqe_cached++;
>> + ctx->cqe_cached++;
>> + return &rings->cqes[off << shift];
>> }
>>
>> static inline struct io_uring_cqe *io_get_cqe(struct io_ring_ctx *ctx)
>> {
>> - if (likely(ctx->cqe_cached < ctx->cqe_sentinel)) {
>> + if (likely(ctx->cqe_cached < ctx->cqe_sentinel && !(ctx->flags & IORING_SETUP_CQE32))) {
>> ctx->cached_cq_tail++;
>> return ctx->cqe_cached++;
>> }
>
> This excludes CQE-caching for 32b CQEs.
> How about something like below to have that enabled (adding
> io_get_cqe32 for the new ring) -
>
What you describe below I tried to avoid: keep the current indexes and pointers
as they are and only when we access an element calculate the correct offset into the
cqe array.
I'll add caching support for V3 in a slightly different way.
> +static noinline struct io_uring_cqe *__io_get_cqe32(struct io_ring_ctx *ctx)
> +{
> + struct io_rings *rings = ctx->rings;
> + unsigned int off = ctx->cached_cq_tail & (ctx->cq_entries - 1);
> + unsigned int free, queued, len;
> +
> + /* userspace may cheat modifying the tail, be safe and do min */
> + queued = min(__io_cqring_events(ctx), ctx->cq_entries);
> + free = ctx->cq_entries - queued;
> + /* we need a contiguous range, limit based on the current
> array offset */
> + len = min(free, ctx->cq_entries - off);
> + if (!len)
> + return NULL;
> +
> + ctx->cached_cq_tail++;
> + /* double increment for 32 CQEs */
> + ctx->cqe_cached = &rings->cqes[off << 1];
> + ctx->cqe_sentinel = ctx->cqe_cached + (len << 1);
> + return ctx->cqe_cached;
> +}
> +
> +static inline struct io_uring_cqe *io_get_cqe32(struct io_ring_ctx *ctx)
> +{
> + struct io_uring_cqe *cqe32;
> + if (likely(ctx->cqe_cached < ctx->cqe_sentinel)) {
> + ctx->cached_cq_tail++;
> + cqe32 = ctx->cqe_cached;
> + } else
> + cqe32 = __io_get_cqe32(ctx);
> + /* double increment for 32b CQE*/
> + ctx->cqe_cached += 2;
> + return cqe32;
> +}
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v2 07/12] io_uring: flush completions for CQE32
2022-04-20 19:14 [PATCH v2 00/12] add large CQE support for io-uring Stefan Roesch
` (5 preceding siblings ...)
2022-04-20 19:14 ` [PATCH v2 06/12] io_uring: modify io_get_cqe for CQE32 Stefan Roesch
@ 2022-04-20 19:14 ` Stefan Roesch
2022-04-20 19:14 ` [PATCH v2 08/12] io_uring: overflow processing " Stefan Roesch
` (5 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Stefan Roesch @ 2022-04-20 19:14 UTC (permalink / raw)
To: io-uring, kernel-team; +Cc: shr
This flushes the completions according to their CQE type: the same
processing is done for the default CQE size, but for large CQE's the
extra1 and extra2 fields are filled in.
Signed-off-by: Stefan Roesch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index bd352815b9e7..ff6229b6df16 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2877,8 +2877,12 @@ static void __io_submit_flush_completions(struct io_ring_ctx *ctx)
struct io_kiocb *req = container_of(node, struct io_kiocb,
comp_list);
- if (!(req->flags & REQ_F_CQE_SKIP))
- __io_fill_cqe_req_filled(ctx, req);
+ if (!(req->flags & REQ_F_CQE_SKIP)) {
+ if (!(ctx->flags & IORING_SETUP_CQE32))
+ __io_fill_cqe_req_filled(ctx, req);
+ else
+ __io_fill_cqe32_req_filled(ctx, req);
+ }
}
io_commit_cqring(ctx);
--
2.30.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v2 08/12] io_uring: overflow processing for CQE32
2022-04-20 19:14 [PATCH v2 00/12] add large CQE support for io-uring Stefan Roesch
` (6 preceding siblings ...)
2022-04-20 19:14 ` [PATCH v2 07/12] io_uring: flush completions " Stefan Roesch
@ 2022-04-20 19:14 ` Stefan Roesch
2022-04-22 2:15 ` Kanchan Joshi
2022-04-20 19:14 ` [PATCH v2 09/12] io_uring: add tracing for additional CQE32 fields Stefan Roesch
` (4 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Stefan Roesch @ 2022-04-20 19:14 UTC (permalink / raw)
To: io-uring, kernel-team; +Cc: shr, Jens Axboe
This adds the overflow processing for large CQE's.
This adds two parameters to the io_cqring_event_overflow function and
uses these fields to initialize the large CQE fields.
Allocate enough space for large CQE's in the overflow structue. If no
large CQE's are used, the size of the allocation is unchanged.
The cqe field can have a different size depending if its a large
CQE or not. To be able to allocate different sizes, the two fields
in the structure are re-ordered.
Co-developed-by: Jens Axboe <[email protected]>
Signed-off-by: Stefan Roesch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 26 +++++++++++++++++++-------
1 file changed, 19 insertions(+), 7 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index ff6229b6df16..50efced63ec9 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -220,8 +220,8 @@ struct io_mapped_ubuf {
struct io_ring_ctx;
struct io_overflow_cqe {
- struct io_uring_cqe cqe;
struct list_head list;
+ struct io_uring_cqe cqe;
};
struct io_fixed_file {
@@ -2016,13 +2016,17 @@ static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
while (!list_empty(&ctx->cq_overflow_list)) {
struct io_uring_cqe *cqe = io_get_cqe(ctx);
struct io_overflow_cqe *ocqe;
+ size_t cqe_size = sizeof(struct io_uring_cqe);
+
+ if (ctx->flags & IORING_SETUP_CQE32)
+ cqe_size <<= 1;
if (!cqe && !force)
break;
ocqe = list_first_entry(&ctx->cq_overflow_list,
struct io_overflow_cqe, list);
if (cqe)
- memcpy(cqe, &ocqe->cqe, sizeof(*cqe));
+ memcpy(cqe, &ocqe->cqe, cqe_size);
else
io_account_cq_overflow(ctx);
@@ -2111,11 +2115,15 @@ static __cold void io_uring_drop_tctx_refs(struct task_struct *task)
}
static bool io_cqring_event_overflow(struct io_ring_ctx *ctx, u64 user_data,
- s32 res, u32 cflags)
+ s32 res, u32 cflags, u64 extra1, u64 extra2)
{
struct io_overflow_cqe *ocqe;
+ size_t ocq_size = sizeof(struct io_overflow_cqe);
- ocqe = kmalloc(sizeof(*ocqe), GFP_ATOMIC | __GFP_ACCOUNT);
+ if (ctx->flags & IORING_SETUP_CQE32)
+ ocq_size += sizeof(struct io_uring_cqe);
+
+ ocqe = kmalloc(ocq_size, GFP_ATOMIC | __GFP_ACCOUNT);
if (!ocqe) {
/*
* If we're in ring overflow flush mode, or in task cancel mode,
@@ -2134,6 +2142,10 @@ static bool io_cqring_event_overflow(struct io_ring_ctx *ctx, u64 user_data,
ocqe->cqe.user_data = user_data;
ocqe->cqe.res = res;
ocqe->cqe.flags = cflags;
+ if (ctx->flags & IORING_SETUP_CQE32) {
+ ocqe->cqe.b[0].extra1 = extra1;
+ ocqe->cqe.b[0].extra2 = extra2;
+ }
list_add_tail(&ocqe->list, &ctx->cq_overflow_list);
return true;
}
@@ -2155,7 +2167,7 @@ static inline bool __io_fill_cqe(struct io_ring_ctx *ctx, u64 user_data,
WRITE_ONCE(cqe->flags, cflags);
return true;
}
- return io_cqring_event_overflow(ctx, user_data, res, cflags);
+ return io_cqring_event_overflow(ctx, user_data, res, cflags, 0, 0);
}
static inline bool __io_fill_cqe_req_filled(struct io_ring_ctx *ctx,
@@ -2177,7 +2189,7 @@ static inline bool __io_fill_cqe_req_filled(struct io_ring_ctx *ctx,
return true;
}
return io_cqring_event_overflow(ctx, req->cqe.user_data,
- req->cqe.res, req->cqe.flags);
+ req->cqe.res, req->cqe.flags, 0, 0);
}
static inline bool __io_fill_cqe32_req_filled(struct io_ring_ctx *ctx,
@@ -2241,7 +2253,7 @@ static void __io_fill_cqe32_req(struct io_kiocb *req, s32 res, u32 cflags,
return;
}
- io_cqring_event_overflow(ctx, req->cqe.user_data, res, cflags);
+ io_cqring_event_overflow(ctx, req->cqe.user_data, res, cflags, extra1, extra2);
}
static noinline bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data,
--
2.30.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH v2 08/12] io_uring: overflow processing for CQE32
2022-04-20 19:14 ` [PATCH v2 08/12] io_uring: overflow processing " Stefan Roesch
@ 2022-04-22 2:15 ` Kanchan Joshi
2022-04-22 21:27 ` Stefan Roesch
0 siblings, 1 reply; 29+ messages in thread
From: Kanchan Joshi @ 2022-04-22 2:15 UTC (permalink / raw)
To: Stefan Roesch; +Cc: io-uring, kernel-team, Jens Axboe
On Thu, Apr 21, 2022 at 1:37 PM Stefan Roesch <[email protected]> wrote:
>
> This adds the overflow processing for large CQE's.
>
> This adds two parameters to the io_cqring_event_overflow function and
> uses these fields to initialize the large CQE fields.
>
> Allocate enough space for large CQE's in the overflow structue. If no
> large CQE's are used, the size of the allocation is unchanged.
>
> The cqe field can have a different size depending if its a large
> CQE or not. To be able to allocate different sizes, the two fields
> in the structure are re-ordered.
>
> Co-developed-by: Jens Axboe <[email protected]>
> Signed-off-by: Stefan Roesch <[email protected]>
> Signed-off-by: Jens Axboe <[email protected]>
> ---
> fs/io_uring.c | 26 +++++++++++++++++++-------
> 1 file changed, 19 insertions(+), 7 deletions(-)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index ff6229b6df16..50efced63ec9 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -220,8 +220,8 @@ struct io_mapped_ubuf {
> struct io_ring_ctx;
>
> struct io_overflow_cqe {
> - struct io_uring_cqe cqe;
> struct list_head list;
> + struct io_uring_cqe cqe;
> };
>
> struct io_fixed_file {
> @@ -2016,13 +2016,17 @@ static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
> while (!list_empty(&ctx->cq_overflow_list)) {
> struct io_uring_cqe *cqe = io_get_cqe(ctx);
> struct io_overflow_cqe *ocqe;
> + size_t cqe_size = sizeof(struct io_uring_cqe);
> +
> + if (ctx->flags & IORING_SETUP_CQE32)
> + cqe_size <<= 1;
>
> if (!cqe && !force)
> break;
> ocqe = list_first_entry(&ctx->cq_overflow_list,
> struct io_overflow_cqe, list);
> if (cqe)
> - memcpy(cqe, &ocqe->cqe, sizeof(*cqe));
> + memcpy(cqe, &ocqe->cqe, cqe_size);
> else
> io_account_cq_overflow(ctx);
>
> @@ -2111,11 +2115,15 @@ static __cold void io_uring_drop_tctx_refs(struct task_struct *task)
> }
>
> static bool io_cqring_event_overflow(struct io_ring_ctx *ctx, u64 user_data,
> - s32 res, u32 cflags)
> + s32 res, u32 cflags, u64 extra1, u64 extra2)
> {
> struct io_overflow_cqe *ocqe;
> + size_t ocq_size = sizeof(struct io_overflow_cqe);
>
> - ocqe = kmalloc(sizeof(*ocqe), GFP_ATOMIC | __GFP_ACCOUNT);
> + if (ctx->flags & IORING_SETUP_CQE32)
This can go inside in a bool variable, as this check is repeated in
this function.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v2 08/12] io_uring: overflow processing for CQE32
2022-04-22 2:15 ` Kanchan Joshi
@ 2022-04-22 21:27 ` Stefan Roesch
2022-04-25 10:31 ` Kanchan Joshi
0 siblings, 1 reply; 29+ messages in thread
From: Stefan Roesch @ 2022-04-22 21:27 UTC (permalink / raw)
To: Kanchan Joshi; +Cc: io-uring, kernel-team, Jens Axboe
On 4/21/22 7:15 PM, Kanchan Joshi wrote:
> On Thu, Apr 21, 2022 at 1:37 PM Stefan Roesch <[email protected]> wrote:
>>
>> This adds the overflow processing for large CQE's.
>>
>> This adds two parameters to the io_cqring_event_overflow function and
>> uses these fields to initialize the large CQE fields.
>>
>> Allocate enough space for large CQE's in the overflow structue. If no
>> large CQE's are used, the size of the allocation is unchanged.
>>
>> The cqe field can have a different size depending if its a large
>> CQE or not. To be able to allocate different sizes, the two fields
>> in the structure are re-ordered.
>>
>> Co-developed-by: Jens Axboe <[email protected]>
>> Signed-off-by: Stefan Roesch <[email protected]>
>> Signed-off-by: Jens Axboe <[email protected]>
>> ---
>> fs/io_uring.c | 26 +++++++++++++++++++-------
>> 1 file changed, 19 insertions(+), 7 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index ff6229b6df16..50efced63ec9 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -220,8 +220,8 @@ struct io_mapped_ubuf {
>> struct io_ring_ctx;
>>
>> struct io_overflow_cqe {
>> - struct io_uring_cqe cqe;
>> struct list_head list;
>> + struct io_uring_cqe cqe;
>> };
>>
>> struct io_fixed_file {
>> @@ -2016,13 +2016,17 @@ static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
>> while (!list_empty(&ctx->cq_overflow_list)) {
>> struct io_uring_cqe *cqe = io_get_cqe(ctx);
>> struct io_overflow_cqe *ocqe;
>> + size_t cqe_size = sizeof(struct io_uring_cqe);
>> +
>> + if (ctx->flags & IORING_SETUP_CQE32)
>> + cqe_size <<= 1;
>>
>> if (!cqe && !force)
>> break;
>> ocqe = list_first_entry(&ctx->cq_overflow_list,
>> struct io_overflow_cqe, list);
>> if (cqe)
>> - memcpy(cqe, &ocqe->cqe, sizeof(*cqe));
>> + memcpy(cqe, &ocqe->cqe, cqe_size);
>> else
>> io_account_cq_overflow(ctx);
>>
>> @@ -2111,11 +2115,15 @@ static __cold void io_uring_drop_tctx_refs(struct task_struct *task)
>> }
>>
>> static bool io_cqring_event_overflow(struct io_ring_ctx *ctx, u64 user_data,
>> - s32 res, u32 cflags)
>> + s32 res, u32 cflags, u64 extra1, u64 extra2)
>> {
>> struct io_overflow_cqe *ocqe;
>> + size_t ocq_size = sizeof(struct io_overflow_cqe);
>>
>> - ocqe = kmalloc(sizeof(*ocqe), GFP_ATOMIC | __GFP_ACCOUNT);
>> + if (ctx->flags & IORING_SETUP_CQE32)
>
> This can go inside in a bool variable, as this check is repeated in
> this function.
V3 will have this change.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v2 08/12] io_uring: overflow processing for CQE32
2022-04-22 21:27 ` Stefan Roesch
@ 2022-04-25 10:31 ` Kanchan Joshi
0 siblings, 0 replies; 29+ messages in thread
From: Kanchan Joshi @ 2022-04-25 10:31 UTC (permalink / raw)
To: Stefan Roesch; +Cc: Kanchan Joshi, io-uring, kernel-team, Jens Axboe
[-- Attachment #1: Type: text/plain, Size: 835 bytes --]
On Fri, Apr 22, 2022 at 02:27:01PM -0700, Stefan Roesch wrote:
>
>
>On 4/21/22 7:15 PM, Kanchan Joshi wrote:
>> On Thu, Apr 21, 2022 at 1:37 PM Stefan Roesch <[email protected]> wrote:
<snip>
>>> static bool io_cqring_event_overflow(struct io_ring_ctx *ctx, u64 user_data,
>>> - s32 res, u32 cflags)
>>> + s32 res, u32 cflags, u64 extra1, u64 extra2)
>>> {
>>> struct io_overflow_cqe *ocqe;
>>> + size_t ocq_size = sizeof(struct io_overflow_cqe);
>>>
>>> - ocqe = kmalloc(sizeof(*ocqe), GFP_ATOMIC | __GFP_ACCOUNT);
>>> + if (ctx->flags & IORING_SETUP_CQE32)
>>
>> This can go inside in a bool variable, as this check is repeated in
>> this function.
>
>V3 will have this change.
While you are at it, good to have this changed in patch 10 too.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v2 09/12] io_uring: add tracing for additional CQE32 fields
2022-04-20 19:14 [PATCH v2 00/12] add large CQE support for io-uring Stefan Roesch
` (7 preceding siblings ...)
2022-04-20 19:14 ` [PATCH v2 08/12] io_uring: overflow processing " Stefan Roesch
@ 2022-04-20 19:14 ` Stefan Roesch
2022-04-20 19:14 ` [PATCH v2 10/12] io_uring: support CQE32 in /proc info Stefan Roesch
` (3 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Stefan Roesch @ 2022-04-20 19:14 UTC (permalink / raw)
To: io-uring, kernel-team; +Cc: shr, Jens Axboe
This adds tracing for the extra1 and extra2 fields.
Co-developed-by: Jens Axboe <[email protected]>
Signed-off-by: Stefan Roesch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 11 ++++++-----
include/trace/events/io_uring.h | 18 ++++++++++++++----
2 files changed, 20 insertions(+), 9 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 50efced63ec9..366f49969b31 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2176,7 +2176,7 @@ static inline bool __io_fill_cqe_req_filled(struct io_ring_ctx *ctx,
struct io_uring_cqe *cqe;
trace_io_uring_complete(req->ctx, req, req->cqe.user_data,
- req->cqe.res, req->cqe.flags);
+ req->cqe.res, req->cqe.flags, 0, 0);
/*
* If we can't get a cq entry, userspace overflowed the
@@ -2200,7 +2200,7 @@ static inline bool __io_fill_cqe32_req_filled(struct io_ring_ctx *ctx,
u64 extra2 = req->extra2;
trace_io_uring_complete(req->ctx, req, req->cqe.user_data,
- req->cqe.res, req->cqe.flags);
+ req->cqe.res, req->cqe.flags, extra1, extra2);
/*
* If we can't get a cq entry, userspace overflowed the
@@ -2221,7 +2221,7 @@ static inline bool __io_fill_cqe32_req_filled(struct io_ring_ctx *ctx,
static inline bool __io_fill_cqe_req(struct io_kiocb *req, s32 res, u32 cflags)
{
- trace_io_uring_complete(req->ctx, req, req->cqe.user_data, res, cflags);
+ trace_io_uring_complete(req->ctx, req, req->cqe.user_data, res, cflags, 0, 0);
return __io_fill_cqe(req->ctx, req->cqe.user_data, res, cflags);
}
@@ -2236,7 +2236,8 @@ static void __io_fill_cqe32_req(struct io_kiocb *req, s32 res, u32 cflags,
if (req->flags & REQ_F_CQE_SKIP)
return;
- trace_io_uring_complete(ctx, req, req->user_data, res, cflags);
+ trace_io_uring_complete(ctx, req, req->cqe.user_data, res, cflags,
+ extra1, extra2);
/*
* If we can't get a cq entry, userspace overflowed the
@@ -2260,7 +2261,7 @@ static noinline bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data,
s32 res, u32 cflags)
{
ctx->cq_extra++;
- trace_io_uring_complete(ctx, NULL, user_data, res, cflags);
+ trace_io_uring_complete(ctx, NULL, user_data, res, cflags, 0, 0);
return __io_fill_cqe(ctx, user_data, res, cflags);
}
diff --git a/include/trace/events/io_uring.h b/include/trace/events/io_uring.h
index 8477414d6d06..2eb4f4e47de4 100644
--- a/include/trace/events/io_uring.h
+++ b/include/trace/events/io_uring.h
@@ -318,13 +318,16 @@ TRACE_EVENT(io_uring_fail_link,
* @user_data: user data associated with the request
* @res: result of the request
* @cflags: completion flags
+ * @extra1: extra 64-bit data for CQE32
+ * @extra2: extra 64-bit data for CQE32
*
*/
TRACE_EVENT(io_uring_complete,
- TP_PROTO(void *ctx, void *req, u64 user_data, int res, unsigned cflags),
+ TP_PROTO(void *ctx, void *req, u64 user_data, int res, unsigned cflags,
+ u64 extra1, u64 extra2),
- TP_ARGS(ctx, req, user_data, res, cflags),
+ TP_ARGS(ctx, req, user_data, res, cflags, extra1, extra2),
TP_STRUCT__entry (
__field( void *, ctx )
@@ -332,6 +335,8 @@ TRACE_EVENT(io_uring_complete,
__field( u64, user_data )
__field( int, res )
__field( unsigned, cflags )
+ __field( u64, extra1 )
+ __field( u64, extra2 )
),
TP_fast_assign(
@@ -340,12 +345,17 @@ TRACE_EVENT(io_uring_complete,
__entry->user_data = user_data;
__entry->res = res;
__entry->cflags = cflags;
+ __entry->extra1 = extra1;
+ __entry->extra2 = extra2;
),
- TP_printk("ring %p, req %p, user_data 0x%llx, result %d, cflags 0x%x",
+ TP_printk("ring %p, req %p, user_data 0x%llx, result %d, cflags 0x%x "
+ "extra1 %llu extra2 %llu ",
__entry->ctx, __entry->req,
__entry->user_data,
- __entry->res, __entry->cflags)
+ __entry->res, __entry->cflags,
+ (unsigned long long) __entry->extra1,
+ (unsigned long long) __entry->extra2)
);
/**
--
2.30.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v2 10/12] io_uring: support CQE32 in /proc info
2022-04-20 19:14 [PATCH v2 00/12] add large CQE support for io-uring Stefan Roesch
` (8 preceding siblings ...)
2022-04-20 19:14 ` [PATCH v2 09/12] io_uring: add tracing for additional CQE32 fields Stefan Roesch
@ 2022-04-20 19:14 ` Stefan Roesch
2022-04-20 19:14 ` [PATCH v2 11/12] io_uring: enable CQE32 Stefan Roesch
` (2 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Stefan Roesch @ 2022-04-20 19:14 UTC (permalink / raw)
To: io-uring, kernel-team; +Cc: shr
This exposes the extra1 and extra2 fields in the /proc output.
Signed-off-by: Stefan Roesch <[email protected]>
---
fs/io_uring.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 366f49969b31..a5fbad91800c 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -11345,10 +11345,14 @@ static __cold void __io_uring_show_fdinfo(struct io_ring_ctx *ctx,
unsigned int sq_tail = READ_ONCE(r->sq.tail);
unsigned int cq_head = READ_ONCE(r->cq.head);
unsigned int cq_tail = READ_ONCE(r->cq.tail);
+ unsigned int cq_shift = 0;
unsigned int sq_entries, cq_entries;
bool has_lock;
unsigned int i;
+ if (ctx->flags & IORING_SETUP_CQE32)
+ cq_shift = 1;
+
/*
* we may get imprecise sqe and cqe info if uring is actively running
* since we get cached_sq_head and cached_cq_tail without uring_lock
@@ -11381,11 +11385,18 @@ static __cold void __io_uring_show_fdinfo(struct io_ring_ctx *ctx,
cq_entries = min(cq_tail - cq_head, ctx->cq_entries);
for (i = 0; i < cq_entries; i++) {
unsigned int entry = i + cq_head;
- struct io_uring_cqe *cqe = &r->cqes[entry & cq_mask];
+ struct io_uring_cqe *cqe = &r->cqes[(entry & cq_mask) << cq_shift];
- seq_printf(m, "%5u: user_data:%llu, res:%d, flag:%x\n",
+ if (!(ctx->flags & IORING_SETUP_CQE32)) {
+ seq_printf(m, "%5u: user_data:%llu, res:%d, flag:%x\n",
entry & cq_mask, cqe->user_data, cqe->res,
cqe->flags);
+ } else {
+ seq_printf(m, "%5u: user_data:%llu, res:%d, flag:%x, "
+ "extra1:%llu, extra2:%llu\n",
+ entry & cq_mask, cqe->user_data, cqe->res,
+ cqe->flags, cqe->b[0].extra1, cqe->b[0].extra2);
+ }
}
/*
--
2.30.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v2 11/12] io_uring: enable CQE32
2022-04-20 19:14 [PATCH v2 00/12] add large CQE support for io-uring Stefan Roesch
` (9 preceding siblings ...)
2022-04-20 19:14 ` [PATCH v2 10/12] io_uring: support CQE32 in /proc info Stefan Roesch
@ 2022-04-20 19:14 ` Stefan Roesch
2022-04-20 19:14 ` [PATCH v2 12/12] io_uring: support CQE32 for nop operation Stefan Roesch
2022-04-20 22:51 ` [PATCH v2 00/12] add large CQE support for io-uring Jens Axboe
12 siblings, 0 replies; 29+ messages in thread
From: Stefan Roesch @ 2022-04-20 19:14 UTC (permalink / raw)
To: io-uring, kernel-team; +Cc: shr, Jens Axboe
This enables large CQE's in the uring setup.
Co-developed-by: Jens Axboe <[email protected]>
Signed-off-by: Stefan Roesch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index a5fbad91800c..8bdf253b9462 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -11742,7 +11742,7 @@ static long io_uring_setup(u32 entries, struct io_uring_params __user *params)
IORING_SETUP_SQ_AFF | IORING_SETUP_CQSIZE |
IORING_SETUP_CLAMP | IORING_SETUP_ATTACH_WQ |
IORING_SETUP_R_DISABLED | IORING_SETUP_SUBMIT_ALL |
- IORING_SETUP_SQE128))
+ IORING_SETUP_SQE128 | IORING_SETUP_CQE32))
return -EINVAL;
return io_uring_create(entries, &p, params);
--
2.30.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v2 12/12] io_uring: support CQE32 for nop operation
2022-04-20 19:14 [PATCH v2 00/12] add large CQE support for io-uring Stefan Roesch
` (10 preceding siblings ...)
2022-04-20 19:14 ` [PATCH v2 11/12] io_uring: enable CQE32 Stefan Roesch
@ 2022-04-20 19:14 ` Stefan Roesch
2022-04-20 22:51 ` [PATCH v2 00/12] add large CQE support for io-uring Jens Axboe
12 siblings, 0 replies; 29+ messages in thread
From: Stefan Roesch @ 2022-04-20 19:14 UTC (permalink / raw)
To: io-uring, kernel-team; +Cc: shr, Jens Axboe
This adds support for filling the extra1 and extra2 fields for large
CQE's.
Co-developed-by: Jens Axboe <[email protected]>
Signed-off-by: Stefan Roesch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 8bdf253b9462..3855148b6774 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -744,6 +744,12 @@ struct io_msg {
u32 len;
};
+struct io_nop {
+ struct file *file;
+ u64 extra1;
+ u64 extra2;
+};
+
struct io_async_connect {
struct sockaddr_storage address;
};
@@ -937,6 +943,7 @@ struct io_kiocb {
struct io_msg msg;
struct io_xattr xattr;
struct io_socket sock;
+ struct io_nop nop;
};
u8 opcode;
@@ -4863,6 +4870,19 @@ static int io_splice(struct io_kiocb *req, unsigned int issue_flags)
return 0;
}
+static int io_nop_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ /*
+ * If the ring is setup with CQE32, relay back addr/addr
+ */
+ if (req->ctx->flags & IORING_SETUP_CQE32) {
+ req->nop.extra1 = READ_ONCE(sqe->addr);
+ req->nop.extra2 = READ_ONCE(sqe->addr2);
+ }
+
+ return 0;
+}
+
/*
* IORING_OP_NOP just posts a completion event, nothing else.
*/
@@ -4873,7 +4893,11 @@ static int io_nop(struct io_kiocb *req, unsigned int issue_flags)
if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
return -EINVAL;
- __io_req_complete(req, issue_flags, 0, 0);
+ if (!(ctx->flags & IORING_SETUP_CQE32))
+ __io_req_complete(req, issue_flags, 0, 0);
+ else
+ __io_req_complete32(req, issue_flags, 0, 0, req->nop.extra1,
+ req->nop.extra2);
return 0;
}
@@ -7345,7 +7369,7 @@ static int io_req_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
{
switch (req->opcode) {
case IORING_OP_NOP:
- return 0;
+ return io_nop_prep(req, sqe);
case IORING_OP_READV:
case IORING_OP_READ_FIXED:
case IORING_OP_READ:
--
2.30.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH v2 00/12] add large CQE support for io-uring
2022-04-20 19:14 [PATCH v2 00/12] add large CQE support for io-uring Stefan Roesch
` (11 preceding siblings ...)
2022-04-20 19:14 ` [PATCH v2 12/12] io_uring: support CQE32 for nop operation Stefan Roesch
@ 2022-04-20 22:51 ` Jens Axboe
2022-04-21 18:42 ` Pavel Begunkov
12 siblings, 1 reply; 29+ messages in thread
From: Jens Axboe @ 2022-04-20 22:51 UTC (permalink / raw)
To: kernel-team, io-uring, shr
On Wed, 20 Apr 2022 12:14:39 -0700, Stefan Roesch wrote:
> This adds the large CQE support for io-uring. Large CQE's are 16 bytes longer.
> To support the longer CQE's the allocation part is changed and when the CQE is
> accessed.
>
> The allocation of the large CQE's is twice as big, so the allocation size is
> doubled. The ring size calculation needs to take this into account.
>
> [...]
Applied, thanks!
[01/12] io_uring: support CQE32 in io_uring_cqe
commit: be428af6b204c2b366dd8b838bea87d1d4d9f2bd
[02/12] io_uring: wire up inline completion path for CQE32
commit: 8fc4fbc38db6538056498c88f606f958fbb24bfd
[03/12] io_uring: change ring size calculation for CQE32
commit: d09d3b8f2986899ff8f535c91d95c137b03595ec
[04/12] io_uring: add CQE32 setup processing
commit: a81124f0283879a7c5e77c0def9c725e84e79cb1
[05/12] io_uring: add CQE32 completion processing
commit: c7050dfe60c484f9084e57c2b1c88b8ab1f8a06d
[06/12] io_uring: modify io_get_cqe for CQE32
commit: f23855c3511dffa54069c9a0ed513b79bec39938
[07/12] io_uring: flush completions for CQE32
commit: 8a5be11b11449a412ef89c46a05e9bbeeab6652d
[08/12] io_uring: overflow processing for CQE32
commit: 2f1bbef557e9b174361ecd2f7c59b683bbca4464
[09/12] io_uring: add tracing for additional CQE32 fields
commit: b4df41b44f8f358f86533148aa0e56b27bca47d6
[10/12] io_uring: support CQE32 in /proc info
commit: 9d1b8d722dc06b9ab96db6e2bb967187c6185727
[11/12] io_uring: enable CQE32
commit: cae6c1bdf9704dee2d3c7803c36ef73ada19e238
[12/12] io_uring: support CQE32 for nop operation
commit: 460527265a0a6aa5107a7e4e4640f8d4b2088455
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v2 00/12] add large CQE support for io-uring
2022-04-20 22:51 ` [PATCH v2 00/12] add large CQE support for io-uring Jens Axboe
@ 2022-04-21 18:42 ` Pavel Begunkov
2022-04-21 18:49 ` Stefan Roesch
0 siblings, 1 reply; 29+ messages in thread
From: Pavel Begunkov @ 2022-04-21 18:42 UTC (permalink / raw)
To: Jens Axboe, kernel-team, io-uring, shr
On 4/20/22 23:51, Jens Axboe wrote:
> On Wed, 20 Apr 2022 12:14:39 -0700, Stefan Roesch wrote:
>> This adds the large CQE support for io-uring. Large CQE's are 16 bytes longer.
>> To support the longer CQE's the allocation part is changed and when the CQE is
>> accessed.
>>
>> The allocation of the large CQE's is twice as big, so the allocation size is
>> doubled. The ring size calculation needs to take this into account.
I'm missing something here, do we have a user for it apart
from no-op requests?
> Applied, thanks!
>
> [01/12] io_uring: support CQE32 in io_uring_cqe
> commit: be428af6b204c2b366dd8b838bea87d1d4d9f2bd
> [02/12] io_uring: wire up inline completion path for CQE32
> commit: 8fc4fbc38db6538056498c88f606f958fbb24bfd
> [03/12] io_uring: change ring size calculation for CQE32
> commit: d09d3b8f2986899ff8f535c91d95c137b03595ec
> [04/12] io_uring: add CQE32 setup processing
> commit: a81124f0283879a7c5e77c0def9c725e84e79cb1
> [05/12] io_uring: add CQE32 completion processing
> commit: c7050dfe60c484f9084e57c2b1c88b8ab1f8a06d
> [06/12] io_uring: modify io_get_cqe for CQE32
> commit: f23855c3511dffa54069c9a0ed513b79bec39938
> [07/12] io_uring: flush completions for CQE32
> commit: 8a5be11b11449a412ef89c46a05e9bbeeab6652d
> [08/12] io_uring: overflow processing for CQE32
> commit: 2f1bbef557e9b174361ecd2f7c59b683bbca4464
> [09/12] io_uring: add tracing for additional CQE32 fields
> commit: b4df41b44f8f358f86533148aa0e56b27bca47d6
> [10/12] io_uring: support CQE32 in /proc info
> commit: 9d1b8d722dc06b9ab96db6e2bb967187c6185727
> [11/12] io_uring: enable CQE32
> commit: cae6c1bdf9704dee2d3c7803c36ef73ada19e238
> [12/12] io_uring: support CQE32 for nop operation
> commit: 460527265a0a6aa5107a7e4e4640f8d4b2088455
>
> Best regards,
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v2 00/12] add large CQE support for io-uring
2022-04-21 18:42 ` Pavel Begunkov
@ 2022-04-21 18:49 ` Stefan Roesch
2022-04-21 18:54 ` Jens Axboe
2022-04-21 18:57 ` Pavel Begunkov
0 siblings, 2 replies; 29+ messages in thread
From: Stefan Roesch @ 2022-04-21 18:49 UTC (permalink / raw)
To: Pavel Begunkov, Jens Axboe, kernel-team, io-uring
On 4/21/22 11:42 AM, Pavel Begunkov wrote:
> On 4/20/22 23:51, Jens Axboe wrote:
>> On Wed, 20 Apr 2022 12:14:39 -0700, Stefan Roesch wrote:
>>> This adds the large CQE support for io-uring. Large CQE's are 16 bytes longer.
>>> To support the longer CQE's the allocation part is changed and when the CQE is
>>> accessed.
>>>
>>> The allocation of the large CQE's is twice as big, so the allocation size is
>>> doubled. The ring size calculation needs to take this into account.
>
> I'm missing something here, do we have a user for it apart
> from no-op requests?
>
Pavel, what started this work is the patch series "io_uring passthru over nvme" from samsung.
(https://lore.kernel.org/io-uring/[email protected]/)
They will use the large SQE and CQE support.
>
>> Applied, thanks!
>>
>> [01/12] io_uring: support CQE32 in io_uring_cqe
>> commit: be428af6b204c2b366dd8b838bea87d1d4d9f2bd
>> [02/12] io_uring: wire up inline completion path for CQE32
>> commit: 8fc4fbc38db6538056498c88f606f958fbb24bfd
>> [03/12] io_uring: change ring size calculation for CQE32
>> commit: d09d3b8f2986899ff8f535c91d95c137b03595ec
>> [04/12] io_uring: add CQE32 setup processing
>> commit: a81124f0283879a7c5e77c0def9c725e84e79cb1
>> [05/12] io_uring: add CQE32 completion processing
>> commit: c7050dfe60c484f9084e57c2b1c88b8ab1f8a06d
>> [06/12] io_uring: modify io_get_cqe for CQE32
>> commit: f23855c3511dffa54069c9a0ed513b79bec39938
>> [07/12] io_uring: flush completions for CQE32
>> commit: 8a5be11b11449a412ef89c46a05e9bbeeab6652d
>> [08/12] io_uring: overflow processing for CQE32
>> commit: 2f1bbef557e9b174361ecd2f7c59b683bbca4464
>> [09/12] io_uring: add tracing for additional CQE32 fields
>> commit: b4df41b44f8f358f86533148aa0e56b27bca47d6
>> [10/12] io_uring: support CQE32 in /proc info
>> commit: 9d1b8d722dc06b9ab96db6e2bb967187c6185727
>> [11/12] io_uring: enable CQE32
>> commit: cae6c1bdf9704dee2d3c7803c36ef73ada19e238
>> [12/12] io_uring: support CQE32 for nop operation
>> commit: 460527265a0a6aa5107a7e4e4640f8d4b2088455
>>
>> Best regards,
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v2 00/12] add large CQE support for io-uring
2022-04-21 18:49 ` Stefan Roesch
@ 2022-04-21 18:54 ` Jens Axboe
2022-04-21 18:57 ` Pavel Begunkov
1 sibling, 0 replies; 29+ messages in thread
From: Jens Axboe @ 2022-04-21 18:54 UTC (permalink / raw)
To: Stefan Roesch, Pavel Begunkov, kernel-team, io-uring
On 4/21/22 12:49 PM, Stefan Roesch wrote:
>
>
> On 4/21/22 11:42 AM, Pavel Begunkov wrote:
>> On 4/20/22 23:51, Jens Axboe wrote:
>>> On Wed, 20 Apr 2022 12:14:39 -0700, Stefan Roesch wrote:
>>>> This adds the large CQE support for io-uring. Large CQE's are 16 bytes longer.
>>>> To support the longer CQE's the allocation part is changed and when the CQE is
>>>> accessed.
>>>>
>>>> The allocation of the large CQE's is twice as big, so the allocation size is
>>>> doubled. The ring size calculation needs to take this into account.
>>
>> I'm missing something here, do we have a user for it apart
>> from no-op requests?
>>
>
> Pavel, what started this work is the patch series "io_uring passthru over nvme" from samsung.
> (https://lore.kernel.org/io-uring/[email protected]/)
>
> They will use the large SQE and CQE support.
Indeed - and as such, this will just be a base for that. Doesn't make
sense standalone, but with the passthrough support it does.
--
Jens Axboe
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v2 00/12] add large CQE support for io-uring
2022-04-21 18:49 ` Stefan Roesch
2022-04-21 18:54 ` Jens Axboe
@ 2022-04-21 18:57 ` Pavel Begunkov
2022-04-21 18:59 ` Jens Axboe
1 sibling, 1 reply; 29+ messages in thread
From: Pavel Begunkov @ 2022-04-21 18:57 UTC (permalink / raw)
To: Stefan Roesch, Jens Axboe, kernel-team, io-uring
On 4/21/22 19:49, Stefan Roesch wrote:
> On 4/21/22 11:42 AM, Pavel Begunkov wrote:
>> On 4/20/22 23:51, Jens Axboe wrote:
>>> On Wed, 20 Apr 2022 12:14:39 -0700, Stefan Roesch wrote:
>>>> This adds the large CQE support for io-uring. Large CQE's are 16 bytes longer.
>>>> To support the longer CQE's the allocation part is changed and when the CQE is
>>>> accessed.
>>>>
>>>> The allocation of the large CQE's is twice as big, so the allocation size is
>>>> doubled. The ring size calculation needs to take this into account.
>>
>> I'm missing something here, do we have a user for it apart
>> from no-op requests?
>>
>
> Pavel, what started this work is the patch series "io_uring passthru over nvme" from samsung.
> (https://lore.kernel.org/io-uring/[email protected]/)
>
> They will use the large SQE and CQE support.
I see, thanks for clarifying. I saw it used in passthrough
patches, but it only got me more confused why it's applied
aforehand separately from the io_uring-cmd and passthrough
>>> Applied, thanks!
>>>
>>> [01/12] io_uring: support CQE32 in io_uring_cqe
>>> commit: be428af6b204c2b366dd8b838bea87d1d4d9f2bd
>>> [02/12] io_uring: wire up inline completion path for CQE32
>>> commit: 8fc4fbc38db6538056498c88f606f958fbb24bfd
>>> [03/12] io_uring: change ring size calculation for CQE32
>>> commit: d09d3b8f2986899ff8f535c91d95c137b03595ec
>>> [04/12] io_uring: add CQE32 setup processing
>>> commit: a81124f0283879a7c5e77c0def9c725e84e79cb1
>>> [05/12] io_uring: add CQE32 completion processing
>>> commit: c7050dfe60c484f9084e57c2b1c88b8ab1f8a06d
>>> [06/12] io_uring: modify io_get_cqe for CQE32
>>> commit: f23855c3511dffa54069c9a0ed513b79bec39938
>>> [07/12] io_uring: flush completions for CQE32
>>> commit: 8a5be11b11449a412ef89c46a05e9bbeeab6652d
>>> [08/12] io_uring: overflow processing for CQE32
>>> commit: 2f1bbef557e9b174361ecd2f7c59b683bbca4464
>>> [09/12] io_uring: add tracing for additional CQE32 fields
>>> commit: b4df41b44f8f358f86533148aa0e56b27bca47d6
>>> [10/12] io_uring: support CQE32 in /proc info
>>> commit: 9d1b8d722dc06b9ab96db6e2bb967187c6185727
>>> [11/12] io_uring: enable CQE32
>>> commit: cae6c1bdf9704dee2d3c7803c36ef73ada19e238
>>> [12/12] io_uring: support CQE32 for nop operation
>>> commit: 460527265a0a6aa5107a7e4e4640f8d4b2088455
>>>
>>> Best regards,
>>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v2 00/12] add large CQE support for io-uring
2022-04-21 18:57 ` Pavel Begunkov
@ 2022-04-21 18:59 ` Jens Axboe
2022-04-22 3:09 ` Kanchan Joshi
0 siblings, 1 reply; 29+ messages in thread
From: Jens Axboe @ 2022-04-21 18:59 UTC (permalink / raw)
To: Pavel Begunkov, Stefan Roesch, kernel-team, io-uring
On 4/21/22 12:57 PM, Pavel Begunkov wrote:
> On 4/21/22 19:49, Stefan Roesch wrote:
>> On 4/21/22 11:42 AM, Pavel Begunkov wrote:
>>> On 4/20/22 23:51, Jens Axboe wrote:
>>>> On Wed, 20 Apr 2022 12:14:39 -0700, Stefan Roesch wrote:
>>>>> This adds the large CQE support for io-uring. Large CQE's are 16 bytes longer.
>>>>> To support the longer CQE's the allocation part is changed and when the CQE is
>>>>> accessed.
>>>>>
>>>>> The allocation of the large CQE's is twice as big, so the allocation size is
>>>>> doubled. The ring size calculation needs to take this into account.
>>>
>>> I'm missing something here, do we have a user for it apart
>>> from no-op requests?
>>>
>>
>> Pavel, what started this work is the patch series "io_uring passthru over nvme" from samsung.
>> (https://lore.kernel.org/io-uring/[email protected]/)
>>
>> They will use the large SQE and CQE support.
>
> I see, thanks for clarifying. I saw it used in passthrough
> patches, but it only got me more confused why it's applied
> aforehand separately from the io_uring-cmd and passthrough
It's just applied to a branch so the passthrough folks have something to
base on, io_uring-big-sqe. It's not queued for 5.19 or anything like
that yet.
--
Jens Axboe
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v2 00/12] add large CQE support for io-uring
2022-04-21 18:59 ` Jens Axboe
@ 2022-04-22 3:09 ` Kanchan Joshi
2022-04-22 5:06 ` Kanchan Joshi
2022-04-22 21:03 ` Stefan Roesch
0 siblings, 2 replies; 29+ messages in thread
From: Kanchan Joshi @ 2022-04-22 3:09 UTC (permalink / raw)
To: Jens Axboe; +Cc: Pavel Begunkov, Stefan Roesch, kernel-team, io-uring
[-- Attachment #1: Type: text/plain, Size: 1641 bytes --]
On Thu, Apr 21, 2022 at 12:59:42PM -0600, Jens Axboe wrote:
>On 4/21/22 12:57 PM, Pavel Begunkov wrote:
>> On 4/21/22 19:49, Stefan Roesch wrote:
>>> On 4/21/22 11:42 AM, Pavel Begunkov wrote:
>>>> On 4/20/22 23:51, Jens Axboe wrote:
>>>>> On Wed, 20 Apr 2022 12:14:39 -0700, Stefan Roesch wrote:
>>>>>> This adds the large CQE support for io-uring. Large CQE's are 16 bytes longer.
>>>>>> To support the longer CQE's the allocation part is changed and when the CQE is
>>>>>> accessed.
>>>>>>
>>>>>> The allocation of the large CQE's is twice as big, so the allocation size is
>>>>>> doubled. The ring size calculation needs to take this into account.
>>>>
>>>> I'm missing something here, do we have a user for it apart
>>>> from no-op requests?
>>>>
>>>
>>> Pavel, what started this work is the patch series "io_uring passthru over nvme" from samsung.
>>> (https://lore.kernel.org/io-uring/[email protected]/)
>>>
>>> They will use the large SQE and CQE support.
>>
>> I see, thanks for clarifying. I saw it used in passthrough
>> patches, but it only got me more confused why it's applied
>> aforehand separately from the io_uring-cmd and passthrough
>
>It's just applied to a branch so the passthrough folks have something to
>base on, io_uring-big-sqe. It's not queued for 5.19 or anything like
>that yet.
>
Thanks for putting this up.
I am bit confused whether these (big-cqe) and big-sqe patches should
continue be sent (to nvme list too) as part of next
uring-cmd/passthrough series?
And does it make sense to squash somes patches of this series; at
high-level there is 32b-CQE support, and no-op support.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v2 00/12] add large CQE support for io-uring
2022-04-22 3:09 ` Kanchan Joshi
@ 2022-04-22 5:06 ` Kanchan Joshi
2022-04-22 21:03 ` Stefan Roesch
1 sibling, 0 replies; 29+ messages in thread
From: Kanchan Joshi @ 2022-04-22 5:06 UTC (permalink / raw)
To: Jens Axboe; +Cc: Pavel Begunkov, Stefan Roesch, kernel-team, io-uring
[-- Attachment #1: Type: text/plain, Size: 1821 bytes --]
On Fri, Apr 22, 2022 at 08:39:18AM +0530, Kanchan Joshi wrote:
>On Thu, Apr 21, 2022 at 12:59:42PM -0600, Jens Axboe wrote:
>>On 4/21/22 12:57 PM, Pavel Begunkov wrote:
>>>On 4/21/22 19:49, Stefan Roesch wrote:
>>>>On 4/21/22 11:42 AM, Pavel Begunkov wrote:
>>>>>On 4/20/22 23:51, Jens Axboe wrote:
>>>>>>On Wed, 20 Apr 2022 12:14:39 -0700, Stefan Roesch wrote:
>>>>>>>This adds the large CQE support for io-uring. Large CQE's are 16 bytes longer.
>>>>>>>To support the longer CQE's the allocation part is changed and when the CQE is
>>>>>>>accessed.
>>>>>>>
>>>>>>>The allocation of the large CQE's is twice as big, so the allocation size is
>>>>>>>doubled. The ring size calculation needs to take this into account.
>>>>>
>>>>>I'm missing something here, do we have a user for it apart
>>>>>from no-op requests?
>>>>>
>>>>
>>>>Pavel, what started this work is the patch series "io_uring passthru over nvme" from samsung.
>>>>(https://lore.kernel.org/io-uring/[email protected]/)
>>>>
>>>>They will use the large SQE and CQE support.
>>>
>>>I see, thanks for clarifying. I saw it used in passthrough
>>>patches, but it only got me more confused why it's applied
>>>aforehand separately from the io_uring-cmd and passthrough
>>
>>It's just applied to a branch so the passthrough folks have something to
>>base on, io_uring-big-sqe. It's not queued for 5.19 or anything like
>>that yet.
>>
>Thanks for putting this up.
>I am bit confused whether these (big-cqe) and big-sqe patches should
>continue be sent (to nvme list too) as part of next
>uring-cmd/passthrough series?
>
>And does it make sense to squash somes patches of this series; at
>high-level there is 32b-CQE support, and no-op support.
Maybe as part of v3, as there seems some scope for that (I made comments
at respective places).
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v2 00/12] add large CQE support for io-uring
2022-04-22 3:09 ` Kanchan Joshi
2022-04-22 5:06 ` Kanchan Joshi
@ 2022-04-22 21:03 ` Stefan Roesch
1 sibling, 0 replies; 29+ messages in thread
From: Stefan Roesch @ 2022-04-22 21:03 UTC (permalink / raw)
To: Kanchan Joshi, Jens Axboe; +Cc: Pavel Begunkov, kernel-team, io-uring
On 4/21/22 8:09 PM, Kanchan Joshi wrote:
> On Thu, Apr 21, 2022 at 12:59:42PM -0600, Jens Axboe wrote:
>> On 4/21/22 12:57 PM, Pavel Begunkov wrote:
>>> On 4/21/22 19:49, Stefan Roesch wrote:
>>>> On 4/21/22 11:42 AM, Pavel Begunkov wrote:
>>>>> On 4/20/22 23:51, Jens Axboe wrote:
>>>>>> On Wed, 20 Apr 2022 12:14:39 -0700, Stefan Roesch wrote:
>>>>>>> This adds the large CQE support for io-uring. Large CQE's are 16 bytes longer.
>>>>>>> To support the longer CQE's the allocation part is changed and when the CQE is
>>>>>>> accessed.
>>>>>>>
>>>>>>> The allocation of the large CQE's is twice as big, so the allocation size is
>>>>>>> doubled. The ring size calculation needs to take this into account.
>>>>>
>>>>> I'm missing something here, do we have a user for it apart
>>>>> from no-op requests?
>>>>>
>>>>
>>>> Pavel, what started this work is the patch series "io_uring passthru over nvme" from samsung.
>>>> (https://lore.kernel.org/io-uring/[email protected]/)
>>>>
>>>> They will use the large SQE and CQE support.
>>>
>>> I see, thanks for clarifying. I saw it used in passthrough
>>> patches, but it only got me more confused why it's applied
>>> aforehand separately from the io_uring-cmd and passthrough
>>
>> It's just applied to a branch so the passthrough folks have something to
>> base on, io_uring-big-sqe. It's not queued for 5.19 or anything like
>> that yet.
>>
> Thanks for putting this up.
> I am bit confused whether these (big-cqe) and big-sqe patches should
> continue be sent (to nvme list too) as part of next
> uring-cmd/passthrough series?
>
I'll sent version 3 also to the nvme list.
> And does it make sense to squash somes patches of this series; at
> high-level there is 32b-CQE support, and no-op support.
>
^ permalink raw reply [flat|nested] 29+ messages in thread