* [PATCH 1/6] io_uring: account drain memory to cgroup
2025-05-08 11:52 [PATCH 0/6] drain cleanups and extra Pavel Begunkov
@ 2025-05-08 11:52 ` Pavel Begunkov
2025-05-08 11:52 ` [PATCH 2/6] io_uring: simplify drain ret passing Pavel Begunkov
` (4 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: Pavel Begunkov @ 2025-05-08 11:52 UTC (permalink / raw)
To: io-uring; +Cc: asml.silence
Account drain allocations against memcg. It's not a big problem as each
such allocation is paired with a request, which is accounted, but it's
nicer to follow the limits more closely.
Cc: stable@vger.kernel.org # 6.1
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
io_uring/io_uring.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 0d051476008c..23e283e65eeb 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1672,7 +1672,7 @@ static __cold void io_drain_req(struct io_kiocb *req)
spin_unlock(&ctx->completion_lock);
io_prep_async_link(req);
- de = kmalloc(sizeof(*de), GFP_KERNEL);
+ de = kmalloc(sizeof(*de), GFP_KERNEL_ACCOUNT);
if (!de) {
ret = -ENOMEM;
io_req_defer_failed(req, ret);
--
2.49.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/6] io_uring: simplify drain ret passing
2025-05-08 11:52 [PATCH 0/6] drain cleanups and extra Pavel Begunkov
2025-05-08 11:52 ` [PATCH 1/6] io_uring: account drain memory to cgroup Pavel Begunkov
@ 2025-05-08 11:52 ` Pavel Begunkov
2025-05-08 11:52 ` [PATCH 3/6] io_uring: remove drain prealloc checks Pavel Begunkov
` (3 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: Pavel Begunkov @ 2025-05-08 11:52 UTC (permalink / raw)
To: io-uring; +Cc: asml.silence
"ret" in io_drain_req() is only used in one place, remove it and pass
-ENOMEM directly.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
io_uring/io_uring.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 23e283e65eeb..21b70ad0edc4 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1657,7 +1657,6 @@ static __cold void io_drain_req(struct io_kiocb *req)
{
struct io_ring_ctx *ctx = req->ctx;
struct io_defer_entry *de;
- int ret;
u32 seq = io_get_sequence(req);
/* Still need defer if there is pending req in defer list. */
@@ -1674,8 +1673,7 @@ static __cold void io_drain_req(struct io_kiocb *req)
io_prep_async_link(req);
de = kmalloc(sizeof(*de), GFP_KERNEL_ACCOUNT);
if (!de) {
- ret = -ENOMEM;
- io_req_defer_failed(req, ret);
+ io_req_defer_failed(req, -ENOMEM);
return;
}
--
2.49.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/6] io_uring: remove drain prealloc checks
2025-05-08 11:52 [PATCH 0/6] drain cleanups and extra Pavel Begunkov
2025-05-08 11:52 ` [PATCH 1/6] io_uring: account drain memory to cgroup Pavel Begunkov
2025-05-08 11:52 ` [PATCH 2/6] io_uring: simplify drain ret passing Pavel Begunkov
@ 2025-05-08 11:52 ` Pavel Begunkov
2025-05-08 11:52 ` [PATCH 4/6] io_uring: consolidate drain seq checking Pavel Begunkov
` (2 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: Pavel Begunkov @ 2025-05-08 11:52 UTC (permalink / raw)
To: io-uring; +Cc: asml.silence
Currently io_drain_req() has two steps. The first is fast path checking
sequence numbers. The second is allocations, rechecking and actual
queuing. Further simplify it by removing the first step.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
io_uring/io_uring.c | 15 +++------------
1 file changed, 3 insertions(+), 12 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 21b70ad0edc4..72ae350f4f8b 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1659,17 +1659,6 @@ static __cold void io_drain_req(struct io_kiocb *req)
struct io_defer_entry *de;
u32 seq = io_get_sequence(req);
- /* Still need defer if there is pending req in defer list. */
- spin_lock(&ctx->completion_lock);
- if (!req_need_defer(req, seq) && list_empty_careful(&ctx->defer_list)) {
- spin_unlock(&ctx->completion_lock);
-queue:
- ctx->drain_active = false;
- io_req_task_queue(req);
- return;
- }
- spin_unlock(&ctx->completion_lock);
-
io_prep_async_link(req);
de = kmalloc(sizeof(*de), GFP_KERNEL_ACCOUNT);
if (!de) {
@@ -1681,7 +1670,9 @@ static __cold void io_drain_req(struct io_kiocb *req)
if (!req_need_defer(req, seq) && list_empty(&ctx->defer_list)) {
spin_unlock(&ctx->completion_lock);
kfree(de);
- goto queue;
+ ctx->drain_active = false;
+ io_req_task_queue(req);
+ return;
}
trace_io_uring_defer(req);
--
2.49.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 4/6] io_uring: consolidate drain seq checking
2025-05-08 11:52 [PATCH 0/6] drain cleanups and extra Pavel Begunkov
` (2 preceding siblings ...)
2025-05-08 11:52 ` [PATCH 3/6] io_uring: remove drain prealloc checks Pavel Begunkov
@ 2025-05-08 11:52 ` Pavel Begunkov
2025-05-08 15:43 ` Pavel Begunkov
2025-05-08 11:52 ` [PATCH 5/6] io_uring/net: move CONFIG_NET guards to Makefile Pavel Begunkov
2025-05-08 11:52 ` [PATCH 6/6] io_uring: add lockdep asserts to io_add_aux_cqe Pavel Begunkov
5 siblings, 1 reply; 8+ messages in thread
From: Pavel Begunkov @ 2025-05-08 11:52 UTC (permalink / raw)
To: io-uring; +Cc: asml.silence
We check sequences when queuing drained requests as well when flushing
them. Instead, always queue and immediately try to flush, so that all
seq handling can be kept contained in the flushing code.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
io_uring/io_uring.c | 34 +++++++++++++++++-----------------
1 file changed, 17 insertions(+), 17 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 72ae350f4f8b..e50c153d8edc 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -559,9 +559,9 @@ void io_req_queue_iowq(struct io_kiocb *req)
io_req_task_work_add(req);
}
-static __cold noinline void io_queue_deferred(struct io_ring_ctx *ctx)
+
+static __cold noinline void __io_queue_deferred(struct io_ring_ctx *ctx)
{
- spin_lock(&ctx->completion_lock);
while (!list_empty(&ctx->defer_list)) {
struct io_defer_entry *de = list_first_entry(&ctx->defer_list,
struct io_defer_entry, list);
@@ -572,7 +572,12 @@ static __cold noinline void io_queue_deferred(struct io_ring_ctx *ctx)
io_req_task_queue(de->req);
kfree(de);
}
- spin_unlock(&ctx->completion_lock);
+}
+
+static __cold noinline void io_queue_deferred(struct io_ring_ctx *ctx)
+{
+ guard(spinlock)(&ctx->completion_lock);
+ __io_queue_deferred(ctx);
}
void __io_commit_cqring_flush(struct io_ring_ctx *ctx)
@@ -1657,29 +1662,24 @@ static __cold void io_drain_req(struct io_kiocb *req)
{
struct io_ring_ctx *ctx = req->ctx;
struct io_defer_entry *de;
- u32 seq = io_get_sequence(req);
- io_prep_async_link(req);
de = kmalloc(sizeof(*de), GFP_KERNEL_ACCOUNT);
if (!de) {
io_req_defer_failed(req, -ENOMEM);
return;
}
- spin_lock(&ctx->completion_lock);
- if (!req_need_defer(req, seq) && list_empty(&ctx->defer_list)) {
- spin_unlock(&ctx->completion_lock);
- kfree(de);
- ctx->drain_active = false;
- io_req_task_queue(req);
- return;
- }
-
+ io_prep_async_link(req);
trace_io_uring_defer(req);
de->req = req;
- de->seq = seq;
- list_add_tail(&de->list, &ctx->defer_list);
- spin_unlock(&ctx->completion_lock);
+ de->seq = io_get_sequence(req);
+
+ scoped_guard(spinlock, &ctx->completion_lock) {
+ list_add_tail(&de->list, &ctx->defer_list);
+ __io_queue_deferred(ctx);
+ if (list_empty(&ctx->defer_list))
+ ctx->drain_active = false;
+ }
}
static bool io_assign_file(struct io_kiocb *req, const struct io_issue_def *def,
--
2.49.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 5/6] io_uring/net: move CONFIG_NET guards to Makefile
2025-05-08 11:52 [PATCH 0/6] drain cleanups and extra Pavel Begunkov
` (3 preceding siblings ...)
2025-05-08 11:52 ` [PATCH 4/6] io_uring: consolidate drain seq checking Pavel Begunkov
@ 2025-05-08 11:52 ` Pavel Begunkov
2025-05-08 11:52 ` [PATCH 6/6] io_uring: add lockdep asserts to io_add_aux_cqe Pavel Begunkov
5 siblings, 0 replies; 8+ messages in thread
From: Pavel Begunkov @ 2025-05-08 11:52 UTC (permalink / raw)
To: io-uring; +Cc: asml.silence
Instruct Makefile to never try to compile net.c without CONFIG_NET and
kill ifdefs in the file.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
io_uring/Makefile | 4 ++--
io_uring/net.c | 2 --
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/io_uring/Makefile b/io_uring/Makefile
index 75e0ca795685..11a739927a62 100644
--- a/io_uring/Makefile
+++ b/io_uring/Makefile
@@ -7,7 +7,7 @@ GCOV_PROFILE := y
endif
obj-$(CONFIG_IO_URING) += io_uring.o opdef.o kbuf.o rsrc.o notif.o \
- tctx.o filetable.o rw.o net.o poll.o \
+ tctx.o filetable.o rw.o poll.o \
eventfd.o uring_cmd.o openclose.o \
sqpoll.o xattr.o nop.o fs.o splice.o \
sync.o msg_ring.o advise.o openclose.o \
@@ -19,4 +19,4 @@ obj-$(CONFIG_IO_WQ) += io-wq.o
obj-$(CONFIG_FUTEX) += futex.o
obj-$(CONFIG_EPOLL) += epoll.o
obj-$(CONFIG_NET_RX_BUSY_POLL) += napi.o
-obj-$(CONFIG_NET) += cmd_net.o
+obj-$(CONFIG_NET) += net.o cmd_net.o
diff --git a/io_uring/net.c b/io_uring/net.c
index b3a643675ce8..1fbdb2bbb3f3 100644
--- a/io_uring/net.c
+++ b/io_uring/net.c
@@ -18,7 +18,6 @@
#include "rsrc.h"
#include "zcrx.h"
-#if defined(CONFIG_NET)
struct io_shutdown {
struct file *file;
int how;
@@ -1836,4 +1835,3 @@ void io_netmsg_cache_free(const void *entry)
io_vec_free(&kmsg->vec);
kfree(kmsg);
}
-#endif
--
2.49.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 6/6] io_uring: add lockdep asserts to io_add_aux_cqe
2025-05-08 11:52 [PATCH 0/6] drain cleanups and extra Pavel Begunkov
` (4 preceding siblings ...)
2025-05-08 11:52 ` [PATCH 5/6] io_uring/net: move CONFIG_NET guards to Makefile Pavel Begunkov
@ 2025-05-08 11:52 ` Pavel Begunkov
5 siblings, 0 replies; 8+ messages in thread
From: Pavel Begunkov @ 2025-05-08 11:52 UTC (permalink / raw)
To: io-uring; +Cc: asml.silence
io_add_aux_cqe() can only be called for rings with uring_lock protected
completion queues, add a couple of assertions in regards to that.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
io_uring/io_uring.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index e50c153d8edc..503205ced136 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -853,6 +853,9 @@ bool io_post_aux_cqe(struct io_ring_ctx *ctx, u64 user_data, s32 res, u32 cflags
*/
void io_add_aux_cqe(struct io_ring_ctx *ctx, u64 user_data, s32 res, u32 cflags)
{
+ lockdep_assert_held(&ctx->uring_lock);
+ lockdep_assert(ctx->lockless_cq);
+
if (!io_fill_cqe_aux(ctx, user_data, res, cflags)) {
spin_lock(&ctx->completion_lock);
io_cqring_event_overflow(ctx, user_data, res, cflags, 0, 0);
--
2.49.0
^ permalink raw reply related [flat|nested] 8+ messages in thread