* [PATCH 1/2] io_uring: fix __tctx_task_work() ctx race
2021-02-28 22:04 [PATCH 5.12 0/2] task_work ctx refs fix + xchg cleanup Pavel Begunkov
@ 2021-02-28 22:04 ` Pavel Begunkov
2021-02-28 22:04 ` [PATCH 2/2] io_uring: replace cmpxchg in fallback with xchg Pavel Begunkov
2021-02-28 22:17 ` [PATCH 5.12 0/2] task_work ctx refs fix + xchg cleanup Jens Axboe
2 siblings, 0 replies; 4+ messages in thread
From: Pavel Begunkov @ 2021-02-28 22:04 UTC (permalink / raw)
To: Jens Axboe, io-uring
There is an unlikely but possible race using a freed context. That's
because req->task_work.func() can free a request, but we won't
necessarily find a completion in submit_state.comp and so all ctx refs
may be put by the time we do mutex_lock(&ctx->uring_ctx);
There are several reasons why it can miss going through
submit_state.comp: 1) req->task_work.func() didn't complete it itself,
but punted to iowq (e.g. reissue) and it got freed later, or a similar
situation with it overflowing and getting flushed by someone else, or
being submitted to IRQ completion, 2) As we don't hold the uring_lock,
someone else can do io_submit_flush_completions() and put our ref.
3) Bugs and code obscurities, e.g. failing to propagate issue_flags
properly.
One example is as follows
CPU1 | CPU2
=======================================================================
@req->task_work.func() |
-> @req overflwed, |
so submit_state.comp,nr==0 |
| flush overflows, and free @req
| ctx refs == 0, free it
ctx is dead, but we do |
lock + flush + unlock |
So take a ctx reference for each new ctx we see in __tctx_task_work(),
and do release it until we do all our flushing.
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index d0ca0b819f1c..365e75b53a78 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1822,6 +1822,9 @@ static bool __tctx_task_work(struct io_uring_task *tctx)
req = container_of(node, struct io_kiocb, io_task_work.node);
this_ctx = req->ctx;
+ if (this_ctx != ctx)
+ percpu_ref_get(&this_ctx->refs);
+
req->task_work.func(&req->task_work);
node = next;
@@ -1831,14 +1834,18 @@ static bool __tctx_task_work(struct io_uring_task *tctx)
mutex_lock(&ctx->uring_lock);
io_submit_flush_completions(&ctx->submit_state.comp, ctx);
mutex_unlock(&ctx->uring_lock);
+ percpu_ref_put(&ctx->refs);
ctx = node ? this_ctx : NULL;
}
}
- if (ctx && ctx->submit_state.comp.nr) {
- mutex_lock(&ctx->uring_lock);
- io_submit_flush_completions(&ctx->submit_state.comp, ctx);
- mutex_unlock(&ctx->uring_lock);
+ if (ctx) {
+ if (ctx->submit_state.comp.nr) {
+ mutex_lock(&ctx->uring_lock);
+ io_submit_flush_completions(&ctx->submit_state.comp, ctx);
+ mutex_unlock(&ctx->uring_lock);
+ }
+ percpu_ref_put(&ctx->refs);
}
return list.first != NULL;
--
2.24.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2] io_uring: replace cmpxchg in fallback with xchg
2021-02-28 22:04 [PATCH 5.12 0/2] task_work ctx refs fix + xchg cleanup Pavel Begunkov
2021-02-28 22:04 ` [PATCH 1/2] io_uring: fix __tctx_task_work() ctx race Pavel Begunkov
@ 2021-02-28 22:04 ` Pavel Begunkov
2021-02-28 22:17 ` [PATCH 5.12 0/2] task_work ctx refs fix + xchg cleanup Jens Axboe
2 siblings, 0 replies; 4+ messages in thread
From: Pavel Begunkov @ 2021-02-28 22:04 UTC (permalink / raw)
To: Jens Axboe, io-uring
io_run_ctx_fallback() can use xchg() instead of cmpxchg(). It's faster
and faster.
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 365e75b53a78..42b675939582 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -8489,15 +8489,11 @@ static int io_remove_personalities(int id, void *p, void *data)
static bool io_run_ctx_fallback(struct io_ring_ctx *ctx)
{
- struct callback_head *work, *head, *next;
+ struct callback_head *work, *next;
bool executed = false;
do {
- do {
- head = NULL;
- work = READ_ONCE(ctx->exit_task_work);
- } while (cmpxchg(&ctx->exit_task_work, work, head) != work);
-
+ work = xchg(&ctx->exit_task_work, NULL);
if (!work)
break;
--
2.24.0
^ permalink raw reply related [flat|nested] 4+ messages in thread