From: Pavel Begunkov <[email protected]>
To: Jens Axboe <[email protected]>, [email protected]
Subject: [PATCH 1/2] io_uring: fix __tctx_task_work() ctx race
Date: Sun, 28 Feb 2021 22:04:53 +0000 [thread overview]
Message-ID: <c05ae0077d4e248256e6d6fdcd4e25a0c4f640e6.1614549667.git.asml.silence@gmail.com> (raw)
In-Reply-To: <[email protected]>
There is an unlikely but possible race using a freed context. That's
because req->task_work.func() can free a request, but we won't
necessarily find a completion in submit_state.comp and so all ctx refs
may be put by the time we do mutex_lock(&ctx->uring_ctx);
There are several reasons why it can miss going through
submit_state.comp: 1) req->task_work.func() didn't complete it itself,
but punted to iowq (e.g. reissue) and it got freed later, or a similar
situation with it overflowing and getting flushed by someone else, or
being submitted to IRQ completion, 2) As we don't hold the uring_lock,
someone else can do io_submit_flush_completions() and put our ref.
3) Bugs and code obscurities, e.g. failing to propagate issue_flags
properly.
One example is as follows
CPU1 | CPU2
=======================================================================
@req->task_work.func() |
-> @req overflwed, |
so submit_state.comp,nr==0 |
| flush overflows, and free @req
| ctx refs == 0, free it
ctx is dead, but we do |
lock + flush + unlock |
So take a ctx reference for each new ctx we see in __tctx_task_work(),
and do release it until we do all our flushing.
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index d0ca0b819f1c..365e75b53a78 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1822,6 +1822,9 @@ static bool __tctx_task_work(struct io_uring_task *tctx)
req = container_of(node, struct io_kiocb, io_task_work.node);
this_ctx = req->ctx;
+ if (this_ctx != ctx)
+ percpu_ref_get(&this_ctx->refs);
+
req->task_work.func(&req->task_work);
node = next;
@@ -1831,14 +1834,18 @@ static bool __tctx_task_work(struct io_uring_task *tctx)
mutex_lock(&ctx->uring_lock);
io_submit_flush_completions(&ctx->submit_state.comp, ctx);
mutex_unlock(&ctx->uring_lock);
+ percpu_ref_put(&ctx->refs);
ctx = node ? this_ctx : NULL;
}
}
- if (ctx && ctx->submit_state.comp.nr) {
- mutex_lock(&ctx->uring_lock);
- io_submit_flush_completions(&ctx->submit_state.comp, ctx);
- mutex_unlock(&ctx->uring_lock);
+ if (ctx) {
+ if (ctx->submit_state.comp.nr) {
+ mutex_lock(&ctx->uring_lock);
+ io_submit_flush_completions(&ctx->submit_state.comp, ctx);
+ mutex_unlock(&ctx->uring_lock);
+ }
+ percpu_ref_put(&ctx->refs);
}
return list.first != NULL;
--
2.24.0
next prev parent reply other threads:[~2021-02-28 22:09 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-28 22:04 [PATCH 5.12 0/2] task_work ctx refs fix + xchg cleanup Pavel Begunkov
2021-02-28 22:04 ` Pavel Begunkov [this message]
2021-02-28 22:04 ` [PATCH 2/2] io_uring: replace cmpxchg in fallback with xchg Pavel Begunkov
2021-02-28 22:17 ` [PATCH 5.12 0/2] task_work ctx refs fix + xchg cleanup Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c05ae0077d4e248256e6d6fdcd4e25a0c4f640e6.1614549667.git.asml.silence@gmail.com \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox