From: Jens Axboe <[email protected]>
To: Oleg Nesterov <[email protected]>
Cc: [email protected], [email protected],
Peter Zijlstra <[email protected]>
Subject: Re: [PATCH 4/4] io_uring: flush task work before waiting for ring exit
Date: Thu, 9 Apr 2020 17:29:17 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 4/9/20 11:50 AM, Oleg Nesterov wrote:
> On 04/08, Jens Axboe wrote:
>>
>> So the question remains, we basically have this:
>>
>> A B
>> task_work_run(tsk)
>> task_work_add(tsk, io_poll_task_func())
>> process cbs
>> wait_for_completion()
>>
>> with the last wait needing to flush the work added on the B side, since
>> that isn't part of the initial list.
>
> I don't understand you, even remotely :/
>
> maybe you can write some pseudo-code ?
Sorry, I guess my explanation skills aren't stellar, or perhaps they
assume too much prior knowledge about this.
I just wasted about a day on debugging this because I had Peter's patch
applied, and I should have reviewed that more carefully. So a lot of the
hangs were due to just missing the running of some of the work.
> who does wait_for_completion(), a callback? or this "tsk" after it does
> task_work_run() ? Who does complete() ? How can this wait_for_completion()
> help to flush the work added on the B side? And why do you need to do
> something special to flush that work?
I'll try to explain this. Each io_uring operation has a request
associated with it. Each request grabs a reference to the ctx, and the
ctx can't exit before we reach zero references (obviously). When we
enter ctx teardown, we wait for us to hit zero references. Each ctx is
basically just a file descriptor, which means that we most often end up
waiting for zero references off the fput() path.
Some requests can spawn task_works work items. If we have ordering
issues on the task_works list, we can have the fput() ordered before
items that need to complete. These items are what ultimately put the
request, so you end up in a situation where you block waiting for
requests to complete, but these requests can't complete until the the
current work has completed. This is the deadlock.
Once I got rid of Peter's patch, I solved this by just making the
wait-and-free operation go async. This allows any dependent work to run
and complete just fine. Here's the patch:
https://git.kernel.dk/cgit/linux-block/commit/?h=io_uring-5.7&id=2baf397719af3a8121fcaba964470950356a4a7a
and I could perhaps make this smarter by checking if current->task_works
!= NULL, but I don't think there's much point to that. The important
part is that we complete the fput() task_work without blocking, so the
remaining work gets a chance to run.
Hope this helps! As mentioned in the commit message, we could have a
separate list of task_works for io_uring, which is what I originally
proposed. But I can also work around it like in the patch referenced
above.
--
Jens Axboe
next prev parent reply other threads:[~2020-04-10 0:29 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-07 16:02 [PATCHSET v2] io_uring and task_work interactions Jens Axboe
2020-04-07 16:02 ` [PATCH 1/4] task_work: add task_work_pending() helper Jens Axboe
2020-04-07 17:52 ` Jann Horn
2020-04-07 16:02 ` [PATCH 2/4] task_work: kill current->task_works checking in callers Jens Axboe
2020-04-07 16:02 ` [PATCH 3/4] task_work: make exit_work externally visible Jens Axboe
2020-04-07 16:02 ` [PATCH 4/4] io_uring: flush task work before waiting for ring exit Jens Axboe
2020-04-07 16:24 ` Oleg Nesterov
2020-04-07 16:38 ` Oleg Nesterov
2020-04-07 20:30 ` Jens Axboe
2020-04-07 20:39 ` Jens Axboe
2020-04-08 18:40 ` Oleg Nesterov
2020-04-08 18:48 ` Jens Axboe
2020-04-08 19:06 ` Jens Axboe
2020-04-08 20:17 ` Oleg Nesterov
2020-04-08 20:25 ` Jens Axboe
2020-04-08 21:19 ` Jens Axboe
2020-04-09 18:50 ` Oleg Nesterov
2020-04-10 0:29 ` Jens Axboe [this message]
-- strict thread matches above, loose matches on Subject: below --
2020-04-06 19:48 [PATCHSET 0/4] io_uring and task_work interactions Jens Axboe
2020-04-06 19:48 ` [PATCH 4/4] io_uring: flush task work before waiting for ring exit Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox