From: Jens Axboe <[email protected]>
To: Nadav Amit <[email protected]>
Cc: [email protected]
Subject: Re: Race between io_wqe_worker() and io_wqe_wake_worker()
Date: Tue, 3 Aug 2021 07:22:11 -0600 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 8/2/21 7:05 PM, Nadav Amit wrote:
> Hello Jens,
>
> I encountered an issue, which appears to be a race between
> io_wqe_worker() and io_wqe_wake_worker(). I am not sure how to address
> this issue and whether I am missing something, since this seems to
> occur in a common scenario. Your feedback (or fix ;-)) would be
> appreciated.
>
> I run on 5.13 a workload that issues multiple async read operations
> that should run concurrently. Some read operations can not complete
> for unbounded time (e.g., read from a pipe that is never written to).
> The problem is that occasionally another read operation that should
> complete gets stuck. My understanding, based on debugging and the code
> is that the following race (or similar) occurs:
>
>
> cpu0 cpu1
> ---- ----
> io_wqe_worker()
> schedule_timeout()
> // timed out
> io_wqe_enqueue()
> io_wqe_wake_worker()
> // work_flags & IO_WQ_WORK_CONCURRENT
> io_wqe_activate_free_worker()
> io_worker_exit()
>
>
> Basically, io_wqe_wake_worker() can find a worker, but this worker is
> about to exit and is not going to process further work. Once the
> worker exits, the concurrency level decreases and async work might be
> blocked by another work. I had a look at 5.14, but did not see
> anything that might address this issue.
>
> Am I missing something?
>
> If not, all my ideas for a solution are either complicated (track
> required concurrency-level) or relaxed (span another worker on
> io_worker_exit if work_list of unbounded work is not empty).
>
> As said, feedback would be appreciated.
You are right that there's definitely a race here between checking the
freelist and finding a worker, but that worker is already exiting. Let
me mull over this a bit, I'll post something for you to try later today.
--
Jens Axboe
next prev parent reply other threads:[~2021-08-03 13:22 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-03 1:05 Race between io_wqe_worker() and io_wqe_wake_worker() Nadav Amit
2021-08-03 13:22 ` Jens Axboe [this message]
2021-08-03 14:37 ` Jens Axboe
2021-08-03 17:25 ` Hao Xu
2021-08-03 18:04 ` Nadav Amit
2021-08-03 18:14 ` Jens Axboe
2021-08-03 19:20 ` Nadav Amit
2021-08-03 19:24 ` Jens Axboe
2021-08-03 19:53 ` Jens Axboe
2021-08-03 21:16 ` Nadav Amit
2021-08-03 21:25 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox