From: Nadav Amit <[email protected]>
To: Jens Axboe <[email protected]>
Cc: [email protected], Hao Xu <[email protected]>
Subject: Re: Race between io_wqe_worker() and io_wqe_wake_worker()
Date: Tue, 3 Aug 2021 11:04:44 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
> On Aug 3, 2021, at 7:37 AM, Jens Axboe <[email protected]> wrote:
>
> On 8/3/21 7:22 AM, Jens Axboe wrote:
>> On 8/2/21 7:05 PM, Nadav Amit wrote:
>>> Hello Jens,
>>>
>>> I encountered an issue, which appears to be a race between
>>> io_wqe_worker() and io_wqe_wake_worker(). I am not sure how to address
>>> this issue and whether I am missing something, since this seems to
>>> occur in a common scenario. Your feedback (or fix ;-)) would be
>>> appreciated.
>>>
>>> I run on 5.13 a workload that issues multiple async read operations
>>> that should run concurrently. Some read operations can not complete
>>> for unbounded time (e.g., read from a pipe that is never written to).
>>> The problem is that occasionally another read operation that should
>>> complete gets stuck. My understanding, based on debugging and the code
>>> is that the following race (or similar) occurs:
>>>
>>>
>>> cpu0 cpu1
>>> ---- ----
>>> io_wqe_worker()
>>> schedule_timeout()
>>> // timed out
>>> io_wqe_enqueue()
>>> io_wqe_wake_worker()
>>> // work_flags & IO_WQ_WORK_CONCURRENT
>>> io_wqe_activate_free_worker()
>>> io_worker_exit()
>>>
>>>
>>> Basically, io_wqe_wake_worker() can find a worker, but this worker is
>>> about to exit and is not going to process further work. Once the
>>> worker exits, the concurrency level decreases and async work might be
>>> blocked by another work. I had a look at 5.14, but did not see
>>> anything that might address this issue.
>>>
>>> Am I missing something?
>>>
>>> If not, all my ideas for a solution are either complicated (track
>>> required concurrency-level) or relaxed (span another worker on
>>> io_worker_exit if work_list of unbounded work is not empty).
>>>
>>> As said, feedback would be appreciated.
>>
>> You are right that there's definitely a race here between checking the
>> freelist and finding a worker, but that worker is already exiting. Let
>> me mull over this a bit, I'll post something for you to try later today.
>
> Can you try something like this? Just consider it a first tester, need
> to spend a bit more time on it to ensure we fully close the gap.
Thanks for the quick response.
I tried you version. It works better, but my workload still gets stuck
occasionally (less frequently though). It is pretty obvious that the
version you sent still has a race, so I didn’t put the effort into
debugging it.
I should note that I have an ugly hack that does make my test pass. I
include it, although it is obviously not the right solution.
diff --git a/fs/io-wq.c b/fs/io-wq.c
index b3e8624a37d0..a8792809e416 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -165,6 +165,17 @@ static void io_worker_ref_put(struct io_wq *wq)
complete(&wq->worker_done);
}
+static void io_queue_worker_create(struct io_wqe *wqe, struct io_wqe_acct *acct);
+
+static inline bool io_wqe_run_queue(struct io_wqe *wqe)
+ __must_hold(wqe->lock)
+{
+ if (!wq_list_empty(&wqe->work_list) &&
+ !(wqe->flags & IO_WQE_FLAG_STALLED))
+ return true;
+ return false;
+}
+
static void io_worker_exit(struct io_worker *worker)
{
struct io_wqe *wqe = worker->wqe;
@@ -192,17 +203,17 @@ static void io_worker_exit(struct io_worker *worker)
raw_spin_unlock_irq(&wqe->lock);
kfree_rcu(worker, rcu);
+ raw_spin_lock_irq(&wqe->lock);
+
+ if (!(flags & IO_WORKER_F_BOUND) && io_wqe_run_queue(wqe)) {
+ atomic_inc(&acct->nr_running);
+ atomic_inc(&wqe->wq->worker_refs);
+ io_queue_worker_create(wqe, acct);
+ }
io_worker_ref_put(wqe->wq);
- do_exit(0);
-}
-static inline bool io_wqe_run_queue(struct io_wqe *wqe)
- __must_hold(wqe->lock)
-{
- if (!wq_list_empty(&wqe->work_list) &&
- !(wqe->flags & IO_WQE_FLAG_STALLED))
- return true;
- return false;
+ raw_spin_unlock_irq(&wqe->lock);
+ do_exit(0);
}
/*
next prev parent reply other threads:[~2021-08-03 18:04 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-03 1:05 Race between io_wqe_worker() and io_wqe_wake_worker() Nadav Amit
2021-08-03 13:22 ` Jens Axboe
2021-08-03 14:37 ` Jens Axboe
2021-08-03 17:25 ` Hao Xu
2021-08-03 18:04 ` Nadav Amit [this message]
2021-08-03 18:14 ` Jens Axboe
2021-08-03 19:20 ` Nadav Amit
2021-08-03 19:24 ` Jens Axboe
2021-08-03 19:53 ` Jens Axboe
2021-08-03 21:16 ` Nadav Amit
2021-08-03 21:25 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox