From: Jens Axboe <[email protected]>
To: Hao Xu <[email protected]>
Cc: io-uring <[email protected]>,
Pavel Begunkov <[email protected]>,
Joseph Qi <[email protected]>
Subject: Re: [PATCH 2/3] io-wq: fix no lock protection of acct->nr_worker
Date: Sat, 7 Aug 2021 07:51:31 -0600 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 8/7/21 3:56 AM, Hao Xu wrote:
> 在 2021/8/6 下午10:27, Jens Axboe 写道:
>> On Thu, Aug 5, 2021 at 4:05 AM Hao Xu <[email protected]> wrote:
>>>
>>> There is an acct->nr_worker visit without lock protection. Think about
>>> the case: two callers call io_wqe_wake_worker(), one is the original
>>> context and the other one is an io-worker(by calling
>>> io_wqe_enqueue(wqe, linked)), on two cpus paralelly, this may cause
>>> nr_worker to be larger than max_worker.
>>> Let's fix it by adding lock for it, and let's do nr_workers++ before
>>> create_io_worker. There may be a edge cause that the first caller fails
>>> to create an io-worker, but the second caller doesn't know it and then
>>> quit creating io-worker as well:
>>>
>>> say nr_worker = max_worker - 1
>>> cpu 0 cpu 1
>>> io_wqe_wake_worker() io_wqe_wake_worker()
>>> nr_worker < max_worker
>>> nr_worker++
>>> create_io_worker() nr_worker == max_worker
>>> failed return
>>> return
>>>
>>> But the chance of this case is very slim.
>>>
>>> Fixes: 685fe7feedb9 ("io-wq: eliminate the need for a manager thread")
>>> Signed-off-by: Hao Xu <[email protected]>
>>> ---
>>> fs/io-wq.c | 17 ++++++++++++-----
>>> 1 file changed, 12 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/fs/io-wq.c b/fs/io-wq.c
>>> index cd4fd4d6268f..88d0ba7be1fb 100644
>>> --- a/fs/io-wq.c
>>> +++ b/fs/io-wq.c
>>> @@ -247,9 +247,14 @@ static void io_wqe_wake_worker(struct io_wqe *wqe, struct io_wqe_acct *acct)
>>> ret = io_wqe_activate_free_worker(wqe);
>>> rcu_read_unlock();
>>>
>>> - if (!ret && acct->nr_workers < acct->max_workers) {
>>> - atomic_inc(&acct->nr_running);
>>> - atomic_inc(&wqe->wq->worker_refs);
>>> + if (!ret) {
>>> + raw_spin_lock_irq(&wqe->lock);
>>> + if (acct->nr_workers < acct->max_workers) {
>>> + atomic_inc(&acct->nr_running);
>>> + atomic_inc(&wqe->wq->worker_refs);
>>> + acct->nr_workers++;
>>> + }
>>> + raw_spin_unlock_irq(&wqe->lock);
>>> create_io_worker(wqe->wq, wqe, acct->index);
>>> }
>>> }
>>
>> There's a pretty grave bug in this patch, in that you no call
>> create_io_worker() unconditionally. This causes obvious problems with
>> misaccounting, and stalls that hit the idle timeout...
>>
> This is surely a silly mistake, I'll check this patch and the 3/3 again.
Please do - and please always run the full set of tests before sending
out changes like this, you would have seen the slower runs and/or
timeouts from the regression suite. I ended up wasting time on this
thinking it was a change I made that broke it, before then debugging
this one.
--
Jens Axboe
next prev parent reply other threads:[~2021-08-07 13:51 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-05 10:05 [PATCH 0/3] code clean and nr_worker fixes Hao Xu
2021-08-05 10:05 ` [PATCH 1/3] io-wq: clean code of task state setting Hao Xu
2021-08-05 14:23 ` Jens Axboe
2021-08-05 17:37 ` Hao Xu
2021-08-05 10:05 ` [PATCH 2/3] io-wq: fix no lock protection of acct->nr_worker Hao Xu
2021-08-06 14:27 ` Jens Axboe
2021-08-07 9:56 ` Hao Xu
2021-08-07 13:51 ` Jens Axboe [this message]
2021-08-09 20:19 ` Olivier Langlois
2021-08-09 20:34 ` Pavel Begunkov
2021-08-09 20:35 ` Jens Axboe
2021-08-05 10:05 ` [PATCH 3/3] io-wq: fix lack of acct->nr_workers < acct->max_workers judgement Hao Xu
2021-08-05 14:58 ` [PATCH 0/3] code clean and nr_worker fixes Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox