public inbox for [email protected]
 help / color / mirror / Atom feed
From: Linus Torvalds <[email protected]>
To: Jens Axboe <[email protected]>
Cc: io-uring <[email protected]>
Subject: Re: [GIT PULL] io_uring fix for 5.16-rc6
Date: Fri, 17 Dec 2021 11:45:50 -0800	[thread overview]
Message-ID: <CAHk-=wgFYF+W7QZ1KW3u-uFU=rC0jbyUFZBzCVX1-SH9-qe16w@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>

On Fri, Dec 17, 2021 at 9:00 AM Jens Axboe <[email protected]> wrote:
>
> Just a single fix, fixing an issue with the worker creation change that
> was merged last week.

Hmm. I've pulled, but looking at the result, this is a classic no-no.

You can't just randomly drop and re-take a lock and sat it's "safe".

Because I don't think it's necessarily safe at all.

When you drop the wqe->lock in the middle of io_wqe_dec_running to
create a new worker, it means - for example - that "io_worker_exit()"
can now run immediately on the new worker as far as I can tell.

So one io_worker_exit() m,ay literally race with another one, where
both are inside that io_wqe_dec_running() at the same time. And then
they both end up doing

        worker->flags = 0;
        current->flags &= ~PF_IO_WORKER;

afterwards in the caller, and not necessarily in the original order.
And then they'll both possible do

        kfree_rcu(worker, rcu);

which sounds like a disaster.

Maybe this is all safe and things like the above cannot happen, but it
sure is *not* obviously so. Any time you release a lock in the middle
of holding it, basically *everything* you do afterwards when you
re-take it is suspect.

Don't perpetuate this broken pattern. Because even if it happens to be
safe in this situation, it's _alweays_ broken garbage unless you have
a big and exhaustive comment talking about why re-taking it suddenly
makes everything that follows ok.

The way to do this properly is to either

 (a) make the code you ran under the lock ok to run under the lock

 (b) make the locked region have a *return value* that the code then
uses to decide what to do after it has actually released the lock

But the whole "release and re-take" pattern is broken, broken, broken.

As mentioned, I've pulled this, but I seriously considered unpulling.
Because I think that fix is wrong.

                  Linus

  reply	other threads:[~2021-12-17 19:46 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-17 17:00 [GIT PULL] io_uring fix for 5.16-rc6 Jens Axboe
2021-12-17 19:45 ` Linus Torvalds [this message]
2021-12-17 20:11   ` Jens Axboe
2021-12-17 20:48     ` Jens Axboe
2021-12-17 21:43 ` pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wgFYF+W7QZ1KW3u-uFU=rC0jbyUFZBzCVX1-SH9-qe16w@mail.gmail.com' \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox