Re: [RFC] .flush and io_uring_cancel_files

public inbox for [email protected]
 help / color / mirror / Atom feed

From: Pavel Begunkov <[email protected]>
To: Jann Horn <[email protected]>
Cc: io-uring <[email protected]>, Jens Axboe <[email protected]>
Subject: Re: [RFC] .flush and io_uring_cancel_files
Date: Fri, 29 May 2020 11:09:18 +0300	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAG48ez0-2jcGk3qTqQqrDr+j1UWv4K4wF6rm0xkifVtkFz76Wg@mail.gmail.com>

On 27/05/2020 21:04, Jann Horn wrote:
> On Wed, May 27, 2020 at 12:14 PM Pavel Begunkov <[email protected]> wrote:
>> On 27/05/2020 01:04, Jann Horn wrote:
>>> On Tue, May 26, 2020 at 8:11 PM Pavel Begunkov <[email protected]> wrote:
>>>> It looks like taking ->uring_lock should work like kind of grace
>>>> period for struct files_struct and io_uring_flush(), and that would
>>>> solve the race with "fcheck(ctx->ring_fd) == ctx->ring_file".
>>>>
>>>> Can you take a look? If you like it, I'll send a proper patch
>>>> and a bunch of cleanups on top.
>>>>
>>>>
>>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>>> index a3dbd5f40391..012af200dc72 100644
>>>> --- a/fs/io_uring.c
>>>> +++ b/fs/io_uring.c
>>>> @@ -5557,12 +5557,11 @@ static int io_grab_files(struct io_kiocb *req)
>>>>          * the fd has changed since we started down this path, and disallow
>>>>          * this operation if it has.
>>>>          */
>>>> -       if (fcheck(ctx->ring_fd) == ctx->ring_file) {
>>>> -               list_add(&req->inflight_entry, &ctx->inflight_list);
>>>> -               req->flags |= REQ_F_INFLIGHT;
>>>> -               req->work.files = current->files;
>>>> -               ret = 0;
>>>> -       }
>>>> +       list_add(&req->inflight_entry, &ctx->inflight_list);
>>>> +       req->flags |= REQ_F_INFLIGHT;
>>>> +       req->work.files = current->files;
>>>> +       ret = 0;
>>>> +
>>>>         spin_unlock_irq(&ctx->inflight_lock);
>>>>         rcu_read_unlock();
>>>>
>>>> @@ -7479,6 +7478,10 @@ static int io_uring_release(struct inode *inode, struct
>>>> file *file)
>>>>  static void io_uring_cancel_files(struct io_ring_ctx *ctx,
>>>>                                   struct files_struct *files)
>>>>  {
>>>> +       /* wait all submitters that can race for @files */
>>>> +       mutex_lock(&ctx->uring_lock);
>>>> +       mutex_unlock(&ctx->uring_lock);
>>>> +
>>>>         while (!list_empty_careful(&ctx->inflight_list)) {
>>>>                 struct io_kiocb *cancel_req = NULL, *req;
>>>>                 DEFINE_WAIT(wait);
>>>
>>> First off: You're removing a check in io_grab_files() without changing
>>> the comment that describes the check; and the new comment you're
>>> adding in io_uring_cancel_files() is IMO too short to be useful.
>>
>> Obviously, it was stripped down to show the idea, nobody is talking about
>> commiting it as is. I hoped Jens remembers it well enough to understand.
>> Let me describe it in more details then:
>>
>>>
>>> I'm trying to figure out how your change is supposed to work, and I
>>> don't get it. If a submitter is just past fdget() (at which point no
>>> locks are held), the ->flush() caller can instantly take and drop the
>>> ->uring_lock, and then later the rest of the submission path will grab
>>> an unprotected pointer to the files_struct. Am I missing something?
>>
>> old = tsk->files;
>> task_lock(tsk);
>> tsk->files = files;
>> task_unlock(tsk);
>> put_files_struct(old); (i.e. ->flush(old))
>>
>> It's from reset_files_struct(), and I presume the whole idea of
>> io_uring->flush() is to protect against racing for similarly going away @old
>> files. I.e. ensuring of not having io_uring requests holding @old files.
> 
> Kind of. We use the ->flush() handler to be notified when the
> files_struct goes away, so that instead of holding a reference to the
> files_struct (which would cause a reference loop), we can clean up our
> references when it goes away.
> 
>> The only place, where current->files are accessed and copied by io_uring, is
>> io_grab_files(), which is called in the submission path. And the whole
>> submission path is done under @uring_mtx.
> 
> No it isn't. We do fdget(fd) at the start of the io_uring_enter
> syscall, and at that point we obviously can't hold the uring_mtx yet.

__directly__ accessing ->files, or hand-coded. fdget() by itself shouldn't be a
problem.

> 
>> For your case, the submitter will take @uring_mtx only after this lock/unlock
>> happened, so it won't see old files (happens-before by locking mutex).
> 
> No, it will see the old files. The concurrent operation we're worried

From what you wrote below, it will see the old files just because nobody would
try to replace them. Is that it?

> about is not that the files_struct goes away somehow (that can't
> happen); what we want to guard against is a concurrent close() or
> dup2() or so removing the uring fd from the files_struct, because if
> someone calls close() before we stash a pointer to current->files,
> that pointer isn't protected anymore.

Let me guess, you mean maintenance of ->files itself, such as expand_files()?

> 
> 
>> The thing I don't know is why current->files is originally accessed without
>> protection in io_grab_files(), but presumably rcu_read_lock() there is for that
>> reason.
> 
> No, it's because current->files can never change under you; pretty
> much the only places where current->files can change are unshare() and
> execve().

I see, good to know


-- 
Pavel Begunkov

     prev parent reply	other threads:[~2020-05-29  8:10 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-26 18:09 [RFC] .flush and io_uring_cancel_files Pavel Begunkov
2020-05-26 21:12 ` Jens Axboe
2020-05-26 22:04 ` Jann Horn
2020-05-27 10:13   ` Pavel Begunkov
2020-05-27 18:04     ` Jann Horn
2020-05-29  8:09       ` Pavel Begunkov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox