From: Linus Torvalds <[email protected]>
To: Al Viro <[email protected]>
Cc: Christian Brauner <[email protected]>,
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected],
[email protected]
Subject: Re: [PATCH] epoll: try to be a _bit_ better about file lifetimes
Date: Sun, 5 May 2024 13:53:48 -0700 [thread overview]
Message-ID: <CAHk-=whFg8-WyMbVUGW5c0baurGzqmRtzFLoU-gxtRXq2nVZ+w@mail.gmail.com> (raw)
In-Reply-To: <20240505203052.GJ2118490@ZenIV>
On Sun, 5 May 2024 at 13:30, Al Viro <[email protected]> wrote:
>
> 0. special-cased ->f_count rule for ->poll() is a wart and it's
> better to get rid of it.
>
> 1. fs/eventpoll.c is a steaming pile of shit and I'd be glad to see
> git rm taken to it. Short of that, by all means, let's grab reference
> in there around the call of vfs_poll() (see (0)).
Agreed on 0/1.
> 2. having ->poll() instances grab extra references to file passed
> to them is not something that should be encouraged; there's a plenty
> of potential problems, and "caller has it pinned, so we are fine with
> grabbing extra refs" is nowhere near enough to eliminate those.
So it's not clear why you hate it so much, since those extra
references are totally normal in all the other VFS paths.
I mean, they are perhaps not the *common* case, but we have a lot of
random get_file() calls sprinkled around in various places when you
end up passing a file descriptor off to some asynchronous operation
thing.
Yeah, I think most of them tend to be special operations (eg the tty
TIOCCONS ioctl to redirect the console), but it's not like vfs_ioctl()
is *that* different from vfs_poll. Different operation, not somehow
"one is more special than the other".
cachefiles and backing-file does it for regular IO, and drop it at IO
completion - not that different from what dma-buf does. It's in
->read_iter() rather than ->poll(), but again: different operations,
but not "one of them is somehow fundamentally different".
> 3. dma-buf uses of get_file() are probably safe (epoll shite aside),
> but they do look fishy. That has nothing to do with epoll.
Now, what dma-buf basically seems to do is to avoid ref-counting its
own fundamental data structure, and replaces that by refcounting the
'struct file' that *points* to it instead.
And it is a bit odd, but it actually makes some amount of sense,
because then what it passes around is that file pointer (and it allows
passing it around from user space *as* that file).
And honestly, if you look at why it then needs to add its refcount to
it all, it actually makes sense. dma-bufs have this notion of
"fences" that are basically completion points for the asynchronous
DMA. Doing a "poll()" operation will add a note to the fence to get
that wakeup when it's done.
And yes, logically it takes a ref to the "struct dma_buf", but because
of how the lifetime of the dma_buf is associated with the lifetime of
the 'struct file', that then turns into taking a ref on the file.
Unusual? Yes. But not illogical. Not obviously broken. Tying the
lifetime of the dma_buf to the lifetime of a file that is passed along
makes _sense_ for that use.
I'm sure dma-bufs could add another level of refcounting on the
'struct dma_buf' itself, and not make it be 1:1 with the file, but
it's not clear to me what the advantage would really be, or why it
would be wrong to re-use a refcount that is already there.
Linus
next prev parent reply other threads:[~2024-05-05 20:54 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-08 8:26 [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove syzbot
2024-04-15 14:31 ` Jens Axboe
2024-04-15 14:57 ` Pavel Begunkov
2024-05-03 11:54 ` Bui Quang Minh
2024-05-03 18:26 ` get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove) Kees Cook
2024-05-03 18:49 ` Jens Axboe
2024-05-03 19:22 ` Kees Cook
2024-05-03 19:35 ` Jens Axboe
2024-05-03 19:59 ` Kees Cook
2024-05-03 20:28 ` Kees Cook
2024-05-03 21:11 ` Al Viro
2024-05-03 21:24 ` Linus Torvalds
2024-05-03 21:30 ` Al Viro
2024-05-06 17:46 ` Stefan Metzmacher
2024-05-06 18:17 ` Linus Torvalds
2024-05-08 8:47 ` David Laight
2024-05-03 21:36 ` Al Viro
2024-05-03 21:42 ` Linus Torvalds
2024-05-03 21:53 ` Al Viro
2024-05-06 12:23 ` Daniel Vetter
2024-05-04 9:59 ` Christian Brauner
2024-05-03 21:11 ` [PATCH] epoll: try to be a _bit_ better about file lifetimes Linus Torvalds
2024-05-03 21:24 ` Al Viro
2024-05-03 21:33 ` Linus Torvalds
2024-05-03 21:45 ` Al Viro
2024-05-03 21:52 ` Linus Torvalds
2024-05-03 22:01 ` Al Viro
2024-05-03 22:07 ` Al Viro
2024-05-03 23:16 ` Linus Torvalds
2024-05-03 23:39 ` Al Viro
2024-05-03 23:54 ` Linus Torvalds
2024-05-04 10:44 ` Christian Brauner
2024-05-03 22:46 ` Kees Cook
2024-05-03 23:03 ` Al Viro
2024-05-03 23:23 ` Kees Cook
2024-05-03 23:41 ` Linus Torvalds
2024-05-04 9:19 ` Christian Brauner
2024-05-06 12:37 ` Daniel Vetter
2024-05-04 9:37 ` Christian Brauner
2024-05-04 15:32 ` Linus Torvalds
2024-05-04 15:40 ` Linus Torvalds
2024-05-04 15:53 ` Linus Torvalds
2024-05-05 19:46 ` Al Viro
2024-05-05 20:03 ` Linus Torvalds
2024-05-05 20:30 ` Al Viro
2024-05-05 20:53 ` Linus Torvalds [this message]
2024-05-06 12:47 ` Daniel Vetter
2024-05-06 14:46 ` Christian Brauner
2024-05-07 10:58 ` Daniel Vetter
2024-05-06 16:15 ` Christian König
2024-05-05 10:50 ` Christian Brauner
2024-05-05 16:46 ` Linus Torvalds
2024-05-05 17:55 ` [PATCH v2] epoll: be " Linus Torvalds
2024-05-05 18:04 ` Jens Axboe
2024-05-05 20:01 ` David Laight
2024-05-05 20:16 ` Linus Torvalds
2024-05-05 20:12 ` [PATCH] epoll: try to be a _bit_ " Al Viro
2024-05-06 8:45 ` Christian Brauner
2024-05-06 9:26 ` Christian Brauner
2024-05-06 14:19 ` Christian Brauner
2024-05-07 21:02 ` David Laight
2024-05-04 18:20 ` Linus Torvalds
2024-05-06 14:29 ` [Linaro-mm-sig] " Christian König
2024-05-07 11:02 ` Daniel Vetter
2024-05-07 16:46 ` Linus Torvalds
2024-05-07 17:45 ` Christian König
2024-05-08 7:51 ` Michel Dänzer
2024-05-08 7:59 ` Christian König
2024-05-08 8:23 ` Christian Brauner
2024-05-08 9:10 ` Christian König
2024-05-07 18:04 ` Daniel Vetter
2024-05-07 19:07 ` Linus Torvalds
2024-05-08 5:55 ` Christian König
2024-05-08 8:32 ` Daniel Vetter
2024-05-08 10:16 ` Christian Brauner
2024-05-08 8:05 ` Christian Brauner
2024-05-08 16:19 ` Linus Torvalds
2024-05-08 17:14 ` Linus Torvalds
2024-05-09 11:38 ` Christian Brauner
2024-05-09 15:48 ` Linus Torvalds
2024-05-10 6:33 ` Christian Brauner
2024-05-08 10:08 ` Christian Brauner
2024-05-08 15:45 ` Daniel Vetter
2024-05-10 10:55 ` Christian Brauner
2024-05-11 18:25 ` David Laight
2024-05-05 17:31 ` Jens Axboe
2024-05-04 9:45 ` get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove) Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHk-=whFg8-WyMbVUGW5c0baurGzqmRtzFLoU-gxtRXq2nVZ+w@mail.gmail.com' \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox