public inbox for [email protected]
 help / color / mirror / Atom feed
From: Christian Brauner <[email protected]>
To: Daniel Vetter <[email protected]>
Cc: Linus Torvalds <[email protected]>,
	 Al Viro <[email protected]>,
	[email protected], [email protected], [email protected],
	 [email protected], [email protected],
	[email protected], [email protected],
	 [email protected], [email protected],
	[email protected],  [email protected],
	[email protected], [email protected],
	 [email protected],
	[email protected]
Subject: Re: [PATCH] epoll: try to be a _bit_ better about file lifetimes
Date: Mon, 6 May 2024 16:46:54 +0200	[thread overview]
Message-ID: <20240506-zweisamkeit-zinsniveau-615a2e6d7c67@brauner> (raw)
In-Reply-To: <[email protected]>

On Mon, May 06, 2024 at 02:47:23PM +0200, Daniel Vetter wrote:
> On Sun, May 05, 2024 at 01:53:48PM -0700, Linus Torvalds wrote:
> > On Sun, 5 May 2024 at 13:30, Al Viro <[email protected]> wrote:
> > >
> > > 0.      special-cased ->f_count rule for ->poll() is a wart and it's
> > > better to get rid of it.
> > >
> > > 1.      fs/eventpoll.c is a steaming pile of shit and I'd be glad to see
> > > git rm taken to it.  Short of that, by all means, let's grab reference
> > > in there around the call of vfs_poll() (see (0)).
> > 
> > Agreed on 0/1.
> > 
> > > 2.      having ->poll() instances grab extra references to file passed
> > > to them is not something that should be encouraged; there's a plenty
> > > of potential problems, and "caller has it pinned, so we are fine with
> > > grabbing extra refs" is nowhere near enough to eliminate those.
> > 
> > So it's not clear why you hate it so much, since those extra
> > references are totally normal in all the other VFS paths.
> > 
> > I mean, they are perhaps not the *common* case, but we have a lot of
> > random get_file() calls sprinkled around in various places when you
> > end up passing a file descriptor off to some asynchronous operation
> > thing.
> > 
> > Yeah, I think most of them tend to be special operations (eg the tty
> > TIOCCONS ioctl to redirect the console), but it's not like vfs_ioctl()
> > is *that* different from vfs_poll. Different operation, not somehow
> > "one is more special than the other".
> > 
> > cachefiles and backing-file does it for regular IO, and drop it at IO
> > completion - not that different from what dma-buf does. It's in
> > ->read_iter() rather than ->poll(), but again: different operations,
> > but not "one of them is somehow fundamentally different".
> > 
> > > 3.      dma-buf uses of get_file() are probably safe (epoll shite aside),
> > > but they do look fishy.  That has nothing to do with epoll.
> > 
> > Now, what dma-buf basically seems to do is to avoid ref-counting its
> > own fundamental data structure, and replaces that by refcounting the
> > 'struct file' that *points* to it instead.
> > 
> > And it is a bit odd, but it actually makes some amount of sense,
> > because then what it passes around is that file pointer (and it allows
> > passing it around from user space *as* that file).
> > 
> > And honestly, if you look at why it then needs to add its refcount to
> > it all, it actually makes sense.  dma-bufs have this notion of
> > "fences" that are basically completion points for the asynchronous
> > DMA. Doing a "poll()" operation will add a note to the fence to get
> > that wakeup when it's done.
> > 
> > And yes, logically it takes a ref to the "struct dma_buf", but because
> > of how the lifetime of the dma_buf is associated with the lifetime of
> > the 'struct file', that then turns into taking a ref on the file.
> > 
> > Unusual? Yes. But not illogical. Not obviously broken. Tying the
> > lifetime of the dma_buf to the lifetime of a file that is passed along
> > makes _sense_ for that use.
> > 
> > I'm sure dma-bufs could add another level of refcounting on the
> > 'struct dma_buf' itself, and not make it be 1:1 with the file, but
> > it's not clear to me what the advantage would really be, or why it
> > would be wrong to re-use a refcount that is already there.
> 
> So there is generally another refcount, because dma_buf is just the
> cross-driver interface to some kind of real underlying buffer object from
> the various graphics related subsystems we have.
> 
> And since it's a pure file based api thing that ceases to serve any
> function once the fd/file is gone we tied all the dma_buf refcounting to
> the refcount struct file already maintains. But the underlying buffer
> object can easily outlive the dma_buf, and over the lifetime of an
> underlying buffer object you might actually end up creating different
> dma_buf api wrappers for it (but at least in drm we guarantee there's at
> most one, hence why vmwgfx does the atomic_inc_unless_zero trick, which I
> don't particularly like and isn't really needed).
> 
> But we could add another refcount, it just means we have 3 of those then
> when only really 2 are needed.

Fwiw, the TTM thing described upthread and in the other thread really
tries hard to work around the dma_buf == file lifetime choice by hooking
into the dma-buf specific release function so it can access the dmabuf
and then the file. All that seems like a pretty error prone thing to me.
So a separate refcount for dma_buf wouldn't be the worst as that would
allow that TTM thing to benefit and remove that nasty hacking into your
generic dma_buf ops. But maybe I'm the only one who sees it that way and
I'm certainly not familiar enough with dma-buf.

  reply	other threads:[~2024-05-06 14:47 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-08  8:26 [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove syzbot
2024-04-15 14:31 ` Jens Axboe
2024-04-15 14:57   ` Pavel Begunkov
2024-05-03 11:54 ` Bui Quang Minh
2024-05-03 18:26   ` get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove) Kees Cook
2024-05-03 18:49     ` Jens Axboe
2024-05-03 19:22       ` Kees Cook
2024-05-03 19:35         ` Jens Axboe
2024-05-03 19:59           ` Kees Cook
2024-05-03 20:28             ` Kees Cook
2024-05-03 21:11               ` Al Viro
2024-05-03 21:24                 ` Linus Torvalds
2024-05-03 21:30                   ` Al Viro
2024-05-06 17:46                   ` Stefan Metzmacher
2024-05-06 18:17                     ` Linus Torvalds
2024-05-08  8:47                       ` David Laight
2024-05-03 21:36                 ` Al Viro
2024-05-03 21:42                   ` Linus Torvalds
2024-05-03 21:53                     ` Al Viro
2024-05-06 12:23                       ` Daniel Vetter
2024-05-04  9:59             ` Christian Brauner
2024-05-03 21:11     ` [PATCH] epoll: try to be a _bit_ better about file lifetimes Linus Torvalds
2024-05-03 21:24       ` Al Viro
2024-05-03 21:33         ` Linus Torvalds
2024-05-03 21:45           ` Al Viro
2024-05-03 21:52             ` Linus Torvalds
2024-05-03 22:01               ` Al Viro
2024-05-03 22:07                 ` Al Viro
2024-05-03 23:16                   ` Linus Torvalds
2024-05-03 23:39                     ` Al Viro
2024-05-03 23:54                       ` Linus Torvalds
2024-05-04 10:44                       ` Christian Brauner
2024-05-03 22:46               ` Kees Cook
2024-05-03 23:03                 ` Al Viro
2024-05-03 23:23                   ` Kees Cook
2024-05-03 23:41                     ` Linus Torvalds
2024-05-04  9:19                       ` Christian Brauner
2024-05-06 12:37                       ` Daniel Vetter
2024-05-04  9:37           ` Christian Brauner
2024-05-04 15:32             ` Linus Torvalds
2024-05-04 15:40               ` Linus Torvalds
2024-05-04 15:53                 ` Linus Torvalds
2024-05-05 19:46                   ` Al Viro
2024-05-05 20:03                     ` Linus Torvalds
2024-05-05 20:30                       ` Al Viro
2024-05-05 20:53                         ` Linus Torvalds
2024-05-06 12:47                           ` Daniel Vetter
2024-05-06 14:46                             ` Christian Brauner [this message]
2024-05-07 10:58                               ` Daniel Vetter
2024-05-06 16:15                           ` Christian König
2024-05-05 10:50                 ` Christian Brauner
2024-05-05 16:46                   ` Linus Torvalds
2024-05-05 17:55                     ` [PATCH v2] epoll: be " Linus Torvalds
2024-05-05 18:04                       ` Jens Axboe
2024-05-05 20:01                       ` David Laight
2024-05-05 20:16                         ` Linus Torvalds
2024-05-05 20:12                     ` [PATCH] epoll: try to be a _bit_ " Al Viro
2024-05-06  8:45                     ` Christian Brauner
2024-05-06  9:26                       ` Christian Brauner
2024-05-06 14:19                         ` Christian Brauner
2024-05-07 21:02                       ` David Laight
2024-05-04 18:20               ` Linus Torvalds
2024-05-06 14:29                 ` [Linaro-mm-sig] " Christian König
2024-05-07 11:02                   ` Daniel Vetter
2024-05-07 16:46                     ` Linus Torvalds
2024-05-07 17:45                       ` Christian König
2024-05-08  7:51                         ` Michel Dänzer
2024-05-08  7:59                           ` Christian König
2024-05-08  8:23                         ` Christian Brauner
2024-05-08  9:10                           ` Christian König
2024-05-07 18:04                       ` Daniel Vetter
2024-05-07 19:07                         ` Linus Torvalds
2024-05-08  5:55                           ` Christian König
2024-05-08  8:32                             ` Daniel Vetter
2024-05-08 10:16                               ` Christian Brauner
2024-05-08  8:05                           ` Christian Brauner
2024-05-08 16:19                           ` Linus Torvalds
2024-05-08 17:14                             ` Linus Torvalds
2024-05-09 11:38                               ` Christian Brauner
2024-05-09 15:48                                 ` Linus Torvalds
2024-05-10  6:33                                   ` Christian Brauner
2024-05-08 10:08                   ` Christian Brauner
2024-05-08 15:45                     ` Daniel Vetter
2024-05-10 10:55                       ` Christian Brauner
2024-05-11 18:25                         ` David Laight
2024-05-05 17:31       ` Jens Axboe
2024-05-04  9:45     ` get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove) Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240506-zweisamkeit-zinsniveau-615a2e6d7c67@brauner \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox