From: Christian Brauner <[email protected]>
To: Kees Cook <[email protected]>
Cc: "Bui Quang Minh" <[email protected]>,
"Al Viro" <[email protected]>,
syzbot <[email protected]>,
[email protected], [email protected], [email protected],
[email protected], [email protected],
[email protected],
"Sumit Semwal" <[email protected]>,
"Christian König" <[email protected]>,
[email protected], [email protected],
[email protected],
"Laura Abbott" <[email protected]>
Subject: Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)
Date: Sat, 4 May 2024 11:45:18 +0200 [thread overview]
Message-ID: <20240504-probanden-wahrsagung-f82cddd37718@brauner> (raw)
In-Reply-To: <202405031110.6F47982593@keescook>
On Fri, May 03, 2024 at 11:26:32AM -0700, Kees Cook wrote:
> On Fri, May 03, 2024 at 06:54:22PM +0700, Bui Quang Minh wrote:
> > [...]
> > Root cause:
> > AFAIK, eventpoll (epoll) does not increase the registered file's reference.
> > To ensure the safety, when the registered file is deallocated in __fput,
> > eventpoll_release is called to unregister the file from epoll. When calling
> > poll on epoll, epoll will loop through registered files and call vfs_poll on
> > these files. In the file's poll, file is guaranteed to be alive, however, as
> > epoll does not increase the registered file's reference, the file may be
> > dying
> > and it's not safe the get the file for later use. And dma_buf_poll violates
> > this. In the dma_buf_poll, it tries to get_file to use in the callback. This
> > leads to a race where the dmabuf->file can be fput twice.
> >
> > Here is the race occurs in the above proof-of-concept
> >
> > close(dmabuf->file)
> > __fput_sync (f_count == 1, last ref)
> > f_count-- (f_count == 0 now)
> > __fput
> > epoll_wait
> > vfs_poll(dmabuf->file)
> > get_file(dmabuf->file)(f_count == 1)
> > eventpoll_release
> > dmabuf->file deallocation
> > fput(dmabuf->file) (f_count == 1)
> > f_count--
> > dmabuf->file deallocation
> >
> > I am not familiar with the dma_buf so I don't know the proper fix for the
> > issue. About the rule that don't get the file for later use in poll callback
> > of
> > file, I wonder if it is there when only select/poll exist or just after
> > epoll
> > appears.
> >
> > I hope the analysis helps us to fix the issue.
>
> Thanks for doing this analysis! I suspect at least a start of a fix
> would be this:
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 8fe5aa67b167..15e8f74ee0f2 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -267,9 +267,8 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>
> if (events & EPOLLOUT) {
> /* Paired with fput in dma_buf_poll_cb */
> - get_file(dmabuf->file);
> -
> - if (!dma_buf_poll_add_cb(resv, true, dcb))
> + if (!atomic_long_inc_not_zero(&dmabuf->file) &&
> + !dma_buf_poll_add_cb(resv, true, dcb))
> /* No callback queued, wake up any other waiters */
> dma_buf_poll_cb(NULL, &dcb->cb);
> else
> @@ -290,9 +289,8 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>
> if (events & EPOLLIN) {
> /* Paired with fput in dma_buf_poll_cb */
> - get_file(dmabuf->file);
> -
> - if (!dma_buf_poll_add_cb(resv, false, dcb))
> + if (!atomic_long_inc_not_zero(&dmabuf->file) &&
> + !dma_buf_poll_add_cb(resv, false, dcb))
> /* No callback queued, wake up any other waiters */
> dma_buf_poll_cb(NULL, &dcb->cb);
> else
>
>
> But this ends up leaving "active" non-zero, and at close time it runs
> into:
>
> BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);
>
> But the bottom line is that get_file() is unsafe to use in some places,
> one of which appears to be in the poll handler. There are maybe some
> other fragile places too, like in drivers/gpu/drm/vmwgfx/ttm_object.c:
>
> static bool __must_check get_dma_buf_unless_doomed(struct dma_buf *dmabuf)
> {
> return atomic_long_inc_not_zero(&dmabuf->file->f_count) != 0L;
> }
>
> Which I also note involves a dmabuf...
>
> Due to this issue I've proposed fixing get_file() to detect pathological states:
> https://lore.kernel.org/lkml/[email protected]/
>
> But that has run into some push-back. I'm hoping that seeing this epoll
> example will help illustrate what needs fixing a little better.
>
> I think the best current proposal is to just WARN sooner instead of a
> full refcount_t implementation:
>
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 8dfd53b52744..e09107d0a3d6 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1040,7 +1040,8 @@ struct file_handle {
>
> static inline struct file *get_file(struct file *f)
> {
> - atomic_long_inc(&f->f_count);
> + long prior = atomic_long_fetch_inc_relaxed(&f->f_count);
> + WARN_ONCE(!prior, "struct file::f_count incremented from zero; use-after-free condition present!\n");
> return f;
> }
>
>
>
> What's the right way to deal with the dmabuf situation? (And I suspect
> it applies to get_dma_buf_unless_doomed() as well...)
No, it doesn't. That's safe afaict as I've explained at lenght in
the other thread.
prev parent reply other threads:[~2024-05-04 9:45 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-08 8:26 [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove syzbot
2024-04-15 14:31 ` Jens Axboe
2024-04-15 14:57 ` Pavel Begunkov
2024-05-03 11:54 ` Bui Quang Minh
2024-05-03 18:26 ` get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove) Kees Cook
2024-05-03 18:49 ` Jens Axboe
2024-05-03 19:22 ` Kees Cook
2024-05-03 19:35 ` Jens Axboe
2024-05-03 19:59 ` Kees Cook
2024-05-03 20:28 ` Kees Cook
2024-05-03 21:11 ` Al Viro
2024-05-03 21:24 ` Linus Torvalds
2024-05-03 21:30 ` Al Viro
2024-05-06 17:46 ` Stefan Metzmacher
2024-05-06 18:17 ` Linus Torvalds
2024-05-08 8:47 ` David Laight
2024-05-03 21:36 ` Al Viro
2024-05-03 21:42 ` Linus Torvalds
2024-05-03 21:53 ` Al Viro
2024-05-06 12:23 ` Daniel Vetter
2024-05-04 9:59 ` Christian Brauner
2024-05-03 21:11 ` [PATCH] epoll: try to be a _bit_ better about file lifetimes Linus Torvalds
2024-05-03 21:24 ` Al Viro
2024-05-03 21:33 ` Linus Torvalds
2024-05-03 21:45 ` Al Viro
2024-05-03 21:52 ` Linus Torvalds
2024-05-03 22:01 ` Al Viro
2024-05-03 22:07 ` Al Viro
2024-05-03 23:16 ` Linus Torvalds
2024-05-03 23:39 ` Al Viro
2024-05-03 23:54 ` Linus Torvalds
2024-05-04 10:44 ` Christian Brauner
2024-05-03 22:46 ` Kees Cook
2024-05-03 23:03 ` Al Viro
2024-05-03 23:23 ` Kees Cook
2024-05-03 23:41 ` Linus Torvalds
2024-05-04 9:19 ` Christian Brauner
2024-05-06 12:37 ` Daniel Vetter
2024-05-04 9:37 ` Christian Brauner
2024-05-04 15:32 ` Linus Torvalds
2024-05-04 15:40 ` Linus Torvalds
2024-05-04 15:53 ` Linus Torvalds
2024-05-05 19:46 ` Al Viro
2024-05-05 20:03 ` Linus Torvalds
2024-05-05 20:30 ` Al Viro
2024-05-05 20:53 ` Linus Torvalds
2024-05-06 12:47 ` Daniel Vetter
2024-05-06 14:46 ` Christian Brauner
2024-05-07 10:58 ` Daniel Vetter
2024-05-06 16:15 ` Christian König
2024-05-05 10:50 ` Christian Brauner
2024-05-05 16:46 ` Linus Torvalds
2024-05-05 17:55 ` [PATCH v2] epoll: be " Linus Torvalds
2024-05-05 18:04 ` Jens Axboe
2024-05-05 20:01 ` David Laight
2024-05-05 20:16 ` Linus Torvalds
2024-05-05 20:12 ` [PATCH] epoll: try to be a _bit_ " Al Viro
2024-05-06 8:45 ` Christian Brauner
2024-05-06 9:26 ` Christian Brauner
2024-05-06 14:19 ` Christian Brauner
2024-05-07 21:02 ` David Laight
2024-05-04 18:20 ` Linus Torvalds
2024-05-06 14:29 ` [Linaro-mm-sig] " Christian König
2024-05-07 11:02 ` Daniel Vetter
2024-05-07 16:46 ` Linus Torvalds
2024-05-07 17:45 ` Christian König
2024-05-08 7:51 ` Michel Dänzer
2024-05-08 7:59 ` Christian König
2024-05-08 8:23 ` Christian Brauner
2024-05-08 9:10 ` Christian König
2024-05-07 18:04 ` Daniel Vetter
2024-05-07 19:07 ` Linus Torvalds
2024-05-08 5:55 ` Christian König
2024-05-08 8:32 ` Daniel Vetter
2024-05-08 10:16 ` Christian Brauner
2024-05-08 8:05 ` Christian Brauner
2024-05-08 16:19 ` Linus Torvalds
2024-05-08 17:14 ` Linus Torvalds
2024-05-09 11:38 ` Christian Brauner
2024-05-09 15:48 ` Linus Torvalds
2024-05-10 6:33 ` Christian Brauner
2024-05-08 10:08 ` Christian Brauner
2024-05-08 15:45 ` Daniel Vetter
2024-05-10 10:55 ` Christian Brauner
2024-05-11 18:25 ` David Laight
2024-05-05 17:31 ` Jens Axboe
2024-05-04 9:45 ` Christian Brauner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240504-probanden-wahrsagung-f82cddd37718@brauner \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox