public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jann Horn <[email protected]>
To: Jens Axboe <[email protected]>
Cc: Pavel Begunkov <[email protected]>,
	io-uring <[email protected]>,
	 kernel list <[email protected]>
Subject: Re: io_uring: risky use of task work, especially wrt fdget()
Date: Tue, 28 Nov 2023 17:36:34 +0100	[thread overview]
Message-ID: <CAG48ez2PeV8WiMdHakqTrVHtbO=N8RKtqAq1v6bNbNOMdMJ9QA@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>

On Tue, Nov 28, 2023 at 5:19 PM Jens Axboe <[email protected]> wrote:
> On 11/28/23 8:58 AM, Jens Axboe wrote:
> > On 11/27/23 2:53 PM, Jann Horn wrote:
> >> Hi!
> >>
> >> I noticed something that I think does not currently cause any
> >> significant security issues, but could be problematic in the future:
> >>
> >> io_uring sometimes processes task work in the middle of syscalls,
> >> including between fdget() and fdput(). My understanding of task work
> >> is that it is expected to run in a context similar to directly at
> >> syscall entry/exit: task context, no locks held, sleeping is okay, and
> >> it doesn't execute in the middle of some syscall that expects private
> >> state of the task_struct to stay the same.
> >>
> >> An example of another user of task work is the keyring subsystem,
> >> which does task_work_add() in keyctl_session_to_parent() to change the
> >> cred pointers of another task.
> >>
> >> Several places in io_uring process task work while holding an fdget()
> >> reference to some file descriptor. For example, the io_uring_enter
> >> syscall handler calls io_iopoll_check() while the io_ring_ctx is only
> >> referenced via fdget(). This means that if there were another kernel
> >> subsystem that uses task work to close file descriptors, io_uring
> >> would become unsafe. And io_uring does _almost_ that itself, I think:
> >> io_queue_worker_create() can be run on a workqueue, and uses task work
> >> to launch a worker thread from the context of a userspace thread; and
> >> this worker thread can then accept commands to close file descriptors.
> >> Except it doesn't accept commands to close io_uring file descriptors.
> >>
> >> A closer miss might be io_sync_cancel(), which holds a reference to
> >> some normal file with fdget()/fdput() while calling into
> >> io_run_task_work_sig(). However, from what I can tell, the only things
> >> that are actually done with this file pointer are pointer comparisons,
> >> so this also shouldn't have significant security impact.
> >>
> >> Would it make sense to use fget()/fput() instead of fdget()/fdput() in
> >> io_sync_cancel(), io_uring_enter and io_uring_register? These
> >> functions probably usually run in multithreaded environments anyway
> >> (thanks to the io_uring worker threads), so I would think fdget()
> >> shouldn't bring significant performance savings here?
> >
> > Let me run some testing on that. It's a mistake to think that it's
> > usually multithreaded, generally if you end up using io-wq then it's not
> > a fast path. A fast networked setup, for example, would never touch the
> > threads and hence no threading would be implied by using io_uring. Ditto
> > on the storage front, if you're just reading/writing or eg doing polled
> > IO. That said, those workloads are generally threaded _anyway_ - not
> > because of io_uring, but because that's how these kinds of workloads are
> > written to begin with.

Aah, because with polled I/O, when the fd is signalled as ready, the
actual execution of work is done via task_work? Thanks for the
explanation, I missed that.

> > So probably won't be much of a concern to do the swap. The only
> > "interesting" part of the above mix of cancel/register/enter is
> > obviously the enter part. The rest are not really fast path.
>
> Did all three and ran the usual testing, which just so happens to be
> multithreaded to begin with anyway. No discernable change from using
> fget/fput over fdget/fdput.

Oh, nice!

> IOW, we may as well do this. Do you want to send a patch? Or I can send
> out mine, up to you.

Ah, if you already cooked up a patch to do the testing, I guess it's
easier if you commit that?

  parent reply	other threads:[~2023-11-28 16:37 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-27 21:53 io_uring: risky use of task work, especially wrt fdget() Jann Horn
2023-11-28 15:58 ` Jens Axboe
2023-11-28 16:19   ` Jens Axboe
2023-11-28 16:28     ` Jens Axboe
2023-11-28 16:36     ` Jann Horn [this message]
2023-11-28 16:46       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAG48ez2PeV8WiMdHakqTrVHtbO=N8RKtqAq1v6bNbNOMdMJ9QA@mail.gmail.com' \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox