From: Andy Lutomirski <[email protected]>
To: Jens Axboe <[email protected]>
Cc: Andy Lutomirski <[email protected]>,
Andres Freund <[email protected]>,
Stefano Garzarella <[email protected]>,
Christoph Hellwig <[email protected]>, Kees Cook <[email protected]>,
Pavel Begunkov <[email protected]>,
Miklos Szeredi <[email protected]>,
Matthew Wilcox <[email protected]>,
Jann Horn <[email protected]>,
Christian Brauner <[email protected]>,
[email protected], [email protected],
Linux API <[email protected]>,
Linux FS Devel <[email protected]>,
LKML <[email protected]>,
Michael Kerrisk <[email protected]>,
Stefan Hajnoczi <[email protected]>
Subject: Re: strace of io_uring events?
Date: Tue, 21 Jul 2020 10:44:06 -0700 [thread overview]
Message-ID: <CALCETrUvOuKZWiQeZhf9DXyjS4OQdyW+s1YMh+vwe605jBS3LQ@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
On Tue, Jul 21, 2020 at 10:30 AM Jens Axboe <[email protected]> wrote:
>
> On 7/21/20 11:23 AM, Andy Lutomirski wrote:
> > On Tue, Jul 21, 2020 at 8:31 AM Jens Axboe <[email protected]> wrote:
> >>
> >> On 7/21/20 9:27 AM, Andy Lutomirski wrote:
> >>> On Fri, Jul 17, 2020 at 1:02 AM Stefano Garzarella <[email protected]> wrote:
> >>>>
> >>>> On Thu, Jul 16, 2020 at 08:12:35AM -0700, Kees Cook wrote:
> >>>>> On Thu, Jul 16, 2020 at 03:14:04PM +0200, Stefano Garzarella wrote:
> >>>
> >>>>> access (IIUC) is possible without actually calling any of the io_uring
> >>>>> syscalls. Is that correct? A process would receive an fd (via SCM_RIGHTS,
> >>>>> pidfd_getfd, or soon seccomp addfd), and then call mmap() on it to gain
> >>>>> access to the SQ and CQ, and off it goes? (The only glitch I see is
> >>>>> waking up the worker thread?)
> >>>>
> >>>> It is true only if the io_uring istance is created with SQPOLL flag (not the
> >>>> default behaviour and it requires CAP_SYS_ADMIN). In this case the
> >>>> kthread is created and you can also set an higher idle time for it, so
> >>>> also the waking up syscall can be avoided.
> >>>
> >>> I stared at the io_uring code for a while, and I'm wondering if we're
> >>> approaching this the wrong way. It seems to me that most of the
> >>> complications here come from the fact that io_uring SQEs don't clearly
> >>> belong to any particular security principle. (We have struct creds,
> >>> but we don't really have a task or mm.) But I'm also not convinced
> >>> that io_uring actually supports cross-mm submission except by accident
> >>> -- as it stands, unless a user is very careful to only submit SQEs
> >>> that don't use user pointers, the results will be unpredictable.
> >>
> >> How so?
> >
> > Unless I've missed something, either current->mm or sqo_mm will be
> > used depending on which thread ends up doing the IO. (And there might
> > be similar issues with threads.) Having the user memory references
> > end up somewhere that is an implementation detail seems suboptimal.
>
> current->mm is always used from the entering task - obviously if done
> synchronously, but also if it needs to go async. The only exception is a
> setup with SQPOLL, in which case ctx->sqo_mm is the task that set up the
> ring. SQPOLL requires root privileges to setup, and there's no task
> entering the io_uring at all necessarily. It'll just submit sqes with
> the credentials that are registered with the ring.
Really? I admit I haven't fully followed how the code works, but it
looks like anything that goes through the io_queue_async_work() path
will use sqo_mm, and can't most requests that end up blocking end up
there? It looks like, even if SQPOLL is not set, the mm used will
depend on whether the request ends up blocking and thus getting queued
for later completion.
Or does some magic I missed make this a nonissue.
>
> This is just one known use case, there may very well be others. Outside
> of SQPOLL, which is special, I don't see a reason to restrict this.
> Given that you may have a fuller understanding of it after the above
> explanation, please clearly state what problem you're seeing that
> warrants a change.
I see two fundamental issues:
1. The above. This may be less of an issue than it seems to me, but,
if you submit io from outside sqo_mm, the mm that ends up being used
depends on whether the IO is completed from io_uring_enter() or from
the workqueue. For something like Postgres, I guess this is okay
because the memory is MAP_ANONYMOUS | MAP_SHARED and the pointers all
point the same place regardless.
2. If you create an io_uring and io_uring_enter() it from a different
mm, it's unclear what seccomp is supposed to do. (Or audit, for that
matter.) Which task did the IO? Which mm did the IO? Whose sandbox
is supposed to be applied?
--Andy
next prev parent reply other threads:[~2020-07-21 17:44 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-15 11:12 strace of io_uring events? Miklos Szeredi
2020-07-15 14:35 ` Andy Lutomirski
2020-07-15 17:11 ` Matthew Wilcox
2020-07-15 19:42 ` Pavel Begunkov
2020-07-15 20:09 ` Miklos Szeredi
2020-07-15 20:20 ` Pavel Begunkov
2020-07-15 23:07 ` Kees Cook
2020-07-16 13:14 ` Stefano Garzarella
2020-07-16 15:12 ` Kees Cook
2020-07-17 8:01 ` Stefano Garzarella
2020-07-21 15:27 ` Andy Lutomirski
2020-07-21 15:31 ` Jens Axboe
2020-07-21 17:23 ` Andy Lutomirski
2020-07-21 17:30 ` Jens Axboe
2020-07-21 17:44 ` Andy Lutomirski [this message]
2020-07-21 18:39 ` Jens Axboe
2020-07-21 19:44 ` Andy Lutomirski
2020-07-21 19:48 ` Jens Axboe
2020-07-21 19:56 ` Andres Freund
2020-07-21 19:37 ` Andres Freund
2020-07-21 15:58 ` Stefano Garzarella
2020-07-23 10:39 ` Stefan Hajnoczi
2020-07-23 13:37 ` Colin Walters
2020-07-24 7:25 ` Stefano Garzarella
2020-07-16 13:17 ` Aleksa Sarai
2020-07-16 15:19 ` Kees Cook
2020-07-17 8:17 ` Cyril Hrubis
2020-07-16 16:24 ` Andy Lutomirski
2020-07-16 0:12 ` tytso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALCETrUvOuKZWiQeZhf9DXyjS4OQdyW+s1YMh+vwe605jBS3LQ@mail.gmail.com \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox