From: Ming Lei <[email protected]>
To: Keith Busch <[email protected]>
Cc: Jeff Moyer <[email protected]>, Keith Busch <[email protected]>,
[email protected], [email protected],
[email protected], [email protected], [email protected],
[email protected], [email protected],
Kanchan Joshi <[email protected]>
Subject: Re: [PATCH 1/2] iouring: one capable call per iouring instance
Date: Tue, 5 Dec 2023 13:25:44 +0800 [thread overview]
Message-ID: <ZW60WPf/hmAUoxPv@fedora> (raw)
In-Reply-To: <ZW6nmR2ytIBApXE0@kbusch-mbp>
On Mon, Dec 04, 2023 at 09:31:21PM -0700, Keith Busch wrote:
> On Tue, Dec 05, 2023 at 12:14:22PM +0800, Ming Lei wrote:
> > On Mon, Dec 04, 2023 at 11:57:55AM -0700, Keith Busch wrote:
> > > On Mon, Dec 04, 2023 at 01:40:58PM -0500, Jeff Moyer wrote:
> > > > I added a CC: linux-security-module@vger
> > > > Keith Busch <[email protected]> writes:
> > > > > From: Keith Busch <[email protected]>
> > > > >
> > > > > The uring_cmd operation is often used for privileged actions, so drivers
> > > > > subscribing to this interface check capable() for each command. The
> > > > > capable() function is not fast path friendly for many kernel configs,
> > > > > and this can really harm performance. Stash the capable sys admin
> > > > > attribute in the io_uring context and set a new issue_flag for the
> > > > > uring_cmd interface.
> > > >
> > > > I have a few questions. What privileged actions are performance
> > > > sensitive? I would hope that anything requiring privileges would not
> > > > be in a fast path (but clearly that's not the case).
> > >
> > > Protocol specifics that don't have a generic equivalent. For example,
> > > NVMe FDP is reachable only through the uring_cmd and ioctl interfaces,
> > > but you use it like normal reads and writes so has to be as fast as the
> > > generic interfaces.
> >
> > But normal read/write pt command doesn't require ADMIN any more since
> > commit 855b7717f44b ("nvme: fine-granular CAP_SYS_ADMIN for nvme io commands"),
> > why do you have to pay the cost of checking capable(CAP_SYS_ADMIN)?
>
> Good question. The "capable" check had always been first so even with
> the relaxed permissions, it was still paying the price. I have changed
> that order in commit staged here (not yet upstream):
>
> http://git.infradead.org/nvme.git/commitdiff/7be866b1cf0bf1dfa74480fe8097daeceda68622
With this change, I guess you shouldn't see the following big gap, right?
> Before: 970k IOPs
> After: 1750k IOPs
>
> Note that only prevents the costly capable() check if the inexpensive
> checks could make a determination. That's still not solving the problem
> long term since we aim for forward compatibility where we have no idea
> which opcodes, admin identifications, or vendor specifics could be
> deemed "safe" for non-root users in the future, so those conditions
> would always fall back to the more expensive check that this patch was
> trying to mitigate for admin processes.
Not sure I get the idea, it is related with nvme's permission model for
user pt command, and:
1) it should be always checked in entry of nvme user pt command
2) only the following two types of commands require ADMIN, per commit
855b7717f44b ("nvme: fine-granular CAP_SYS_ADMIN for nvme io commands")
- any admin-cmd is not allowed
- vendor-specific and fabric commmand are not allowed
Can you provide more details why the expensive check can't be avoided for
fast read/write user IO commands?
Thanks,
Ming
next prev parent reply other threads:[~2023-12-05 5:26 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-04 17:53 [PATCH 1/2] iouring: one capable call per iouring instance Keith Busch
2023-12-04 17:53 ` [PATCH 2/2] nvme: use uring_cmd sys_admin flag Keith Busch
2023-12-04 18:05 ` [PATCH 1/2] iouring: one capable call per iouring instance Jens Axboe
2023-12-04 18:45 ` Pavel Begunkov
2023-12-05 16:21 ` Kanchan Joshi
2023-12-06 21:09 ` Keith Busch
2023-12-04 18:15 ` Jens Axboe
2023-12-04 18:40 ` Jeff Moyer
2023-12-04 18:57 ` Keith Busch
2023-12-05 4:14 ` Ming Lei
2023-12-05 4:31 ` Keith Busch
2023-12-05 5:25 ` Ming Lei [this message]
2023-12-05 15:45 ` Keith Busch
2023-12-06 3:08 ` Ming Lei
2023-12-06 15:31 ` Keith Busch
2023-12-07 1:23 ` Ming Lei
2023-12-07 17:48 ` Christoph Hellwig
2023-12-04 19:01 ` Jens Axboe
2023-12-04 19:22 ` Jeff Moyer
2023-12-04 19:33 ` Jens Axboe
2023-12-04 19:37 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZW60WPf/hmAUoxPv@fedora \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox