public inbox for [email protected]
 help / color / mirror / Atom feed
From: Ming Lei <[email protected]>
To: Keith Busch <[email protected]>
Cc: Jeff Moyer <[email protected]>, Keith Busch <[email protected]>,
	[email protected], [email protected],
	[email protected], [email protected], [email protected],
	[email protected], [email protected],
	Kanchan Joshi <[email protected]>
Subject: Re: [PATCH 1/2] iouring: one capable call per iouring instance
Date: Tue, 5 Dec 2023 13:25:44 +0800	[thread overview]
Message-ID: <ZW60WPf/hmAUoxPv@fedora> (raw)
In-Reply-To: <ZW6nmR2ytIBApXE0@kbusch-mbp>

On Mon, Dec 04, 2023 at 09:31:21PM -0700, Keith Busch wrote:
> On Tue, Dec 05, 2023 at 12:14:22PM +0800, Ming Lei wrote:
> > On Mon, Dec 04, 2023 at 11:57:55AM -0700, Keith Busch wrote:
> > > On Mon, Dec 04, 2023 at 01:40:58PM -0500, Jeff Moyer wrote:
> > > > I added a CC: linux-security-module@vger
> > > > Keith Busch <[email protected]> writes:
> > > > > From: Keith Busch <[email protected]>
> > > > >
> > > > > The uring_cmd operation is often used for privileged actions, so drivers
> > > > > subscribing to this interface check capable() for each command. The
> > > > > capable() function is not fast path friendly for many kernel configs,
> > > > > and this can really harm performance. Stash the capable sys admin
> > > > > attribute in the io_uring context and set a new issue_flag for the
> > > > > uring_cmd interface.
> > > > 
> > > > I have a few questions.  What privileged actions are performance
> > > > sensitive? I would hope that anything requiring privileges would not
> > > > be in a fast path (but clearly that's not the case).
> > > 
> > > Protocol specifics that don't have a generic equivalent. For example,
> > > NVMe FDP is reachable only through the uring_cmd and ioctl interfaces,
> > > but you use it like normal reads and writes so has to be as fast as the
> > > generic interfaces.
> > 
> > But normal read/write pt command doesn't require ADMIN any more since 
> > commit 855b7717f44b ("nvme: fine-granular CAP_SYS_ADMIN for nvme io commands"),
> > why do you have to pay the cost of checking capable(CAP_SYS_ADMIN)?
> 
> Good question. The "capable" check had always been first so even with
> the relaxed permissions, it was still paying the price. I have changed
> that order in commit staged here (not yet upstream):
> 
>   http://git.infradead.org/nvme.git/commitdiff/7be866b1cf0bf1dfa74480fe8097daeceda68622

With this change, I guess you shouldn't see the following big gap, right?

> Before: 970k IOPs
> After: 1750k IOPs

> 
> Note that only prevents the costly capable() check if the inexpensive
> checks could make a determination. That's still not solving the problem
> long term since we aim for forward compatibility where we have no idea
> which opcodes, admin identifications, or vendor specifics could be
> deemed "safe" for non-root users in the future, so those conditions
> would always fall back to the more expensive check that this patch was
> trying to mitigate for admin processes.

Not sure I get the idea, it is related with nvme's permission model for
user pt command, and:

1) it should be always checked in entry of nvme user pt command

2) only the following two types of commands require ADMIN, per commit
855b7717f44b ("nvme: fine-granular CAP_SYS_ADMIN for nvme io commands")

    - any admin-cmd is not allowed
    - vendor-specific and fabric commmand are not allowed

Can you provide more details why the expensive check can't be avoided for
fast read/write user IO commands?

Thanks, 
Ming


  reply	other threads:[~2023-12-05  5:26 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-04 17:53 [PATCH 1/2] iouring: one capable call per iouring instance Keith Busch
2023-12-04 17:53 ` [PATCH 2/2] nvme: use uring_cmd sys_admin flag Keith Busch
2023-12-04 18:05 ` [PATCH 1/2] iouring: one capable call per iouring instance Jens Axboe
2023-12-04 18:45   ` Pavel Begunkov
2023-12-05 16:21   ` Kanchan Joshi
2023-12-06 21:09     ` Keith Busch
2023-12-04 18:15 ` Jens Axboe
2023-12-04 18:40 ` Jeff Moyer
2023-12-04 18:57   ` Keith Busch
2023-12-05  4:14     ` Ming Lei
2023-12-05  4:31       ` Keith Busch
2023-12-05  5:25         ` Ming Lei [this message]
2023-12-05 15:45           ` Keith Busch
2023-12-06  3:08             ` Ming Lei
2023-12-06 15:31               ` Keith Busch
2023-12-07  1:23                 ` Ming Lei
2023-12-07 17:48                   ` Christoph Hellwig
2023-12-04 19:01   ` Jens Axboe
2023-12-04 19:22     ` Jeff Moyer
2023-12-04 19:33       ` Jens Axboe
2023-12-04 19:37       ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZW60WPf/hmAUoxPv@fedora \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox