public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: Kanchan Joshi <[email protected]>, [email protected]
Cc: [email protected], [email protected],
	[email protected], [email protected], [email protected],
	[email protected]
Subject: Re: [LSF/MM/BPF ATTEND][LSF/MM/BPF Topic] Non-block IO
Date: Fri, 10 Feb 2023 12:53:50 -0700	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 2/10/23 11:00?AM, Kanchan Joshi wrote:
> is getting more common than it used to be.
> NVMe is no longer tied to block storage. Command sets in NVMe 2.0 spec
> opened an excellent way to present non-block interfaces to the Host. ZNS
> and KV came along with it, and some new command sets are emerging.
> 
> OTOH, Kernel IO advances historically centered around the block IO path.
> Passthrough IO path existed, but it stayed far from all the advances, be
> it new features or performance.
> 
> Current state & discussion points:
> ---------------------------------
> Status-quo changed in the recent past with the new passthrough path (ng
> char interface + io_uring command). Feature parity does not exist, but
> performance parity does.
> Adoption draws asks. I propose a session covering a few voices and
> finding a path-forward for some ideas too.
> 
> 1. Command cancellation: while NVMe mandatorily supports the abort
> command, we do not have a way to trigger that from user-space. There
> are ways to go about it (with or without the uring-cancel interface) but
> not without certain tradeoffs. It will be good to discuss the choices in
> person.
> 
> 2. Cgroups: works for only block dev at the moment. Are there outright
> objections to extending this to char-interface IO?
> 
> 3. DMA cost: is high in presence of IOMMU. Keith posted the work[1],
> with block IO path, last year. I imagine plumbing to get a bit simpler
> with passthrough-only support. But what are the other things that must
> be sorted out to have progress on moving DMA cost out of the fast path?

Yeah, this one is still pending... Would be nice to make some progress
there at some point.

> 4. Direct NVMe queues - will there be interest in having io_uring
> managed NVMe queues?  Sort of a new ring, for which I/O is destaged from
> io_uring SQE to NVMe SQE without having to go through intermediate
> constructs (i.e., bio/request). Hopefully,that can further amp up the
> efficiency of IO.

This is interesting, and I've pondered something like that before too. I
think it's worth investigating and hacking up a prototype. I recently
had one user of IOPOLL assume that setting up a ring with IOPOLL would
automatically create a polled queue on the driver side and that is what
would be used for IO. And while that's not how it currently works, it
definitely does make sense and we could make some things faster like
that. It would also potentially easier enable cancelation referenced in
#1 above, if it's restricted to the queue(s) that the ring "owns".

-- 
Jens Axboe


  parent reply	other threads:[~2023-02-10 19:53 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20230210180226epcas5p1bd2e1150de067f8af61de2bbf571594d@epcas5p1.samsung.com>
2023-02-10 18:00 ` [LSF/MM/BPF ATTEND][LSF/MM/BPF Topic] Non-block IO Kanchan Joshi
2023-02-10 18:18   ` Bart Van Assche
2023-02-10 19:34     ` Kanchan Joshi
2023-02-13 20:24       ` Bart Van Assche
2023-02-10 19:47     ` Jens Axboe
2023-02-14 10:33     ` John Garry
2023-02-10 19:53   ` Jens Axboe [this message]
2023-02-13 11:54     ` Sagi Grimberg
2023-04-11 22:48     ` Kanchan Joshi
2023-04-11 22:53       ` Jens Axboe
2023-04-11 23:28         ` Kanchan Joshi
2023-04-12  2:12           ` Jens Axboe
2023-04-12  2:33       ` Ming Lei
2023-04-12 13:26         ` Kanchan Joshi
2023-04-12 13:47           ` Ming Lei
2023-02-10 20:07   ` Clay Mayers
2023-02-11  3:33   ` Ming Lei
2023-02-11 12:06   ` Hannes Reinecke
2023-02-28 16:05   ` John Meneghini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox