public inbox for [email protected]
 help / color / mirror / Atom feed
From: Willem de Bruijn <[email protected]>
To: David Wei <[email protected]>,
	 [email protected],  [email protected]
Cc: Jens Axboe <[email protected]>,
	 Pavel Begunkov <[email protected]>,
	 Jakub Kicinski <[email protected]>,
	 Paolo Abeni <[email protected]>,
	 "David S. Miller" <[email protected]>,
	 Eric Dumazet <[email protected]>,
	 Jesper Dangaard Brouer <[email protected]>,
	 David Ahern <[email protected]>,
	 Mina Almasry <[email protected]>,
	 [email protected],  [email protected]
Subject: Re: [RFC PATCH v3 07/20] io_uring: add interface queue
Date: Thu, 21 Dec 2023 12:57:35 -0500	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

David Wei wrote:
> From: David Wei <[email protected]>
> 
> This patch introduces a new object in io_uring called an interface queue
> (ifq) which contains:
> 
> * A pool region allocated by userspace and registered w/ io_uring where
>   Rx data is written to.
> * A net device and one specific Rx queue in it that will be configured
>   for ZC Rx.
> * A pair of shared ringbuffers w/ userspace, dubbed registered buf
>   (rbuf) rings. Each entry contains a pool region id and an offset + len
>   within that region. The kernel writes entries into the completion ring
>   to tell userspace where RX data is relative to the start of a region.
>   Userspace writes entries into the refill ring to tell the kernel when
>   it is done with the data.
> 
> For now, each io_uring instance has a single ifq, and each ifq has a
> single pool region associated with one Rx queue.
> 
> Add a new opcode to io_uring_register that sets up an ifq. Size and
> offsets of shared ringbuffers are returned to userspace for it to mmap.
> The implementation will be added in a later patch.
> 
> Signed-off-by: David Wei <[email protected]>

This is quite similar to AF_XDP, of course. Is it at all possible to
reuse all or some of that? If not, why not?

As a side effect, unification would also show a path of moving AF_XDP
from its custom allocator to the page_pool infra.

Related: what is the story wrt the process crashing while user memory
is posted to the NIC or present in the kernel stack.

SO_DEVMEM already demonstrates zerocopy into user buffers using usdma.
To a certain extent that and asyncronous I/O with iouring are two
independent goals. SO_DEVMEM imposes limitations on the stack because
it might hold opaque device mem. That is too strong for this case.

But for this iouring provider, is there anything ioring specific about
it beyond being user memory? If not, maybe just call it a umem
provider, and anticipate it being usable for AF_XDP in the future too?

Besides delivery up to the intended socket, packets may also end up
in other code paths, such as packet sockets or forwarding. All of
this is simpler with userspace backed buffers than with device mem.
But good to call out explicitly how this is handled. MSG_ZEROCOPY
makes a deep packet copy in unexpected code paths, for instance. To
avoid indefinite latency to buffer reclaim.

  parent reply	other threads:[~2023-12-21 17:57 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-19 21:03 [RFC PATCH v3 00/20] Zero copy Rx using io_uring David Wei
2023-12-19 21:03 ` [RFC PATCH v3 01/20] net: page_pool: add ppiov mangling helper David Wei
2023-12-19 23:22   ` Mina Almasry
2023-12-19 23:59     ` Pavel Begunkov
2023-12-19 21:03 ` [RFC PATCH v3 02/20] tcp: don't allow non-devmem originated ppiov David Wei
2023-12-19 23:24   ` Mina Almasry
2023-12-20  1:29     ` Pavel Begunkov
2024-01-02 16:11       ` Mina Almasry
2023-12-19 21:03 ` [RFC PATCH v3 03/20] net: page pool: rework ppiov life cycle David Wei
2023-12-19 23:35   ` Mina Almasry
2023-12-20  0:49     ` Pavel Begunkov
2023-12-19 21:03 ` [RFC PATCH v3 04/20] net: enable napi_pp_put_page for ppiov David Wei
2023-12-19 21:03 ` [RFC PATCH v3 05/20] net: page_pool: add ->scrub mem provider callback David Wei
2023-12-19 21:03 ` [RFC PATCH v3 06/20] io_uring: separate header for exported net bits David Wei
2023-12-20 16:01   ` Jens Axboe
2023-12-19 21:03 ` [RFC PATCH v3 07/20] io_uring: add interface queue David Wei
2023-12-20 16:13   ` Jens Axboe
2023-12-20 16:23     ` Pavel Begunkov
2023-12-21  1:44     ` David Wei
2023-12-21 17:57   ` Willem de Bruijn [this message]
2023-12-30 16:25     ` Pavel Begunkov
2023-12-31 22:25       ` Willem de Bruijn
2023-12-19 21:03 ` [RFC PATCH v3 08/20] io_uring: add mmap support for shared ifq ringbuffers David Wei
2023-12-20 16:13   ` Jens Axboe
2023-12-19 21:03 ` [RFC PATCH v3 09/20] netdev: add XDP_SETUP_ZC_RX command David Wei
2023-12-19 21:03 ` [RFC PATCH v3 10/20] io_uring: setup ZC for an Rx queue when registering an ifq David Wei
2023-12-20 16:06   ` Jens Axboe
2023-12-20 16:24     ` Pavel Begunkov
2023-12-19 21:03 ` [RFC PATCH v3 11/20] io_uring/zcrx: implement socket registration David Wei
2023-12-19 21:03 ` [RFC PATCH v3 12/20] io_uring: add ZC buf and pool David Wei
2023-12-19 21:03 ` [RFC PATCH v3 13/20] io_uring: implement pp memory provider for zc rx David Wei
2023-12-19 23:44   ` Mina Almasry
2023-12-20  0:39     ` Pavel Begunkov
2023-12-21 19:36   ` Pavel Begunkov
2023-12-19 21:03 ` [RFC PATCH v3 14/20] net: page pool: add io_uring memory provider David Wei
2023-12-19 23:39   ` Mina Almasry
2023-12-20  0:04     ` Pavel Begunkov
2023-12-19 21:03 ` [RFC PATCH v3 15/20] io_uring: add io_recvzc request David Wei
2023-12-20 16:27   ` Jens Axboe
2023-12-20 17:04     ` Pavel Begunkov
2023-12-20 18:09       ` Jens Axboe
2023-12-21 18:59         ` Pavel Begunkov
2023-12-21 21:32           ` Jens Axboe
2023-12-30 21:15             ` Pavel Begunkov
2023-12-19 21:03 ` [RFC PATCH v3 16/20] net: execute custom callback from napi David Wei
2023-12-19 21:03 ` [RFC PATCH v3 17/20] io_uring/zcrx: add copy fallback David Wei
2023-12-19 21:03 ` [RFC PATCH v3 18/20] veth: add support for io_uring zc rx David Wei
2023-12-19 21:03 ` [RFC PATCH v3 19/20] net: page pool: generalise ppiov dma address get David Wei
2023-12-21 19:51   ` Mina Almasry
2023-12-19 21:03 ` [RFC PATCH v3 20/20] bnxt: enable io_uring zc page pool David Wei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=65847c8f83f71_82de329487@willemb.c.googlers.com.notmuch \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox