public inbox for [email protected]
 help / color / mirror / Atom feed
From: Dust Li <[email protected]>
To: Jonathan Lemon <[email protected]>, [email protected]
Cc: [email protected]
Subject: Re: [PATCH v1 00/15] zero-copy RX for io_uring
Date: Wed, 9 Nov 2022 14:37:42 +0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On Mon, Nov 07, 2022 at 09:05:06PM -0800, Jonathan Lemon wrote:
>This series introduces network RX zerocopy for io_uring.
>
>This is an evolution of the earlier zctap work, re-targeted to use
>io_uring as the userspace API.  The code is intends to provide a 
>ZC RX path for upper-level networking protocols (aka TCP and UDP),
>with a focus on focuses on host-provided memory (not GPU memory).
>
>This patch contains the upper-level core code required for operation,
>but does not not contain the network driver side changes required for
>true zero-copy operation.  The io_uring RECV_ZC opcode will work 
>without hardware support, albeit in copy mode.
>
>The intent is to use a network driver which provides header/data
>splitting, so the frame header which is processed by the networking
>stack is not placed in user memory.
>
>The code is successfully receiving a zero-copy TCP stream from a
>remote sender.
>
>There is a liburing fork providing the needed wrappers:
>
>    https://github.com/jlemon/liburing/tree/zctap
>
>Which contains an examples/io_uring-net test application exercising
>these features.  A sample run:
>
>  # ./io_uring-net -i eth1 -q 20 -p 9999 -r 3000
>   copy bytes: 1938872
>     ZC bytes: 996683008
>  Total bytes: 998621880, nsec:1025219375
>         Rate: 7.79 Gb/s
>
>If no queue is specified, then non-zc mode is used:
>
>  # ./io_uring-net -p 9999
>   copy bytes: 998621880
>     ZC bytes: 0
>  Total bytes: 998621880, nsec:1051515726
>         Rate: 7.60 Gb/s

Haven't dive into your test case yet, but the performance data
looks disappointing

I don't know why we need zerocopy if we can't get a big performance
gain.

Have you tested large messages with jumbo or LRO enabled ?

Thanks

>
>There is also an iperf3 fork as well:
>
>   https://github.com/jlemon/iperf/tree/io_uring
>
>This allows running single tests with either:
>   * select (normal iperf3)
>   * io_uring READ
>   * io_uring RECV_ZC copy mode
>   * io_uring RECV_ZC hardware mode
>
>Current testing shows similar BW between RECV_ZC and READ modes
>(running at 22Gbit/sec), but a reduction of ~50% of MemBW.
>
>High level description:
>
>The application allocates a frame backing store, and provides this
>to the kernel for use.  An interface queue is requested from the
>networking device, and incoming frames are deposited into the provided
>memory region.  The NIC should provide a header splitting feature, so
>only the frame payload is placed in the user space area.
>
>Responsibility for correctly steering incoming frames to the queue
>is outside the scope of this work - it is assumed that the user 
>has set steering rules up separately.
>
>Incoming frames are sent up the stack as skb's and eventually
>land in the application's socket receive queue.  This differs
>from AF_XDP, which receives raw frames directly to userspace,
>without protocol processing.
>
>The RECV_ZC opcode then returns an iov[] style vector which points
>to the data in userspace memory.  When the application has completed
>processing of the data, the buffers are returned back to the kernel
>through a fill ring for reuse.
>
>Jonathan Lemon (15):
>  io_uring: add zctap ifq definition
>  netdevice: add SETUP_ZCTAP to the netdev_bpf structure
>  io_uring: add register ifq opcode
>  io_uring: create a zctap region for a mapped buffer
>  io_uring: mark pages in ifq region with zctap information.
>  io_uring: Provide driver API for zctap packet buffers.
>  io_uring: Allocate zctap device buffers and dma map them.
>  io_uring: Add zctap buffer get/put functions and refcounting.
>  skbuff: Introduce SKBFL_FIXED_FRAG and skb_fixed()
>  io_uring: Allocate a uarg for use by the ifq RX
>  io_uring: Define the zctap iov[] returned to the user.
>  io_uring: add OP_RECV_ZC command.
>  io_uring: Make remove_ifq_region a delayed work call
>  io_uring: Add a buffer caching mechanism for zctap.
>  io_uring: Notify the application as the fillq is drained.
>
> include/linux/io_uring.h       |   47 ++
> include/linux/io_uring_types.h |   12 +
> include/linux/netdevice.h      |    6 +
> include/linux/skbuff.h         |   10 +-
> include/uapi/linux/io_uring.h  |   24 +
> io_uring/Makefile              |    3 +-
> io_uring/io_uring.c            |    8 +
> io_uring/kbuf.c                |   13 +
> io_uring/kbuf.h                |    2 +
> io_uring/net.c                 |  123 ++++
> io_uring/opdef.c               |   15 +
> io_uring/zctap.c               | 1001 ++++++++++++++++++++++++++++++++
> io_uring/zctap.h               |   31 +
> 13 files changed, 1293 insertions(+), 2 deletions(-)
> create mode 100644 io_uring/zctap.c
> create mode 100644 io_uring/zctap.h
>
>-- 
>2.30.2

  parent reply	other threads:[~2022-11-09  6:37 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-08  5:05 [PATCH v1 00/15] zero-copy RX for io_uring Jonathan Lemon
2022-11-08  5:05 ` [PATCH v1 01/15] io_uring: add zctap ifq definition Jonathan Lemon
2022-11-08  5:05 ` [PATCH v1 02/15] netdevice: add SETUP_ZCTAP to the netdev_bpf structure Jonathan Lemon
2022-11-08  5:05 ` [PATCH v1 03/15] io_uring: add register ifq opcode Jonathan Lemon
2022-11-08  5:05 ` [PATCH v1 04/15] io_uring: create a zctap region for a mapped buffer Jonathan Lemon
2022-11-08  5:05 ` [PATCH v1 05/15] io_uring: mark pages in ifq region with zctap information Jonathan Lemon
2022-11-16  8:12   ` Christoph Hellwig
2022-11-17 20:48     ` Jonathan Lemon
2022-11-08  5:05 ` [PATCH v1 06/15] io_uring: Provide driver API for zctap packet buffers Jonathan Lemon
2022-11-16  8:17   ` Christoph Hellwig
2022-11-17 21:01     ` Jonathan Lemon
2022-11-08  5:05 ` [PATCH v1 07/15] io_uring: Allocate zctap device buffers and dma map them Jonathan Lemon
2022-11-16  8:15   ` Christoph Hellwig
2022-11-17 20:51     ` Jonathan Lemon
2022-11-08  5:05 ` [PATCH v1 08/15] io_uring: Add zctap buffer get/put functions and refcounting Jonathan Lemon
2022-11-08  5:05 ` [PATCH v1 09/15] skbuff: Introduce SKBFL_FIXED_FRAG and skb_fixed() Jonathan Lemon
2022-11-08  5:05 ` [PATCH v1 10/15] io_uring: Allocate a uarg for use by the ifq RX Jonathan Lemon
2022-11-08  5:05 ` [PATCH v1 11/15] io_uring: Define the zctap iov[] returned to the user Jonathan Lemon
2022-11-08  5:05 ` [PATCH v1 12/15] io_uring: add OP_RECV_ZC command Jonathan Lemon
2022-11-08  5:05 ` [PATCH v1 13/15] io_uring: Make remove_ifq_region a delayed work call Jonathan Lemon
2022-11-08  5:05 ` [PATCH v1 14/15] io_uring: Add a buffer caching mechanism for zctap Jonathan Lemon
2022-11-08  5:05 ` [PATCH v1 15/15] io_uring: Notify the application as the fillq is drained Jonathan Lemon
2022-11-09  6:37 ` Dust Li [this message]
2022-11-09 15:27   ` [PATCH v1 00/15] zero-copy RX for io_uring Jonathan Lemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox