public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jonathan Lemon <[email protected]>
To: <[email protected]>
Cc: <[email protected]>
Subject: [RFC PATCH v3 00/15] zero-copy RX for io_uring
Date: Wed, 2 Nov 2022 16:32:29 -0700	[thread overview]
Message-ID: <[email protected]> (raw)

This series is a RFC for io_uring/zctap.  This is an evolution of
the earlier zctap work, re-targeted to use io_uring as the userspace
API.  The current code is intended to provide a zero-copy RX path for
upper-level networking protocols (aka TCP and UDP).  The current draft
focuses on host-provided memory (not GPU memory).

This RFC contains the upper-level core code required for operation,
with the intent of soliciting feedback on the general API.  This does
not contain the network driver side changes required for complete
operation.  Also please note that as an RFC, there are some things
which are incomplete or in need of rework.

The intent is to use a network driver which provides header/data
splitting, so the frame header (which is processed by the networking
stack) does not reside in user memory.

The code is successfully receiving a zero-copy TCP stream from a
remote sender.  An RFC, the intent is to solicit feedback on the
API and overall design.  The current code will also work with
system pages, copying the data out to the application - this is
intended as a fallback/testing path.

There is an liburing fork: https://github.com/jlemon/liburing/tree/zctap

Which contains an examples/io_uring-net test application exercising
these features.  A sample run:

  # ./io_uring-net -i eth1 -q 20 -p 9999 -r 3000
   copy bytes: 1938872
     ZC bytes: 996683008
  Total bytes: 998621880, nsec:1025219375
         Rate: 7.79 Gb/s

If no queue is specified, then non-zc mode is used:

  # ./io_uring-net -p 9999
   copy bytes: 998621880
     ZC bytes: 0
  Total bytes: 998621880, nsec:1051515726
         Rate: 7.60 Gb/s

High level description:

The application allocates a frame backing store, and provides this
to the kernel for use.  An interface queue is requested from the
networking device, and incoming frames are deposited into the provided
memory region.

Responsibility for correctly steering incoming frames to the queue
is outside the scope of this work - it is assumed that the user 
has set steering rules up separately.

Incoming frames are sent up the stack as skb's and eventually
land in the application's socket receive queue.  This differs
from AF_XDP, which receives raw frames directly to userspace,
without protocol processing.

The RECV_ZC opcode then returns an iov[] style vector which points
to the data in userspace memory.  When the application has completed
processing of the data, the buffer is returned back to the kernel
through a fill ring for reuse.

Changelog:
 v1: initial version
 v2: Remove separate PROVIDE_REGION opcode, fold this functionality
     into REGISTER_IFQ.  Remove page_pool hooks, as it appears the 
     page pool is currently incompatible with user-mapped memory.
     Add io_zctap_buffers and network driver API.
 v3: Change freelist so it holds a zctap buffer index, instead of
     a pointer.  Add caching mechanism for better performance. Add
     notify mechanism which informs the app of fillq buffers removed.
     Clean up some refcount issues.

Jonathan Lemon (15):
  io_uring: add zctap ifq definition
  netdevice: add SETUP_ZCTAP to the netdev_bpf structure
  io_uring: add register ifq opcode
  io_uring: create a zctap region for a mapped buffer
  io_uring: mark pages in ifq region with zctap information.
  io_uring: Provide driver API for zctap packet buffers.
  io_uring: Allocate zctap device buffers and dma map them.
  io_uring: Add zctap buffer get/put functions and refcounting.
  skbuff: Introduce SKBFL_FIXED_FRAG and skb_fixed()
  io_uring: Allocate a uarg for use by the ifq RX
  io_uring: Define the zctap iov[] returned to the user.
  io_uring: add OP_RECV_ZC command.
  io_uring: Make remove_ifq_region a delayed work call
  io_uring: Add a buffer caching mechanism for zctap.
  io_uring: Notify the application as the fillq is drained.

 include/linux/io_uring.h       |  47 ++
 include/linux/io_uring_types.h |  12 +
 include/linux/netdevice.h      |   6 +
 include/linux/skbuff.h         |  10 +-
 include/uapi/linux/io_uring.h  |  24 +
 io_uring/Makefile              |   3 +-
 io_uring/io_uring.c            |   8 +
 io_uring/kbuf.c                |  13 +
 io_uring/kbuf.h                |   2 +
 io_uring/net.c                 | 121 ++++
 io_uring/opdef.c               |  15 +
 io_uring/zctap.c               | 976 +++++++++++++++++++++++++++++++++
 io_uring/zctap.h               |  31 ++
 13 files changed, 1266 insertions(+), 2 deletions(-)
 create mode 100644 io_uring/zctap.c
 create mode 100644 io_uring/zctap.h

-- 
2.30.2


             reply	other threads:[~2022-11-02 23:40 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-02 23:32 Jonathan Lemon [this message]
2022-11-02 23:32 ` [RFC PATCH v3 01/15] io_uring: add zctap ifq definition Jonathan Lemon
2022-11-02 23:32 ` [RFC PATCH v3 02/15] netdevice: add SETUP_ZCTAP to the netdev_bpf structure Jonathan Lemon
2022-11-02 23:32 ` [RFC PATCH v3 03/15] io_uring: add register ifq opcode Jonathan Lemon
2022-11-02 23:32 ` [RFC PATCH v3 04/15] io_uring: create a zctap region for a mapped buffer Jonathan Lemon
2022-11-02 23:32 ` [RFC PATCH v3 05/15] io_uring: mark pages in ifq region with zctap information Jonathan Lemon
2022-11-02 23:32 ` [RFC PATCH v3 06/15] io_uring: Provide driver API for zctap packet buffers Jonathan Lemon
2022-11-02 23:32 ` [RFC PATCH v3 07/15] io_uring: Allocate zctap device buffers and dma map them Jonathan Lemon
2022-11-02 23:32 ` [RFC PATCH v3 08/15] io_uring: Add zctap buffer get/put functions and refcounting Jonathan Lemon
2022-11-02 23:32 ` [RFC PATCH v3 09/15] skbuff: Introduce SKBFL_FIXED_FRAG and skb_fixed() Jonathan Lemon
2022-11-02 23:32 ` [RFC PATCH v3 10/15] io_uring: Allocate a uarg for use by the ifq RX Jonathan Lemon
2022-11-02 23:32 ` [RFC PATCH v3 11/15] io_uring: Define the zctap iov[] returned to the user Jonathan Lemon
2022-11-02 23:32 ` [RFC PATCH v3 12/15] io_uring: add OP_RECV_ZC command Jonathan Lemon
2022-11-02 23:32 ` [RFC PATCH v3 13/15] io_uring: Make remove_ifq_region a delayed work call Jonathan Lemon
2022-11-02 23:32 ` [RFC PATCH v3 14/15] io_uring: Add a buffer caching mechanism for zctap Jonathan Lemon
2022-11-02 23:32 ` [RFC PATCH v3 15/15] io_uring: Notify the application as the fillq is drained Jonathan Lemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox