public inbox for [email protected]
 help / color / mirror / Atom feed
From: Dylan Yudaken <[email protected]>
To: Jens Axboe <[email protected]>,
	Pavel Begunkov <[email protected]>,
	<[email protected]>
Cc: <[email protected]>, <[email protected]>,
	Dylan Yudaken <[email protected]>
Subject: [PATCH v2 for-next 00/12] io_uring: multishot recv
Date: Thu, 30 Jun 2022 02:12:19 -0700	[thread overview]
Message-ID: <[email protected]> (raw)

This series adds support for multishot recv/recvmsg to io_uring.

The idea is that generally socket applications will be continually
enqueuing a new recv() when the previous one completes. This can be
improved on by allowing the application to queue a multishot receive,
which will post completions as and when data is available. It uses the
provided buffers feature to receive new data into a pool provided by
the application.

This is more performant in a few ways:
* Subsequent receives are queued up straight away without requiring the
  application to finish a processing loop.
* If there are more data in the socket (sat the provided buffer
  size is smaller than the socket buffer) then the data is immediately
  returned, improving batching.
*  Poll is only armed once and reused, saving CPU cycles

Running a small network benchmark [1] shows improved QPS of ~6-8% over a range of loads.

[1]: https://github.com/DylanZA/netbench/tree/multishot_recv

While building this I noticed a small problem in multishot poll which is a really
big problem for receive. If CQEs overflow, then they will be returned to the user
out of order. This is annoying for the existing use cases of poll and accept but
doesn't totally break the functionality. Both of these return results that aren't
strictly ordered except for the IORING_CQE_F_MORE flag. For receive this obviously
is a critical requirement as otherwise data will be received out of order by the
application.

To fix this, when a multishot CQE hits overflow we remove multishot. The application
should then clear CQEs until it sees that CQE, and noticing that IORING_CQE_F_MORE is
not set can re-issue the multishot request.

Patches:
1-3: relax restrictions around provided buffers to allow 0 size lengths
4: recycles more buffers on kernel side in error conditions
5-6: clean up multishot poll API a bit allowing it to end with succesful
error conditions
7-8: fix existing problems with multishot poll on overflow
9: is the multishot receive patch
10-11: are small fixes to tracing of CQEs

v2:
* Added patches 6,7,8 (fixing multishot poll bugs)
* Added patches 10,11 (trace cleanups)
* added io_recv_finish to reduce duplicate logic


Dylan Yudaken (12):
  io_uring: allow 0 length for buffer select
  io_uring: restore bgid in io_put_kbuf
  io_uring: allow iov_len = 0 for recvmsg and buffer select
  io_uring: recycle buffers on error
  io_uring: clean up io_poll_check_events return values
  io_uring: add IOU_STOP_MULTISHOT return code
  io_uring: add allow_overflow to io_post_aux_cqe
  io_uring: fix multishot poll on overflow
  io_uring: fix multishot accept ordering
  io_uring: multishot recv
  io_uring: fix io_uring_cqe_overflow trace format
  io_uring: only trace one of complete or overflow

 include/trace/events/io_uring.h |   2 +-
 include/uapi/linux/io_uring.h   |   5 ++
 io_uring/io_uring.c             |  17 ++--
 io_uring/io_uring.h             |  20 +++--
 io_uring/kbuf.c                 |   4 +-
 io_uring/kbuf.h                 |   9 ++-
 io_uring/msg_ring.c             |   4 +-
 io_uring/net.c                  | 139 ++++++++++++++++++++++++++------
 io_uring/poll.c                 |  44 ++++++----
 io_uring/rsrc.c                 |   4 +-
 10 files changed, 190 insertions(+), 58 deletions(-)


base-commit: 864a15ca4f196184e3f44d72efc1782a7017cbbd
-- 
2.30.2


             reply	other threads:[~2022-06-30  9:14 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-30  9:12 Dylan Yudaken [this message]
2022-06-30  9:12 ` [PATCH v2 for-next 01/12] io_uring: allow 0 length for buffer select Dylan Yudaken
2022-06-30  9:12 ` [PATCH v2 for-next 02/12] io_uring: restore bgid in io_put_kbuf Dylan Yudaken
2022-06-30  9:12 ` [PATCH v2 for-next 03/12] io_uring: allow iov_len = 0 for recvmsg and buffer select Dylan Yudaken
2022-06-30  9:12 ` [PATCH v2 for-next 04/12] io_uring: recycle buffers on error Dylan Yudaken
2022-06-30  9:12 ` [PATCH v2 for-next 05/12] io_uring: clean up io_poll_check_events return values Dylan Yudaken
2022-06-30  9:12 ` [PATCH v2 for-next 06/12] io_uring: add IOU_STOP_MULTISHOT return code Dylan Yudaken
2022-06-30  9:12 ` [PATCH v2 for-next 07/12] io_uring: add allow_overflow to io_post_aux_cqe Dylan Yudaken
2022-06-30  9:12 ` [PATCH v2 for-next 08/12] io_uring: fix multishot poll on overflow Dylan Yudaken
2022-06-30  9:12 ` [PATCH v2 for-next 09/12] io_uring: fix multishot accept ordering Dylan Yudaken
2022-06-30  9:12 ` [PATCH v2 for-next 10/12] io_uring: multishot recv Dylan Yudaken
2022-06-30  9:12 ` [PATCH v2 for-next 11/12] io_uring: fix io_uring_cqe_overflow trace format Dylan Yudaken
2022-06-30  9:12 ` [PATCH v2 for-next 12/12] io_uring: only trace one of complete or overflow Dylan Yudaken
2022-06-30 20:19 ` [PATCH v2 for-next 00/12] io_uring: multishot recv Jens Axboe
2022-06-30 20:32 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox