From: Jens Axboe <[email protected]>
To: [email protected]
Cc: [email protected]
Subject: [PATCHSET v4 0/9] Improve MSG_RING DEFER_TASKRUN performance
Date: Tue, 18 Jun 2024 12:48:39 -0600 [thread overview]
Message-ID: <[email protected]> (raw)
Hi,
Hi,
For v1 and replies to that and tons of perf measurements, go here:
https://lore.kernel.org/io-uring/[email protected]/
and find v2 here:
https://lore.kernel.org/io-uring/[email protected]/
and v3 here:
https://lore.kernel.org/io-uring/[email protected]/
and you can find the git tree here:
https://git.kernel.dk/cgit/linux/log/?h=io_uring-msg-ring.1
and the silly test app being used here:
https://kernel.dk/msg-lat.c
Patches are based on top of the pending 6.11 io_uring changes.
tldr is that this series greatly improves both latency, overhead, and
throughput of sending messages to other rings. It's done by using the
existing io_uring task_work for passing messages, rather than utilize
the rather big hammer of TWA_SIGNAL based generic kernel task_work.
Note that this differs from v3 of this posting, as that used the
CQE overflow approach. While the CQE overflow approach still performs
a bit better than this approach, this one is a bit cleaner.
Performance for local (same node CPUs) message passing before this
change:
init_flags=3000, delay=10 usec
latencies for: receiver (msg=82631)
percentiles (nsec):
| 1.0000th=[ 3088], 5.0000th=[ 3088], 10.0000th=[ 3120],
| 20.0000th=[ 3248], 30.0000th=[ 3280], 40.0000th=[ 3312],
| 50.0000th=[ 3408], 60.0000th=[ 3440], 70.0000th=[ 3472],
| 80.0000th=[ 3504], 90.0000th=[ 3600], 95.0000th=[ 3696],
| 99.0000th=[ 6368], 99.5000th=[ 6496], 99.9000th=[ 6880],
| 99.9500th=[ 7008], 99.9900th=[12352]
latencies for: sender (msg=82631)
percentiles (nsec):
| 1.0000th=[ 5280], 5.0000th=[ 5280], 10.0000th=[ 5344],
| 20.0000th=[ 5408], 30.0000th=[ 5472], 40.0000th=[ 5472],
| 50.0000th=[ 5600], 60.0000th=[ 5600], 70.0000th=[ 5664],
| 80.0000th=[ 5664], 90.0000th=[ 5792], 95.0000th=[ 5920],
| 99.0000th=[ 8512], 99.5000th=[ 8640], 99.9000th=[ 8896],
| 99.9500th=[ 9280], 99.9900th=[19840]
and after:
init_flags=3000, delay=10 usec
Latencies for: Sender (msg=236763)
percentiles (nsec):
| 1.0000th=[ 225], 5.0000th=[ 245], 10.0000th=[ 278],
| 20.0000th=[ 294], 30.0000th=[ 330], 40.0000th=[ 378],
| 50.0000th=[ 418], 60.0000th=[ 466], 70.0000th=[ 524],
| 80.0000th=[ 604], 90.0000th=[ 708], 95.0000th=[ 804],
| 99.0000th=[ 1864], 99.5000th=[ 2480], 99.9000th=[ 2768],
| 99.9500th=[ 2864], 99.9900th=[ 3056]
Latencies for: Receiver (msg=236763)
percentiles (nsec):
| 1.0000th=[ 764], 5.0000th=[ 940], 10.0000th=[ 1096],
| 20.0000th=[ 1416], 30.0000th=[ 1736], 40.0000th=[ 2040],
| 50.0000th=[ 2352], 60.0000th=[ 2704], 70.0000th=[ 3152],
| 80.0000th=[ 3856], 90.0000th=[ 4960], 95.0000th=[ 6176],
| 99.0000th=[ 8032], 99.5000th=[ 8256], 99.9000th=[ 8768],
| 99.9500th=[10304], 99.9900th=[91648]
and for remote (different nodes) CPUs, before:
init_flags=3000, delay=10 usec
Latencies for: Receiver (msg=44002)
percentiles (nsec):
| 1.0000th=[ 7264], 5.0000th=[ 8384], 10.0000th=[ 8512],
| 20.0000th=[ 8640], 30.0000th=[ 8896], 40.0000th=[ 9024],
| 50.0000th=[ 9152], 60.0000th=[ 9280], 70.0000th=[ 9408],
| 80.0000th=[ 9536], 90.0000th=[ 9792], 95.0000th=[ 9920],
| 99.0000th=[10304], 99.5000th=[13376], 99.9000th=[19840],
| 99.9500th=[20608], 99.9900th=[25728]
Latencies for: Sender (msg=44002)
percentiles (nsec):
| 1.0000th=[11712], 5.0000th=[12864], 10.0000th=[12864],
| 20.0000th=[13120], 30.0000th=[13248], 40.0000th=[13376],
| 50.0000th=[13504], 60.0000th=[13760], 70.0000th=[13888],
| 80.0000th=[14144], 90.0000th=[14272], 95.0000th=[14400],
| 99.0000th=[15936], 99.5000th=[21632], 99.9000th=[24704],
| 99.9500th=[25984], 99.9900th=[37632]
and after the changes:
init_flags=3000, delay=10 usec
Latencies for: Sender (msg=192598)
percentiles (nsec):
| 1.0000th=[ 402], 5.0000th=[ 430], 10.0000th=[ 446],
| 20.0000th=[ 482], 30.0000th=[ 700], 40.0000th=[ 804],
| 50.0000th=[ 932], 60.0000th=[ 1176], 70.0000th=[ 1304],
| 80.0000th=[ 1480], 90.0000th=[ 1752], 95.0000th=[ 2128],
| 99.0000th=[ 2736], 99.5000th=[ 2928], 99.9000th=[ 4256],
| 99.9500th=[ 8768], 99.9900th=[12864]
Latencies for: Receiver (msg=192598)
percentiles (nsec):
| 1.0000th=[ 2024], 5.0000th=[ 2544], 10.0000th=[ 2928],
| 20.0000th=[ 3600], 30.0000th=[ 4048], 40.0000th=[ 4448],
| 50.0000th=[ 4896], 60.0000th=[ 5408], 70.0000th=[ 5920],
| 80.0000th=[ 6752], 90.0000th=[ 7904], 95.0000th=[ 9408],
| 99.0000th=[10816], 99.5000th=[11712], 99.9000th=[16320],
| 99.9500th=[18304], 99.9900th=[22656]
include/linux/io_uring_types.h | 3 +
io_uring/io_uring.c | 53 ++++++++++++---
io_uring/io_uring.h | 3 +
io_uring/msg_ring.c | 119 ++++++++++++++++++++-------------
io_uring/msg_ring.h | 1 +
5 files changed, 124 insertions(+), 55 deletions(-)
Since v3:
- Switch back to task_work approach, rather than utilize overflows
for this
- Retain old task_work approach for fd passing
- Various tweaks and cleanups
--
Jens Axboe
next reply other threads:[~2024-06-18 18:56 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-18 18:48 Jens Axboe [this message]
2024-06-18 18:48 ` [PATCH 1/5] io_uring/msg_ring: tighten requirement for remote posting Jens Axboe
2024-06-18 18:48 ` [PATCH 2/5] io_uring: add remote task_work execution helper Jens Axboe
2024-06-18 18:48 ` [PATCH 3/5] io_uring: add io_add_aux_cqe() helper Jens Axboe
2024-06-18 18:48 ` [PATCH 4/5] io_uring/msg_ring: improve handling of target CQE posting Jens Axboe
2024-07-01 13:06 ` Pavel Begunkov
2024-06-18 18:48 ` [PATCH 5/5] io_uring/msg_ring: add an alloc cache for io_kiocb entries Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox