From: Jens Axboe <[email protected]>
To: Pavel Begunkov <[email protected]>, [email protected]
Subject: Re: [PATCHSET 0/3] Improve MSG_RING SINGLE_ISSUER performance
Date: Tue, 28 May 2024 19:35:23 -0600 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 5/28/24 5:04 PM, Jens Axboe wrote:
> On 5/28/24 12:31 PM, Jens Axboe wrote:
>> I suspect a bug in the previous patches, because this is what the
>> forward port looks like. First, for reference, the current results:
>
> Got it sorted, and pinned sender and receiver on CPUs to avoid the
> variation. It looks like this with the task_work approach that I sent
> out as v1:
>
> Latencies for: Sender
> percentiles (nsec):
> | 1.0000th=[ 2160], 5.0000th=[ 2672], 10.0000th=[ 2768],
> | 20.0000th=[ 3568], 30.0000th=[ 3568], 40.0000th=[ 3600],
> | 50.0000th=[ 3600], 60.0000th=[ 3600], 70.0000th=[ 3632],
> | 80.0000th=[ 3632], 90.0000th=[ 3664], 95.0000th=[ 3696],
> | 99.0000th=[ 4832], 99.5000th=[15168], 99.9000th=[16192],
> | 99.9500th=[16320], 99.9900th=[18304]
> Latencies for: Receiver
> percentiles (nsec):
> | 1.0000th=[ 1528], 5.0000th=[ 1576], 10.0000th=[ 1656],
> | 20.0000th=[ 2040], 30.0000th=[ 2064], 40.0000th=[ 2064],
> | 50.0000th=[ 2064], 60.0000th=[ 2064], 70.0000th=[ 2096],
> | 80.0000th=[ 2096], 90.0000th=[ 2128], 95.0000th=[ 2160],
> | 99.0000th=[ 3472], 99.5000th=[14784], 99.9000th=[15168],
> | 99.9500th=[15424], 99.9900th=[17280]
>
> and here's the exact same test run on the current patches:
>
> Latencies for: Sender
> percentiles (nsec):
> | 1.0000th=[ 362], 5.0000th=[ 362], 10.0000th=[ 370],
> | 20.0000th=[ 370], 30.0000th=[ 370], 40.0000th=[ 370],
> | 50.0000th=[ 374], 60.0000th=[ 382], 70.0000th=[ 382],
> | 80.0000th=[ 382], 90.0000th=[ 382], 95.0000th=[ 390],
> | 99.0000th=[ 402], 99.5000th=[ 430], 99.9000th=[ 900],
> | 99.9500th=[ 972], 99.9900th=[ 1432]
> Latencies for: Receiver
> percentiles (nsec):
> | 1.0000th=[ 1528], 5.0000th=[ 1544], 10.0000th=[ 1560],
> | 20.0000th=[ 1576], 30.0000th=[ 1592], 40.0000th=[ 1592],
> | 50.0000th=[ 1592], 60.0000th=[ 1608], 70.0000th=[ 1608],
> | 80.0000th=[ 1640], 90.0000th=[ 1672], 95.0000th=[ 1688],
> | 99.0000th=[ 1848], 99.5000th=[ 2128], 99.9000th=[14272],
> | 99.9500th=[14784], 99.9900th=[73216]
>
> I'll try and augment the test app to do proper rated submissions, so I
> can ramp up the rates a bit and see what happens.
And the final one, with the rated sends sorted out. One key observation
is that v1 trails the current edition, it just can't keep up as the rate
is increased. If we cap the rate at at what should be 33K messages per
second, v1 gets ~28K messages and has the following latency profile (for
a 3 second run)
Latencies for: Receiver (msg=83863)
percentiles (nsec):
| 1.0000th=[ 1208], 5.0000th=[ 1336], 10.0000th=[ 1400],
| 20.0000th=[ 1768], 30.0000th=[ 1912], 40.0000th=[ 1976],
| 50.0000th=[ 2040], 60.0000th=[ 2160], 70.0000th=[ 2256],
| 80.0000th=[ 2480], 90.0000th=[ 2736], 95.0000th=[ 3024],
| 99.0000th=[ 4080], 99.5000th=[ 4896], 99.9000th=[ 9664],
| 99.9500th=[ 17024], 99.9900th=[218112]
Latencies for: Sender (msg=83863)
percentiles (nsec):
| 1.0000th=[ 1928], 5.0000th=[ 2064], 10.0000th=[ 2160],
| 20.0000th=[ 2608], 30.0000th=[ 2672], 40.0000th=[ 2736],
| 50.0000th=[ 2864], 60.0000th=[ 2960], 70.0000th=[ 3152],
| 80.0000th=[ 3408], 90.0000th=[ 4128], 95.0000th=[ 4576],
| 99.0000th=[ 5920], 99.5000th=[ 6752], 99.9000th=[ 13376],
| 99.9500th=[ 22912], 99.9900th=[261120]
and the current edition does:
Latencies for: Sender (msg=94488)
percentiles (nsec):
| 1.0000th=[ 181], 5.0000th=[ 191], 10.0000th=[ 201],
| 20.0000th=[ 215], 30.0000th=[ 225], 40.0000th=[ 235],
| 50.0000th=[ 262], 60.0000th=[ 306], 70.0000th=[ 430],
| 80.0000th=[ 1004], 90.0000th=[ 2480], 95.0000th=[ 3632],
| 99.0000th=[ 8096], 99.5000th=[12352], 99.9000th=[18048],
| 99.9500th=[19584], 99.9900th=[23680]
Latencies for: Receiver (msg=94488)
percentiles (nsec):
| 1.0000th=[ 342], 5.0000th=[ 398], 10.0000th=[ 482],
| 20.0000th=[ 652], 30.0000th=[ 812], 40.0000th=[ 972],
| 50.0000th=[ 1240], 60.0000th=[ 1640], 70.0000th=[ 1944],
| 80.0000th=[ 2448], 90.0000th=[ 3248], 95.0000th=[ 5216],
| 99.0000th=[10304], 99.5000th=[12352], 99.9000th=[18048],
| 99.9500th=[19840], 99.9900th=[23168]
If we cap it where v1 keeps up, at 13K messages per second, v1 does:
Latencies for: Receiver (msg=38820)
percentiles (nsec):
| 1.0000th=[ 1160], 5.0000th=[ 1256], 10.0000th=[ 1352],
| 20.0000th=[ 1688], 30.0000th=[ 1928], 40.0000th=[ 1976],
| 50.0000th=[ 2064], 60.0000th=[ 2384], 70.0000th=[ 2480],
| 80.0000th=[ 2768], 90.0000th=[ 3280], 95.0000th=[ 3472],
| 99.0000th=[ 4192], 99.5000th=[ 4512], 99.9000th=[ 6624],
| 99.9500th=[ 8768], 99.9900th=[14272]
Latencies for: Sender (msg=38820)
percentiles (nsec):
| 1.0000th=[ 1848], 5.0000th=[ 1928], 10.0000th=[ 2040],
| 20.0000th=[ 2608], 30.0000th=[ 2640], 40.0000th=[ 2736],
| 50.0000th=[ 3024], 60.0000th=[ 3120], 70.0000th=[ 3376],
| 80.0000th=[ 3824], 90.0000th=[ 4512], 95.0000th=[ 4768],
| 99.0000th=[ 5536], 99.5000th=[ 6048], 99.9000th=[ 9024],
| 99.9500th=[10304], 99.9900th=[23424]
and v2 does:
Latencies for: Sender (msg=39005)
percentiles (nsec):
| 1.0000th=[ 191], 5.0000th=[ 211], 10.0000th=[ 262],
| 20.0000th=[ 342], 30.0000th=[ 382], 40.0000th=[ 402],
| 50.0000th=[ 450], 60.0000th=[ 532], 70.0000th=[ 1080],
| 80.0000th=[ 1848], 90.0000th=[ 4768], 95.0000th=[10944],
| 99.0000th=[16512], 99.5000th=[18304], 99.9000th=[22400],
| 99.9500th=[26496], 99.9900th=[41728]
Latencies for: Receiver (msg=39005)
percentiles (nsec):
| 1.0000th=[ 410], 5.0000th=[ 604], 10.0000th=[ 700],
| 20.0000th=[ 900], 30.0000th=[ 1128], 40.0000th=[ 1320],
| 50.0000th=[ 1672], 60.0000th=[ 2256], 70.0000th=[ 2736],
| 80.0000th=[ 3760], 90.0000th=[ 5408], 95.0000th=[11072],
| 99.0000th=[18304], 99.5000th=[20096], 99.9000th=[24704],
| 99.9500th=[27520], 99.9900th=[35584]
--
Jens Axboe
next prev parent reply other threads:[~2024-05-29 1:35 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-24 22:58 [PATCHSET 0/3] Improve MSG_RING SINGLE_ISSUER performance Jens Axboe
2024-05-24 22:58 ` [PATCH 1/3] io_uring/msg_ring: split fd installing into a helper Jens Axboe
2024-05-24 22:58 ` [PATCH 2/3] io_uring/msg_ring: avoid double indirection task_work for data messages Jens Axboe
2024-05-28 13:18 ` Pavel Begunkov
2024-05-28 14:23 ` Jens Axboe
2024-05-28 13:32 ` Pavel Begunkov
2024-05-28 14:23 ` Jens Axboe
2024-05-28 16:23 ` Pavel Begunkov
2024-05-28 17:59 ` Jens Axboe
2024-05-29 2:04 ` Pavel Begunkov
2024-05-29 2:43 ` Jens Axboe
2024-05-24 22:58 ` [PATCH 3/3] io_uring/msg_ring: avoid double indirection task_work for fd passing Jens Axboe
2024-05-28 13:31 ` [PATCHSET 0/3] Improve MSG_RING SINGLE_ISSUER performance Pavel Begunkov
2024-05-28 14:34 ` Jens Axboe
2024-05-28 14:39 ` Jens Axboe
2024-05-28 15:27 ` Jens Axboe
2024-05-28 16:50 ` Pavel Begunkov
2024-05-28 18:07 ` Jens Axboe
2024-05-28 18:31 ` Jens Axboe
2024-05-28 23:04 ` Jens Axboe
2024-05-29 1:35 ` Jens Axboe [this message]
2024-05-29 2:08 ` Pavel Begunkov
2024-05-29 2:42 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox