public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: Jens Axboe <[email protected]>, [email protected]
Subject: Re: [PATCHSET 0/3] Improve MSG_RING SINGLE_ISSUER performance
Date: Wed, 29 May 2024 03:08:33 +0100	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 5/29/24 02:35, Jens Axboe wrote:
> On 5/28/24 5:04 PM, Jens Axboe wrote:
>> On 5/28/24 12:31 PM, Jens Axboe wrote:
>>> I suspect a bug in the previous patches, because this is what the
>>> forward port looks like. First, for reference, the current results:
>>
>> Got it sorted, and pinned sender and receiver on CPUs to avoid the
>> variation. It looks like this with the task_work approach that I sent
>> out as v1:
>>
>> Latencies for: Sender
>>      percentiles (nsec):
>>       |  1.0000th=[ 2160],  5.0000th=[ 2672], 10.0000th=[ 2768],
>>       | 20.0000th=[ 3568], 30.0000th=[ 3568], 40.0000th=[ 3600],
>>       | 50.0000th=[ 3600], 60.0000th=[ 3600], 70.0000th=[ 3632],
>>       | 80.0000th=[ 3632], 90.0000th=[ 3664], 95.0000th=[ 3696],
>>       | 99.0000th=[ 4832], 99.5000th=[15168], 99.9000th=[16192],
>>       | 99.9500th=[16320], 99.9900th=[18304]
>> Latencies for: Receiver
>>      percentiles (nsec):
>>       |  1.0000th=[ 1528],  5.0000th=[ 1576], 10.0000th=[ 1656],
>>       | 20.0000th=[ 2040], 30.0000th=[ 2064], 40.0000th=[ 2064],
>>       | 50.0000th=[ 2064], 60.0000th=[ 2064], 70.0000th=[ 2096],
>>       | 80.0000th=[ 2096], 90.0000th=[ 2128], 95.0000th=[ 2160],
>>       | 99.0000th=[ 3472], 99.5000th=[14784], 99.9000th=[15168],
>>       | 99.9500th=[15424], 99.9900th=[17280]
>>
>> and here's the exact same test run on the current patches:
>>
>> Latencies for: Sender
>>      percentiles (nsec):
>>       |  1.0000th=[  362],  5.0000th=[  362], 10.0000th=[  370],
>>       | 20.0000th=[  370], 30.0000th=[  370], 40.0000th=[  370],
>>       | 50.0000th=[  374], 60.0000th=[  382], 70.0000th=[  382],
>>       | 80.0000th=[  382], 90.0000th=[  382], 95.0000th=[  390],
>>       | 99.0000th=[  402], 99.5000th=[  430], 99.9000th=[  900],
>>       | 99.9500th=[  972], 99.9900th=[ 1432]
>> Latencies for: Receiver
>>      percentiles (nsec):
>>       |  1.0000th=[ 1528],  5.0000th=[ 1544], 10.0000th=[ 1560],
>>       | 20.0000th=[ 1576], 30.0000th=[ 1592], 40.0000th=[ 1592],
>>       | 50.0000th=[ 1592], 60.0000th=[ 1608], 70.0000th=[ 1608],
>>       | 80.0000th=[ 1640], 90.0000th=[ 1672], 95.0000th=[ 1688],
>>       | 99.0000th=[ 1848], 99.5000th=[ 2128], 99.9000th=[14272],
>>       | 99.9500th=[14784], 99.9900th=[73216]
>>
>> I'll try and augment the test app to do proper rated submissions, so I
>> can ramp up the rates a bit and see what happens.
> 
> And the final one, with the rated sends sorted out. One key observation
> is that v1 trails the current edition, it just can't keep up as the rate
> is increased. If we cap the rate at at what should be 33K messages per
> second, v1 gets ~28K messages and has the following latency profile (for
> a 3 second run)

Do you see where the receiver latency comes from? The wakeups are
quite similar in nature, assuming it's all wait(nr=1) and CPUs
are not 100% consumed. The hop back spoils scheduling timing?


> Latencies for: Receiver (msg=83863)
>      percentiles (nsec):
>       |  1.0000th=[  1208],  5.0000th=[  1336], 10.0000th=[  1400],
>       | 20.0000th=[  1768], 30.0000th=[  1912], 40.0000th=[  1976],
>       | 50.0000th=[  2040], 60.0000th=[  2160], 70.0000th=[  2256],
>       | 80.0000th=[  2480], 90.0000th=[  2736], 95.0000th=[  3024],
>       | 99.0000th=[  4080], 99.5000th=[  4896], 99.9000th=[  9664],
>       | 99.9500th=[ 17024], 99.9900th=[218112]
> Latencies for: Sender (msg=83863)
>      percentiles (nsec):
>       |  1.0000th=[  1928],  5.0000th=[  2064], 10.0000th=[  2160],
>       | 20.0000th=[  2608], 30.0000th=[  2672], 40.0000th=[  2736],
>       | 50.0000th=[  2864], 60.0000th=[  2960], 70.0000th=[  3152],
>       | 80.0000th=[  3408], 90.0000th=[  4128], 95.0000th=[  4576],
>       | 99.0000th=[  5920], 99.5000th=[  6752], 99.9000th=[ 13376],
>       | 99.9500th=[ 22912], 99.9900th=[261120]
> 
> and the current edition does:
> 
> Latencies for: Sender (msg=94488)
>      percentiles (nsec):
>       |  1.0000th=[  181],  5.0000th=[  191], 10.0000th=[  201],
>       | 20.0000th=[  215], 30.0000th=[  225], 40.0000th=[  235],
>       | 50.0000th=[  262], 60.0000th=[  306], 70.0000th=[  430],
>       | 80.0000th=[ 1004], 90.0000th=[ 2480], 95.0000th=[ 3632],
>       | 99.0000th=[ 8096], 99.5000th=[12352], 99.9000th=[18048],
>       | 99.9500th=[19584], 99.9900th=[23680]
> Latencies for: Receiver (msg=94488)
>      percentiles (nsec):
>       |  1.0000th=[  342],  5.0000th=[  398], 10.0000th=[  482],
>       | 20.0000th=[  652], 30.0000th=[  812], 40.0000th=[  972],
>       | 50.0000th=[ 1240], 60.0000th=[ 1640], 70.0000th=[ 1944],
>       | 80.0000th=[ 2448], 90.0000th=[ 3248], 95.0000th=[ 5216],
>       | 99.0000th=[10304], 99.5000th=[12352], 99.9000th=[18048],
>       | 99.9500th=[19840], 99.9900th=[23168]
> 
> If we cap it where v1 keeps up, at 13K messages per second, v1 does:
> 
> Latencies for: Receiver (msg=38820)
>      percentiles (nsec):
>       |  1.0000th=[ 1160],  5.0000th=[ 1256], 10.0000th=[ 1352],
>       | 20.0000th=[ 1688], 30.0000th=[ 1928], 40.0000th=[ 1976],
>       | 50.0000th=[ 2064], 60.0000th=[ 2384], 70.0000th=[ 2480],
>       | 80.0000th=[ 2768], 90.0000th=[ 3280], 95.0000th=[ 3472],
>       | 99.0000th=[ 4192], 99.5000th=[ 4512], 99.9000th=[ 6624],
>       | 99.9500th=[ 8768], 99.9900th=[14272]
> Latencies for: Sender (msg=38820)
>      percentiles (nsec):
>       |  1.0000th=[ 1848],  5.0000th=[ 1928], 10.0000th=[ 2040],
>       | 20.0000th=[ 2608], 30.0000th=[ 2640], 40.0000th=[ 2736],
>       | 50.0000th=[ 3024], 60.0000th=[ 3120], 70.0000th=[ 3376],
>       | 80.0000th=[ 3824], 90.0000th=[ 4512], 95.0000th=[ 4768],
>       | 99.0000th=[ 5536], 99.5000th=[ 6048], 99.9000th=[ 9024],
>       | 99.9500th=[10304], 99.9900th=[23424]
> 
> and v2 does:
> 
> Latencies for: Sender (msg=39005)
>      percentiles (nsec):
>       |  1.0000th=[  191],  5.0000th=[  211], 10.0000th=[  262],
>       | 20.0000th=[  342], 30.0000th=[  382], 40.0000th=[  402],
>       | 50.0000th=[  450], 60.0000th=[  532], 70.0000th=[ 1080],
>       | 80.0000th=[ 1848], 90.0000th=[ 4768], 95.0000th=[10944],
>       | 99.0000th=[16512], 99.5000th=[18304], 99.9000th=[22400],
>       | 99.9500th=[26496], 99.9900th=[41728]
> Latencies for: Receiver (msg=39005)
>      percentiles (nsec):
>       |  1.0000th=[  410],  5.0000th=[  604], 10.0000th=[  700],
>       | 20.0000th=[  900], 30.0000th=[ 1128], 40.0000th=[ 1320],
>       | 50.0000th=[ 1672], 60.0000th=[ 2256], 70.0000th=[ 2736],
>       | 80.0000th=[ 3760], 90.0000th=[ 5408], 95.0000th=[11072],
>       | 99.0000th=[18304], 99.5000th=[20096], 99.9000th=[24704],
>       | 99.9500th=[27520], 99.9900th=[35584]
> 

-- 
Pavel Begunkov

  reply	other threads:[~2024-05-29  2:08 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-24 22:58 [PATCHSET 0/3] Improve MSG_RING SINGLE_ISSUER performance Jens Axboe
2024-05-24 22:58 ` [PATCH 1/3] io_uring/msg_ring: split fd installing into a helper Jens Axboe
2024-05-24 22:58 ` [PATCH 2/3] io_uring/msg_ring: avoid double indirection task_work for data messages Jens Axboe
2024-05-28 13:18   ` Pavel Begunkov
2024-05-28 14:23     ` Jens Axboe
2024-05-28 13:32   ` Pavel Begunkov
2024-05-28 14:23     ` Jens Axboe
2024-05-28 16:23       ` Pavel Begunkov
2024-05-28 17:59         ` Jens Axboe
2024-05-29  2:04           ` Pavel Begunkov
2024-05-29  2:43             ` Jens Axboe
2024-05-24 22:58 ` [PATCH 3/3] io_uring/msg_ring: avoid double indirection task_work for fd passing Jens Axboe
2024-05-28 13:31 ` [PATCHSET 0/3] Improve MSG_RING SINGLE_ISSUER performance Pavel Begunkov
2024-05-28 14:34   ` Jens Axboe
2024-05-28 14:39     ` Jens Axboe
2024-05-28 15:27     ` Jens Axboe
2024-05-28 16:50     ` Pavel Begunkov
2024-05-28 18:07       ` Jens Axboe
2024-05-28 18:31         ` Jens Axboe
2024-05-28 23:04           ` Jens Axboe
2024-05-29  1:35             ` Jens Axboe
2024-05-29  2:08               ` Pavel Begunkov [this message]
2024-05-29  2:42                 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox