public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCHSET 0/3] Improve MSG_RING SINGLE_ISSUER performance
@ 2024-05-24 22:58 Jens Axboe
  2024-05-24 22:58 ` [PATCH 1/3] io_uring/msg_ring: split fd installing into a helper Jens Axboe
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Jens Axboe @ 2024-05-24 22:58 UTC (permalink / raw)
  To: io-uring

Hi,

A ring setup with with IORING_SETUP_SINGLE_ISSUER, which is required to
use IORING_SETUP_DEFER_TASKRUN, will need two round trips through
generic task_work. This isn't ideal. This patchset attempts to rectify
that, taking a new approach rather than trying to use the io_uring
task_work infrastructure to handle it as in previous postings.

In a sample test app that has one thread send messages to another and
logging both the time between sender sending and receiver receving and
just the time for the sender to post a message and get the CQE back,
I see the following sender latencies with the stock kernel:

Latencies for: Sender
    percentiles (nsec):
     |  1.0000th=[ 4384],  5.0000th=[ 4512], 10.0000th=[ 4576],
     | 20.0000th=[ 4768], 30.0000th=[ 4896], 40.0000th=[ 5024],
     | 50.0000th=[ 5088], 60.0000th=[ 5152], 70.0000th=[ 5280],
     | 80.0000th=[ 5344], 90.0000th=[ 5536], 95.0000th=[ 5728],
     | 99.0000th=[ 8032], 99.5000th=[18048], 99.9000th=[21376],
     | 99.9500th=[26496], 99.9900th=[59136]

and with the patches:

Latencies for: Sender
    percentiles (nsec):
     |  1.0000th=[  756],  5.0000th=[  820], 10.0000th=[  828],
     | 20.0000th=[  844], 30.0000th=[  852], 40.0000th=[  852],
     | 50.0000th=[  860], 60.0000th=[  860], 70.0000th=[  868],
     | 80.0000th=[  884], 90.0000th=[  964], 95.0000th=[  988],
     | 99.0000th=[ 1128], 99.5000th=[ 1208], 99.9000th=[ 1544],
     | 99.9500th=[ 1944], 99.9900th=[ 2896]

For the receiving side the win is smaller as it only "suffers" from
a single generic task_work, about a 10% win in latencies there.

The idea here is to utilize the CQE overflow infrastructure for this,
as that allows the right task to post the CQE to the ring.

1 is a basic refactoring prep patch, patch 2 adds support for normal
messages, and patch 3 adopts the same approach for fd passing.

 io_uring/msg_ring.c | 151 ++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 138 insertions(+), 13 deletions(-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-05-29  2:43 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-24 22:58 [PATCHSET 0/3] Improve MSG_RING SINGLE_ISSUER performance Jens Axboe
2024-05-24 22:58 ` [PATCH 1/3] io_uring/msg_ring: split fd installing into a helper Jens Axboe
2024-05-24 22:58 ` [PATCH 2/3] io_uring/msg_ring: avoid double indirection task_work for data messages Jens Axboe
2024-05-28 13:18   ` Pavel Begunkov
2024-05-28 14:23     ` Jens Axboe
2024-05-28 13:32   ` Pavel Begunkov
2024-05-28 14:23     ` Jens Axboe
2024-05-28 16:23       ` Pavel Begunkov
2024-05-28 17:59         ` Jens Axboe
2024-05-29  2:04           ` Pavel Begunkov
2024-05-29  2:43             ` Jens Axboe
2024-05-24 22:58 ` [PATCH 3/3] io_uring/msg_ring: avoid double indirection task_work for fd passing Jens Axboe
2024-05-28 13:31 ` [PATCHSET 0/3] Improve MSG_RING SINGLE_ISSUER performance Pavel Begunkov
2024-05-28 14:34   ` Jens Axboe
2024-05-28 14:39     ` Jens Axboe
2024-05-28 15:27     ` Jens Axboe
2024-05-28 16:50     ` Pavel Begunkov
2024-05-28 18:07       ` Jens Axboe
2024-05-28 18:31         ` Jens Axboe
2024-05-28 23:04           ` Jens Axboe
2024-05-29  1:35             ` Jens Axboe
2024-05-29  2:08               ` Pavel Begunkov
2024-05-29  2:42                 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox