public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: Max Kellermann <[email protected]>
Cc: [email protected], [email protected],
	[email protected]
Subject: Re: [PATCH 0/8] Various io_uring micro-optimizations (reducing lock contention)
Date: Sat, 1 Feb 2025 08:25:57 -0700	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 1/31/25 9:13 AM, Jens Axboe wrote:
> On 1/29/25 11:01 AM, Max Kellermann wrote:
>> On Wed, Jan 29, 2025 at 6:45?PM Jens Axboe <[email protected]> wrote:
>>> Why are you combining it with epoll in the first place? It's a lot more
>>> efficient to wait on a/multiple events in io_uring_enter() rather than
>>> go back to a serialize one-event-per-notification by using epoll to wait
>>> on completions on the io_uring side.
>>
>> Yes, I wish I could do that, but that works only if everything is
>> io_uring - all or nothing. Most of the code is built around an
>> epoll-based loop and will not be ported to io_uring so quickly.
>>
>> Maybe what's missing is epoll_wait as io_uring opcode. Then I could
>> wrap it the other way. Or am I supposed to use io_uring
>> poll_add_multishot for that?
> 
> Not a huge fan of adding more epoll logic to io_uring, but you are right
> this case may indeed make sense as it allows you to integrate better
> that way in existing event loops. I'll take a look.

Here's a series doing that:

https://git.kernel.dk/cgit/linux/log/?h=io_uring-epoll-wait

Could actually work pretty well - the last patch adds multishot support
as well, which means we can avoid the write lock dance for repeated
triggers of this epoll event. That should actually end up being more
efficient than regular epoll_wait(2).

Wrote a basic test cases to exercise it, and it seems to work fine for
me, but obviously not super well tested just yet. Below is the liburing
diff, just adds the helper to prepare one of these epoll wait requests.


diff --git a/src/include/liburing.h b/src/include/liburing.h
index 49b4edf437b2..a95c475496f4 100644
--- a/src/include/liburing.h
+++ b/src/include/liburing.h
@@ -729,6 +729,15 @@ IOURINGINLINE void io_uring_prep_listen(struct io_uring_sqe *sqe, int fd,
 	io_uring_prep_rw(IORING_OP_LISTEN, sqe, fd, 0, backlog, 0);
 }
 
+struct epoll_event;
+IOURINGINLINE void io_uring_prep_epoll_wait(struct io_uring_sqe *sqe, int fd,
+					    struct epoll_event *events,
+					    int maxevents, unsigned flags)
+{
+	io_uring_prep_rw(IORING_OP_EPOLL_WAIT, sqe, fd, events, maxevents, 0);
+	sqe->epoll_flags = flags;
+}
+
 IOURINGINLINE void io_uring_prep_files_update(struct io_uring_sqe *sqe,
 					      int *fds, unsigned nr_fds,
 					      int offset)
diff --git a/src/include/liburing/io_uring.h b/src/include/liburing/io_uring.h
index 765919883cff..bc725787ceb7 100644
--- a/src/include/liburing/io_uring.h
+++ b/src/include/liburing/io_uring.h
@@ -73,6 +73,7 @@ struct io_uring_sqe {
 		__u32		futex_flags;
 		__u32		install_fd_flags;
 		__u32		nop_flags;
+		__u32		epoll_flags;
 	};
 	__u64	user_data;	/* data to be passed back at completion time */
 	/* pack this to avoid bogus arm OABI complaints */
@@ -262,6 +263,7 @@ enum io_uring_op {
 	IORING_OP_FTRUNCATE,
 	IORING_OP_BIND,
 	IORING_OP_LISTEN,
+	IORING_OP_EPOLL_WAIT,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
@@ -388,6 +390,11 @@ enum io_uring_op {
 #define IORING_ACCEPT_DONTWAIT	(1U << 1)
 #define IORING_ACCEPT_POLL_FIRST	(1U << 2)
 
+/*
+ * epoll_wait flags, stored in sqe->epoll_flags
+ */
+#define IORING_EPOLL_WAIT_MULTISHOT	(1U << 0)
+
 /*
  * IORING_OP_MSG_RING command types, stored in sqe->addr
  */

-- 
Jens Axboe

  reply	other threads:[~2025-02-01 15:26 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-28 13:39 [PATCH 0/8] Various io_uring micro-optimizations (reducing lock contention) Max Kellermann
2025-01-28 13:39 ` [PATCH 1/8] io_uring/io-wq: eliminate redundant io_work_get_acct() calls Max Kellermann
2025-01-28 13:39 ` [PATCH 2/8] io_uring/io-wq: add io_worker.acct pointer Max Kellermann
2025-01-28 13:39 ` [PATCH 3/8] io_uring/io-wq: move worker lists to struct io_wq_acct Max Kellermann
2025-01-28 13:39 ` [PATCH 4/8] io_uring/io-wq: cache work->flags in variable Max Kellermann
2025-01-29 18:57   ` Pavel Begunkov
2025-01-29 19:11     ` Max Kellermann
2025-01-29 23:41       ` Pavel Begunkov
2025-01-30  5:36         ` Max Kellermann
2025-01-30 14:57           ` Jens Axboe
2025-01-31 14:06             ` Pavel Begunkov
2025-01-30 14:54         ` Jens Axboe
2025-01-28 13:39 ` [PATCH 5/8] io_uring/io-wq: do not use bogus hash value Max Kellermann
2025-01-28 13:39 ` [PATCH 6/8] io_uring/io-wq: pass io_wq to io_get_next_work() Max Kellermann
2025-01-28 13:39 ` [PATCH 7/8] io_uring: cache io_kiocb->flags in variable Max Kellermann
2025-01-29 19:11   ` Pavel Begunkov
2025-01-28 13:39 ` [PATCH 8/8] io_uring: skip redundant poll wakeups Max Kellermann
2025-01-31 13:54   ` Pavel Begunkov
2025-01-31 17:16     ` Max Kellermann
2025-01-31 17:25       ` Pavel Begunkov
2025-01-29 17:18 ` [PATCH 0/8] Various io_uring micro-optimizations (reducing lock contention) Jens Axboe
2025-01-29 17:39   ` Max Kellermann
2025-01-29 17:45     ` Jens Axboe
2025-01-29 18:01       ` Max Kellermann
2025-01-31 16:13         ` Jens Axboe
2025-02-01 15:25           ` Jens Axboe [this message]
2025-02-01 15:30             ` Max Kellermann
2025-02-01 15:38               ` Jens Axboe
2025-01-29 19:30     ` Pavel Begunkov
2025-01-29 19:43       ` Max Kellermann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox