From: Caleb Sander Mateos <csander@purestorage.com>
To: Jens Axboe <axboe@kernel.dk>,
io-uring@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Joanne Koong <joannelkoong@gmail.com>,
Caleb Sander Mateos <csander@purestorage.com>
Subject: [PATCH v6 0/6] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
Date: Wed, 17 Dec 2025 19:44:53 -0700 [thread overview]
Message-ID: <20251218024459.1083572-1-csander@purestorage.com> (raw)
Setting IORING_SETUP_SINGLE_ISSUER when creating an io_uring doesn't
actually enable any additional optimizations (aside from being a
requirement for IORING_SETUP_DEFER_TASKRUN). This series leverages
IORING_SETUP_SINGLE_ISSUER's guarantee that only one task submits SQEs
to skip taking the uring_lock mutex for the issue and task work paths.
First, we need to disable this optimization for IORING_SETUP_SQPOLL by
clearing the IORING_SETUP_SINGLE_ISSUER flag. For IORING_SETUP_SQPOLL,
the SQ thread is the one taking the uring_lock mutex in the issue path.
Since concurrent io_uring_register() syscalls are allowed on the thread
that created/enabled the io_uring, some additional synchronization
method would be required to synchronize the two threads. This is
possible in principle by having io_uring_register() schedule a task work
item to suspend the SQ thread, but seems complex for a niche use case.
Then we factor out helpers for interacting with uring_lock to centralize
the logic.
Finally, we implement the optimization for IORING_SETUP_SINGLE_ISSUER.
If the io_ring_ctx is setup with IORING_SETUP_SINGLE_ISSUER, skip the
uring_lock mutex_lock() and mutex_unlock() on the submitter_task. On
other tasks acquiring the ctx uring lock, use a task work item to
suspend the submitter_task for the critical section.
If the io_ring_ctx is IORING_SETUP_R_DISABLED (possible during
io_uring_setup(), io_uring_register(), or io_uring exit), submitter_task
may be set concurrently, so acquire the uring_lock before checking it.
If submitter_task isn't set yet, the uring_lock suffices to provide
mutual exclusion. If task work can't be queued because submitter_task
has exited, also use the uring_lock for mutual exclusion.
v6:
- Release submitter_task reference last in io_ring_ctx_free() (syzbot)
- Use the uring_lock to provide mutual exclusion if task_work_add()
fails because submitter_task has exited
- Add Reviewed-by tag
v5:
- Ensure submitter_task is initialized in io_uring_create() before
calling io_ring_ctx_wait_and_kill() (kernel test robot)
- Correct Fixes tag (Joanne)
- Add Reviewed-by tag
v4:
- Handle IORING_SETUP_SINGLE_ISSUER and IORING_SETUP_R_DISABLED
correctly (syzbot)
- Remove separate set of helpers for io_uring_register()
- Add preliminary fix to prevent races between accessing ctx->flags and
submitter_task
v3:
- Ensure mutual exclusion on threads other than submitter_task via a
task work item to suspend submitter_task
- Drop patches already merged
v2:
- Don't enable these optimizations for IORING_SETUP_SQPOLL, as we still
need to synchronize SQ thread submission with io_uring_register()
Caleb Sander Mateos (6):
io_uring: use release-acquire ordering for IORING_SETUP_R_DISABLED
io_uring: clear IORING_SETUP_SINGLE_ISSUER for IORING_SETUP_SQPOLL
io_uring: ensure submitter_task is valid for io_ring_ctx's lifetime
io_uring: use io_ring_submit_lock() in io_iopoll_req_issued()
io_uring: factor out uring_lock helpers
io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
include/linux/io_uring_types.h | 12 +-
io_uring/cancel.c | 40 +++---
io_uring/cancel.h | 5 +-
io_uring/eventfd.c | 5 +-
io_uring/fdinfo.c | 8 +-
io_uring/filetable.c | 8 +-
io_uring/futex.c | 14 +-
io_uring/io_uring.c | 232 ++++++++++++++++++++-------------
io_uring/io_uring.h | 187 +++++++++++++++++++++++---
io_uring/kbuf.c | 32 +++--
io_uring/memmap.h | 2 +-
io_uring/msg_ring.c | 33 +++--
io_uring/notif.c | 5 +-
io_uring/notif.h | 3 +-
io_uring/openclose.c | 14 +-
io_uring/poll.c | 21 +--
io_uring/register.c | 81 ++++++------
io_uring/rsrc.c | 51 +++++---
io_uring/rsrc.h | 6 +-
io_uring/rw.c | 2 +-
io_uring/splice.c | 5 +-
io_uring/sqpoll.c | 5 +-
io_uring/tctx.c | 27 ++--
io_uring/tctx.h | 5 +-
io_uring/uring_cmd.c | 13 +-
io_uring/waitid.c | 13 +-
io_uring/zcrx.c | 2 +-
27 files changed, 555 insertions(+), 276 deletions(-)
--
2.45.2
next reply other threads:[~2025-12-18 2:45 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-18 2:44 Caleb Sander Mateos [this message]
2025-12-18 2:44 ` [PATCH v6 1/6] io_uring: use release-acquire ordering for IORING_SETUP_R_DISABLED Caleb Sander Mateos
2025-12-18 2:44 ` [PATCH v6 2/6] io_uring: clear IORING_SETUP_SINGLE_ISSUER for IORING_SETUP_SQPOLL Caleb Sander Mateos
2025-12-18 2:44 ` [PATCH v6 3/6] io_uring: ensure submitter_task is valid for io_ring_ctx's lifetime Caleb Sander Mateos
2025-12-18 2:44 ` [PATCH v6 4/6] io_uring: use io_ring_submit_lock() in io_iopoll_req_issued() Caleb Sander Mateos
2025-12-18 2:44 ` [PATCH v6 5/6] io_uring: factor out uring_lock helpers Caleb Sander Mateos
2025-12-18 2:44 ` [PATCH v6 6/6] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER Caleb Sander Mateos
2025-12-18 8:01 ` [syzbot ci] " syzbot ci
2025-12-22 20:19 ` Caleb Sander Mateos
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251218024459.1083572-1-csander@purestorage.com \
--to=csander@purestorage.com \
--cc=axboe@kernel.dk \
--cc=io-uring@vger.kernel.org \
--cc=joannelkoong@gmail.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox