* [PATCH 0/2] io_uring: replace wq users and add WQ_PERCPU to alloc_workqueue() users
@ 2025-09-05 9:02 Marco Crivellari
2025-09-05 9:02 ` [PATCH 1/2] io_uring: replace use of system_wq with system_percpu_wq Marco Crivellari
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Marco Crivellari @ 2025-09-05 9:02 UTC (permalink / raw)
To: linux-kernel, io-uring
Cc: Tejun Heo, Lai Jiangshan, Frederic Weisbecker,
Sebastian Andrzej Siewior, Marco Crivellari, Michal Hocko,
Jens Axboe, Pavel Begunkov
Hi!
Below is a summary of a discussion about the Workqueue API and cpu isolation
considerations. Details and more information are available here:
"workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND."
https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
=== Current situation: problems ===
Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.
This leads to different scenarios if a work item is scheduled on an isolated
CPU where "delay" value is 0 or greater then 0:
schedule_delayed_work(, 0);
This will be handled by __queue_work() that will queue the work item on the
current local (isolated) CPU, while:
schedule_delayed_work(, 1);
Will move the timer on an housekeeping CPU, and schedule the work there.
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
=== Plan and future plans ===
This patchset is the first stone on a refactoring needed in order to
address the points aforementioned; it will have a positive impact also
on the cpu isolation, in the long term, moving away percpu workqueue in
favor to an unbound model.
These are the main steps:
1) API refactoring (that this patch is introducing)
- Make more clear and uniform the system wq names, both per-cpu and
unbound. This to avoid any possible confusion on what should be
used.
- Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND,
introduced in this patchset and used on all the callers that are not
currently using WQ_UNBOUND.
WQ_UNBOUND will be removed in a future release cycle.
Most users don't need to be per-cpu, because they don't have
locality requirements, because of that, a next future step will be
make "unbound" the default behavior.
2) Check who really needs to be per-cpu
- Remove the WQ_PERCPU flag when is not strictly required.
3) Add a new API (prefer local cpu)
- There are users that don't require a local execution, like mentioned
above; despite that, local execution yeld to performance gain.
This new API will prefer the local execution, without requiring it.
=== Introduced Changes by this series ===
1) [P 1-2] Replace use of system_wq and system_unbound_wq
system_wq is a per-CPU workqueue, but his name is not clear.
system_unbound_wq is to be used when locality is not required.
Because of that, system_wq has been renamed in system_percpu_wq, and
system_unbound_wq has been renamed in system_dfl_wq.
=== For Maintainers ===
There are prerequisites for this series, already merged in the master branch.
The commits are:
128ea9f6ccfb6960293ae4212f4f97165e42222d ("workqueue: Add system_percpu_wq and
system_dfl_wq")
930c2ea566aff59e962c50b2421d5fcc3b98b8be ("workqueue: Add new WQ_PERCPU flag")
Thanks!
Marco Crivellari (2):
io_uring: replace use of system_wq with system_percpu_wq
io_uring: replace use of system_unbound_wq with system_dfl_wq
io_uring/io_uring.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/2] io_uring: replace use of system_wq with system_percpu_wq
2025-09-05 9:02 [PATCH 0/2] io_uring: replace wq users and add WQ_PERCPU to alloc_workqueue() users Marco Crivellari
@ 2025-09-05 9:02 ` Marco Crivellari
2025-09-05 9:02 ` [PATCH 2/2] io_uring: replace use of system_unbound_wq with system_dfl_wq Marco Crivellari
2025-09-09 15:12 ` [PATCH 0/2] io_uring: replace wq users and add WQ_PERCPU to alloc_workqueue() users Jens Axboe
2 siblings, 0 replies; 5+ messages in thread
From: Marco Crivellari @ 2025-09-05 9:02 UTC (permalink / raw)
To: linux-kernel, io-uring
Cc: Tejun Heo, Lai Jiangshan, Frederic Weisbecker,
Sebastian Andrzej Siewior, Marco Crivellari, Michal Hocko,
Jens Axboe, Pavel Begunkov
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
system_wq is a per-CPU worqueue, yet nothing in its name tells about that
CPU affinity constraint, which is very often not required by users. Make
it clear by adding a system_percpu_wq.
queue_work() / queue_delayed_work() mod_delayed_work() will now use the
new per-cpu wq: whether the user still stick on the old name a warn will
be printed along a wq redirect to the new one.
This patch add the new system_percpu_wq except for mm, fs and net
subsystem, whom are handled in separated patches.
The old wq will be kept for a few release cylces.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
---
io_uring/io_uring.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index c6209fe44cb1..2a6ead3c7d36 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2986,7 +2986,7 @@ static __cold void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx)
* Use system_unbound_wq to avoid spawning tons of event kworkers
* if we're exiting a ton of rings at the same time. It just adds
* noise and overhead, there's no discernable change in runtime
- * over using system_wq.
+ * over using system_percpu_wq.
*/
queue_work(iou_wq, &ctx->exit_work);
}
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] io_uring: replace use of system_unbound_wq with system_dfl_wq
2025-09-05 9:02 [PATCH 0/2] io_uring: replace wq users and add WQ_PERCPU to alloc_workqueue() users Marco Crivellari
2025-09-05 9:02 ` [PATCH 1/2] io_uring: replace use of system_wq with system_percpu_wq Marco Crivellari
@ 2025-09-05 9:02 ` Marco Crivellari
2025-09-09 15:12 ` [PATCH 0/2] io_uring: replace wq users and add WQ_PERCPU to alloc_workqueue() users Jens Axboe
2 siblings, 0 replies; 5+ messages in thread
From: Marco Crivellari @ 2025-09-05 9:02 UTC (permalink / raw)
To: linux-kernel, io-uring
Cc: Tejun Heo, Lai Jiangshan, Frederic Weisbecker,
Sebastian Andrzej Siewior, Marco Crivellari, Michal Hocko,
Jens Axboe, Pavel Begunkov
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
system_unbound_wq should be the default workqueue so as not to enforce
locality constraints for random work whenever it's not required.
Adding system_dfl_wq to encourage its use when unbound work should be used.
queue_work() / queue_delayed_work() / mod_delayed_work() will now use the
new unbound wq: whether the user still use the old wq a warn will be
printed along with a wq redirect to the new one.
The old system_unbound_wq will be kept for a few release cycles.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
---
io_uring/io_uring.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 2a6ead3c7d36..74972ecf2045 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2983,7 +2983,7 @@ static __cold void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx)
INIT_WORK(&ctx->exit_work, io_ring_exit_work);
/*
- * Use system_unbound_wq to avoid spawning tons of event kworkers
+ * Use system_dfl_wq to avoid spawning tons of event kworkers
* if we're exiting a ton of rings at the same time. It just adds
* noise and overhead, there's no discernable change in runtime
* over using system_percpu_wq.
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 0/2] io_uring: replace wq users and add WQ_PERCPU to alloc_workqueue() users
2025-09-05 9:02 [PATCH 0/2] io_uring: replace wq users and add WQ_PERCPU to alloc_workqueue() users Marco Crivellari
2025-09-05 9:02 ` [PATCH 1/2] io_uring: replace use of system_wq with system_percpu_wq Marco Crivellari
2025-09-05 9:02 ` [PATCH 2/2] io_uring: replace use of system_unbound_wq with system_dfl_wq Marco Crivellari
@ 2025-09-09 15:12 ` Jens Axboe
2025-09-09 15:13 ` Marco Crivellari
2 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2025-09-09 15:12 UTC (permalink / raw)
To: linux-kernel, io-uring, Marco Crivellari
Cc: Tejun Heo, Lai Jiangshan, Frederic Weisbecker,
Sebastian Andrzej Siewior, Michal Hocko, Pavel Begunkov
On Fri, 05 Sep 2025 11:02:38 +0200, Marco Crivellari wrote:
> Below is a summary of a discussion about the Workqueue API and cpu isolation
> considerations. Details and more information are available here:
>
> "workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND."
> https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
>
> === Current situation: problems ===
>
> [...]
Applied, thanks!
[1/2] io_uring: replace use of system_wq with system_percpu_wq
commit: e92f5c03d32409c957864f9bc611872861f8157e
[2/2] io_uring: replace use of system_unbound_wq with system_dfl_wq
commit: 59cfd1fa5a5b87969190c3178180d025ee9251a7
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/2] io_uring: replace wq users and add WQ_PERCPU to alloc_workqueue() users
2025-09-09 15:12 ` [PATCH 0/2] io_uring: replace wq users and add WQ_PERCPU to alloc_workqueue() users Jens Axboe
@ 2025-09-09 15:13 ` Marco Crivellari
0 siblings, 0 replies; 5+ messages in thread
From: Marco Crivellari @ 2025-09-09 15:13 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-kernel, io-uring, Tejun Heo, Lai Jiangshan,
Frederic Weisbecker, Sebastian Andrzej Siewior, Michal Hocko,
Pavel Begunkov
On Tue, Sep 9, 2025 at 5:12 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> Applied, thanks!
>
> [1/2] io_uring: replace use of system_wq with system_percpu_wq
> commit: e92f5c03d32409c957864f9bc611872861f8157e
> [2/2] io_uring: replace use of system_unbound_wq with system_dfl_wq
> commit: 59cfd1fa5a5b87969190c3178180d025ee9251a7
>
Many thanks!
--
Marco Crivellari
L3 Support Engineer, Technology & Product
marco.crivellari@suse.com
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-09-09 15:13 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-05 9:02 [PATCH 0/2] io_uring: replace wq users and add WQ_PERCPU to alloc_workqueue() users Marco Crivellari
2025-09-05 9:02 ` [PATCH 1/2] io_uring: replace use of system_wq with system_percpu_wq Marco Crivellari
2025-09-05 9:02 ` [PATCH 2/2] io_uring: replace use of system_unbound_wq with system_dfl_wq Marco Crivellari
2025-09-09 15:12 ` [PATCH 0/2] io_uring: replace wq users and add WQ_PERCPU to alloc_workqueue() users Jens Axboe
2025-09-09 15:13 ` Marco Crivellari
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox