public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH v2 0/2] fix task_work interation with freezing
@ 2024-07-09 14:27 Pavel Begunkov
  2024-07-09 14:27 ` [PATCH v2 1/2] io_uring/io-wq: limit retrying worker initialisation Pavel Begunkov
  2024-07-09 14:27 ` [PATCH v2 2/2] kernel: rerun task_work while freezing in get_signal() Pavel Begunkov
  0 siblings, 2 replies; 6+ messages in thread
From: Pavel Begunkov @ 2024-07-09 14:27 UTC (permalink / raw)
  To: io-uring
  Cc: Jens Axboe, asml.silence, Oleg Nesterov, Andrew Morton,
	Christian Brauner, Tycho Andersen, Thomas Gleixner, linux-kernel,
	Julian Orth, Peter Zijlstra, Tejun Heo

It's reported [1] that a task_work queued at a wrong time can prevent
freezing and make the tasks to spin in get_signal() taking 100%
of CPU. Patch 1 is a preparation. Patch 2 addresses the issue.

[1] https://github.com/systemd/systemd/issues/33626

v2: move task_work_run() into do_freezer_trap()
    change Fixes tag

Pavel Begunkov (2):
  io_uring/io-wq: limit retrying worker initialisation
  kernel: rerun task_work while freezing in get_signal()

 io_uring/io-wq.c | 10 +++++++---
 kernel/signal.c  |  8 ++++++++
 2 files changed, 15 insertions(+), 3 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/2] io_uring/io-wq: limit retrying worker initialisation
  2024-07-09 14:27 [PATCH v2 0/2] fix task_work interation with freezing Pavel Begunkov
@ 2024-07-09 14:27 ` Pavel Begunkov
  2024-07-09 14:27 ` [PATCH v2 2/2] kernel: rerun task_work while freezing in get_signal() Pavel Begunkov
  1 sibling, 0 replies; 6+ messages in thread
From: Pavel Begunkov @ 2024-07-09 14:27 UTC (permalink / raw)
  To: io-uring
  Cc: Jens Axboe, asml.silence, Oleg Nesterov, Andrew Morton,
	Christian Brauner, Tycho Andersen, Thomas Gleixner, linux-kernel,
	Julian Orth, Peter Zijlstra, Tejun Heo

If io-wq worker creation fails, we retry it by queueing up a task_work.
tasK_work is needed because it should be done from the user process
context. The problem is that retries are not limited, and if queueing a
task_work is the reason for the failure, we might get into an infinite
loop.

It doesn't seem to happen now but it will with the following patch
executing task_work in the freezing loop. For now, arbitrarily limit the
number of attempts to create a worker.

Cc: [email protected]
Fixes: 3146cba99aa28 ("io-wq: make worker creation resilient against signals")
Reported-by: Julian Orth <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/io-wq.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c
index 913c92249522..f1e7c670add8 100644
--- a/io_uring/io-wq.c
+++ b/io_uring/io-wq.c
@@ -23,6 +23,7 @@
 #include "io_uring.h"
 
 #define WORKER_IDLE_TIMEOUT	(5 * HZ)
+#define WORKER_INIT_LIMIT	3
 
 enum {
 	IO_WORKER_F_UP		= 0,	/* up and active */
@@ -58,6 +59,7 @@ struct io_worker {
 
 	unsigned long create_state;
 	struct callback_head create_work;
+	int init_retries;
 
 	union {
 		struct rcu_head rcu;
@@ -745,7 +747,7 @@ static bool io_wq_work_match_all(struct io_wq_work *work, void *data)
 	return true;
 }
 
-static inline bool io_should_retry_thread(long err)
+static inline bool io_should_retry_thread(struct io_worker *worker, long err)
 {
 	/*
 	 * Prevent perpetual task_work retry, if the task (or its group) is
@@ -753,6 +755,8 @@ static inline bool io_should_retry_thread(long err)
 	 */
 	if (fatal_signal_pending(current))
 		return false;
+	if (worker->init_retries++ >= WORKER_INIT_LIMIT)
+		return false;
 
 	switch (err) {
 	case -EAGAIN:
@@ -779,7 +783,7 @@ static void create_worker_cont(struct callback_head *cb)
 		io_init_new_worker(wq, worker, tsk);
 		io_worker_release(worker);
 		return;
-	} else if (!io_should_retry_thread(PTR_ERR(tsk))) {
+	} else if (!io_should_retry_thread(worker, PTR_ERR(tsk))) {
 		struct io_wq_acct *acct = io_wq_get_acct(worker);
 
 		atomic_dec(&acct->nr_running);
@@ -846,7 +850,7 @@ static bool create_io_worker(struct io_wq *wq, int index)
 	tsk = create_io_thread(io_wq_worker, worker, NUMA_NO_NODE);
 	if (!IS_ERR(tsk)) {
 		io_init_new_worker(wq, worker, tsk);
-	} else if (!io_should_retry_thread(PTR_ERR(tsk))) {
+	} else if (!io_should_retry_thread(worker, PTR_ERR(tsk))) {
 		kfree(worker);
 		goto fail;
 	} else {
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/2] kernel: rerun task_work while freezing in get_signal()
  2024-07-09 14:27 [PATCH v2 0/2] fix task_work interation with freezing Pavel Begunkov
  2024-07-09 14:27 ` [PATCH v2 1/2] io_uring/io-wq: limit retrying worker initialisation Pavel Begunkov
@ 2024-07-09 14:27 ` Pavel Begunkov
  2024-07-09 14:42   ` Oleg Nesterov
  2024-07-10  0:57   ` Tejun Heo
  1 sibling, 2 replies; 6+ messages in thread
From: Pavel Begunkov @ 2024-07-09 14:27 UTC (permalink / raw)
  To: io-uring
  Cc: Jens Axboe, asml.silence, Oleg Nesterov, Andrew Morton,
	Christian Brauner, Tycho Andersen, Thomas Gleixner, linux-kernel,
	Julian Orth, Peter Zijlstra, Tejun Heo

io_uring can asynchronously add a task_work while the task is getting
freezed. TIF_NOTIFY_SIGNAL will prevent the task from sleeping in
do_freezer_trap(), and since the get_signal()'s relock loop doesn't
retry task_work, the task will spin there not being able to sleep
until the freezing is cancelled / the task is killed / etc.

Cc: [email protected]
Link: https://github.com/systemd/systemd/issues/33626
Fixes: 12db8b690010c ("entry: Add support for TIF_NOTIFY_SIGNAL")
Reported-by: Julian Orth <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---
 kernel/signal.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/kernel/signal.c b/kernel/signal.c
index 1f9dd41c04be..60c737e423a1 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2600,6 +2600,14 @@ static void do_freezer_trap(void)
 	spin_unlock_irq(&current->sighand->siglock);
 	cgroup_enter_frozen();
 	schedule();
+
+	/*
+	 * We could've been woken by task_work, run it to clear
+	 * TIF_NOTIFY_SIGNAL. The caller will retry if necessary.
+	 */
+	clear_notify_signal();
+	if (unlikely(task_work_pending(current)))
+		task_work_run();
 }
 
 static int ptrace_signal(int signr, kernel_siginfo_t *info, enum pid_type type)
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 2/2] kernel: rerun task_work while freezing in get_signal()
  2024-07-09 14:27 ` [PATCH v2 2/2] kernel: rerun task_work while freezing in get_signal() Pavel Begunkov
@ 2024-07-09 14:42   ` Oleg Nesterov
  2024-07-10  0:57   ` Tejun Heo
  1 sibling, 0 replies; 6+ messages in thread
From: Oleg Nesterov @ 2024-07-09 14:42 UTC (permalink / raw)
  To: Pavel Begunkov
  Cc: io-uring, Jens Axboe, Andrew Morton, Christian Brauner,
	Tycho Andersen, Thomas Gleixner, linux-kernel, Julian Orth,
	Peter Zijlstra, Tejun Heo

On 07/09, Pavel Begunkov wrote:
>
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2600,6 +2600,14 @@ static void do_freezer_trap(void)
>  	spin_unlock_irq(&current->sighand->siglock);
>  	cgroup_enter_frozen();
>  	schedule();
> +
> +	/*
> +	 * We could've been woken by task_work, run it to clear
> +	 * TIF_NOTIFY_SIGNAL. The caller will retry if necessary.
> +	 */
> +	clear_notify_signal();
> +	if (unlikely(task_work_pending(current)))
> +		task_work_run();
>  }

Acked-by: Oleg Nesterov <[email protected]>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 2/2] kernel: rerun task_work while freezing in get_signal()
  2024-07-09 14:27 ` [PATCH v2 2/2] kernel: rerun task_work while freezing in get_signal() Pavel Begunkov
  2024-07-09 14:42   ` Oleg Nesterov
@ 2024-07-10  0:57   ` Tejun Heo
  2024-07-10 17:55     ` Pavel Begunkov
  1 sibling, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2024-07-10  0:57 UTC (permalink / raw)
  To: Pavel Begunkov
  Cc: io-uring, Jens Axboe, Oleg Nesterov, Andrew Morton,
	Christian Brauner, Tycho Andersen, Thomas Gleixner, linux-kernel,
	Julian Orth, Peter Zijlstra

On Tue, Jul 09, 2024 at 03:27:19PM +0100, Pavel Begunkov wrote:
> io_uring can asynchronously add a task_work while the task is getting
> freezed. TIF_NOTIFY_SIGNAL will prevent the task from sleeping in
> do_freezer_trap(), and since the get_signal()'s relock loop doesn't
> retry task_work, the task will spin there not being able to sleep
> until the freezing is cancelled / the task is killed / etc.
> 
> Cc: [email protected]
> Link: https://github.com/systemd/systemd/issues/33626
> Fixes: 12db8b690010c ("entry: Add support for TIF_NOTIFY_SIGNAL")
> Reported-by: Julian Orth <[email protected]>
> Signed-off-by: Pavel Begunkov <[email protected]>

I haven't looked at the signal code for too long to be all that useful but
the problem described and the patch does make sense to me. FWIW,

Acked-by: Tejun Heo <[email protected]>

Maybe note that this is structured specifically to ease backport and we need
further cleanups? It's not great that this is special cased in
do_freezer_trap() instead of being integrated into the outer loop.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 2/2] kernel: rerun task_work while freezing in get_signal()
  2024-07-10  0:57   ` Tejun Heo
@ 2024-07-10 17:55     ` Pavel Begunkov
  0 siblings, 0 replies; 6+ messages in thread
From: Pavel Begunkov @ 2024-07-10 17:55 UTC (permalink / raw)
  To: Tejun Heo
  Cc: io-uring, Jens Axboe, Oleg Nesterov, Andrew Morton,
	Christian Brauner, Tycho Andersen, Thomas Gleixner, linux-kernel,
	Julian Orth, Peter Zijlstra

On 7/10/24 01:57, Tejun Heo wrote:
> On Tue, Jul 09, 2024 at 03:27:19PM +0100, Pavel Begunkov wrote:
>> io_uring can asynchronously add a task_work while the task is getting
>> freezed. TIF_NOTIFY_SIGNAL will prevent the task from sleeping in
>> do_freezer_trap(), and since the get_signal()'s relock loop doesn't
>> retry task_work, the task will spin there not being able to sleep
>> until the freezing is cancelled / the task is killed / etc.
>>
>> Cc: [email protected]
>> Link: https://github.com/systemd/systemd/issues/33626
>> Fixes: 12db8b690010c ("entry: Add support for TIF_NOTIFY_SIGNAL")
>> Reported-by: Julian Orth <[email protected]>
>> Signed-off-by: Pavel Begunkov <[email protected]>
> 
> I haven't looked at the signal code for too long to be all that useful but
> the problem described and the patch does make sense to me. FWIW,
> 
> Acked-by: Tejun Heo <[email protected]>
> 
> Maybe note that this is structured specifically to ease backport and we need
> further cleanups? It's not great that this is special cased in

I'll add a couple of words

> do_freezer_trap() instead of being integrated into the outer loop.

v1 had it in the common loop, but might actually be good it's
limited to freezing, need more digging.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-07-10 17:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-09 14:27 [PATCH v2 0/2] fix task_work interation with freezing Pavel Begunkov
2024-07-09 14:27 ` [PATCH v2 1/2] io_uring/io-wq: limit retrying worker initialisation Pavel Begunkov
2024-07-09 14:27 ` [PATCH v2 2/2] kernel: rerun task_work while freezing in get_signal() Pavel Begunkov
2024-07-09 14:42   ` Oleg Nesterov
2024-07-10  0:57   ` Tejun Heo
2024-07-10 17:55     ` Pavel Begunkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox