public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: Stefan Metzmacher <[email protected]>, [email protected]
Cc: [email protected], [email protected],
	[email protected]
Subject: Re: [PATCH 09/18] io-wq: fork worker threads from original task
Date: Thu, 4 Mar 2021 11:19:18 -0700	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 3/4/21 10:32 AM, Jens Axboe wrote:
> On 3/4/21 10:09 AM, Stefan Metzmacher wrote:
>>
>> Am 04.03.21 um 17:42 schrieb Jens Axboe:
>>> On 3/4/21 9:13 AM, Stefan Metzmacher wrote:
>>>>
>>>> Am 04.03.21 um 14:19 schrieb Stefan Metzmacher:
>>>>> Hi Jens,
>>>>>
>>>>>>> Can you please explain why CLONE_SIGHAND is used here?
>>>>>>
>>>>>> We can't have CLONE_THREAD without CLONE_SIGHAND... The io-wq workers
>>>>>> don't really care about signals, we don't use them internally.
>>>>>
>>>>> I'm 100% sure, but I heard rumors that in some situations signals get
>>>>> randomly delivered to any thread of a userspace process.
>>>>
>>>> Ok, from task_struct:
>>>>
>>>>         /* Signal handlers: */
>>>>         struct signal_struct            *signal;
>>>>         struct sighand_struct __rcu             *sighand;
>>>>         sigset_t                        blocked;
>>>>         sigset_t                        real_blocked;
>>>>         /* Restored if set_restore_sigmask() was used: */
>>>>         sigset_t                        saved_sigmask;
>>>>         struct sigpending               pending;
>>>>
>>>> The signal handlers are shared, but 'blocked' is per thread/task.
>>>>
>>>>> My fear was that the related logic may select a kernel thread if they
>>>>> share the same signal handlers.
>>>>
>>>> I found the related logic in the interaction between
>>>> complete_signal() and wants_signal().
>>>>
>>>> static inline bool wants_signal(int sig, struct task_struct *p)
>>>> {
>>>>         if (sigismember(&p->blocked, sig))
>>>>                 return false;
>>>>
>>>> ...
>>>>
>>>> Would it make sense to set up task->blocked to block all signals?
>>>>
>>>> Something like this:
>>>>
>>>> --- a/fs/io-wq.c
>>>> +++ b/fs/io-wq.c
>>>> @@ -611,15 +611,15 @@ pid_t io_wq_fork_thread(int (*fn)(void *), void *arg)
>>>>  {
>>>>         unsigned long flags = CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|
>>>>                                 CLONE_IO|SIGCHLD;
>>>> -       struct kernel_clone_args args = {
>>>> -               .flags          = ((lower_32_bits(flags) | CLONE_VM |
>>>> -                                   CLONE_UNTRACED) & ~CSIGNAL),
>>>> -               .exit_signal    = (lower_32_bits(flags) & CSIGNAL),
>>>> -               .stack          = (unsigned long)fn,
>>>> -               .stack_size     = (unsigned long)arg,
>>>> -       };
>>>> +       sigset_t mask, oldmask;
>>>> +       pid_t pid;
>>>>
>>>> -       return kernel_clone(&args);
>>>> +       sigfillset(&mask);
>>>> +       sigprocmask(SIG_BLOCK, &mask, &oldmask);
>>>> +       pid = kernel_thread(fn, arg, flags);
>>>> +       sigprocmask(SIG_SETMASK, &oldmask, NULL);
>>>> +
>>>> +       return ret;
>>>>  }
>>>>
>>>> I think using kernel_thread() would be a good simplification anyway.
>>>
>>> I like this approach, we're really not interested in signals for those
>>> threads, and this makes it explicit. Ditto on just using the kernel_thread()
>>> helper, looks fine too. I'll run this through the testing. Do you want to
>>> send this as a "real" patch, or should I just attribute you in the commit
>>> message?
>>
>> You can do the patch, it was mostly an example.
>> I'm not sure if sigprocmask() is the correct function here.
>>
>> Or if we better use something like this:
>>
>>         set_restore_sigmask();
>>         current->saved_sigmask = current->blocked;
>>         set_current_blocked(&kmask);
>>         pid = kernel_thread(fn, arg, flags);
>>         restore_saved_sigmask();
> 
> Might be cleaner, and allows fatal signals.

How about this - it moves the signal fiddling into the task
itself, and leaves the parent alone. Also allows future cleanups
of how we wait for thread creation. And moves the PF_IO_WORKER
into a contained spot, instead of having it throughout where we
call the worker fork.

Later cleanups can focus on having copy_process() do the right
thing and we can kill this PF_IO_WORKER dance completely


commit eeb485abb7a189058858f941fb3432bee945a861
Author: Jens Axboe <[email protected]>
Date:   Thu Mar 4 09:46:37 2021 -0700

    io-wq: block signals by default for any io-wq worker
    
    We're not interested in signals, so let's make it explicit and block
    it for any worker. Wrap the thread creation in our own handler, so we
    can set blocked signals and serialize creation of them. A future cleanup
    can now simplify the waiting on thread creation from the other paths,
    just pushing the 'startup' completion further down.
    
    Move the PF_IO_WORKER flag dance in there as well. This will go away in
    the future when we teach copy_process() how to deal with this
    automatically.
    
    Reported-by: Stefan Metzmacher <[email protected]>
    Signed-off-by: Jens Axboe <[email protected]>

diff --git a/fs/io-wq.c b/fs/io-wq.c
index 19f18389ead2..bf5df1a31a0e 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -607,19 +607,54 @@ static int task_thread_unbound(void *data)
 	return task_thread(data, IO_WQ_ACCT_UNBOUND);
 }
 
+struct thread_start_data {
+	struct completion startup;
+	int (*fn)(void *);
+	void *arg;
+};
+
+static int thread_start(void *data)
+{
+	struct thread_start_data *d = data;
+	int (*fn)(void *) = d->fn;
+	void *arg = d->arg;
+	sigset_t mask;
+	int ret;
+
+	sigfillset(&mask);
+	set_restore_sigmask();
+	current->saved_sigmask = current->blocked;
+	set_current_blocked(&mask);
+	current->flags |= PF_IO_WORKER;
+	complete(&d->startup);
+	ret = fn(arg);
+	restore_saved_sigmask();
+	return ret;
+}
+
 pid_t io_wq_fork_thread(int (*fn)(void *), void *arg)
 {
 	unsigned long flags = CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|
 				CLONE_IO|SIGCHLD;
-	struct kernel_clone_args args = {
-		.flags		= ((lower_32_bits(flags) | CLONE_VM |
-				    CLONE_UNTRACED) & ~CSIGNAL),
-		.exit_signal	= (lower_32_bits(flags) & CSIGNAL),
-		.stack		= (unsigned long)fn,
-		.stack_size	= (unsigned long)arg,
+	struct thread_start_data data = {
+		.startup	= COMPLETION_INITIALIZER_ONSTACK(data.startup),
+		.fn		= fn,
+		.arg		= arg
 	};
+	bool clear = false;
+	pid_t pid;
 
-	return kernel_clone(&args);
+	/* task path doesn't have it, manager path does */
+	if (!(current->flags & PF_IO_WORKER)) {
+		current->flags |= PF_IO_WORKER;
+		clear = true;
+	}
+	pid = kernel_thread(thread_start, &data, flags);
+	if (clear)
+		current->flags &= ~PF_IO_WORKER;
+	if (pid >= 0)
+		wait_for_completion(&data.startup);
+	return pid;
 }
 
 static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int index)
@@ -752,7 +787,6 @@ static int io_wq_manager(void *data)
 
 	sprintf(buf, "iou-mgr-%d", wq->task_pid);
 	set_task_comm(current, buf);
-	current->flags |= PF_IO_WORKER;
 	wq->manager = get_task_struct(current);
 
 	complete(&wq->started);
@@ -821,9 +855,7 @@ static int io_wq_fork_manager(struct io_wq *wq)
 		return 0;
 
 	reinit_completion(&wq->worker_done);
-	current->flags |= PF_IO_WORKER;
 	ret = io_wq_fork_thread(io_wq_manager, wq);
-	current->flags &= ~PF_IO_WORKER;
 	if (ret >= 0) {
 		wait_for_completion(&wq->started);
 		return 0;
diff --git a/fs/io_uring.c b/fs/io_uring.c
index e55369555e5c..995f506e3a60 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -7824,9 +7824,7 @@ static int io_sq_thread_fork(struct io_sq_data *sqd, struct io_ring_ctx *ctx)
 	reinit_completion(&sqd->completion);
 	ctx->sqo_exec = 0;
 	sqd->task_pid = current->pid;
-	current->flags |= PF_IO_WORKER;
 	ret = io_wq_fork_thread(io_sq_thread, sqd);
-	current->flags &= ~PF_IO_WORKER;
 	if (ret < 0) {
 		sqd->thread = NULL;
 		return ret;
@@ -7896,9 +7894,7 @@ static int io_sq_offload_create(struct io_ring_ctx *ctx,
 		}
 
 		sqd->task_pid = current->pid;
-		current->flags |= PF_IO_WORKER;
 		ret = io_wq_fork_thread(io_sq_thread, sqd);
-		current->flags &= ~PF_IO_WORKER;
 		if (ret < 0) {
 			sqd->thread = NULL;
 			goto err;

-- 
Jens Axboe


  reply	other threads:[~2021-03-04 18:21 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-19 17:09 [PATCHSET RFC 0/18] Remove kthread usage from io_uring Jens Axboe
2021-02-19 17:09 ` [PATCH 01/18] io_uring: remove the need for relying on an io-wq fallback worker Jens Axboe
2021-02-19 20:25   ` Eric W. Biederman
2021-02-19 20:37     ` Jens Axboe
2021-02-22 13:46   ` Pavel Begunkov
2021-02-19 17:09 ` [PATCH 02/18] io-wq: don't create any IO workers upfront Jens Axboe
2021-02-19 17:09 ` [PATCH 03/18] io_uring: disable io-wq attaching Jens Axboe
2021-02-19 17:09 ` [PATCH 04/18] io-wq: get rid of wq->use_refs Jens Axboe
2021-02-19 17:09 ` [PATCH 05/18] io_uring: tie async worker side to the task context Jens Axboe
2021-02-20  8:11   ` Hao Xu
2021-02-20 14:38     ` Jens Axboe
2021-02-21  9:16       ` Hao Xu
2021-02-19 17:09 ` [PATCH 06/18] io-wq: don't pass 'wqe' needlessly around Jens Axboe
2021-02-19 17:09 ` [PATCH 07/18] arch: setup PF_IO_WORKER threads like PF_KTHREAD Jens Axboe
2021-02-19 22:21   ` Eric W. Biederman
2021-02-19 23:26     ` Jens Axboe
2021-02-19 17:10 ` [PATCH 08/18] kernel: treat PF_IO_WORKER like PF_KTHREAD for ptrace/signals Jens Axboe
2021-02-19 17:10 ` [PATCH 09/18] io-wq: fork worker threads from original task Jens Axboe
2021-03-04 12:23   ` Stefan Metzmacher
2021-03-04 13:05     ` Jens Axboe
2021-03-04 13:19       ` Stefan Metzmacher
2021-03-04 16:13         ` Stefan Metzmacher
2021-03-04 16:42           ` Jens Axboe
2021-03-04 17:09             ` Stefan Metzmacher
2021-03-04 17:32               ` Jens Axboe
2021-03-04 18:19                 ` Jens Axboe [this message]
2021-03-04 18:56                   ` Linus Torvalds
2021-03-04 19:19                     ` Jens Axboe
2021-03-04 19:46                       ` Linus Torvalds
2021-03-04 19:54                         ` Jens Axboe
2021-03-04 20:00                           ` Jens Axboe
2021-03-04 20:23                             ` Jens Axboe
2021-03-04 20:50                           ` Linus Torvalds
2021-03-04 20:54                             ` Jens Axboe
2021-03-05 19:16           ` Eric W. Biederman
2021-03-05 19:00       ` Eric W. Biederman
2021-02-19 17:10 ` [PATCH 10/18] io-wq: worker idling always returns false Jens Axboe
2021-02-19 17:10 ` [PATCH 11/18] io_uring: remove any grabbing of context Jens Axboe
2021-02-19 17:10 ` [PATCH 12/18] io_uring: remove io_identity Jens Axboe
2021-02-19 17:10 ` [PATCH 13/18] io-wq: only remove worker from free_list, if it was there Jens Axboe
2021-02-19 17:10 ` [PATCH 14/18] io-wq: make io_wq_fork_thread() available to other users Jens Axboe
2021-02-19 17:10 ` [PATCH 15/18] io_uring: move SQPOLL thread io-wq forked worker Jens Axboe
2021-02-19 17:10 ` [PATCH 16/18] Revert "proc: don't allow async path resolution of /proc/thread-self components" Jens Axboe
2021-02-19 17:10 ` [PATCH 17/18] Revert "proc: don't allow async path resolution of /proc/self components" Jens Axboe
2021-02-19 17:10 ` [PATCH 18/18] net: remove cmsg restriction from io_uring based send/recvmsg calls Jens Axboe
2021-02-19 23:44 ` [PATCHSET RFC 0/18] Remove kthread usage from io_uring Stefan Metzmacher
2021-02-19 23:51   ` Jens Axboe
2021-02-21  5:04 ` Linus Torvalds
2021-02-21 21:22   ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox