From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50848C433DB for ; Sat, 20 Mar 2021 16:28:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0609E61922 for ; Sat, 20 Mar 2021 16:28:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229618AbhCTQ1z (ORCPT ); Sat, 20 Mar 2021 12:27:55 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:36330 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229766AbhCTQ1i (ORCPT ); Sat, 20 Mar 2021 12:27:38 -0400 Received: from in01.mta.xmission.com ([166.70.13.51]) by out02.mta.xmission.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from ) id 1lNeRs-00FKYa-PS; Sat, 20 Mar 2021 10:27:37 -0600 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=fess.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.87) (envelope-from ) id 1lNeRs-00036F-1S; Sat, 20 Mar 2021 10:27:36 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Jens Axboe Cc: io-uring@vger.kernel.org, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, oleg@redhat.com References: <20210320153832.1033687-1-axboe@kernel.dk> Date: Sat, 20 Mar 2021 11:26:32 -0500 In-Reply-To: <20210320153832.1033687-1-axboe@kernel.dk> (Jens Axboe's message of "Sat, 20 Mar 2021 09:38:30 -0600") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1lNeRs-00036F-1S;;;mid=;;;hst=in01.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX19zror9kzthLCxE3e77LWrSLBp4IIqBOdU= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [PATCHSET 0/2] PF_IO_WORKER signal tweaks X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Jens Axboe writes: > Hi, > > Been trying to ensure that we do the right thing wrt signals and > PF_IO_WORKER threads, and I think there are two cases we need to handle > explicitly: > > 1) Just don't allow signals to them in general. We do mask everything > as blocked, outside of SIGKILL, so things like wants_signal() will > never return true for them. But it's still possible to send them a > signal via (ultimately) group_send_sig_info(). This will then deliver > the signal to the original io_uring owning task, and that seems a bit > unexpected. So just don't allow them in general. > > 2) STOP is done a bit differently, and we should not allow that either. > > Outside of that, I've been looking at same_thread_group(). This will > currently return true for an io_uring task and it's IO workers, since > they do share ->signal. From looking at the kernel users of this, that > actually seems OK for the cases I checked. One is accounting related, > which we obviously want, and others are related to permissions between > tasks. FWIW, I ran with the below and didn't observe any ill effects, > but I'd like someone to actually think about and verify that PF_IO_WORKER > same_thread_group() usage is sane. They are helper threads running in kernel space. Allowing the ordinary threads not to block. Why would same_thread_group be a problem? I don't hate this set of patches. But I also don't see much explanation why the changes are the right thing semantically. That makes me uneasy. Because especially the SIGSTOP changes feels like it is the wrong thing semantically. The group_send_sig_info change simply feels like it is unnecessary. Like we are maybe playing whak-a-mole with symptoms rather than make certain these are the right fixes for the symptoms. Eric > diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h > index 3f6a0fcaa10c..a580bc0f8aa3 100644 > --- a/include/linux/sched/signal.h > +++ b/include/linux/sched/signal.h > @@ -667,10 +667,17 @@ static inline bool thread_group_leader(struct task_struct *p) > return p->exit_signal >= 0; > } > > +static inline > +bool same_thread_group_account(struct task_struct *p1, struct task_struct *p2) > +{ > + return p1->signal == p2->signal > +} > + > static inline > bool same_thread_group(struct task_struct *p1, struct task_struct *p2) > { > - return p1->signal == p2->signal; > + return same_thread_group_account(p1, p2) && > + !((p1->flags | p2->flags) & PF_IO_WORKER); > } > > static inline struct task_struct *next_thread(const struct task_struct *p) > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c > index 5f611658eeab..625110cacc2a 100644 > --- a/kernel/sched/cputime.c > +++ b/kernel/sched/cputime.c > @@ -307,7 +307,7 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times) > * those pending times and rely only on values updated on tick or > * other scheduler action. > */ > - if (same_thread_group(current, tsk)) > + if (same_thread_group_account(current, tsk)) > (void) task_sched_runtime(current); > > rcu_read_lock();