Re: [PATCH 2/2] kernel: rerun task_work while freezing in get_signal()

public inbox for [email protected]
 help / color / mirror / Atom feed

From: Pavel Begunkov <[email protected]>
To: Oleg Nesterov <[email protected]>
Cc: [email protected], Jens Axboe <[email protected]>,
	Andrew Morton <[email protected]>,
	Christian Brauner <[email protected]>,
	Tycho Andersen <[email protected]>,
	Thomas Gleixner <[email protected]>,
	[email protected], Julian Orth <[email protected]>,
	Tejun Heo <[email protected]>, Peter Zijlstra <[email protected]>
Subject: Re: [PATCH 2/2] kernel: rerun task_work while freezing in get_signal()
Date: Tue, 9 Jul 2024 15:05:21 +0100	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 7/9/24 11:36, Oleg Nesterov wrote:
> On 07/08, Pavel Begunkov wrote:
>>
>> On 7/8/24 11:42, Oleg Nesterov wrote:
>>> I don't think we should blame io_uring even if so far it is the only user
>>> of TWA_SIGNAL.
>>
>> And it's not entirely correct even for backporting purposes,
>> I'll pin it to when freezing was introduced then.
> 
> This is another problem introduced by 12db8b690010 ("entry: Add support for
> TIF_NOTIFY_SIGNAL")

Ah, yes, I forgot NOTIFY_SIGNAL was split out of SIGPENDING

> We need much more changes. Say, zap_threads() does the same and assumes
> that only SIGKILL or freezeing can make dump_interrupted() true.
> 
> There are more similar problems. I'll try to think, so far I do not see
> a simple solution...

Thanks. And there was some patching done before against dumping
being interrupted by task_work, indeed a reoccurring issue.


> As for this particular problem, I agree it needs a simple/backportable fix.
> 
>>>>   relock:
>>>> +	clear_notify_signal();
>>>> +	if (unlikely(task_work_pending(current)))
>>>> +		task_work_run();
>>>> +
>>>>   	spin_lock_irq(&sighand->siglock);
>>>
>>> Well, but can't we kill the same code at the start of get_signal() then?
>>> Of course, in this case get_signal() should check signal_pending(), not
>>> task_sigpending().
>>
>> Should be fine,
> 
> Well, not really at least performance-wise... get_signal() should return
> asap if TIF_NOTIFY_SIGNAL was the only reason to call get_signal().
> 
>> but I didn't want to change the
>> try_to_freeze() -> __refrigerator() path, which also reschedules.
> 
> Could you spell please?

Let's say it calls get_signal() for freezing with a task_work pending.
Currently, it executes task_work and calls try_to_freeze(), which
puts the task to sleep. If we remove that task_work_run() before
try_to_freeze(), it would not be able to sleep. Sounds like it should
be fine, it races anyway, but I'm trying to avoid side effect for fixes.

>>> Or perhaps something like the patch below makes more sense? I dunno...
>>
>> It needs a far backporting, I'd really prefer to keep it
>> lean and without more side effects if possible, unless
>> there is a strong opinion on that.
> 
> Well, I don't think my patch is really worse in this sense. Just it
> is buggy ;) it needs another recalc_sigpending() before goto start,
> so lets forget it.
> 
> So I am starting to agree with your change as a workaround until we
> find a clean solution (if ever ;).
> 
> But can I ask you to add this additional clear_notify_signal() +
> task_work_run() to the end of do_freezer_trap() ? get_signal() is
> already a mess...

Will change

> -----------------------------------------------------------------------
> Either way I have no idea whether a cgroup_task_frozen() task should
> react to task_work_add(TWA_SIGNAL) or not.
> 
> Documentation/admin-guide/cgroup-v2.rst says
> 
> 	Writing "1" to the file causes freezing of the cgroup and all
> 	descendant cgroups. This means that all belonging processes will
> 	be stopped and will not run until the cgroup will be explicitly
> 	unfrozen.
> 
> AFAICS this is not accurate, they can run but can't return to user-mode.
> So I guess task_work_run() is fine.

IIUC it's a user facing doc, so maybe it's accurate enough from that
perspective. But I do agree that the semantics around task_work is
not exactly clear.

-- 
Pavel Begunkov

next prev parent reply	other threads:[~2024-07-09 14:05 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-07 16:32 [PATCH 0/2] fix task_work interation with freezing Pavel Begunkov
2024-07-07 16:32 ` [PATCH 1/2] io_uring/io-wq: limit retrying worker initialisation Pavel Begunkov
2024-07-07 16:32 ` [PATCH 2/2] kernel: rerun task_work while freezing in get_signal() Pavel Begunkov
2024-07-08 10:42   ` Oleg Nesterov
2024-07-08 15:40     ` Pavel Begunkov
2024-07-08 18:48       ` Peter Zijlstra
2024-07-09 10:36       ` Oleg Nesterov
2024-07-09 14:05         ` Pavel Begunkov [this message]
2024-07-09 16:39           ` Tejun Heo
2024-07-09 19:07             ` Oleg Nesterov
2024-07-09 19:26               ` Pavel Begunkov
2024-07-09 19:38                 ` Oleg Nesterov
2024-07-09 19:55                   ` Pavel Begunkov
2024-07-10  0:54                     ` Tejun Heo
2024-07-10 17:53                       ` Pavel Begunkov
2024-07-10 19:10                         ` Oleg Nesterov
2024-07-10 19:20                           ` Tejun Heo
2024-07-10 21:34                             ` Oleg Nesterov
2024-07-10 22:01                               ` Tejun Heo
2024-07-10 22:17                                 ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox