public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Mateusz Guzik <mjguzik@gmail.com>
Cc: io-uring@vger.kernel.org, asml.silence@gmail.com,
	brauner@kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 1/5] fs: gate final fput task_work on PF_NO_TASKWORK
Date: Mon, 14 Apr 2025 13:35:26 -0600	[thread overview]
Message-ID: <c83ab29b-48e5-44b1-8902-2711a806739f@kernel.dk> (raw)
In-Reply-To: <gj6liprp6wtwgabimozkpaw6rv5xfotyi62zuegy5ffjxjdrrs@325g7wcnir6t>

On 4/14/25 11:11 AM, Mateusz Guzik wrote:
> On Wed, Apr 09, 2025 at 07:35:19AM -0600, Jens Axboe wrote:
>> fput currently gates whether or not a task can run task_work on the
>> PF_KTHREAD flag, which excludes kernel threads as they don't usually run
>> task_work as they never exit to userspace. This punts the final fput
>> done from a kthread to a delayed work item instead of using task_work.
>>
>> It's perfectly viable to have the final fput done by the kthread itself,
>> as long as it will actually run the task_work. Add a PF_NO_TASKWORK flag
>> which is set by default by a kernel thread, and gate the task_work fput
>> on that instead. This enables a kernel thread to clear this flag
>> temporarily while putting files, as long as it runs its task_work
>> manually.
>>
>> This enables users like io_uring to ensure that when the final fput of a
>> file is done as part of ring teardown to run the local task_work and
>> hence know that all files have been properly put, without needing to
>> resort to workqueue flushing tricks which can deadlock.
>>
>> No functional changes in this patch.
>>
>> Cc: Christian Brauner <brauner@kernel.org>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>  fs/file_table.c       | 2 +-
>>  include/linux/sched.h | 2 +-
>>  kernel/fork.c         | 2 +-
>>  3 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/file_table.c b/fs/file_table.c
>> index c04ed94cdc4b..e3c3dd1b820d 100644
>> --- a/fs/file_table.c
>> +++ b/fs/file_table.c
>> @@ -521,7 +521,7 @@ static void __fput_deferred(struct file *file)
>>  		return;
>>  	}
>>  
>> -	if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) {
>> +	if (likely(!in_interrupt() && !(task->flags & PF_NO_TASKWORK))) {
>>  		init_task_work(&file->f_task_work, ____fput);
>>  		if (!task_work_add(task, &file->f_task_work, TWA_RESUME))
>>  			return;
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index f96ac1982893..349c993fc32b 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -1736,7 +1736,7 @@ extern struct pid *cad_pid;
>>  						 * I am cleaning dirty pages from some other bdi. */
>>  #define PF_KTHREAD		0x00200000	/* I am a kernel thread */
>>  #define PF_RANDOMIZE		0x00400000	/* Randomize virtual address space */
>> -#define PF__HOLE__00800000	0x00800000
>> +#define PF_NO_TASKWORK		0x00800000	/* task doesn't run task_work */
>>  #define PF__HOLE__01000000	0x01000000
>>  #define PF__HOLE__02000000	0x02000000
>>  #define PF_NO_SETAFFINITY	0x04000000	/* Userland is not allowed to meddle with cpus_mask */
>> diff --git a/kernel/fork.c b/kernel/fork.c
>> index c4b26cd8998b..8dd0b8a5348d 100644
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -2261,7 +2261,7 @@ __latent_entropy struct task_struct *copy_process(
>>  		goto fork_out;
>>  	p->flags &= ~PF_KTHREAD;
>>  	if (args->kthread)
>> -		p->flags |= PF_KTHREAD;
>> +		p->flags |= PF_KTHREAD | PF_NO_TASKWORK;
>>  	if (args->user_worker) {
>>  		/*
>>  		 * Mark us a user worker, and block any signal that isn't
> 
> I don't have comments on the semantics here, I do have comments on some
> future-proofing.
> 
> To my reading kthreads on the stock kernel never execute task_work.

Correct

> This suggests it would be nice for task_work_add() to at least WARN_ON
> when executing with a kthread. After all you don't want a task_work_add
> consumer adding work which will never execute.

I don't think there's much need for that, as I'm not aware of any kernel
usage that had a bug due to that. And if you did, you'd find it pretty
quick during testing as that work would just never execute.

> But then for your patch to not produce any splats there would have to be
> a flag blessing select kthreads as legitimate task_work consumers.

This patchset very much adds a specific flag for that, PF_NO_TASKWORK,
and kernel threads have it set by default. It just separates the "do I
run task_work" flag from PF_KTHREAD. So yes you could add:

WARN_ON_ONCE(task->flags & PF_NO_TASKWORK);

to task_work_add(), but I'm not really convinced it'd be super useful.

-- 
Jens Axboe

  reply	other threads:[~2025-04-14 19:35 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-09 13:35 [PATCHSET v3 0/5] Cancel and wait for all requests on exit Jens Axboe
2025-04-09 13:35 ` [PATCH 1/5] fs: gate final fput task_work on PF_NO_TASKWORK Jens Axboe
2025-04-11 13:48   ` Christian Brauner
2025-04-11 14:37     ` Jens Axboe
2025-04-14 10:10       ` Christian Brauner
2025-04-14 14:29         ` Jens Axboe
2025-04-14 17:11   ` Mateusz Guzik
2025-04-14 19:35     ` Jens Axboe [this message]
2025-04-09 13:35 ` [PATCH 2/5] io_uring: mark exit side kworkers as task_work capable Jens Axboe
2025-04-11 13:55   ` Christian Brauner
2025-04-11 14:38     ` Jens Axboe
2025-04-09 13:35 ` [PATCH 3/5] io_uring: consider ring dead once the ref is marked dying Jens Axboe
2025-04-09 13:35 ` [PATCH 4/5] io_uring: wait for cancelations on final ring put Jens Axboe
2025-04-09 13:35 ` [PATCH 5/5] io_uring: switch away from percpu refcounts Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2025-03-21 19:24 [PATCHSET RFC v2 0/5] Cancel and wait for all requests on exit Jens Axboe
2025-03-21 19:24 ` [PATCH 1/5] fs: gate final fput task_work on PF_NO_TASKWORK Jens Axboe
2024-06-04 19:01 [PATCHSET RFC 0/5] Wait on cancelations at release time Jens Axboe
2024-06-04 19:01 ` [PATCH 1/5] fs: gate final fput task_work on PF_NO_TASKWORK Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c83ab29b-48e5-44b1-8902-2711a806739f@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=asml.silence@gmail.com \
    --cc=brauner@kernel.org \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mjguzik@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox