* [PATCH 1/4] task_work: add task_work_pending() helper
2020-04-06 19:48 [PATCHSET 0/4] io_uring and task_work interactions Jens Axboe
@ 2020-04-06 19:48 ` Jens Axboe
2020-04-07 11:28 ` Oleg Nesterov
2020-04-06 19:48 ` [PATCH 2/4] task_work: don't run task_work if task_work_exited is queued Jens Axboe
` (2 subsequent siblings)
3 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2020-04-06 19:48 UTC (permalink / raw)
To: io-uring; +Cc: peterz, Jens Axboe
Various callsites currently check current->task_works != NULL to know
when to run work. Add a helper that we use internally for that. This is
in preparation for also not running if exit queue has been queued.
Signed-off-by: Jens Axboe <[email protected]>
---
include/linux/task_work.h | 13 ++++++++++++-
kernel/task_work.c | 2 +-
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/include/linux/task_work.h b/include/linux/task_work.h
index bd9a6a91c097..54c911bbf754 100644
--- a/include/linux/task_work.h
+++ b/include/linux/task_work.h
@@ -15,7 +15,18 @@ init_task_work(struct callback_head *twork, task_work_func_t func)
int task_work_add(struct task_struct *task, struct callback_head *twork, bool);
struct callback_head *task_work_cancel(struct task_struct *, task_work_func_t);
-void task_work_run(void);
+void __task_work_run(void);
+
+static inline bool task_work_pending(void)
+{
+ return current->task_works;
+}
+
+static inline void task_work_run(void)
+{
+ if (task_work_pending())
+ __task_work_run();
+}
static inline void exit_task_work(struct task_struct *task)
{
diff --git a/kernel/task_work.c b/kernel/task_work.c
index 825f28259a19..9620333423a3 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -87,7 +87,7 @@ task_work_cancel(struct task_struct *task, task_work_func_t func)
* it exits. In the latter case task_work_add() can no longer add the
* new work after task_work_run() returns.
*/
-void task_work_run(void)
+void __task_work_run(void)
{
struct task_struct *task = current;
struct callback_head *work, *head, *next;
--
2.26.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] task_work: add task_work_pending() helper
2020-04-06 19:48 ` [PATCH 1/4] task_work: add task_work_pending() helper Jens Axboe
@ 2020-04-07 11:28 ` Oleg Nesterov
2020-04-07 15:43 ` Jens Axboe
0 siblings, 1 reply; 15+ messages in thread
From: Oleg Nesterov @ 2020-04-07 11:28 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, peterz
On 04/06, Jens Axboe wrote:
>
> +static inline bool task_work_pending(void)
> +{
> + return current->task_works;
> +}
> +
> +static inline void task_work_run(void)
> +{
> + if (task_work_pending())
> + __task_work_run();
> +}
No, this is wrong. exit_task_work() must always call __task_work_run()
to install work_exited.
This helper (and 3/4) probably makes sense but please change exit_task_work()
to use __task_work_run() then.
Oleg.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] task_work: add task_work_pending() helper
2020-04-07 11:28 ` Oleg Nesterov
@ 2020-04-07 15:43 ` Jens Axboe
0 siblings, 0 replies; 15+ messages in thread
From: Jens Axboe @ 2020-04-07 15:43 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: io-uring, peterz
On 4/7/20 4:28 AM, Oleg Nesterov wrote:
> On 04/06, Jens Axboe wrote:
>>
>> +static inline bool task_work_pending(void)
>> +{
>> + return current->task_works;
>> +}
>> +
>> +static inline void task_work_run(void)
>> +{
>> + if (task_work_pending())
>> + __task_work_run();
>> +}
>
> No, this is wrong. exit_task_work() must always call __task_work_run()
> to install work_exited.
>
> This helper (and 3/4) probably makes sense but please change exit_task_work()
> to use __task_work_run() then.
Good catch, thanks. I'll make the change.
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/4] task_work: don't run task_work if task_work_exited is queued
2020-04-06 19:48 [PATCHSET 0/4] io_uring and task_work interactions Jens Axboe
2020-04-06 19:48 ` [PATCH 1/4] task_work: add task_work_pending() helper Jens Axboe
@ 2020-04-06 19:48 ` Jens Axboe
2020-04-07 11:39 ` Oleg Nesterov
2020-04-07 12:47 ` Peter Zijlstra
2020-04-06 19:48 ` [PATCH 3/4] task_work: kill current->task_works checking in callers Jens Axboe
2020-04-06 19:48 ` [PATCH 4/4] io_uring: flush task work before waiting for ring exit Jens Axboe
3 siblings, 2 replies; 15+ messages in thread
From: Jens Axboe @ 2020-04-06 19:48 UTC (permalink / raw)
To: io-uring; +Cc: peterz, Jens Axboe
If task_work has already been run on task exit, we don't always know
if it's safe to run again. Check for task_work_exited in the
task_work_pending() helper. This makes it less fragile in calling
from the exit files path, for example.
Signed-off-by: Jens Axboe <[email protected]>
---
include/linux/task_work.h | 4 +++-
kernel/task_work.c | 8 ++++----
2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/include/linux/task_work.h b/include/linux/task_work.h
index 54c911bbf754..24f977a8fc35 100644
--- a/include/linux/task_work.h
+++ b/include/linux/task_work.h
@@ -7,6 +7,8 @@
typedef void (*task_work_func_t)(struct callback_head *);
+extern struct callback_head task_work_exited;
+
static inline void
init_task_work(struct callback_head *twork, task_work_func_t func)
{
@@ -19,7 +21,7 @@ void __task_work_run(void);
static inline bool task_work_pending(void)
{
- return current->task_works;
+ return current->task_works && current->task_works != &task_work_exited;
}
static inline void task_work_run(void)
diff --git a/kernel/task_work.c b/kernel/task_work.c
index 9620333423a3..d6a8b4ab4858 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -3,7 +3,7 @@
#include <linux/task_work.h>
#include <linux/tracehook.h>
-static struct callback_head work_exited; /* all we need is ->next == NULL */
+struct callback_head task_work_exited = { };
/**
* task_work_add - ask the @task to execute @work->func()
@@ -31,7 +31,7 @@ task_work_add(struct task_struct *task, struct callback_head *work, bool notify)
do {
head = READ_ONCE(task->task_works);
- if (unlikely(head == &work_exited))
+ if (unlikely(head == &task_work_exited))
return -ESRCH;
work->next = head;
} while (cmpxchg(&task->task_works, head, work) != head);
@@ -95,14 +95,14 @@ void __task_work_run(void)
for (;;) {
/*
* work->func() can do task_work_add(), do not set
- * work_exited unless the list is empty.
+ * task_work_exited unless the list is empty.
*/
do {
head = NULL;
work = READ_ONCE(task->task_works);
if (!work) {
if (task->flags & PF_EXITING)
- head = &work_exited;
+ head = &task_work_exited;
else
break;
}
--
2.26.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] task_work: don't run task_work if task_work_exited is queued
2020-04-06 19:48 ` [PATCH 2/4] task_work: don't run task_work if task_work_exited is queued Jens Axboe
@ 2020-04-07 11:39 ` Oleg Nesterov
2020-04-07 11:54 ` Oleg Nesterov
2020-04-07 15:40 ` Jens Axboe
2020-04-07 12:47 ` Peter Zijlstra
1 sibling, 2 replies; 15+ messages in thread
From: Oleg Nesterov @ 2020-04-07 11:39 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, peterz
On 04/06, Jens Axboe wrote:
>
> +extern struct callback_head task_work_exited;
> +
> static inline void
> init_task_work(struct callback_head *twork, task_work_func_t func)
> {
> @@ -19,7 +21,7 @@ void __task_work_run(void);
>
> static inline bool task_work_pending(void)
> {
> - return current->task_works;
> + return current->task_works && current->task_works != &task_work_exited;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Well, this penalizes all the current users, they can't hit work_exited.
IIUC, this is needed for the next change which adds task_work_run() into
io_ring_ctx_wait_and_kill(), right?
could you explain how the exiting can call io_ring_ctx_wait_and_kill()
after it passed exit_task_work() ?
Oleg.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] task_work: don't run task_work if task_work_exited is queued
2020-04-07 11:39 ` Oleg Nesterov
@ 2020-04-07 11:54 ` Oleg Nesterov
2020-04-07 15:40 ` Jens Axboe
1 sibling, 0 replies; 15+ messages in thread
From: Oleg Nesterov @ 2020-04-07 11:54 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, peterz
On 04/07, Oleg Nesterov wrote:
>
> On 04/06, Jens Axboe wrote:
> >
> > +extern struct callback_head task_work_exited;
> > +
> > static inline void
> > init_task_work(struct callback_head *twork, task_work_func_t func)
> > {
> > @@ -19,7 +21,7 @@ void __task_work_run(void);
> >
> > static inline bool task_work_pending(void)
> > {
> > - return current->task_works;
> > + return current->task_works && current->task_works != &task_work_exited;
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Well, this penalizes all the current users, they can't hit work_exited.
>
> IIUC, this is needed for the next change which adds task_work_run() into
> io_ring_ctx_wait_and_kill(), right?
>
> could you explain how the exiting can call io_ring_ctx_wait_and_kill()
> after it passed exit_task_work() ?
Note also that currently we assume that task_work_run() must not be called
by the exiting process (except exit_task_work). If io_ring adds the precedent
we need change the PF_EXITING logic in task_work_run().
Oleg.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] task_work: don't run task_work if task_work_exited is queued
2020-04-07 11:39 ` Oleg Nesterov
2020-04-07 11:54 ` Oleg Nesterov
@ 2020-04-07 15:40 ` Jens Axboe
2020-04-07 16:19 ` Oleg Nesterov
1 sibling, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2020-04-07 15:40 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: io-uring, peterz
On 4/7/20 4:39 AM, Oleg Nesterov wrote:
> On 04/06, Jens Axboe wrote:
>>
>> +extern struct callback_head task_work_exited;
>> +
>> static inline void
>> init_task_work(struct callback_head *twork, task_work_func_t func)
>> {
>> @@ -19,7 +21,7 @@ void __task_work_run(void);
>>
>> static inline bool task_work_pending(void)
>> {
>> - return current->task_works;
>> + return current->task_works && current->task_works != &task_work_exited;
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Well, this penalizes all the current users, they can't hit work_exited.
>
> IIUC, this is needed for the next change which adds task_work_run() into
> io_ring_ctx_wait_and_kill(), right?
Right - so you'd rather I localize that check there instead? Can certainly
do that.
> could you explain how the exiting can call io_ring_ctx_wait_and_kill()
> after it passed exit_task_work() ?
Sure, here's a trace where it happens:
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 0 P4D 0
Oops: 0010 [#1] SMP
CPU: 51 PID: 7290 Comm: mc_worker Kdump: loaded Not tainted 5.2.9-03696-gf2db01aa1e97 #190
Hardware name: Quanta Leopard ORv2-DDR4/Leopard ORv2-DDR4, BIOS F06_3B17 03/16/2018
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0018:ffffc9002721bc78 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffffffff82d10ff0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffc9002721bc60 RDI: ffffffff82d10ff0
RBP: ffff889fd220e8f0 R08: 0000000000000000 R09: ffffffff812f1000
R10: ffff88bfa5fcb100 R11: 0000000000000000 R12: ffff889fd220e200
R13: ffff889fd220e92c R14: ffffffff82d10ff0 R15: 0000000000000000
FS: 00007f03161ff700(0000) GS:ffff88bfff9c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 0000000002409004 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__task_work_run+0x66/0xa0
io_ring_ctx_wait_and_kill+0x14e/0x3c0
io_uring_release+0x1c/0x20
__fput+0xaa/0x200
__task_work_run+0x66/0xa0
do_exit+0x9cf/0xb40
do_group_exit+0x3a/0xa0
get_signal+0x152/0x800
do_signal+0x36/0x640
? __audit_syscall_exit+0x23c/0x290
exit_to_usermode_loop+0x65/0x100
do_syscall_64+0xd4/0x100
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f77057fe8ce
Code: Bad RIP value.
RSP: 002b:00007f03161f8960 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
RAX: 000000000000002a RBX: 00007f03161f8a30 RCX: 00007f77057fe8ce
RDX: 0000000000004040 RSI: 00007f03161f8a30 RDI: 00000000000057a4
RBP: 00007f03161f8980 R08: 0000000000000000 R09: 00007f03161fb7b8
R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
R13: 0000000000004040 R14: 00007f02dc12bc00 R15: 00007f02dc21b900
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] task_work: don't run task_work if task_work_exited is queued
2020-04-07 15:40 ` Jens Axboe
@ 2020-04-07 16:19 ` Oleg Nesterov
2020-04-07 16:59 ` Jens Axboe
0 siblings, 1 reply; 15+ messages in thread
From: Oleg Nesterov @ 2020-04-07 16:19 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, peterz
On 04/07, Jens Axboe wrote:
>
> On 4/7/20 4:39 AM, Oleg Nesterov wrote:
> >
> > IIUC, this is needed for the next change which adds task_work_run() into
> > io_ring_ctx_wait_and_kill(), right?
>
> Right - so you'd rather I localize that check there instead? Can certainly
> do that.
I am still not sure we need this check at all... probably this is because
I don't understand the problem.
> > could you explain how the exiting can call io_ring_ctx_wait_and_kill()
> > after it passed exit_task_work() ?
>
> Sure, here's a trace where it happens:
but this task has not passed exit_task_work(),
> __task_work_run+0x66/0xa0
> io_ring_ctx_wait_and_kill+0x14e/0x3c0
> io_uring_release+0x1c/0x20
> __fput+0xaa/0x200
> __task_work_run+0x66/0xa0
> do_exit+0x9cf/0xb40
So task_work_run() is called recursively from exit_task_work()->task_work_run().
See my another email, this is wrong with or without this series. And that is
why I think task_work_run() hits work_exited.
Could you explain why io_ring_ctx_wait_and_kill() needs task_work_run() ?
Oleg.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] task_work: don't run task_work if task_work_exited is queued
2020-04-07 16:19 ` Oleg Nesterov
@ 2020-04-07 16:59 ` Jens Axboe
2020-04-07 17:43 ` Oleg Nesterov
0 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2020-04-07 16:59 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: io-uring, peterz
On 4/7/20 9:19 AM, Oleg Nesterov wrote:
> On 04/07, Jens Axboe wrote:
>>
>> On 4/7/20 4:39 AM, Oleg Nesterov wrote:
>>>
>>> IIUC, this is needed for the next change which adds task_work_run() into
>>> io_ring_ctx_wait_and_kill(), right?
>>
>> Right - so you'd rather I localize that check there instead? Can certainly
>> do that.
>
> I am still not sure we need this check at all... probably this is because
> I don't understand the problem.
Probably because I'm not explaining it very well... Let me try. io_uring
uses the task_work to handle completion of poll requests. Either an
explicit poll, or one done implicitly because someone did a recv/send
(or whatever) on a socket that wasn't ready. When we get the poll
waitqueue callback, we queue up task_work to handle the completion of
it.
These can come in at any time, obviously, as space or data becomes
available. If the task is exiting, our task_work_add() fails, and we
queue with someone else. But there seems to be a case where it does get
queued locally, and then io_uring doesn't know if it's safe to run
task_work or not. Sometimes that manifests itself in hitting the RIP ==
0 case that I included here. With the work_pending && work != exit_work
in place, it works fine.
>>> could you explain how the exiting can call io_ring_ctx_wait_and_kill()
>>> after it passed exit_task_work() ?
>>
>> Sure, here's a trace where it happens:
>
> but this task has not passed exit_task_work(),
But it's definitely hitting callback->func == NULL, which is the
exit_work. So if it's not from past exit_task_work(), where is it from?
>
>> __task_work_run+0x66/0xa0
>> io_ring_ctx_wait_and_kill+0x14e/0x3c0
>> io_uring_release+0x1c/0x20
>> __fput+0xaa/0x200
>> __task_work_run+0x66/0xa0
>> do_exit+0x9cf/0xb40
>
> So task_work_run() is called recursively from
> exit_task_work()->task_work_run(). See my another email, this is
> wrong with or without this series. And that is why I think
> task_work_run() hits work_exited.
I see your newer email on this, I'll go read it.
> Could you explain why io_ring_ctx_wait_and_kill() needs
> task_work_run() ?
Hopefully the above helped! If I'm way off somehow, cluebat away.
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] task_work: don't run task_work if task_work_exited is queued
2020-04-07 16:59 ` Jens Axboe
@ 2020-04-07 17:43 ` Oleg Nesterov
0 siblings, 0 replies; 15+ messages in thread
From: Oleg Nesterov @ 2020-04-07 17:43 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, peterz
Jens,
I'll read your email tomorrow, just one note for now ...
On 04/07, Jens Axboe wrote:
>
> On 4/7/20 9:19 AM, Oleg Nesterov wrote:
> >
> > but this task has not passed exit_task_work(),
>
> But it's definitely hitting callback->func == NULL, which is the
> exit_work. So if it's not from past exit_task_work(), where is it from?
I guess it comes from task_work_run() added by the next patch ;)
> I see your newer email on this, I'll go read it.
please look at the "bad_work_func" example.
Oleg.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] task_work: don't run task_work if task_work_exited is queued
2020-04-06 19:48 ` [PATCH 2/4] task_work: don't run task_work if task_work_exited is queued Jens Axboe
2020-04-07 11:39 ` Oleg Nesterov
@ 2020-04-07 12:47 ` Peter Zijlstra
2020-04-07 15:43 ` Jens Axboe
1 sibling, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2020-04-07 12:47 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, Oleg Nesterov, viro
You seem to have lost Oleg and Al from the Cc list..
On Mon, Apr 06, 2020 at 01:48:51PM -0600, Jens Axboe wrote:
> If task_work has already been run on task exit, we don't always know
> if it's safe to run again. Check for task_work_exited in the
> task_work_pending() helper. This makes it less fragile in calling
> from the exit files path, for example.
>
> Signed-off-by: Jens Axboe <[email protected]>
> ---
> include/linux/task_work.h | 4 +++-
> kernel/task_work.c | 8 ++++----
> 2 files changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/task_work.h b/include/linux/task_work.h
> index 54c911bbf754..24f977a8fc35 100644
> --- a/include/linux/task_work.h
> +++ b/include/linux/task_work.h
> @@ -7,6 +7,8 @@
>
> typedef void (*task_work_func_t)(struct callback_head *);
>
> +extern struct callback_head task_work_exited;
> +
> static inline void
> init_task_work(struct callback_head *twork, task_work_func_t func)
> {
> @@ -19,7 +21,7 @@ void __task_work_run(void);
>
> static inline bool task_work_pending(void)
> {
> - return current->task_works;
> + return current->task_works && current->task_works != &task_work_exited;
> }
Hurmph.. not sure I like this. It inlines that second condition to
every caller of task_work_run() even though for pretty much all of them
this is impossible.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] task_work: don't run task_work if task_work_exited is queued
2020-04-07 12:47 ` Peter Zijlstra
@ 2020-04-07 15:43 ` Jens Axboe
0 siblings, 0 replies; 15+ messages in thread
From: Jens Axboe @ 2020-04-07 15:43 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: io-uring, Oleg Nesterov, viro
On 4/7/20 5:47 AM, Peter Zijlstra wrote:
>
> You seem to have lost Oleg and Al from the Cc list..
I'll add them for v2, I did point Oleg at it!
> On Mon, Apr 06, 2020 at 01:48:51PM -0600, Jens Axboe wrote:
>> If task_work has already been run on task exit, we don't always know
>> if it's safe to run again. Check for task_work_exited in the
>> task_work_pending() helper. This makes it less fragile in calling
>> from the exit files path, for example.
>>
>> Signed-off-by: Jens Axboe <[email protected]>
>> ---
>> include/linux/task_work.h | 4 +++-
>> kernel/task_work.c | 8 ++++----
>> 2 files changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/linux/task_work.h b/include/linux/task_work.h
>> index 54c911bbf754..24f977a8fc35 100644
>> --- a/include/linux/task_work.h
>> +++ b/include/linux/task_work.h
>> @@ -7,6 +7,8 @@
>>
>> typedef void (*task_work_func_t)(struct callback_head *);
>>
>> +extern struct callback_head task_work_exited;
>> +
>> static inline void
>> init_task_work(struct callback_head *twork, task_work_func_t func)
>> {
>> @@ -19,7 +21,7 @@ void __task_work_run(void);
>>
>> static inline bool task_work_pending(void)
>> {
>> - return current->task_works;
>> + return current->task_works && current->task_works != &task_work_exited;
>> }
>
> Hurmph.. not sure I like this. It inlines that second condition to
> every caller of task_work_run() even though for pretty much all of them
> this is impossible.
Oleg had the same concern, and I agree with both of you. Would you
prefer it we just leave task_work_run() as:
static inline void task_work_run(void)
{
if (current->task_works)
__task_work_run();
}
and then have the io_uring caller do:
if (current->task_works != &task_work_exited)
task_work_run();
instead?
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/4] task_work: kill current->task_works checking in callers
2020-04-06 19:48 [PATCHSET 0/4] io_uring and task_work interactions Jens Axboe
2020-04-06 19:48 ` [PATCH 1/4] task_work: add task_work_pending() helper Jens Axboe
2020-04-06 19:48 ` [PATCH 2/4] task_work: don't run task_work if task_work_exited is queued Jens Axboe
@ 2020-04-06 19:48 ` Jens Axboe
2020-04-06 19:48 ` [PATCH 4/4] io_uring: flush task work before waiting for ring exit Jens Axboe
3 siblings, 0 replies; 15+ messages in thread
From: Jens Axboe @ 2020-04-06 19:48 UTC (permalink / raw)
To: io-uring; +Cc: peterz, Jens Axboe
If the callsite cares, use task_work_pending(). If not, just call
task_work_run() unconditionally, that makes the check inline and
doesn't add any extra overhead.
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io-wq.c | 7 ++-----
fs/io_uring.c | 20 ++++++++------------
include/linux/tracehook.h | 3 +--
kernel/signal.c | 7 +++----
4 files changed, 14 insertions(+), 23 deletions(-)
diff --git a/fs/io-wq.c b/fs/io-wq.c
index 4023c9846860..5bee3f5f67e1 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -717,8 +717,7 @@ static int io_wq_manager(void *data)
complete(&wq->done);
while (!kthread_should_stop()) {
- if (current->task_works)
- task_work_run();
+ task_work_run();
for_each_node(node) {
struct io_wqe *wqe = wq->wqes[node];
@@ -742,9 +741,7 @@ static int io_wq_manager(void *data)
schedule_timeout(HZ);
}
- if (current->task_works)
- task_work_run();
-
+ task_work_run();
return 0;
err:
set_bit(IO_WQ_BIT_ERROR, &wq->state);
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 79bd22289d73..1579390c7c53 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5967,8 +5967,7 @@ static int io_sq_thread(void *data)
if (!list_empty(&ctx->poll_list) ||
(!time_after(jiffies, timeout) && ret != -EBUSY &&
!percpu_ref_is_dying(&ctx->refs))) {
- if (current->task_works)
- task_work_run();
+ task_work_run();
cond_resched();
continue;
}
@@ -6000,8 +5999,8 @@ static int io_sq_thread(void *data)
finish_wait(&ctx->sqo_wait, &wait);
break;
}
- if (current->task_works) {
- task_work_run();
+ if (task_work_pending()) {
+ __task_work_run();
finish_wait(&ctx->sqo_wait, &wait);
continue;
}
@@ -6024,8 +6023,7 @@ static int io_sq_thread(void *data)
timeout = jiffies + ctx->sq_thread_idle;
}
- if (current->task_works)
- task_work_run();
+ task_work_run();
set_fs(old_fs);
if (cur_mm) {
@@ -6094,9 +6092,9 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
do {
if (io_cqring_events(ctx, false) >= min_events)
return 0;
- if (!current->task_works)
+ if (!task_work_pending())
break;
- task_work_run();
+ __task_work_run();
} while (1);
if (sig) {
@@ -6117,8 +6115,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
do {
prepare_to_wait_exclusive(&ctx->wait, &iowq.wq,
TASK_INTERRUPTIBLE);
- if (current->task_works)
- task_work_run();
+ task_work_run();
if (io_should_wake(&iowq, false))
break;
schedule();
@@ -7467,8 +7464,7 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
int submitted = 0;
struct fd f;
- if (current->task_works)
- task_work_run();
+ task_work_run();
if (flags & ~(IORING_ENTER_GETEVENTS | IORING_ENTER_SQ_WAKEUP))
return -EINVAL;
diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
index 36fb3bbed6b2..608a2d12bc14 100644
--- a/include/linux/tracehook.h
+++ b/include/linux/tracehook.h
@@ -184,8 +184,7 @@ static inline void tracehook_notify_resume(struct pt_regs *regs)
* hlist_add_head(task->task_works);
*/
smp_mb__after_atomic();
- if (unlikely(current->task_works))
- task_work_run();
+ task_work_run();
#ifdef CONFIG_KEYS_REQUEST_CACHE
if (unlikely(current->cached_requested_key)) {
diff --git a/kernel/signal.c b/kernel/signal.c
index e58a6c619824..d62b7a3f2045 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2271,8 +2271,8 @@ static void ptrace_do_notify(int signr, int exit_code, int why)
void ptrace_notify(int exit_code)
{
BUG_ON((exit_code & (0x7f | ~0xffff)) != SIGTRAP);
- if (unlikely(current->task_works))
- task_work_run();
+
+ task_work_run();
spin_lock_irq(¤t->sighand->siglock);
ptrace_do_notify(SIGTRAP, exit_code, CLD_TRAPPED);
@@ -2529,8 +2529,7 @@ bool get_signal(struct ksignal *ksig)
struct signal_struct *signal = current->signal;
int signr;
- if (unlikely(current->task_works))
- task_work_run();
+ task_work_run();
if (unlikely(uprobe_deny_signal()))
return false;
--
2.26.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 4/4] io_uring: flush task work before waiting for ring exit
2020-04-06 19:48 [PATCHSET 0/4] io_uring and task_work interactions Jens Axboe
` (2 preceding siblings ...)
2020-04-06 19:48 ` [PATCH 3/4] task_work: kill current->task_works checking in callers Jens Axboe
@ 2020-04-06 19:48 ` Jens Axboe
3 siblings, 0 replies; 15+ messages in thread
From: Jens Axboe @ 2020-04-06 19:48 UTC (permalink / raw)
To: io-uring; +Cc: peterz, Jens Axboe
We could have triggered task work when we removed poll completions.
It's now safe to unconditionally call task_work_run() since it checks
for the exited task work, do so in case we have pending items there.
Ensure we do so before flushing the CQ ring overflow.
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 1579390c7c53..183bd28761e3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -7295,10 +7295,13 @@ static void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx)
io_wq_cancel_all(ctx->io_wq);
io_iopoll_reap_events(ctx);
+ idr_for_each(&ctx->personality_idr, io_remove_personalities, ctx);
+ task_work_run();
+
/* if we failed setting up the ctx, we might not have any rings */
if (ctx->rings)
io_cqring_overflow_flush(ctx, true);
- idr_for_each(&ctx->personality_idr, io_remove_personalities, ctx);
+
wait_for_completion(&ctx->completions[0]);
io_ring_ctx_free(ctx);
}
--
2.26.0
^ permalink raw reply related [flat|nested] 15+ messages in thread