* [PATCH 5.11 0/2] syzbot fixes
@ 2021-01-26 15:28 Pavel Begunkov
2021-01-26 15:28 ` [PATCH 1/2] io_uring: fix __io_uring_files_cancel() with TASK_UNINTERRUPTIBLE Pavel Begunkov
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Pavel Begunkov @ 2021-01-26 15:28 UTC (permalink / raw)
To: Jens Axboe, io-uring
2/2 is a reincarnated issue found by syzbot, fixed by cancel-all-reqs
patches before.
Pavel Begunkov (2):
io_uring: fix __io_uring_files_cancel() with TASK_UNINTERRUPTIBLE
io_uring: fix cancellation taking mutex while TASK_UNINTERRUPTIBLE
fs/io_uring.c | 52 ++++++++++++++++++++++++++++-----------------------
1 file changed, 29 insertions(+), 23 deletions(-)
--
2.24.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] io_uring: fix __io_uring_files_cancel() with TASK_UNINTERRUPTIBLE
2021-01-26 15:28 [PATCH 5.11 0/2] syzbot fixes Pavel Begunkov
@ 2021-01-26 15:28 ` Pavel Begunkov
2021-01-26 15:28 ` [PATCH 2/2] io_uring: fix cancellation taking mutex while TASK_UNINTERRUPTIBLE Pavel Begunkov
2021-01-26 15:54 ` [PATCH 5.11 0/2] syzbot fixes Jens Axboe
2 siblings, 0 replies; 6+ messages in thread
From: Pavel Begunkov @ 2021-01-26 15:28 UTC (permalink / raw)
To: Jens Axboe, io-uring; +Cc: stable
If the tctx inflight number haven't changed because of cancellation,
__io_uring_task_cancel() will continue leaving the task in
TASK_UNINTERRUPTIBLE state, that's not expected by
__io_uring_files_cancel(). Ensure we always call finish_wait() before
retrying.
Cc: [email protected] # 5.9+
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 2166c469789d..09aada153a71 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -9124,16 +9124,15 @@ void __io_uring_task_cancel(void)
prepare_to_wait(&tctx->wait, &wait, TASK_UNINTERRUPTIBLE);
/*
- * If we've seen completions, retry. This avoids a race where
- * a completion comes in before we did prepare_to_wait().
+ * If we've seen completions, retry without waiting. This
+ * avoids a race where a completion comes in before we did
+ * prepare_to_wait().
*/
- if (inflight != tctx_inflight(tctx))
- continue;
- schedule();
+ if (inflight == tctx_inflight(tctx))
+ schedule();
finish_wait(&tctx->wait, &wait);
} while (1);
- finish_wait(&tctx->wait, &wait);
atomic_dec(&tctx->in_idle);
io_uring_remove_task_files(tctx);
--
2.24.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] io_uring: fix cancellation taking mutex while TASK_UNINTERRUPTIBLE
2021-01-26 15:28 [PATCH 5.11 0/2] syzbot fixes Pavel Begunkov
2021-01-26 15:28 ` [PATCH 1/2] io_uring: fix __io_uring_files_cancel() with TASK_UNINTERRUPTIBLE Pavel Begunkov
@ 2021-01-26 15:28 ` Pavel Begunkov
2021-01-26 15:56 ` Pavel Begunkov
2021-01-26 15:54 ` [PATCH 5.11 0/2] syzbot fixes Jens Axboe
2 siblings, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2021-01-26 15:28 UTC (permalink / raw)
To: Jens Axboe, io-uring; +Cc: stable, syzbot+f655445043a26a7cfab8
do not call blocking ops when !TASK_RUNNING; state=2 set at
[<00000000ced9dbfc>] prepare_to_wait+0x1f4/0x3b0
kernel/sched/wait.c:262
WARNING: CPU: 1 PID: 19888 at kernel/sched/core.c:7853
__might_sleep+0xed/0x100 kernel/sched/core.c:7848
RIP: 0010:__might_sleep+0xed/0x100 kernel/sched/core.c:7848
Call Trace:
__mutex_lock_common+0xc4/0x2ef0 kernel/locking/mutex.c:935
__mutex_lock kernel/locking/mutex.c:1103 [inline]
mutex_lock_nested+0x1a/0x20 kernel/locking/mutex.c:1118
io_wq_submit_work+0x39a/0x720 fs/io_uring.c:6411
io_run_cancel fs/io-wq.c:856 [inline]
io_wqe_cancel_pending_work fs/io-wq.c:990 [inline]
io_wq_cancel_cb+0x614/0xcb0 fs/io-wq.c:1027
io_uring_cancel_files fs/io_uring.c:8874 [inline]
io_uring_cancel_task_requests fs/io_uring.c:8952 [inline]
__io_uring_files_cancel+0x115d/0x19e0 fs/io_uring.c:9038
io_uring_files_cancel include/linux/io_uring.h:51 [inline]
do_exit+0x2e6/0x2490 kernel/exit.c:780
do_group_exit+0x168/0x2d0 kernel/exit.c:922
get_signal+0x16b5/0x2030 kernel/signal.c:2770
arch_do_signal_or_restart+0x8e/0x6a0 arch/x86/kernel/signal.c:811
handle_signal_work kernel/entry/common.c:147 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
exit_to_user_mode_prepare+0xac/0x1e0 kernel/entry/common.c:201
__syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
syscall_exit_to_user_mode+0x48/0x190 kernel/entry/common.c:302
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Rewrite io_uring_cancel_files() to mimic __io_uring_task_cancel()'s
counting scheme, so it does all the heavy work before setting
TASK_UNINTERRUPTIBLE.
Cc: [email protected] # 5.9+
Reported-by: [email protected]
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 41 ++++++++++++++++++++++++-----------------
1 file changed, 24 insertions(+), 17 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 09aada153a71..f3f2b37e7021 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -8873,30 +8873,33 @@ static void io_cancel_defer_files(struct io_ring_ctx *ctx,
}
}
+static int io_uring_count_inflight(struct io_ring_ctx *ctx,
+ struct task_struct *task,
+ struct files_struct *files)
+{
+ struct io_kiocb *req;
+ int cnt = 0;
+
+ spin_lock_irq(&ctx->inflight_lock);
+ list_for_each_entry(req, &ctx->inflight_list, inflight_entry) {
+ if (!io_match_task(req, task, files))
+ cnt++;
+ }
+ spin_unlock_irq(&ctx->inflight_lock);
+ return cnt;
+}
+
static void io_uring_cancel_files(struct io_ring_ctx *ctx,
struct task_struct *task,
struct files_struct *files)
{
while (!list_empty_careful(&ctx->inflight_list)) {
struct io_task_cancel cancel = { .task = task, .files = files };
- struct io_kiocb *req;
DEFINE_WAIT(wait);
- bool found = false;
+ int inflight;
- spin_lock_irq(&ctx->inflight_lock);
- list_for_each_entry(req, &ctx->inflight_list, inflight_entry) {
- if (!io_match_task(req, task, files))
- continue;
- found = true;
- break;
- }
- if (found)
- prepare_to_wait(&task->io_uring->wait, &wait,
- TASK_UNINTERRUPTIBLE);
- spin_unlock_irq(&ctx->inflight_lock);
-
- /* We need to keep going until we don't find a matching req */
- if (!found)
+ inflight = io_uring_count_inflight(ctx, task, files);
+ if (!inflight)
break;
io_wq_cancel_cb(ctx->io_wq, io_cancel_task_cb, &cancel, true);
@@ -8905,7 +8908,11 @@ static void io_uring_cancel_files(struct io_ring_ctx *ctx,
io_cqring_overflow_flush(ctx, true, task, files);
/* cancellations _may_ trigger task work */
io_run_task_work();
- schedule();
+
+ prepare_to_wait(&task->io_uring->wait, &wait,
+ TASK_UNINTERRUPTIBLE);
+ if (inflight == io_uring_count_inflight(ctx, task, files))
+ schedule();
finish_wait(&task->io_uring->wait, &wait);
}
}
--
2.24.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 5.11 0/2] syzbot fixes
2021-01-26 15:28 [PATCH 5.11 0/2] syzbot fixes Pavel Begunkov
2021-01-26 15:28 ` [PATCH 1/2] io_uring: fix __io_uring_files_cancel() with TASK_UNINTERRUPTIBLE Pavel Begunkov
2021-01-26 15:28 ` [PATCH 2/2] io_uring: fix cancellation taking mutex while TASK_UNINTERRUPTIBLE Pavel Begunkov
@ 2021-01-26 15:54 ` Jens Axboe
2 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2021-01-26 15:54 UTC (permalink / raw)
To: Pavel Begunkov, io-uring
On 1/26/21 8:28 AM, Pavel Begunkov wrote:
> 2/2 is a reincarnated issue found by syzbot, fixed by cancel-all-reqs
> patches before.
>
> Pavel Begunkov (2):
> io_uring: fix __io_uring_files_cancel() with TASK_UNINTERRUPTIBLE
> io_uring: fix cancellation taking mutex while TASK_UNINTERRUPTIBLE
>
> fs/io_uring.c | 52 ++++++++++++++++++++++++++++-----------------------
> 1 file changed, 29 insertions(+), 23 deletions(-)
LGTM, applied.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] io_uring: fix cancellation taking mutex while TASK_UNINTERRUPTIBLE
2021-01-26 15:28 ` [PATCH 2/2] io_uring: fix cancellation taking mutex while TASK_UNINTERRUPTIBLE Pavel Begunkov
@ 2021-01-26 15:56 ` Pavel Begunkov
2021-01-26 16:01 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2021-01-26 15:56 UTC (permalink / raw)
To: Jens Axboe, io-uring; +Cc: stable, syzbot+f655445043a26a7cfab8
On 26/01/2021 15:28, Pavel Begunkov wrote:
> do not call blocking ops when !TASK_RUNNING; state=2 set at
> [<00000000ced9dbfc>] prepare_to_wait+0x1f4/0x3b0
> kernel/sched/wait.c:262
> WARNING: CPU: 1 PID: 19888 at kernel/sched/core.c:7853
> __might_sleep+0xed/0x100 kernel/sched/core.c:7848
> RIP: 0010:__might_sleep+0xed/0x100 kernel/sched/core.c:7848
> Call Trace:
> __mutex_lock_common+0xc4/0x2ef0 kernel/locking/mutex.c:935
> __mutex_lock kernel/locking/mutex.c:1103 [inline]
> mutex_lock_nested+0x1a/0x20 kernel/locking/mutex.c:1118
> io_wq_submit_work+0x39a/0x720 fs/io_uring.c:6411
> io_run_cancel fs/io-wq.c:856 [inline]
> io_wqe_cancel_pending_work fs/io-wq.c:990 [inline]
> io_wq_cancel_cb+0x614/0xcb0 fs/io-wq.c:1027
> io_uring_cancel_files fs/io_uring.c:8874 [inline]
> io_uring_cancel_task_requests fs/io_uring.c:8952 [inline]
> __io_uring_files_cancel+0x115d/0x19e0 fs/io_uring.c:9038
> io_uring_files_cancel include/linux/io_uring.h:51 [inline]
> do_exit+0x2e6/0x2490 kernel/exit.c:780
> do_group_exit+0x168/0x2d0 kernel/exit.c:922
> get_signal+0x16b5/0x2030 kernel/signal.c:2770
> arch_do_signal_or_restart+0x8e/0x6a0 arch/x86/kernel/signal.c:811
> handle_signal_work kernel/entry/common.c:147 [inline]
> exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
> exit_to_user_mode_prepare+0xac/0x1e0 kernel/entry/common.c:201
> __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
> syscall_exit_to_user_mode+0x48/0x190 kernel/entry/common.c:302
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Rewrite io_uring_cancel_files() to mimic __io_uring_task_cancel()'s
> counting scheme, so it does all the heavy work before setting
> TASK_UNINTERRUPTIBLE.
>
> Cc: [email protected] # 5.9+
> Reported-by: [email protected]
> Signed-off-by: Pavel Begunkov <[email protected]>
> ---
> fs/io_uring.c | 41 ++++++++++++++++++++++++-----------------
> 1 file changed, 24 insertions(+), 17 deletions(-)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 09aada153a71..f3f2b37e7021 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -8873,30 +8873,33 @@ static void io_cancel_defer_files(struct io_ring_ctx *ctx,
> }
> }
>
> +static int io_uring_count_inflight(struct io_ring_ctx *ctx,
> + struct task_struct *task,
> + struct files_struct *files)
> +{
> + struct io_kiocb *req;
> + int cnt = 0;
> +
> + spin_lock_irq(&ctx->inflight_lock);
> + list_for_each_entry(req, &ctx->inflight_list, inflight_entry) {
> + if (!io_match_task(req, task, files))
This condition should be inversed. Jens, please drop it
p.s. I wonder how tests didn't catch that
> + cnt++;
> + }
> + spin_unlock_irq(&ctx->inflight_lock);
> + return cnt;
> +}
> +
> static void io_uring_cancel_files(struct io_ring_ctx *ctx,
> struct task_struct *task,
> struct files_struct *files)
> {
> while (!list_empty_careful(&ctx->inflight_list)) {
> struct io_task_cancel cancel = { .task = task, .files = files };
> - struct io_kiocb *req;
> DEFINE_WAIT(wait);
> - bool found = false;
> + int inflight;
>
> - spin_lock_irq(&ctx->inflight_lock);
> - list_for_each_entry(req, &ctx->inflight_list, inflight_entry) {
> - if (!io_match_task(req, task, files))
> - continue;
> - found = true;
> - break;
> - }
> - if (found)
> - prepare_to_wait(&task->io_uring->wait, &wait,
> - TASK_UNINTERRUPTIBLE);
> - spin_unlock_irq(&ctx->inflight_lock);
> -
> - /* We need to keep going until we don't find a matching req */
> - if (!found)
> + inflight = io_uring_count_inflight(ctx, task, files);
> + if (!inflight)
> break;
>
> io_wq_cancel_cb(ctx->io_wq, io_cancel_task_cb, &cancel, true);
> @@ -8905,7 +8908,11 @@ static void io_uring_cancel_files(struct io_ring_ctx *ctx,
> io_cqring_overflow_flush(ctx, true, task, files);
> /* cancellations _may_ trigger task work */
> io_run_task_work();
> - schedule();
> +
> + prepare_to_wait(&task->io_uring->wait, &wait,
> + TASK_UNINTERRUPTIBLE);
> + if (inflight == io_uring_count_inflight(ctx, task, files))
> + schedule();
> finish_wait(&task->io_uring->wait, &wait);
> }
> }
>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] io_uring: fix cancellation taking mutex while TASK_UNINTERRUPTIBLE
2021-01-26 15:56 ` Pavel Begunkov
@ 2021-01-26 16:01 ` Jens Axboe
0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2021-01-26 16:01 UTC (permalink / raw)
To: Pavel Begunkov, io-uring; +Cc: stable, syzbot+f655445043a26a7cfab8
On 1/26/21 8:56 AM, Pavel Begunkov wrote:
> On 26/01/2021 15:28, Pavel Begunkov wrote:
>> do not call blocking ops when !TASK_RUNNING; state=2 set at
>> [<00000000ced9dbfc>] prepare_to_wait+0x1f4/0x3b0
>> kernel/sched/wait.c:262
>> WARNING: CPU: 1 PID: 19888 at kernel/sched/core.c:7853
>> __might_sleep+0xed/0x100 kernel/sched/core.c:7848
>> RIP: 0010:__might_sleep+0xed/0x100 kernel/sched/core.c:7848
>> Call Trace:
>> __mutex_lock_common+0xc4/0x2ef0 kernel/locking/mutex.c:935
>> __mutex_lock kernel/locking/mutex.c:1103 [inline]
>> mutex_lock_nested+0x1a/0x20 kernel/locking/mutex.c:1118
>> io_wq_submit_work+0x39a/0x720 fs/io_uring.c:6411
>> io_run_cancel fs/io-wq.c:856 [inline]
>> io_wqe_cancel_pending_work fs/io-wq.c:990 [inline]
>> io_wq_cancel_cb+0x614/0xcb0 fs/io-wq.c:1027
>> io_uring_cancel_files fs/io_uring.c:8874 [inline]
>> io_uring_cancel_task_requests fs/io_uring.c:8952 [inline]
>> __io_uring_files_cancel+0x115d/0x19e0 fs/io_uring.c:9038
>> io_uring_files_cancel include/linux/io_uring.h:51 [inline]
>> do_exit+0x2e6/0x2490 kernel/exit.c:780
>> do_group_exit+0x168/0x2d0 kernel/exit.c:922
>> get_signal+0x16b5/0x2030 kernel/signal.c:2770
>> arch_do_signal_or_restart+0x8e/0x6a0 arch/x86/kernel/signal.c:811
>> handle_signal_work kernel/entry/common.c:147 [inline]
>> exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
>> exit_to_user_mode_prepare+0xac/0x1e0 kernel/entry/common.c:201
>> __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
>> syscall_exit_to_user_mode+0x48/0x190 kernel/entry/common.c:302
>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>> Rewrite io_uring_cancel_files() to mimic __io_uring_task_cancel()'s
>> counting scheme, so it does all the heavy work before setting
>> TASK_UNINTERRUPTIBLE.
>>
>> Cc: [email protected] # 5.9+
>> Reported-by: [email protected]
>> Signed-off-by: Pavel Begunkov <[email protected]>
>> ---
>> fs/io_uring.c | 41 ++++++++++++++++++++++++-----------------
>> 1 file changed, 24 insertions(+), 17 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 09aada153a71..f3f2b37e7021 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -8873,30 +8873,33 @@ static void io_cancel_defer_files(struct io_ring_ctx *ctx,
>> }
>> }
>>
>> +static int io_uring_count_inflight(struct io_ring_ctx *ctx,
>> + struct task_struct *task,
>> + struct files_struct *files)
>> +{
>> + struct io_kiocb *req;
>> + int cnt = 0;
>> +
>> + spin_lock_irq(&ctx->inflight_lock);
>> + list_for_each_entry(req, &ctx->inflight_list, inflight_entry) {
>> + if (!io_match_task(req, task, files))
>
> This condition should be inversed. Jens, please drop it
>
> p.s. I wonder how tests didn't catch that
Probably just make it cnt += io_match_task(req, task_files) to simplify
it. But yes, it looks reversed.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-01-26 16:12 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-01-26 15:28 [PATCH 5.11 0/2] syzbot fixes Pavel Begunkov
2021-01-26 15:28 ` [PATCH 1/2] io_uring: fix __io_uring_files_cancel() with TASK_UNINTERRUPTIBLE Pavel Begunkov
2021-01-26 15:28 ` [PATCH 2/2] io_uring: fix cancellation taking mutex while TASK_UNINTERRUPTIBLE Pavel Begunkov
2021-01-26 15:56 ` Pavel Begunkov
2021-01-26 16:01 ` Jens Axboe
2021-01-26 15:54 ` [PATCH 5.11 0/2] syzbot fixes Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox