public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH] io_uring: don't modify identity's files uncess identity is cowed
@ 2021-02-04  9:20 Xiaoguang Wang
  2021-02-04 11:05 ` Pavel Begunkov
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Xiaoguang Wang @ 2021-02-04  9:20 UTC (permalink / raw)
  To: io-uring; +Cc: axboe, asml.silence, joseph.qi

Abaci Robot reported following panic:
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 800000010ef3f067 P4D 800000010ef3f067 PUD 10d9df067 PMD 0
Oops: 0002 [#1] SMP PTI
CPU: 0 PID: 1869 Comm: io_wqe_worker-0 Not tainted 5.11.0-rc3+ #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:put_files_struct+0x1b/0x120
Code: 24 18 c7 00 f4 ff ff ff e9 4d fd ff ff 66 90 0f 1f 44 00 00 41 57 41 56 49 89 fe 41 55 41 54 55 53 48 83 ec 08 e8 b5 6b db ff  41 ff 0e 74 13 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f e9 9c
RSP: 0000:ffffc90002147d48 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff88810d9a5300 RCX: 0000000000000000
RDX: ffff88810d87c280 RSI: ffffffff8144ba6b RDI: 0000000000000000
RBP: 0000000000000080 R08: 0000000000000001 R09: ffffffff81431500
R10: ffff8881001be000 R11: 0000000000000000 R12: ffff88810ac2f800
R13: ffff88810af38a00 R14: 0000000000000000 R15: ffff8881057130c0
FS:  0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000010dbaa002 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 __io_clean_op+0x10c/0x2a0
 io_dismantle_req+0x3c7/0x600
 __io_free_req+0x34/0x280
 io_put_req+0x63/0xb0
 io_worker_handle_work+0x60e/0x830
 ? io_wqe_worker+0x135/0x520
 io_wqe_worker+0x158/0x520
 ? __kthread_parkme+0x96/0xc0
 ? io_worker_handle_work+0x830/0x830
 kthread+0x134/0x180
 ? kthread_create_worker_on_cpu+0x90/0x90
 ret_from_fork+0x1f/0x30
Modules linked in:
CR2: 0000000000000000
---[ end trace c358ca86af95b1e7 ]---

I guess case below can trigger above panic: there're two threads which
operates different io_uring ctxs and share same sqthread identity, and
later one thread exits, io_uring_cancel_task_requests() will clear
task->io_uring->identity->files to be NULL in sqpoll mode, then another
ctx that uses same identity will panic.

Indeed we don't need to clear task->io_uring->identity->files here,
io_grab_identity() should handle identity->files changes well, if
task->io_uring->identity->files is not equal to current->files,
io_cow_identity() should handle this changes well.

Reported-by: Abaci Robot <[email protected]>
Signed-off-by: Xiaoguang Wang <[email protected]>
---
 fs/io_uring.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 38c6cbe1ab38..5d3348d66f06 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -8982,12 +8982,6 @@ static void io_uring_cancel_task_requests(struct io_ring_ctx *ctx,
 
 	if ((ctx->flags & IORING_SETUP_SQPOLL) && ctx->sq_data) {
 		atomic_dec(&task->io_uring->in_idle);
-		/*
-		 * If the files that are going away are the ones in the thread
-		 * identity, clear them out.
-		 */
-		if (task->io_uring->identity->files == files)
-			task->io_uring->identity->files = NULL;
 		io_sq_thread_unpark(ctx->sq_data);
 	}
 }
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] io_uring: don't modify identity's files uncess identity is cowed
  2021-02-04  9:20 [PATCH] io_uring: don't modify identity's files uncess identity is cowed Xiaoguang Wang
@ 2021-02-04 11:05 ` Pavel Begunkov
  2021-02-04 13:54   ` Pavel Begunkov
  2021-02-04 14:43 ` Jens Axboe
  2021-02-05  1:34 ` Joseph Qi
  2 siblings, 1 reply; 5+ messages in thread
From: Pavel Begunkov @ 2021-02-04 11:05 UTC (permalink / raw)
  To: Xiaoguang Wang, io-uring; +Cc: axboe, joseph.qi

On 04/02/2021 09:20, Xiaoguang Wang wrote:
> Abaci Robot reported following panic:
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> PGD 800000010ef3f067 P4D 800000010ef3f067 PUD 10d9df067 PMD 0
> Oops: 0002 [#1] SMP PTI
> CPU: 0 PID: 1869 Comm: io_wqe_worker-0 Not tainted 5.11.0-rc3+ #1
> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> RIP: 0010:put_files_struct+0x1b/0x120
> Code: 24 18 c7 00 f4 ff ff ff e9 4d fd ff ff 66 90 0f 1f 44 00 00 41 57 41 56 49 89 fe 41 55 41 54 55 53 48 83 ec 08 e8 b5 6b db ff  41 ff 0e 74 13 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f e9 9c
> RSP: 0000:ffffc90002147d48 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: ffff88810d9a5300 RCX: 0000000000000000
> RDX: ffff88810d87c280 RSI: ffffffff8144ba6b RDI: 0000000000000000
> RBP: 0000000000000080 R08: 0000000000000001 R09: ffffffff81431500
> R10: ffff8881001be000 R11: 0000000000000000 R12: ffff88810ac2f800
> R13: ffff88810af38a00 R14: 0000000000000000 R15: ffff8881057130c0
> FS:  0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000010dbaa002 CR4: 00000000003706f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  __io_clean_op+0x10c/0x2a0
>  io_dismantle_req+0x3c7/0x600
>  __io_free_req+0x34/0x280
>  io_put_req+0x63/0xb0
>  io_worker_handle_work+0x60e/0x830
>  ? io_wqe_worker+0x135/0x520
>  io_wqe_worker+0x158/0x520
>  ? __kthread_parkme+0x96/0xc0
>  ? io_worker_handle_work+0x830/0x830
>  kthread+0x134/0x180
>  ? kthread_create_worker_on_cpu+0x90/0x90
>  ret_from_fork+0x1f/0x30
> Modules linked in:
> CR2: 0000000000000000
> ---[ end trace c358ca86af95b1e7 ]---
> 
> I guess case below can trigger above panic: there're two threads which
> operates different io_uring ctxs and share same sqthread identity, and
> later one thread exits, io_uring_cancel_task_requests() will clear
> task->io_uring->identity->files to be NULL in sqpoll mode, then another
> ctx that uses same identity will panic.
> 
> Indeed we don't need to clear task->io_uring->identity->files here,
> io_grab_identity() should handle identity->files changes well, if
> task->io_uring->identity->files is not equal to current->files,
> io_cow_identity() should handle this changes well.

Didn't look in the trace above, but the change looks good. I even did
it myself a couple of weeks ago, but it got dropped because of unrelated
hassle.

I'll test/review a bit later.

> 
> Reported-by: Abaci Robot <[email protected]>
> Signed-off-by: Xiaoguang Wang <[email protected]>
> ---
>  fs/io_uring.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 38c6cbe1ab38..5d3348d66f06 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -8982,12 +8982,6 @@ static void io_uring_cancel_task_requests(struct io_ring_ctx *ctx,
>  
>  	if ((ctx->flags & IORING_SETUP_SQPOLL) && ctx->sq_data) {
>  		atomic_dec(&task->io_uring->in_idle);
> -		/*
> -		 * If the files that are going away are the ones in the thread
> -		 * identity, clear them out.
> -		 */
> -		if (task->io_uring->identity->files == files)
> -			task->io_uring->identity->files = NULL;
>  		io_sq_thread_unpark(ctx->sq_data);
>  	}
>  }
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] io_uring: don't modify identity's files uncess identity is cowed
  2021-02-04 11:05 ` Pavel Begunkov
@ 2021-02-04 13:54   ` Pavel Begunkov
  0 siblings, 0 replies; 5+ messages in thread
From: Pavel Begunkov @ 2021-02-04 13:54 UTC (permalink / raw)
  To: Xiaoguang Wang, io-uring; +Cc: axboe, joseph.qi

On 04/02/2021 11:05, Pavel Begunkov wrote:
> On 04/02/2021 09:20, Xiaoguang Wang wrote:
>> Abaci Robot reported following panic:
>> BUG: kernel NULL pointer dereference, address: 0000000000000000
>> PGD 800000010ef3f067 P4D 800000010ef3f067 PUD 10d9df067 PMD 0
>> Oops: 0002 [#1] SMP PTI
>> CPU: 0 PID: 1869 Comm: io_wqe_worker-0 Not tainted 5.11.0-rc3+ #1
>> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>> RIP: 0010:put_files_struct+0x1b/0x120
>> Code: 24 18 c7 00 f4 ff ff ff e9 4d fd ff ff 66 90 0f 1f 44 00 00 41 57 41 56 49 89 fe 41 55 41 54 55 53 48 83 ec 08 e8 b5 6b db ff  41 ff 0e 74 13 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f e9 9c
>> RSP: 0000:ffffc90002147d48 EFLAGS: 00010293
>> RAX: 0000000000000000 RBX: ffff88810d9a5300 RCX: 0000000000000000
>> RDX: ffff88810d87c280 RSI: ffffffff8144ba6b RDI: 0000000000000000
>> RBP: 0000000000000080 R08: 0000000000000001 R09: ffffffff81431500
>> R10: ffff8881001be000 R11: 0000000000000000 R12: ffff88810ac2f800
>> R13: ffff88810af38a00 R14: 0000000000000000 R15: ffff8881057130c0
>> FS:  0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000000 CR3: 000000010dbaa002 CR4: 00000000003706f0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Call Trace:
>>  __io_clean_op+0x10c/0x2a0
>>  io_dismantle_req+0x3c7/0x600
>>  __io_free_req+0x34/0x280
>>  io_put_req+0x63/0xb0
>>  io_worker_handle_work+0x60e/0x830
>>  ? io_wqe_worker+0x135/0x520
>>  io_wqe_worker+0x158/0x520
>>  ? __kthread_parkme+0x96/0xc0
>>  ? io_worker_handle_work+0x830/0x830
>>  kthread+0x134/0x180
>>  ? kthread_create_worker_on_cpu+0x90/0x90
>>  ret_from_fork+0x1f/0x30
>> Modules linked in:
>> CR2: 0000000000000000
>> ---[ end trace c358ca86af95b1e7 ]---
>>
>> I guess case below can trigger above panic: there're two threads which
>> operates different io_uring ctxs and share same sqthread identity, and
>> later one thread exits, io_uring_cancel_task_requests() will clear
>> task->io_uring->identity->files to be NULL in sqpoll mode, then another
>> ctx that uses same identity will panic.
>>
>> Indeed we don't need to clear task->io_uring->identity->files here,
>> io_grab_identity() should handle identity->files changes well, if
>> task->io_uring->identity->files is not equal to current->files,
>> io_cow_identity() should handle this changes well.
> 
> Didn't look in the trace above, but the change looks good. I even did
> it myself a couple of weeks ago, but it got dropped because of unrelated
> hassle.
> 
> I'll test/review a bit later.

Reviewed-by: Pavel Begunkov <[email protected]>
Cc: [email protected] # 5.5+

> 
>>
>> Reported-by: Abaci Robot <[email protected]>
>> Signed-off-by: Xiaoguang Wang <[email protected]>
>> ---
>>  fs/io_uring.c | 6 ------
>>  1 file changed, 6 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 38c6cbe1ab38..5d3348d66f06 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -8982,12 +8982,6 @@ static void io_uring_cancel_task_requests(struct io_ring_ctx *ctx,
>>  
>>  	if ((ctx->flags & IORING_SETUP_SQPOLL) && ctx->sq_data) {
>>  		atomic_dec(&task->io_uring->in_idle);
>> -		/*
>> -		 * If the files that are going away are the ones in the thread
>> -		 * identity, clear them out.
>> -		 */
>> -		if (task->io_uring->identity->files == files)
>> -			task->io_uring->identity->files = NULL;
>>  		io_sq_thread_unpark(ctx->sq_data);
>>  	}
>>  }
>>
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] io_uring: don't modify identity's files uncess identity is cowed
  2021-02-04  9:20 [PATCH] io_uring: don't modify identity's files uncess identity is cowed Xiaoguang Wang
  2021-02-04 11:05 ` Pavel Begunkov
@ 2021-02-04 14:43 ` Jens Axboe
  2021-02-05  1:34 ` Joseph Qi
  2 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2021-02-04 14:43 UTC (permalink / raw)
  To: Xiaoguang Wang, io-uring; +Cc: asml.silence, joseph.qi

On 2/4/21 2:20 AM, Xiaoguang Wang wrote:
> Abaci Robot reported following panic:
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> PGD 800000010ef3f067 P4D 800000010ef3f067 PUD 10d9df067 PMD 0
> Oops: 0002 [#1] SMP PTI
> CPU: 0 PID: 1869 Comm: io_wqe_worker-0 Not tainted 5.11.0-rc3+ #1
> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> RIP: 0010:put_files_struct+0x1b/0x120
> Code: 24 18 c7 00 f4 ff ff ff e9 4d fd ff ff 66 90 0f 1f 44 00 00 41 57 41 56 49 89 fe 41 55 41 54 55 53 48 83 ec 08 e8 b5 6b db ff  41 ff 0e 74 13 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f e9 9c
> RSP: 0000:ffffc90002147d48 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: ffff88810d9a5300 RCX: 0000000000000000
> RDX: ffff88810d87c280 RSI: ffffffff8144ba6b RDI: 0000000000000000
> RBP: 0000000000000080 R08: 0000000000000001 R09: ffffffff81431500
> R10: ffff8881001be000 R11: 0000000000000000 R12: ffff88810ac2f800
> R13: ffff88810af38a00 R14: 0000000000000000 R15: ffff8881057130c0
> FS:  0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000010dbaa002 CR4: 00000000003706f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  __io_clean_op+0x10c/0x2a0
>  io_dismantle_req+0x3c7/0x600
>  __io_free_req+0x34/0x280
>  io_put_req+0x63/0xb0
>  io_worker_handle_work+0x60e/0x830
>  ? io_wqe_worker+0x135/0x520
>  io_wqe_worker+0x158/0x520
>  ? __kthread_parkme+0x96/0xc0
>  ? io_worker_handle_work+0x830/0x830
>  kthread+0x134/0x180
>  ? kthread_create_worker_on_cpu+0x90/0x90
>  ret_from_fork+0x1f/0x30
> Modules linked in:
> CR2: 0000000000000000
> ---[ end trace c358ca86af95b1e7 ]---
> 
> I guess case below can trigger above panic: there're two threads which
> operates different io_uring ctxs and share same sqthread identity, and
> later one thread exits, io_uring_cancel_task_requests() will clear
> task->io_uring->identity->files to be NULL in sqpoll mode, then another
> ctx that uses same identity will panic.
> 
> Indeed we don't need to clear task->io_uring->identity->files here,
> io_grab_identity() should handle identity->files changes well, if
> task->io_uring->identity->files is not equal to current->files,
> io_cow_identity() should handle this changes well.

Applied, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] io_uring: don't modify identity's files uncess identity is cowed
  2021-02-04  9:20 [PATCH] io_uring: don't modify identity's files uncess identity is cowed Xiaoguang Wang
  2021-02-04 11:05 ` Pavel Begunkov
  2021-02-04 14:43 ` Jens Axboe
@ 2021-02-05  1:34 ` Joseph Qi
  2 siblings, 0 replies; 5+ messages in thread
From: Joseph Qi @ 2021-02-05  1:34 UTC (permalink / raw)
  To: Xiaoguang Wang, io-uring; +Cc: axboe, asml.silence

A typo in subject, 'uncess' -> 'unless'?

Thanks,
Joseph

On 2/4/21 5:20 PM, Xiaoguang Wang wrote:
> Abaci Robot reported following panic:
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> PGD 800000010ef3f067 P4D 800000010ef3f067 PUD 10d9df067 PMD 0
> Oops: 0002 [#1] SMP PTI
> CPU: 0 PID: 1869 Comm: io_wqe_worker-0 Not tainted 5.11.0-rc3+ #1
> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> RIP: 0010:put_files_struct+0x1b/0x120
> Code: 24 18 c7 00 f4 ff ff ff e9 4d fd ff ff 66 90 0f 1f 44 00 00 41 57 41 56 49 89 fe 41 55 41 54 55 53 48 83 ec 08 e8 b5 6b db ff  41 ff 0e 74 13 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f e9 9c
> RSP: 0000:ffffc90002147d48 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: ffff88810d9a5300 RCX: 0000000000000000
> RDX: ffff88810d87c280 RSI: ffffffff8144ba6b RDI: 0000000000000000
> RBP: 0000000000000080 R08: 0000000000000001 R09: ffffffff81431500
> R10: ffff8881001be000 R11: 0000000000000000 R12: ffff88810ac2f800
> R13: ffff88810af38a00 R14: 0000000000000000 R15: ffff8881057130c0
> FS:  0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000010dbaa002 CR4: 00000000003706f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  __io_clean_op+0x10c/0x2a0
>  io_dismantle_req+0x3c7/0x600
>  __io_free_req+0x34/0x280
>  io_put_req+0x63/0xb0
>  io_worker_handle_work+0x60e/0x830
>  ? io_wqe_worker+0x135/0x520
>  io_wqe_worker+0x158/0x520
>  ? __kthread_parkme+0x96/0xc0
>  ? io_worker_handle_work+0x830/0x830
>  kthread+0x134/0x180
>  ? kthread_create_worker_on_cpu+0x90/0x90
>  ret_from_fork+0x1f/0x30
> Modules linked in:
> CR2: 0000000000000000
> ---[ end trace c358ca86af95b1e7 ]---
> 
> I guess case below can trigger above panic: there're two threads which
> operates different io_uring ctxs and share same sqthread identity, and
> later one thread exits, io_uring_cancel_task_requests() will clear
> task->io_uring->identity->files to be NULL in sqpoll mode, then another
> ctx that uses same identity will panic.
> 
> Indeed we don't need to clear task->io_uring->identity->files here,
> io_grab_identity() should handle identity->files changes well, if
> task->io_uring->identity->files is not equal to current->files,
> io_cow_identity() should handle this changes well.
> 
> Reported-by: Abaci Robot <[email protected]>
> Signed-off-by: Xiaoguang Wang <[email protected]>
> ---
>  fs/io_uring.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 38c6cbe1ab38..5d3348d66f06 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -8982,12 +8982,6 @@ static void io_uring_cancel_task_requests(struct io_ring_ctx *ctx,
>  
>  	if ((ctx->flags & IORING_SETUP_SQPOLL) && ctx->sq_data) {
>  		atomic_dec(&task->io_uring->in_idle);
> -		/*
> -		 * If the files that are going away are the ones in the thread
> -		 * identity, clear them out.
> -		 */
> -		if (task->io_uring->identity->files == files)
> -			task->io_uring->identity->files = NULL;
>  		io_sq_thread_unpark(ctx->sq_data);
>  	}
>  }
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-02-05  1:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-02-04  9:20 [PATCH] io_uring: don't modify identity's files uncess identity is cowed Xiaoguang Wang
2021-02-04 11:05 ` Pavel Begunkov
2021-02-04 13:54   ` Pavel Begunkov
2021-02-04 14:43 ` Jens Axboe
2021-02-05  1:34 ` Joseph Qi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox