public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH] io_uring: fix error clear of ->file_table in io_sqe_files_register()
@ 2019-11-10 15:46 Jens Axboe
  2019-11-10 23:44 ` Bob Liu
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2019-11-10 15:46 UTC (permalink / raw)
  To: io-uring

syzbot reports that when using failslab and friends, we can get a double
free in io_sqe_files_unregister():

BUG: KASAN: double-free or invalid-free in
io_sqe_files_unregister+0x20b/0x300 fs/io_uring.c:3185

CPU: 1 PID: 8819 Comm: syz-executor452 Not tainted 5.4.0-rc6-next-20191108
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x197/0x210 lib/dump_stack.c:118
  print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
  kasan_report_invalid_free+0x65/0xa0 mm/kasan/report.c:468
  __kasan_slab_free+0x13a/0x150 mm/kasan/common.c:450
  kasan_slab_free+0xe/0x10 mm/kasan/common.c:480
  __cache_free mm/slab.c:3426 [inline]
  kfree+0x10a/0x2c0 mm/slab.c:3757
  io_sqe_files_unregister+0x20b/0x300 fs/io_uring.c:3185
  io_ring_ctx_free fs/io_uring.c:3998 [inline]
  io_ring_ctx_wait_and_kill+0x348/0x700 fs/io_uring.c:4060
  io_uring_release+0x42/0x50 fs/io_uring.c:4068
  __fput+0x2ff/0x890 fs/file_table.c:280
  ____fput+0x16/0x20 fs/file_table.c:313
  task_work_run+0x145/0x1c0 kernel/task_work.c:113
  exit_task_work include/linux/task_work.h:22 [inline]
  do_exit+0x904/0x2e60 kernel/exit.c:817
  do_group_exit+0x135/0x360 kernel/exit.c:921
  __do_sys_exit_group kernel/exit.c:932 [inline]
  __se_sys_exit_group kernel/exit.c:930 [inline]
  __x64_sys_exit_group+0x44/0x50 kernel/exit.c:930
  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x43f2c8
Code: 31 b8 c5 f7 ff ff 48 8b 5c 24 28 48 8b 6c 24 30 4c 8b 64 24 38 4c 8b
6c 24 40 4c 8b 74 24 48 4c 8b 7c 24 50 48 83 c4 58 c3 66 <0f> 1f 84 00 00
00 00 00 48 8d 35 59 ca 00 00 0f b6 d2 48 89 fb 48
RSP: 002b:00007ffd5b976008 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043f2c8
RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
RBP: 00000000004bf0a8 R08: 00000000000000e7 R09: ffffffffffffffd0
R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000001
R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000

This happens if we fail allocating the file tables. For that case we do
free the file table correctly, but we forget to set it to NULL. This
means that ring teardown will see it as being non-NULL, and attempt to
free it again.

Fix this by clearing the file_table pointer if we free the table.

Reported-by: [email protected]
Fixes: 65e19f54d29c ("io_uring: support for larger fixed file sets")
Signed-off-by: Jens Axboe <[email protected]>

---

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 7fc3a72e1e1e..5c10b34ceb24 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3488,6 +3488,7 @@ static int io_sqe_files_register(struct io_ring_ctx *ctx, void __user *arg,
 
 	if (io_sqe_alloc_file_tables(ctx, nr_tables, nr_args)) {
 		kfree(ctx->file_table);
+		ctx->file_table = NULL;
 		return -ENOMEM;
 	}
 
-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] io_uring: fix error clear of ->file_table in io_sqe_files_register()
  2019-11-10 15:46 [PATCH] io_uring: fix error clear of ->file_table in io_sqe_files_register() Jens Axboe
@ 2019-11-10 23:44 ` Bob Liu
  2019-11-11  3:54   ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Bob Liu @ 2019-11-10 23:44 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On 11/10/19 11:46 PM, Jens Axboe wrote:
> syzbot reports that when using failslab and friends, we can get a double
> free in io_sqe_files_unregister():
> 
> BUG: KASAN: double-free or invalid-free in
> io_sqe_files_unregister+0x20b/0x300 fs/io_uring.c:3185
> 
> CPU: 1 PID: 8819 Comm: syz-executor452 Not tainted 5.4.0-rc6-next-20191108
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>   __dump_stack lib/dump_stack.c:77 [inline]
>   dump_stack+0x197/0x210 lib/dump_stack.c:118
>   print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
>   kasan_report_invalid_free+0x65/0xa0 mm/kasan/report.c:468
>   __kasan_slab_free+0x13a/0x150 mm/kasan/common.c:450
>   kasan_slab_free+0xe/0x10 mm/kasan/common.c:480
>   __cache_free mm/slab.c:3426 [inline]
>   kfree+0x10a/0x2c0 mm/slab.c:3757
>   io_sqe_files_unregister+0x20b/0x300 fs/io_uring.c:3185
>   io_ring_ctx_free fs/io_uring.c:3998 [inline]
>   io_ring_ctx_wait_and_kill+0x348/0x700 fs/io_uring.c:4060
>   io_uring_release+0x42/0x50 fs/io_uring.c:4068
>   __fput+0x2ff/0x890 fs/file_table.c:280
>   ____fput+0x16/0x20 fs/file_table.c:313
>   task_work_run+0x145/0x1c0 kernel/task_work.c:113
>   exit_task_work include/linux/task_work.h:22 [inline]
>   do_exit+0x904/0x2e60 kernel/exit.c:817
>   do_group_exit+0x135/0x360 kernel/exit.c:921
>   __do_sys_exit_group kernel/exit.c:932 [inline]
>   __se_sys_exit_group kernel/exit.c:930 [inline]
>   __x64_sys_exit_group+0x44/0x50 kernel/exit.c:930
>   do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
>   entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x43f2c8
> Code: 31 b8 c5 f7 ff ff 48 8b 5c 24 28 48 8b 6c 24 30 4c 8b 64 24 38 4c 8b
> 6c 24 40 4c 8b 74 24 48 4c 8b 7c 24 50 48 83 c4 58 c3 66 <0f> 1f 84 00 00
> 00 00 00 48 8d 35 59 ca 00 00 0f b6 d2 48 89 fb 48
> RSP: 002b:00007ffd5b976008 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043f2c8
> RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
> RBP: 00000000004bf0a8 R08: 00000000000000e7 R09: ffffffffffffffd0
> R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000001
> R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000
> 
> This happens if we fail allocating the file tables. For that case we do
> free the file table correctly, but we forget to set it to NULL. This
> means that ring teardown will see it as being non-NULL, and attempt to
> free it again.
> 
> Fix this by clearing the file_table pointer if we free the table.
> 
> Reported-by: [email protected]
> Fixes: 65e19f54d29c ("io_uring: support for larger fixed file sets")
> Signed-off-by: Jens Axboe <[email protected]>
> 

Reviewed-by: Bob Liu <[email protected]>

By the way, there are many place(besides io_uring.c) which need to set pointer to NULL after free.
I saw similar fix from time to time.

Do you think a safe_free() is worth? e.g
#define SAFE_FREE(p) { if (p) { free(p); (p)=NULL; } }

> ---
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 7fc3a72e1e1e..5c10b34ceb24 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -3488,6 +3488,7 @@ static int io_sqe_files_register(struct io_ring_ctx *ctx, void __user *arg,
>  
>  	if (io_sqe_alloc_file_tables(ctx, nr_tables, nr_args)) {
>  		kfree(ctx->file_table);
> +		ctx->file_table = NULL;
>  		return -ENOMEM;
>  	}
>  
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] io_uring: fix error clear of ->file_table in io_sqe_files_register()
  2019-11-10 23:44 ` Bob Liu
@ 2019-11-11  3:54   ` Jens Axboe
  2019-11-11  4:02     ` Jackie Liu
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2019-11-11  3:54 UTC (permalink / raw)
  To: Bob Liu, io-uring

On 11/10/19 4:44 PM, Bob Liu wrote:
> On 11/10/19 11:46 PM, Jens Axboe wrote:
>> syzbot reports that when using failslab and friends, we can get a double
>> free in io_sqe_files_unregister():
>>
>> BUG: KASAN: double-free or invalid-free in
>> io_sqe_files_unregister+0x20b/0x300 fs/io_uring.c:3185
>>
>> CPU: 1 PID: 8819 Comm: syz-executor452 Not tainted 5.4.0-rc6-next-20191108
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> Call Trace:
>>    __dump_stack lib/dump_stack.c:77 [inline]
>>    dump_stack+0x197/0x210 lib/dump_stack.c:118
>>    print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
>>    kasan_report_invalid_free+0x65/0xa0 mm/kasan/report.c:468
>>    __kasan_slab_free+0x13a/0x150 mm/kasan/common.c:450
>>    kasan_slab_free+0xe/0x10 mm/kasan/common.c:480
>>    __cache_free mm/slab.c:3426 [inline]
>>    kfree+0x10a/0x2c0 mm/slab.c:3757
>>    io_sqe_files_unregister+0x20b/0x300 fs/io_uring.c:3185
>>    io_ring_ctx_free fs/io_uring.c:3998 [inline]
>>    io_ring_ctx_wait_and_kill+0x348/0x700 fs/io_uring.c:4060
>>    io_uring_release+0x42/0x50 fs/io_uring.c:4068
>>    __fput+0x2ff/0x890 fs/file_table.c:280
>>    ____fput+0x16/0x20 fs/file_table.c:313
>>    task_work_run+0x145/0x1c0 kernel/task_work.c:113
>>    exit_task_work include/linux/task_work.h:22 [inline]
>>    do_exit+0x904/0x2e60 kernel/exit.c:817
>>    do_group_exit+0x135/0x360 kernel/exit.c:921
>>    __do_sys_exit_group kernel/exit.c:932 [inline]
>>    __se_sys_exit_group kernel/exit.c:930 [inline]
>>    __x64_sys_exit_group+0x44/0x50 kernel/exit.c:930
>>    do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
>>    entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> RIP: 0033:0x43f2c8
>> Code: 31 b8 c5 f7 ff ff 48 8b 5c 24 28 48 8b 6c 24 30 4c 8b 64 24 38 4c 8b
>> 6c 24 40 4c 8b 74 24 48 4c 8b 7c 24 50 48 83 c4 58 c3 66 <0f> 1f 84 00 00
>> 00 00 00 48 8d 35 59 ca 00 00 0f b6 d2 48 89 fb 48
>> RSP: 002b:00007ffd5b976008 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043f2c8
>> RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
>> RBP: 00000000004bf0a8 R08: 00000000000000e7 R09: ffffffffffffffd0
>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000001
>> R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000
>>
>> This happens if we fail allocating the file tables. For that case we do
>> free the file table correctly, but we forget to set it to NULL. This
>> means that ring teardown will see it as being non-NULL, and attempt to
>> free it again.
>>
>> Fix this by clearing the file_table pointer if we free the table.
>>
>> Reported-by: [email protected]
>> Fixes: 65e19f54d29c ("io_uring: support for larger fixed file sets")
>> Signed-off-by: Jens Axboe <[email protected]>
>>
> 
> Reviewed-by: Bob Liu <[email protected]>

Thanks, added.

> By the way, there are many place(besides io_uring.c) which need to set
> pointer to NULL after free.  I saw similar fix from time to time.
> 
> Do you think a safe_free() is worth? e.g
> #define SAFE_FREE(p) { if (p) { free(p); (p)=NULL; } }

Hmm not sure, and would probably be better as:

kfree_safe(&ptr);

or something instead. I seem to recall discussions about that ages ago,
probably worth while to try and search and see if you can find those. I
suspect Linus hates it, reasons not remembered ;-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] io_uring: fix error clear of ->file_table in io_sqe_files_register()
  2019-11-11  3:54   ` Jens Axboe
@ 2019-11-11  4:02     ` Jackie Liu
  2019-11-11  4:09       ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Jackie Liu @ 2019-11-11  4:02 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Bob Liu, io-uring



> 2019年11月11日 11:54,Jens Axboe <[email protected]> 写道:
> 
> On 11/10/19 4:44 PM, Bob Liu wrote:
>> On 11/10/19 11:46 PM, Jens Axboe wrote:
>>> syzbot reports that when using failslab and friends, we can get a double
>>> free in io_sqe_files_unregister():
>>> 
>>> BUG: KASAN: double-free or invalid-free in
>>> io_sqe_files_unregister+0x20b/0x300 fs/io_uring.c:3185
>>> 
>>> CPU: 1 PID: 8819 Comm: syz-executor452 Not tainted 5.4.0-rc6-next-20191108
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>>> Google 01/01/2011
>>> Call Trace:
>>>   __dump_stack lib/dump_stack.c:77 [inline]
>>>   dump_stack+0x197/0x210 lib/dump_stack.c:118
>>>   print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
>>>   kasan_report_invalid_free+0x65/0xa0 mm/kasan/report.c:468
>>>   __kasan_slab_free+0x13a/0x150 mm/kasan/common.c:450
>>>   kasan_slab_free+0xe/0x10 mm/kasan/common.c:480
>>>   __cache_free mm/slab.c:3426 [inline]
>>>   kfree+0x10a/0x2c0 mm/slab.c:3757
>>>   io_sqe_files_unregister+0x20b/0x300 fs/io_uring.c:3185
>>>   io_ring_ctx_free fs/io_uring.c:3998 [inline]
>>>   io_ring_ctx_wait_and_kill+0x348/0x700 fs/io_uring.c:4060
>>>   io_uring_release+0x42/0x50 fs/io_uring.c:4068
>>>   __fput+0x2ff/0x890 fs/file_table.c:280
>>>   ____fput+0x16/0x20 fs/file_table.c:313
>>>   task_work_run+0x145/0x1c0 kernel/task_work.c:113
>>>   exit_task_work include/linux/task_work.h:22 [inline]
>>>   do_exit+0x904/0x2e60 kernel/exit.c:817
>>>   do_group_exit+0x135/0x360 kernel/exit.c:921
>>>   __do_sys_exit_group kernel/exit.c:932 [inline]
>>>   __se_sys_exit_group kernel/exit.c:930 [inline]
>>>   __x64_sys_exit_group+0x44/0x50 kernel/exit.c:930
>>>   do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
>>>   entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>> RIP: 0033:0x43f2c8
>>> Code: 31 b8 c5 f7 ff ff 48 8b 5c 24 28 48 8b 6c 24 30 4c 8b 64 24 38 4c 8b
>>> 6c 24 40 4c 8b 74 24 48 4c 8b 7c 24 50 48 83 c4 58 c3 66 <0f> 1f 84 00 00
>>> 00 00 00 48 8d 35 59 ca 00 00 0f b6 d2 48 89 fb 48
>>> RSP: 002b:00007ffd5b976008 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
>>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043f2c8
>>> RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
>>> RBP: 00000000004bf0a8 R08: 00000000000000e7 R09: ffffffffffffffd0
>>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000001
>>> R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000
>>> 
>>> This happens if we fail allocating the file tables. For that case we do
>>> free the file table correctly, but we forget to set it to NULL. This
>>> means that ring teardown will see it as being non-NULL, and attempt to
>>> free it again.
>>> 
>>> Fix this by clearing the file_table pointer if we free the table.
>>> 
>>> Reported-by: [email protected]
>>> Fixes: 65e19f54d29c ("io_uring: support for larger fixed file sets")
>>> Signed-off-by: Jens Axboe <[email protected]>
>>> 
>> 
>> Reviewed-by: Bob Liu <[email protected]>
> 
> Thanks, added.
> 
>> By the way, there are many place(besides io_uring.c) which need to set
>> pointer to NULL after free.  I saw similar fix from time to time.
>> 
>> Do you think a safe_free() is worth? e.g
>> #define SAFE_FREE(p) { if (p) { free(p); (p)=NULL; } }
> 
> Hmm not sure, and would probably be better as:
> 
> kfree_safe(&ptr);
> 
> or something instead. I seem to recall discussions about that ages ago,
> probably worth while to try and search and see if you can find those. I
> suspect Linus hates it, reasons not remembered ;-)
> 

I think this may be a worthwhile solution, but kfree can handle NULL, we can
set it to NULL directly after free is finished.

void kfree_safe(const void *ptr)
{
	kfree(ptr);
	ptr = NULL;
}

--
BR, Jackie Liu




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] io_uring: fix error clear of ->file_table in io_sqe_files_register()
  2019-11-11  4:02     ` Jackie Liu
@ 2019-11-11  4:09       ` Jens Axboe
  2019-11-11  4:19         ` Jackie Liu
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2019-11-11  4:09 UTC (permalink / raw)
  To: Jackie Liu; +Cc: Bob Liu, io-uring

On 11/10/19 9:02 PM, Jackie Liu wrote:
> 
> 
>> 2019年11月11日 11:54,Jens Axboe <[email protected]> 写道:
>>
>> On 11/10/19 4:44 PM, Bob Liu wrote:
>>> On 11/10/19 11:46 PM, Jens Axboe wrote:
>>>> syzbot reports that when using failslab and friends, we can get a double
>>>> free in io_sqe_files_unregister():
>>>>
>>>> BUG: KASAN: double-free or invalid-free in
>>>> io_sqe_files_unregister+0x20b/0x300 fs/io_uring.c:3185
>>>>
>>>> CPU: 1 PID: 8819 Comm: syz-executor452 Not tainted 5.4.0-rc6-next-20191108
>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>>>> Google 01/01/2011
>>>> Call Trace:
>>>>    __dump_stack lib/dump_stack.c:77 [inline]
>>>>    dump_stack+0x197/0x210 lib/dump_stack.c:118
>>>>    print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
>>>>    kasan_report_invalid_free+0x65/0xa0 mm/kasan/report.c:468
>>>>    __kasan_slab_free+0x13a/0x150 mm/kasan/common.c:450
>>>>    kasan_slab_free+0xe/0x10 mm/kasan/common.c:480
>>>>    __cache_free mm/slab.c:3426 [inline]
>>>>    kfree+0x10a/0x2c0 mm/slab.c:3757
>>>>    io_sqe_files_unregister+0x20b/0x300 fs/io_uring.c:3185
>>>>    io_ring_ctx_free fs/io_uring.c:3998 [inline]
>>>>    io_ring_ctx_wait_and_kill+0x348/0x700 fs/io_uring.c:4060
>>>>    io_uring_release+0x42/0x50 fs/io_uring.c:4068
>>>>    __fput+0x2ff/0x890 fs/file_table.c:280
>>>>    ____fput+0x16/0x20 fs/file_table.c:313
>>>>    task_work_run+0x145/0x1c0 kernel/task_work.c:113
>>>>    exit_task_work include/linux/task_work.h:22 [inline]
>>>>    do_exit+0x904/0x2e60 kernel/exit.c:817
>>>>    do_group_exit+0x135/0x360 kernel/exit.c:921
>>>>    __do_sys_exit_group kernel/exit.c:932 [inline]
>>>>    __se_sys_exit_group kernel/exit.c:930 [inline]
>>>>    __x64_sys_exit_group+0x44/0x50 kernel/exit.c:930
>>>>    do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
>>>>    entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>> RIP: 0033:0x43f2c8
>>>> Code: 31 b8 c5 f7 ff ff 48 8b 5c 24 28 48 8b 6c 24 30 4c 8b 64 24 38 4c 8b
>>>> 6c 24 40 4c 8b 74 24 48 4c 8b 7c 24 50 48 83 c4 58 c3 66 <0f> 1f 84 00 00
>>>> 00 00 00 48 8d 35 59 ca 00 00 0f b6 d2 48 89 fb 48
>>>> RSP: 002b:00007ffd5b976008 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
>>>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043f2c8
>>>> RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
>>>> RBP: 00000000004bf0a8 R08: 00000000000000e7 R09: ffffffffffffffd0
>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000001
>>>> R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000
>>>>
>>>> This happens if we fail allocating the file tables. For that case we do
>>>> free the file table correctly, but we forget to set it to NULL. This
>>>> means that ring teardown will see it as being non-NULL, and attempt to
>>>> free it again.
>>>>
>>>> Fix this by clearing the file_table pointer if we free the table.
>>>>
>>>> Reported-by: [email protected]
>>>> Fixes: 65e19f54d29c ("io_uring: support for larger fixed file sets")
>>>> Signed-off-by: Jens Axboe <[email protected]>
>>>>
>>>
>>> Reviewed-by: Bob Liu <[email protected]>
>>
>> Thanks, added.
>>
>>> By the way, there are many place(besides io_uring.c) which need to set
>>> pointer to NULL after free.  I saw similar fix from time to time.
>>>
>>> Do you think a safe_free() is worth? e.g
>>> #define SAFE_FREE(p) { if (p) { free(p); (p)=NULL; } }
>>
>> Hmm not sure, and would probably be better as:
>>
>> kfree_safe(&ptr);
>>
>> or something instead. I seem to recall discussions about that ages ago,
>> probably worth while to try and search and see if you can find those. I
>> suspect Linus hates it, reasons not remembered ;-)
>>
> 
> I think this may be a worthwhile solution, but kfree can handle NULL, we can
> set it to NULL directly after free is finished.
> 
> void kfree_safe(const void *ptr)
> {
> 	kfree(ptr);
> 	ptr = NULL;
> }

Sure, but that doesn't change the ptr in the caller, which was my point.
You need to pass in a pointer to the pointer for that, otherwise
clearing it in kfree_safe() is pointless:

void kfree_safe(const void **ptr)
{
	kfree(*ptr);
	*ptr = NULL;
}

and then you run into all sorts of fun, since void ** isn't the same as
'struct foo **'.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] io_uring: fix error clear of ->file_table in io_sqe_files_register()
  2019-11-11  4:09       ` Jens Axboe
@ 2019-11-11  4:19         ` Jackie Liu
  0 siblings, 0 replies; 6+ messages in thread
From: Jackie Liu @ 2019-11-11  4:19 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Bob Liu, io-uring



> 2019年11月11日 12:09,Jens Axboe <[email protected]> 写道:
> 
> On 11/10/19 9:02 PM, Jackie Liu wrote:
>> 
>> 
>>> 2019年11月11日 11:54,Jens Axboe <[email protected]> 写道:
>>> 
>>> On 11/10/19 4:44 PM, Bob Liu wrote:
>>>> On 11/10/19 11:46 PM, Jens Axboe wrote:
>>>>> syzbot reports that when using failslab and friends, we can get a double
>>>>> free in io_sqe_files_unregister():
>>>>> 
>>>>> BUG: KASAN: double-free or invalid-free in
>>>>> io_sqe_files_unregister+0x20b/0x300 fs/io_uring.c:3185
>>>>> 
>>>>> CPU: 1 PID: 8819 Comm: syz-executor452 Not tainted 5.4.0-rc6-next-20191108
>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>>>>> Google 01/01/2011
>>>>> Call Trace:
>>>>>   __dump_stack lib/dump_stack.c:77 [inline]
>>>>>   dump_stack+0x197/0x210 lib/dump_stack.c:118
>>>>>   print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
>>>>>   kasan_report_invalid_free+0x65/0xa0 mm/kasan/report.c:468
>>>>>   __kasan_slab_free+0x13a/0x150 mm/kasan/common.c:450
>>>>>   kasan_slab_free+0xe/0x10 mm/kasan/common.c:480
>>>>>   __cache_free mm/slab.c:3426 [inline]
>>>>>   kfree+0x10a/0x2c0 mm/slab.c:3757
>>>>>   io_sqe_files_unregister+0x20b/0x300 fs/io_uring.c:3185
>>>>>   io_ring_ctx_free fs/io_uring.c:3998 [inline]
>>>>>   io_ring_ctx_wait_and_kill+0x348/0x700 fs/io_uring.c:4060
>>>>>   io_uring_release+0x42/0x50 fs/io_uring.c:4068
>>>>>   __fput+0x2ff/0x890 fs/file_table.c:280
>>>>>   ____fput+0x16/0x20 fs/file_table.c:313
>>>>>   task_work_run+0x145/0x1c0 kernel/task_work.c:113
>>>>>   exit_task_work include/linux/task_work.h:22 [inline]
>>>>>   do_exit+0x904/0x2e60 kernel/exit.c:817
>>>>>   do_group_exit+0x135/0x360 kernel/exit.c:921
>>>>>   __do_sys_exit_group kernel/exit.c:932 [inline]
>>>>>   __se_sys_exit_group kernel/exit.c:930 [inline]
>>>>>   __x64_sys_exit_group+0x44/0x50 kernel/exit.c:930
>>>>>   do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
>>>>>   entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>>> RIP: 0033:0x43f2c8
>>>>> Code: 31 b8 c5 f7 ff ff 48 8b 5c 24 28 48 8b 6c 24 30 4c 8b 64 24 38 4c 8b
>>>>> 6c 24 40 4c 8b 74 24 48 4c 8b 7c 24 50 48 83 c4 58 c3 66 <0f> 1f 84 00 00
>>>>> 00 00 00 48 8d 35 59 ca 00 00 0f b6 d2 48 89 fb 48
>>>>> RSP: 002b:00007ffd5b976008 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
>>>>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043f2c8
>>>>> RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
>>>>> RBP: 00000000004bf0a8 R08: 00000000000000e7 R09: ffffffffffffffd0
>>>>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000001
>>>>> R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000
>>>>> 
>>>>> This happens if we fail allocating the file tables. For that case we do
>>>>> free the file table correctly, but we forget to set it to NULL. This
>>>>> means that ring teardown will see it as being non-NULL, and attempt to
>>>>> free it again.
>>>>> 
>>>>> Fix this by clearing the file_table pointer if we free the table.
>>>>> 
>>>>> Reported-by: [email protected]
>>>>> Fixes: 65e19f54d29c ("io_uring: support for larger fixed file sets")
>>>>> Signed-off-by: Jens Axboe <[email protected]>
>>>>> 
>>>> 
>>>> Reviewed-by: Bob Liu <[email protected]>
>>> 
>>> Thanks, added.
>>> 
>>>> By the way, there are many place(besides io_uring.c) which need to set
>>>> pointer to NULL after free.  I saw similar fix from time to time.
>>>> 
>>>> Do you think a safe_free() is worth? e.g
>>>> #define SAFE_FREE(p) { if (p) { free(p); (p)=NULL; } }
>>> 
>>> Hmm not sure, and would probably be better as:
>>> 
>>> kfree_safe(&ptr);
>>> 
>>> or something instead. I seem to recall discussions about that ages ago,
>>> probably worth while to try and search and see if you can find those. I
>>> suspect Linus hates it, reasons not remembered ;-)
>>> 
>> 
>> I think this may be a worthwhile solution, but kfree can handle NULL, we can
>> set it to NULL directly after free is finished.
>> 
>> void kfree_safe(const void *ptr)
>> {
>> 	kfree(ptr);
>> 	ptr = NULL;
>> }
> 
> Sure, but that doesn't change the ptr in the caller, which was my point.
> You need to pass in a pointer to the pointer for that, otherwise
> clearing it in kfree_safe() is pointless:
> 
> void kfree_safe(const void **ptr)
> {
> 	kfree(*ptr);
> 	*ptr = NULL;
> }

Yes, you are right. If it is set to NULL directly, if it is wrong, it can be
panic immediately, which is convenient for debugging.

> 
> and then you run into all sorts of fun, since void ** isn't the same as
> 'struct foo **'.
> 


--
BR, Jackie Liu




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-11-11  4:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-11-10 15:46 [PATCH] io_uring: fix error clear of ->file_table in io_sqe_files_register() Jens Axboe
2019-11-10 23:44 ` Bob Liu
2019-11-11  3:54   ` Jens Axboe
2019-11-11  4:02     ` Jackie Liu
2019-11-11  4:09       ` Jens Axboe
2019-11-11  4:19         ` Jackie Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox