* [PATCH] io_uring: fix use-after-free of sq->thread in __io_uring_show_fdinfo()
@ 2025-06-10 11:17 Penglei Jiang
2025-06-10 14:33 ` Jens Axboe
0 siblings, 1 reply; 3+ messages in thread
From: Penglei Jiang @ 2025-06-10 11:17 UTC (permalink / raw)
To: axboe; +Cc: io-uring, linux-kernel, syzbot+531502bbbe51d2f769f4,
Penglei Jiang
[BUG]
[ 84.375406] ==================================================================
[ 84.378543] BUG: KASAN: slab-use-after-free in getrusage+0x1109/0x1a60
[ 84.381058] Read of size 8 at addr ffff88810de2d2c8 by task a.out/304
[ 84.382977]
[ 84.383767] CPU: 0 UID: 0 PID: 304 Comm: a.out Not tainted 6.16.0-rc1 #1 PREEMPT(voluntary)
[ 84.383788] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 84.383819] Call Trace:
[ 84.383841] <TASK>
[ 84.383842] dump_stack_lvl+0x53/0x70
[ 84.383957] print_report+0xd0/0x670
[ 84.384008] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ 84.384057] ? getrusage+0x1109/0x1a60
[ 84.384060] kasan_report+0xce/0x100
[ 84.384063] ? getrusage+0x1109/0x1a60
[ 84.384066] getrusage+0x1109/0x1a60
[ 84.384070] ? __pfx_getrusage+0x10/0x10
[ 84.384073] __io_uring_show_fdinfo+0x9fe/0x1790
[ 84.384117] ? ksys_read+0xf7/0x1c0
[ 84.384133] ? do_syscall_64+0xa4/0x260
[ 84.384144] ? vsnprintf+0x591/0x1100
[ 84.384155] ? __pfx___io_uring_show_fdinfo+0x10/0x10
[ 84.384157] ? __pfx_vsnprintf+0x10/0x10
[ 84.384175] ? mutex_trylock+0xcf/0x130
[ 84.384185] ? __pfx_mutex_trylock+0x10/0x10
[ 84.384198] ? __pfx_show_fd_locks+0x10/0x10
[ 84.384219] ? io_uring_show_fdinfo+0x57/0x80
[ 84.384222] io_uring_show_fdinfo+0x57/0x80
[ 84.384224] seq_show+0x38c/0x690
[ 84.384257] seq_read_iter+0x3f7/0x1180
[ 84.384279] ? inode_set_ctime_current+0x160/0x4b0
[ 84.384296] seq_read+0x271/0x3e0
[ 84.384298] ? __pfx_seq_read+0x10/0x10
[ 84.384300] ? __pfx__raw_spin_lock+0x10/0x10
[ 84.384303] ? __mark_inode_dirty+0x402/0x810
[ 84.384313] ? selinux_file_permission+0x368/0x500
[ 84.384385] ? file_update_time+0x10f/0x160
[ 84.384388] vfs_read+0x177/0xa40
[ 84.384393] ? __pfx___handle_mm_fault+0x10/0x10
[ 84.384440] ? __pfx_vfs_read+0x10/0x10
[ 84.384443] ? mutex_lock+0x81/0xe0
[ 84.384446] ? __pfx_mutex_lock+0x10/0x10
[ 84.384449] ? fdget_pos+0x24d/0x4b0
[ 84.384452] ksys_read+0xf7/0x1c0
[ 84.384455] ? __pfx_ksys_read+0x10/0x10
[ 84.384458] ? do_user_addr_fault+0x43b/0x9c0
[ 84.384486] do_syscall_64+0xa4/0x260
[ 84.384489] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 84.384528] RIP: 0033:0x7f0f74170fc9
[ 84.384560] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 8
[ 84.384563] RSP: 002b:00007fffece049e8 EFLAGS: 00000206 ORIG_RAX: 0000000000000000
[ 84.384588] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0f74170fc9
[ 84.384611] RDX: 0000000000001000 RSI: 00007fffece049f0 RDI: 0000000000000004
[ 84.384613] RBP: 00007fffece05ad0 R08: 0000000000000000 R09: 00007fffece04d90
[ 84.384615] R10: 0000000000000000 R11: 0000000000000206 R12: 00005651720a1100
[ 84.384617] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 84.384619] </TASK>
[ 84.384620]
[ 84.461314] Allocated by task 298:
[ 84.462019] kasan_save_stack+0x33/0x60
[ 84.462598] kasan_save_track+0x14/0x30
[ 84.463213] __kasan_slab_alloc+0x6e/0x70
[ 84.463853] kmem_cache_alloc_node_noprof+0xe8/0x330
[ 84.465483] copy_process+0x376/0x5e00
[ 84.466798] create_io_thread+0xab/0xf0
[ 84.468355] io_sq_offload_create+0x9ed/0xf20
[ 84.470323] io_uring_setup+0x12b0/0x1cc0
[ 84.471830] do_syscall_64+0xa4/0x260
[ 84.473255] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 84.474952]
[ 84.475758] Freed by task 22:
[ 84.476878] kasan_save_stack+0x33/0x60
[ 84.478400] kasan_save_track+0x14/0x30
[ 84.479709] kasan_save_free_info+0x3b/0x60
[ 84.480777] __kasan_slab_free+0x37/0x50
[ 84.481299] kmem_cache_free+0xc4/0x360
[ 84.481945] rcu_core+0x5ff/0x19f0
[ 84.483038] handle_softirqs+0x18c/0x530
[ 84.485407] run_ksoftirqd+0x20/0x30
[ 84.487146] smpboot_thread_fn+0x287/0x6c0
[ 84.488552] kthread+0x30d/0x630
[ 84.489856] ret_from_fork+0xef/0x1a0
[ 84.491239] ret_from_fork_asm+0x1a/0x30
[ 84.492366]
[ 84.493099] Last potentially related work creation:
[ 84.495175] kasan_save_stack+0x33/0x60
[ 84.496416] kasan_record_aux_stack+0x8c/0xa0
[ 84.498060] __call_rcu_common.constprop.0+0x68/0x940
[ 84.499326] __schedule+0xff2/0x2930
[ 84.499970] __cond_resched+0x4c/0x80
[ 84.501305] mutex_lock+0x5c/0xe0
[ 84.502374] io_uring_del_tctx_node+0xe1/0x2b0
[ 84.504251] io_uring_clean_tctx+0xb7/0x160
[ 84.505377] io_uring_cancel_generic+0x34e/0x760
[ 84.507126] do_exit+0x240/0x2350
[ 84.508667] do_group_exit+0xab/0x220
[ 84.509927] __x64_sys_exit_group+0x39/0x40
[ 84.511223] x64_sys_call+0x1243/0x1840
[ 84.512795] do_syscall_64+0xa4/0x260
[ 84.514044] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 84.515792]
[ 84.516374] The buggy address belongs to the object at ffff88810de2cb00
[ 84.516374] which belongs to the cache task_struct of size 3712
[ 84.521449] The buggy address is located 1992 bytes inside of
[ 84.521449] freed 3712-byte region [ffff88810de2cb00, ffff88810de2d980)
[ 84.524996]
[ 84.525514] The buggy address belongs to the physical page:
[ 84.527383] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10de28
[ 84.530907] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[ 84.533573] flags: 0x200000000000040(head|node=0|zone=2)
[ 84.535366] page_type: f5(slab)
[ 84.536410] raw: 0200000000000040 ffff8881001acdc0 dead000000000122 0000000000000000
[ 84.538807] raw: 0000000000000000 0000000000080008 00000000f5000000 0000000000000000
[ 84.542033] head: 0200000000000040 ffff8881001acdc0 dead000000000122 0000000000000000
[ 84.545121] head: 0000000000000000 0000000000080008 00000000f5000000 0000000000000000
[ 84.547663] head: 0200000000000003 ffffea0004378a01 00000000ffffffff 00000000ffffffff
[ 84.549959] head: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 84.552180] page dumped because: kasan: bad access detected
[ 84.554098]
[ 84.554798] Memory state around the buggy address:
[ 84.556798] ffff88810de2d180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 84.559258] ffff88810de2d200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 84.562112] >ffff88810de2d280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 84.564710] ^
[ 84.567076] ffff88810de2d300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 84.569900] ffff88810de2d380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 84.573018] ==================================================================
[CAUSE]
The task_struct pointed to by sq->thread is released while it is being
used in the function __io_uring_show_fdinfo(), because holding the lock
ctx->uring_lock does not prevent the release of sq->thread.
[FIX]
The fix is to increase the reference count of the task_struct pointed
to by sq->thread and to use RCU to ensure that it is not released while
being used.
Reported-by: syzbot+531502bbbe51d2f769f4@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/682b06a5.a70a0220.3849cf.00b3.GAE@google.com
Fixes: a0d45c3f596b ("io_uring/fdinfo: remove need for sqpoll lock for thread/pid retrieval")
Signed-off-by: Penglei Jiang <superman.xpt@gmail.com>
---
io_uring/fdinfo.c | 8 ++++++--
io_uring/sqpoll.c | 9 ++++-----
2 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c
index e9355276ab5d..2911352bbae1 100644
--- a/io_uring/fdinfo.c
+++ b/io_uring/fdinfo.c
@@ -141,19 +141,23 @@ static void __io_uring_show_fdinfo(struct io_ring_ctx *ctx, struct seq_file *m)
if (ctx->flags & IORING_SETUP_SQPOLL) {
struct io_sq_data *sq = ctx->sq_data;
+ struct task_struct *tsk;
+ rcu_read_lock();
+ tsk = rcu_dereference(sq->thread);
/*
* sq->thread might be NULL if we raced with the sqpoll
* thread termination.
*/
- if (sq->thread) {
+ if (tsk) {
sq_pid = sq->task_pid;
sq_cpu = sq->sq_cpu;
- getrusage(sq->thread, RUSAGE_SELF, &sq_usage);
+ getrusage(tsk, RUSAGE_SELF, &sq_usage);
sq_total_time = (sq_usage.ru_stime.tv_sec * 1000000
+ sq_usage.ru_stime.tv_usec);
sq_work_time = sq->work_time;
}
+ rcu_read_unlock();
}
seq_printf(m, "SqThread:\t%d\n", sq_pid);
diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c
index 03c699493b5a..0625a421626f 100644
--- a/io_uring/sqpoll.c
+++ b/io_uring/sqpoll.c
@@ -270,7 +270,8 @@ static int io_sq_thread(void *data)
/* offload context creation failed, just exit */
if (!current->io_uring) {
mutex_lock(&sqd->lock);
- sqd->thread = NULL;
+ rcu_assign_pointer(sqd->thread, NULL);
+ put_task_struct(current);
mutex_unlock(&sqd->lock);
goto err_out;
}
@@ -379,7 +380,8 @@ static int io_sq_thread(void *data)
io_sq_tw(&retry_list, UINT_MAX);
io_uring_cancel_generic(true, sqd);
- sqd->thread = NULL;
+ rcu_assign_pointer(sqd->thread, NULL);
+ put_task_struct(current);
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
atomic_or(IORING_SQ_NEED_WAKEUP, &ctx->rings->sq_flags);
io_run_task_work();
@@ -495,9 +497,6 @@ __cold int io_sq_offload_create(struct io_ring_ctx *ctx,
ret = -EINVAL;
goto err;
}
-
- if (task_to_put)
- put_task_struct(task_to_put);
return 0;
err_sqpoll:
complete(&ctx->sq_data->exited);
--
2.17.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] io_uring: fix use-after-free of sq->thread in __io_uring_show_fdinfo()
2025-06-10 11:17 [PATCH] io_uring: fix use-after-free of sq->thread in __io_uring_show_fdinfo() Penglei Jiang
@ 2025-06-10 14:33 ` Jens Axboe
2025-06-10 14:50 ` Jens Axboe
0 siblings, 1 reply; 3+ messages in thread
From: Jens Axboe @ 2025-06-10 14:33 UTC (permalink / raw)
To: Penglei Jiang; +Cc: io-uring, linux-kernel, syzbot+531502bbbe51d2f769f4
On 6/10/25 5:17 AM, Penglei Jiang wrote:
> diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c
> index e9355276ab5d..2911352bbae1 100644
> --- a/io_uring/fdinfo.c
> +++ b/io_uring/fdinfo.c
> @@ -141,19 +141,23 @@ static void __io_uring_show_fdinfo(struct io_ring_ctx *ctx, struct seq_file *m)
>
> if (ctx->flags & IORING_SETUP_SQPOLL) {
> struct io_sq_data *sq = ctx->sq_data;
> + struct task_struct *tsk;
>
> + rcu_read_lock();
> + tsk = rcu_dereference(sq->thread);
> /*
> * sq->thread might be NULL if we raced with the sqpoll
> * thread termination.
> */
> - if (sq->thread) {
> + if (tsk) {
> sq_pid = sq->task_pid;
> sq_cpu = sq->sq_cpu;
> - getrusage(sq->thread, RUSAGE_SELF, &sq_usage);
> + getrusage(tsk, RUSAGE_SELF, &sq_usage);
> sq_total_time = (sq_usage.ru_stime.tv_sec * 1000000
> + sq_usage.ru_stime.tv_usec);
> sq_work_time = sq->work_time;
> }
> + rcu_read_unlock();
> }
Don't think this will work, if we're racing with the mmput and then end
up doing that inside an RCU read locked region...
> diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c
> index 03c699493b5a..0625a421626f 100644
> --- a/io_uring/sqpoll.c
> +++ b/io_uring/sqpoll.c
> @@ -270,7 +270,8 @@ static int io_sq_thread(void *data)
> /* offload context creation failed, just exit */
> if (!current->io_uring) {
> mutex_lock(&sqd->lock);
> - sqd->thread = NULL;
> + rcu_assign_pointer(sqd->thread, NULL);
> + put_task_struct(current);
> mutex_unlock(&sqd->lock);
> goto err_out;
> }
You do this in both spots, why the put_task_struct(current)? That seems
like that would be very wrong and instantly break. Did you run this
patch?
I do agree that there's an issue here though, let me take a closer look
at it. This would have been introduced with the getrusage change though,
no? This one:
ommit 3fcb9d17206e31630f802a3ab52081d1342b8ed9
Author: Xiaobing Li <xiaobing.li@samsung.com>
Date: Wed Feb 28 17:12:51 2024 +0800
io_uring/sqpoll: statistics of the true utilization of sq threads
as that is the one that added prodding at sq->thread from fdinfo.
--
Jens Axboe
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] io_uring: fix use-after-free of sq->thread in __io_uring_show_fdinfo()
2025-06-10 14:33 ` Jens Axboe
@ 2025-06-10 14:50 ` Jens Axboe
0 siblings, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2025-06-10 14:50 UTC (permalink / raw)
To: Penglei Jiang; +Cc: io-uring, linux-kernel, syzbot+531502bbbe51d2f769f4
On 6/10/25 8:33 AM, Jens Axboe wrote:
> On 6/10/25 5:17 AM, Penglei Jiang wrote:
>> diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c
>> index e9355276ab5d..2911352bbae1 100644
>> --- a/io_uring/fdinfo.c
>> +++ b/io_uring/fdinfo.c
>> @@ -141,19 +141,23 @@ static void __io_uring_show_fdinfo(struct io_ring_ctx *ctx, struct seq_file *m)
>>
>> if (ctx->flags & IORING_SETUP_SQPOLL) {
>> struct io_sq_data *sq = ctx->sq_data;
>> + struct task_struct *tsk;
>>
>> + rcu_read_lock();
>> + tsk = rcu_dereference(sq->thread);
>> /*
>> * sq->thread might be NULL if we raced with the sqpoll
>> * thread termination.
>> */
>> - if (sq->thread) {
>> + if (tsk) {
>> sq_pid = sq->task_pid;
>> sq_cpu = sq->sq_cpu;
>> - getrusage(sq->thread, RUSAGE_SELF, &sq_usage);
>> + getrusage(tsk, RUSAGE_SELF, &sq_usage);
>> sq_total_time = (sq_usage.ru_stime.tv_sec * 1000000
>> + sq_usage.ru_stime.tv_usec);
>> sq_work_time = sq->work_time;
>> }
>> + rcu_read_unlock();
>> }
>
> Don't think this will work, if we're racing with the mmput and then end
> up doing that inside an RCU read locked region...
I think it needs to look more like:
rcu_read_lock();
tsk = rcu_dereference(sq->thread);
if (tsk) {
get_task_struct(tsk);
rcu_read_unlock();
[...] /* getrusage stuff */
put_task_struct(tsk);
} else {
rcu_read_unlock();
}
and then it would be sane and avoid calling getrusage() with the rcu
read lock held.
If you agree, can you send a v2 with those changes?
>> diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c
>> index 03c699493b5a..0625a421626f 100644
>> --- a/io_uring/sqpoll.c
>> +++ b/io_uring/sqpoll.c
>> @@ -270,7 +270,8 @@ static int io_sq_thread(void *data)
>> /* offload context creation failed, just exit */
>> if (!current->io_uring) {
>> mutex_lock(&sqd->lock);
>> - sqd->thread = NULL;
>> + rcu_assign_pointer(sqd->thread, NULL);
>> + put_task_struct(current);
>> mutex_unlock(&sqd->lock);
>> goto err_out;
>> }
>
> You do this in both spots, why the put_task_struct(current)? That seems
> like that would be very wrong and instantly break. Did you run this
> patch?
For some reason I read this as the io_uring.c hunk - the puts here look
fine, ignore the noise on that part...
> ommit 3fcb9d17206e31630f802a3ab52081d1342b8ed9
> Author: Xiaobing Li <xiaobing.li@samsung.com>
> Date: Wed Feb 28 17:12:51 2024 +0800
>
> io_uring/sqpoll: statistics of the true utilization of sq threads
>
> as that is the one that added prodding at sq->thread from fdinfo.
This should still be the Fixes target though, imho.
--
Jens Axboe
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-06-10 14:50 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-10 11:17 [PATCH] io_uring: fix use-after-free of sq->thread in __io_uring_show_fdinfo() Penglei Jiang
2025-06-10 14:33 ` Jens Axboe
2025-06-10 14:50 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox