public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: rtm@csail.mit.edu, Pavel Begunkov <asml.silence@gmail.com>,
	io-uring@vger.kernel.org
Subject: Re: use-after-free if killed while in IORING_OP_FUTEX_WAIT
Date: Wed, 4 Jun 2025 10:22:58 -0600	[thread overview]
Message-ID: <6b5b368b-310b-41ca-9ce8-cb54e6c5b8f3@kernel.dk> (raw)
In-Reply-To: <83793604-3f0e-4496-a7c2-75f318219bee@kernel.dk>

On 6/4/25 8:12 AM, Jens Axboe wrote:
> On 6/4/25 7:58 AM, rtm@csail.mit.edu wrote:
>> If a process is killed while in IORING_OP_FUTEX_WAIT, do_exit()'s call
>> to exit_mm() causes the futex_private_hash to be freed, along with its
>> buckets' locks, while the iouring request still exists. When (a little
>> later in do_exit()) the iouring fd is fput(), the resulting
>> futex_unqueue() tries to use the freed memory that
>> req->async_data->lock_ptr points to.
>>
>> I've attached a demo:
>>
>> # cc uring46b.c
>> # ./a.out
>> killing child
>> BUG: spinlock bad magic on CPU#0, kworker/u4:1/26
>> Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b711b
>> Current kworker/u4:1 pgtable: 4K pagesize, 39-bit VAs, pgdp=0x000000008202a000
>> [6b6b6b6b6b6b711b] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
>> Oops [#1]
>> Modules linked in:
>> CPU: 0 UID: 0 PID: 26 Comm: kworker/u4:1 Not tainted 6.15.0-11192-ga82d78bc13a8 #553 NONE 
>> Hardware name: riscv-virtio,qemu (DT)
>> Workqueue: iou_exit io_ring_exit_work
>> epc : spin_dump+0x38/0x6e
>>  ra : spin_dump+0x30/0x6e
>> epc : ffffffff80003354 ra : ffffffff8000334c sp : ffffffc600113b60
>> ...
>> status: 0000000200000120 badaddr: 6b6b6b6b6b6b711b cause: 000000000000000d
>> [<ffffffff80003354>] spin_dump+0x38/0x6e
>> [<ffffffff8009b78a>] do_raw_spin_lock+0x10a/0x126
>> [<ffffffff811e6552>] _raw_spin_lock+0x1a/0x22
>> [<ffffffff800eb80c>] futex_unqueue+0x2a/0x76
>> [<ffffffff8069e366>] __io_futex_cancel+0x72/0x88
>> [<ffffffff806982fe>] io_cancel_remove_all+0x50/0x74
>> [<ffffffff8069e4ac>] io_futex_remove_all+0x1a/0x22
>> [<ffffffff80010a7e>] io_uring_try_cancel_requests+0x2e2/0x36e
>> [<ffffffff80010bf6>] io_ring_exit_work+0xec/0x3f0
>> [<ffffffff80057f0a>] process_one_work+0x132/0x2fe
>> [<ffffffff8005888c>] worker_thread+0x21e/0x2fe
>> [<ffffffff80060428>] kthread+0xe8/0x1ba
>> [<ffffffff80022fb0>] ret_from_fork_kernel+0xe/0x5e
>> [<ffffffff811e8566>] ret_from_fork_kernel_asm+0x16/0x18
>> Code: 4517 018b 0513 ca05 00ef 3b60 2603 0049 2601 c491 (a703) 5b04 
>> ---[ end trace 0000000000000000 ]---
>> Kernel panic - not syncing: Fatal exception
>> ---[ end Kernel panic - not syncing: Fatal exception ]---
> 
> Thanks, I'll take a look!

I think this would be the least intrusive fix, and also avoid fiddling
with mmget() for the PRIVATE case. I'll write a test case for this and
send it out as a real patch.


diff --git a/io_uring/futex.c b/io_uring/futex.c
index 383e0d99ad27..246bfb862db9 100644
--- a/io_uring/futex.c
+++ b/io_uring/futex.c
@@ -148,6 +148,8 @@ int io_futex_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 	    !futex_validate_input(iof->futex_flags, iof->futex_mask))
 		return -EINVAL;
 
+	/* Mark as inflight, so file exit cancelation will find it */
+	io_req_track_inflight(req);
 	return 0;
 }
 
@@ -194,6 +196,8 @@ int io_futexv_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 		return ret;
 	}
 
+	/* Mark as inflight, so file exit cancelation will find it */
+	io_req_track_inflight(req);
 	iof->futexv_unqueued = 0;
 	req->flags |= REQ_F_ASYNC_DATA;
 	req->async_data = ifd;
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index c7a9cecf528e..cf759c172083 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -408,7 +408,12 @@ static void io_clean_op(struct io_kiocb *req)
 	req->flags &= ~IO_REQ_CLEAN_FLAGS;
 }
 
-static inline void io_req_track_inflight(struct io_kiocb *req)
+/*
+ * Mark the request as inflight, so that file cancelation will find it.
+ * Can be used if the file is an io_uring instance, or if the request itself
+ * relies on ->mm being alive for the duration of the request.
+ */
+inline void io_req_track_inflight(struct io_kiocb *req)
 {
 	if (!(req->flags & REQ_F_INFLIGHT)) {
 		req->flags |= REQ_F_INFLIGHT;
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index 0ea7a435d1de..d59c12277d58 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -83,6 +83,7 @@ void io_add_aux_cqe(struct io_ring_ctx *ctx, u64 user_data, s32 res, u32 cflags)
 bool io_req_post_cqe(struct io_kiocb *req, s32 res, u32 cflags);
 void __io_commit_cqring_flush(struct io_ring_ctx *ctx);
 
+void io_req_track_inflight(struct io_kiocb *req);
 struct file *io_file_get_normal(struct io_kiocb *req, int fd);
 struct file *io_file_get_fixed(struct io_kiocb *req, int fd,
 			       unsigned issue_flags);

-- 
Jens Axboe

      reply	other threads:[~2025-06-04 16:23 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-04 13:58 use-after-free if killed while in IORING_OP_FUTEX_WAIT rtm
2025-06-04 14:12 ` Jens Axboe
2025-06-04 16:22   ` Jens Axboe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6b5b368b-310b-41ca-9ce8-cb54e6c5b8f3@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=asml.silence@gmail.com \
    --cc=io-uring@vger.kernel.org \
    --cc=rtm@csail.mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox