public inbox for [email protected]
 help / color / mirror / Atom feed
* futex+io_uring: futex_q::task can maybe be dangling (but is not actually accessed, so it's fine)
@ 2025-01-10 22:26 Jann Horn
  2025-01-11  3:33 ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Jann Horn @ 2025-01-10 22:26 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	Davidlohr Bueso, André Almeida, kernel list, Jens Axboe,
	Pavel Begunkov, io-uring

Hi!

I think there is some brittle interaction between futex and io_uring;
but to be clear, I don't think that there is actually a bug here.

In io_uring, when a IORING_OP_FUTEX_WAIT SQE is submitted with
IOSQE_ASYNC, an io_uring worker thread can queue up futex waiters via
the following path:

ret_from_fork -> io_wq_worker -> io_worker_handle_work ->
io_wq_submit_work[called as ->do_work] -> io_issue_sqe ->
io_futex_wait[called as .issue] -> futex_queue -> __futex_queue

futex_q instances normally describe synchronously waiting tasks, and
__futex_queue() records the identity of the calling task (which is
normally the waiter) in futex_q::task. But io_uring waits on futexes
asynchronously instead; from io_uring's perspective, a pending futex
wait is not tied to the task that called into futex_queue(), it is
just tied to the userspace task on behalf of which the io_uring worker
is acting (I think). So when a futex wait operation is started by an
io_uring worker task, I think that worker task could go away while the
futex_q is still queued up on the futex, and so I think we can end up
with a futex_q whose "task" member points to a freed task_struct.

The good part is that (from what I understand) that "task" member is
only used for two purposes:

1. futexes that are either created through the normal futex syscalls
use futex_wake_mark as their .wake callback, which needs the task
pointer to know which task should be woken.
2. PI futexes use it for priority inheritance magic (and AFAICS there
is no way for io_uring to interface with PI futexes)

I'm not sure what is the best thing to do is here - maybe the current
situation is fine, and I should just send a patch that adds a comment
describing this to the definition of the "task" member? Or maybe it
would be better for robustness to ensure that the "task" member is
NULLed out in those cases, though that would probably make the
generated machine code a little bit more ugly? (Or maybe I totally
misunderstand what's going on and there isn't actually a dangling
pointer...)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: futex+io_uring: futex_q::task can maybe be dangling (but is not actually accessed, so it's fine)
  2025-01-10 22:26 futex+io_uring: futex_q::task can maybe be dangling (but is not actually accessed, so it's fine) Jann Horn
@ 2025-01-11  3:33 ` Jens Axboe
  2025-01-13 13:53   ` Jann Horn
  2025-01-13 14:38   ` Peter Zijlstra
  0 siblings, 2 replies; 10+ messages in thread
From: Jens Axboe @ 2025-01-11  3:33 UTC (permalink / raw)
  To: Jann Horn, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, André Almeida, kernel list,
	Pavel Begunkov, io-uring

On 1/10/25 3:26 PM, Jann Horn wrote:
> Hi!
> 
> I think there is some brittle interaction between futex and io_uring;
> but to be clear, I don't think that there is actually a bug here.
> 
> In io_uring, when a IORING_OP_FUTEX_WAIT SQE is submitted with
> IOSQE_ASYNC, an io_uring worker thread can queue up futex waiters via
> the following path:
> 
> ret_from_fork -> io_wq_worker -> io_worker_handle_work ->
> io_wq_submit_work[called as ->do_work] -> io_issue_sqe ->
> io_futex_wait[called as .issue] -> futex_queue -> __futex_queue
> 
> futex_q instances normally describe synchronously waiting tasks, and
> __futex_queue() records the identity of the calling task (which is
> normally the waiter) in futex_q::task. But io_uring waits on futexes
> asynchronously instead; from io_uring's perspective, a pending futex
> wait is not tied to the task that called into futex_queue(), it is
> just tied to the userspace task on behalf of which the io_uring worker
> is acting (I think). So when a futex wait operation is started by an
> io_uring worker task, I think that worker task could go away while the
> futex_q is still queued up on the futex, and so I think we can end up
> with a futex_q whose "task" member points to a freed task_struct.
> 
> The good part is that (from what I understand) that "task" member is
> only used for two purposes:
> 
> 1. futexes that are either created through the normal futex syscalls
> use futex_wake_mark as their .wake callback, which needs the task
> pointer to know which task should be woken.
> 2. PI futexes use it for priority inheritance magic (and AFAICS there
> is no way for io_uring to interface with PI futexes)
> 
> I'm not sure what is the best thing to do is here - maybe the current
> situation is fine, and I should just send a patch that adds a comment
> describing this to the definition of the "task" member? Or maybe it
> would be better for robustness to ensure that the "task" member is
> NULLed out in those cases, though that would probably make the
> generated machine code a little bit more ugly? (Or maybe I totally
> misunderstand what's going on and there isn't actually a dangling
> pointer...)

Good find. And yes the io-wq task could go away, if there's more of
them.

While it isn't an issue, dangling pointers make me nervous. We could do
something like the (totally untested) below patch, where io_uring just
uses __futex_queue() and make __futex_queue() take a task_struct as
well. Other callers pass in 'current'.

It's quite possible that it'd be safe to just use __futex_queue() and
clear ->task after the queue, but if we pass in NULL from the get-go,
then there's definitely not a risk of anything being dangling.


diff --git a/io_uring/futex.c b/io_uring/futex.c
index 30139cc150f2..985ad10cc6d5 100644
--- a/io_uring/futex.c
+++ b/io_uring/futex.c
@@ -338,7 +338,12 @@ int io_futex_wait(struct io_kiocb *req, unsigned int issue_flags)
 		hlist_add_head(&req->hash_node, &ctx->futex_list);
 		io_ring_submit_unlock(ctx, issue_flags);
 
-		futex_queue(&ifd->q, hb);
+		/*
+		 * Don't pass in current as the task associated with the
+		 * futex queue, that only makes sense for sync syscalls.
+		 */
+		__futex_queue(&ifd->q, hb, NULL);
+		spin_unlock(&hb->lock);
 		return IOU_ISSUE_SKIP_COMPLETE;
 	}
 
diff --git a/kernel/futex/core.c b/kernel/futex/core.c
index ebdd76b4ecbb..3db8567f5a44 100644
--- a/kernel/futex/core.c
+++ b/kernel/futex/core.c
@@ -532,7 +532,8 @@ void futex_q_unlock(struct futex_hash_bucket *hb)
 	futex_hb_waiters_dec(hb);
 }
 
-void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb)
+void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb,
+		   struct task_struct *task)
 {
 	int prio;
 
@@ -548,7 +549,7 @@ void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb)
 
 	plist_node_init(&q->list, prio);
 	plist_add(&q->list, &hb->chain);
-	q->task = current;
+	q->task = task;
 }
 
 /**
diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h
index 99b32e728c4a..2d3f1d53c854 100644
--- a/kernel/futex/futex.h
+++ b/kernel/futex/futex.h
@@ -285,7 +285,8 @@ static inline int futex_get_value_locked(u32 *dest, u32 __user *from)
 }
 
 extern void __futex_unqueue(struct futex_q *q);
-extern void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb);
+extern void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb,
+				struct task_struct *task);
 extern int futex_unqueue(struct futex_q *q);
 
 /**
@@ -303,7 +304,7 @@ extern int futex_unqueue(struct futex_q *q);
 static inline void futex_queue(struct futex_q *q, struct futex_hash_bucket *hb)
 	__releases(&hb->lock)
 {
-	__futex_queue(q, hb);
+	__futex_queue(q, hb, current);
 	spin_unlock(&hb->lock);
 }
 
diff --git a/kernel/futex/pi.c b/kernel/futex/pi.c
index d62cca5ed8f4..635c7d5d4222 100644
--- a/kernel/futex/pi.c
+++ b/kernel/futex/pi.c
@@ -982,7 +982,7 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int flags, ktime_t *time, int tryl
 	/*
 	 * Only actually queue now that the atomic ops are done:
 	 */
-	__futex_queue(&q, hb);
+	__futex_queue(&q, hb, current);
 
 	if (trylock) {
 		ret = rt_mutex_futex_trylock(&q.pi_state->pi_mutex);

-- 
Jens Axboe

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: futex+io_uring: futex_q::task can maybe be dangling (but is not actually accessed, so it's fine)
  2025-01-11  3:33 ` Jens Axboe
@ 2025-01-13 13:53   ` Jann Horn
  2025-01-13 14:38   ` Peter Zijlstra
  1 sibling, 0 replies; 10+ messages in thread
From: Jann Horn @ 2025-01-13 13:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	Davidlohr Bueso, André Almeida, kernel list, Pavel Begunkov,
	io-uring

On Sat, Jan 11, 2025 at 4:33 AM Jens Axboe <[email protected]> wrote:
> On 1/10/25 3:26 PM, Jann Horn wrote:
> > Hi!
> >
> > I think there is some brittle interaction between futex and io_uring;
> > but to be clear, I don't think that there is actually a bug here.
> >
> > In io_uring, when a IORING_OP_FUTEX_WAIT SQE is submitted with
> > IOSQE_ASYNC, an io_uring worker thread can queue up futex waiters via
> > the following path:
> >
> > ret_from_fork -> io_wq_worker -> io_worker_handle_work ->
> > io_wq_submit_work[called as ->do_work] -> io_issue_sqe ->
> > io_futex_wait[called as .issue] -> futex_queue -> __futex_queue
> >
> > futex_q instances normally describe synchronously waiting tasks, and
> > __futex_queue() records the identity of the calling task (which is
> > normally the waiter) in futex_q::task. But io_uring waits on futexes
> > asynchronously instead; from io_uring's perspective, a pending futex
> > wait is not tied to the task that called into futex_queue(), it is
> > just tied to the userspace task on behalf of which the io_uring worker
> > is acting (I think). So when a futex wait operation is started by an
> > io_uring worker task, I think that worker task could go away while the
> > futex_q is still queued up on the futex, and so I think we can end up
> > with a futex_q whose "task" member points to a freed task_struct.
> >
> > The good part is that (from what I understand) that "task" member is
> > only used for two purposes:
> >
> > 1. futexes that are either created through the normal futex syscalls
> > use futex_wake_mark as their .wake callback, which needs the task
> > pointer to know which task should be woken.
> > 2. PI futexes use it for priority inheritance magic (and AFAICS there
> > is no way for io_uring to interface with PI futexes)
> >
> > I'm not sure what is the best thing to do is here - maybe the current
> > situation is fine, and I should just send a patch that adds a comment
> > describing this to the definition of the "task" member? Or maybe it
> > would be better for robustness to ensure that the "task" member is
> > NULLed out in those cases, though that would probably make the
> > generated machine code a little bit more ugly? (Or maybe I totally
> > misunderstand what's going on and there isn't actually a dangling
> > pointer...)
>
> Good find. And yes the io-wq task could go away, if there's more of
> them.
>
> While it isn't an issue, dangling pointers make me nervous. We could do
> something like the (totally untested) below patch, where io_uring just
> uses __futex_queue() and make __futex_queue() take a task_struct as
> well. Other callers pass in 'current'.
>
> It's quite possible that it'd be safe to just use __futex_queue() and
> clear ->task after the queue, but if we pass in NULL from the get-go,
> then there's definitely not a risk of anything being dangling.

Yeah, this looks like a nice cleanup that makes this safer, thanks!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: futex+io_uring: futex_q::task can maybe be dangling (but is not actually accessed, so it's fine)
  2025-01-11  3:33 ` Jens Axboe
  2025-01-13 13:53   ` Jann Horn
@ 2025-01-13 14:38   ` Peter Zijlstra
  2025-01-13 14:41     ` Jens Axboe
  2025-01-15 10:20     ` Thomas Gleixner
  1 sibling, 2 replies; 10+ messages in thread
From: Peter Zijlstra @ 2025-01-13 14:38 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Jann Horn, Thomas Gleixner, Ingo Molnar, Darren Hart,
	Davidlohr Bueso, André Almeida, kernel list, Pavel Begunkov,
	io-uring

On Fri, Jan 10, 2025 at 08:33:34PM -0700, Jens Axboe wrote:

> @@ -548,7 +549,7 @@ void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb)
>  
>  	plist_node_init(&q->list, prio);
>  	plist_add(&q->list, &hb->chain);
> -	q->task = current;
> +	q->task = task;
>  }
>  
>  /**

The alternative is, I suppose, to move the q->task assignment out to
these two callsites instead. Thomas, any opinions?

> @@ -303,7 +304,7 @@ extern int futex_unqueue(struct futex_q *q);
>  static inline void futex_queue(struct futex_q *q, struct futex_hash_bucket *hb)
>  	__releases(&hb->lock)
>  {
> -	__futex_queue(q, hb);
> +	__futex_queue(q, hb, current);
>  	spin_unlock(&hb->lock);
>  }
>  
> diff --git a/kernel/futex/pi.c b/kernel/futex/pi.c
> index d62cca5ed8f4..635c7d5d4222 100644
> --- a/kernel/futex/pi.c
> +++ b/kernel/futex/pi.c
> @@ -982,7 +982,7 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int flags, ktime_t *time, int tryl
>  	/*
>  	 * Only actually queue now that the atomic ops are done:
>  	 */
> -	__futex_queue(&q, hb);
> +	__futex_queue(&q, hb, current);
>  
>  	if (trylock) {
>  		ret = rt_mutex_futex_trylock(&q.pi_state->pi_mutex);
> 
> -- 
> Jens Axboe

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: futex+io_uring: futex_q::task can maybe be dangling (but is not actually accessed, so it's fine)
  2025-01-13 14:38   ` Peter Zijlstra
@ 2025-01-13 14:41     ` Jens Axboe
  2025-01-15 10:20     ` Thomas Gleixner
  1 sibling, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2025-01-13 14:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jann Horn, Thomas Gleixner, Ingo Molnar, Darren Hart,
	Davidlohr Bueso, André Almeida, kernel list, Pavel Begunkov,
	io-uring

On 1/13/25 7:38 AM, Peter Zijlstra wrote:
> On Fri, Jan 10, 2025 at 08:33:34PM -0700, Jens Axboe wrote:
> 
>> @@ -548,7 +549,7 @@ void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb)
>>  
>>  	plist_node_init(&q->list, prio);
>>  	plist_add(&q->list, &hb->chain);
>> -	q->task = current;
>> +	q->task = task;
>>  }
>>  
>>  /**
> 
> The alternative is, I suppose, to move the q->task assignment out to
> these two callsites instead. Thomas, any opinions?

That suggestion was posed in my reply as well, but I didn't spend the
time checking if it was safe/feasible. If you/Thomas think it's safe, we
can certainly do that instead. But I'd much rather you make that call
than me :-)

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: futex+io_uring: futex_q::task can maybe be dangling (but is not actually accessed, so it's fine)
  2025-01-13 14:38   ` Peter Zijlstra
  2025-01-13 14:41     ` Jens Axboe
@ 2025-01-15 10:20     ` Thomas Gleixner
  2025-01-15 15:23       ` Jens Axboe
  1 sibling, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2025-01-15 10:20 UTC (permalink / raw)
  To: Peter Zijlstra, Jens Axboe
  Cc: Jann Horn, Ingo Molnar, Darren Hart, Davidlohr Bueso,
	André Almeida, kernel list, Pavel Begunkov, io-uring

On Mon, Jan 13 2025 at 15:38, Peter Zijlstra wrote:
> On Fri, Jan 10, 2025 at 08:33:34PM -0700, Jens Axboe wrote:
>
>> @@ -548,7 +549,7 @@ void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb)
>>  
>>  	plist_node_init(&q->list, prio);
>>  	plist_add(&q->list, &hb->chain);
>> -	q->task = current;
>> +	q->task = task;
>>  }
>>  
>>  /**
>
> The alternative is, I suppose, to move the q->task assignment out to
> these two callsites instead. Thomas, any opinions?

That's fine as long as hb->lock is held, but the explicit argument makes
all of this simpler to understand.

Though I'm not really a fan of this part:

> +		__futex_queue(&ifd->q, hb, NULL);
> +		spin_unlock(&hb->lock);

Can we please add that @task argument to futex_queue() and keep the
internals in the futex code instead of pulling more stuff into io_uring?

Thanks,

        tglx



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: futex+io_uring: futex_q::task can maybe be dangling (but is not actually accessed, so it's fine)
  2025-01-15 10:20     ` Thomas Gleixner
@ 2025-01-15 15:23       ` Jens Axboe
  2025-01-15 15:32         ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2025-01-15 15:23 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra
  Cc: Jann Horn, Ingo Molnar, Darren Hart, Davidlohr Bueso,
	André Almeida, kernel list, Pavel Begunkov, io-uring

On 1/15/25 3:20 AM, Thomas Gleixner wrote:
> On Mon, Jan 13 2025 at 15:38, Peter Zijlstra wrote:
>> On Fri, Jan 10, 2025 at 08:33:34PM -0700, Jens Axboe wrote:
>>
>>> @@ -548,7 +549,7 @@ void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb)
>>>  
>>>  	plist_node_init(&q->list, prio);
>>>  	plist_add(&q->list, &hb->chain);
>>> -	q->task = current;
>>> +	q->task = task;
>>>  }
>>>  
>>>  /**
>>
>> The alternative is, I suppose, to move the q->task assignment out to
>> these two callsites instead. Thomas, any opinions?
> 
> That's fine as long as hb->lock is held, but the explicit argument makes
> all of this simpler to understand.
> 
> Though I'm not really a fan of this part:
> 
>> +		__futex_queue(&ifd->q, hb, NULL);
>> +		spin_unlock(&hb->lock);
> 
> Can we please add that @task argument to futex_queue() and keep the
> internals in the futex code instead of pulling more stuff into io_uring?

Sure, was trying to keep the change more minimal, but we can certainly
add it to futex_queue() instead rather than needing to work around it on
the io_uring side.

I'll be happy to send out a patch for that.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: futex+io_uring: futex_q::task can maybe be dangling (but is not actually accessed, so it's fine)
  2025-01-15 15:23       ` Jens Axboe
@ 2025-01-15 15:32         ` Jens Axboe
  2025-01-15 17:05           ` Thomas Gleixner
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2025-01-15 15:32 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra
  Cc: Jann Horn, Ingo Molnar, Darren Hart, Davidlohr Bueso,
	André Almeida, kernel list, Pavel Begunkov, io-uring

On 1/15/25 8:23 AM, Jens Axboe wrote:
> On 1/15/25 3:20 AM, Thomas Gleixner wrote:
>> On Mon, Jan 13 2025 at 15:38, Peter Zijlstra wrote:
>>> On Fri, Jan 10, 2025 at 08:33:34PM -0700, Jens Axboe wrote:
>>>
>>>> @@ -548,7 +549,7 @@ void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb)
>>>>  
>>>>  	plist_node_init(&q->list, prio);
>>>>  	plist_add(&q->list, &hb->chain);
>>>> -	q->task = current;
>>>> +	q->task = task;
>>>>  }
>>>>  
>>>>  /**
>>>
>>> The alternative is, I suppose, to move the q->task assignment out to
>>> these two callsites instead. Thomas, any opinions?
>>
>> That's fine as long as hb->lock is held, but the explicit argument makes
>> all of this simpler to understand.
>>
>> Though I'm not really a fan of this part:
>>
>>> +		__futex_queue(&ifd->q, hb, NULL);
>>> +		spin_unlock(&hb->lock);
>>
>> Can we please add that @task argument to futex_queue() and keep the
>> internals in the futex code instead of pulling more stuff into io_uring?
> 
> Sure, was trying to keep the change more minimal, but we can certainly
> add it to futex_queue() instead rather than needing to work around it on
> the io_uring side.
> 
> I'll be happy to send out a patch for that.

Here's the raw patch. Should've done this initially rather than just
tackle __futex_queue(), for some reason I thought/assumed that
futex_queue() was more widely used.

What do you think?


diff --git a/io_uring/futex.c b/io_uring/futex.c
index e29662f039e1..f108da4ff863 100644
--- a/io_uring/futex.c
+++ b/io_uring/futex.c
@@ -349,7 +349,7 @@ int io_futex_wait(struct io_kiocb *req, unsigned int issue_flags)
 		hlist_add_head(&req->hash_node, &ctx->futex_list);
 		io_ring_submit_unlock(ctx, issue_flags);
 
-		futex_queue(&ifd->q, hb);
+		futex_queue(&ifd->q, hb, NULL);
 		return IOU_ISSUE_SKIP_COMPLETE;
 	}
 
diff --git a/kernel/futex/core.c b/kernel/futex/core.c
index ebdd76b4ecbb..3db8567f5a44 100644
--- a/kernel/futex/core.c
+++ b/kernel/futex/core.c
@@ -532,7 +532,8 @@ void futex_q_unlock(struct futex_hash_bucket *hb)
 	futex_hb_waiters_dec(hb);
 }
 
-void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb)
+void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb,
+		   struct task_struct *task)
 {
 	int prio;
 
@@ -548,7 +549,7 @@ void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb)
 
 	plist_node_init(&q->list, prio);
 	plist_add(&q->list, &hb->chain);
-	q->task = current;
+	q->task = task;
 }
 
 /**
diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h
index 618ce1fe870e..11de6405c4e3 100644
--- a/kernel/futex/futex.h
+++ b/kernel/futex/futex.h
@@ -285,13 +285,15 @@ static inline int futex_get_value_locked(u32 *dest, u32 __user *from)
 }
 
 extern void __futex_unqueue(struct futex_q *q);
-extern void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb);
+extern void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb,
+				struct task_struct *task);
 extern int futex_unqueue(struct futex_q *q);
 
 /**
  * futex_queue() - Enqueue the futex_q on the futex_hash_bucket
  * @q:	The futex_q to enqueue
  * @hb:	The destination hash bucket
+ * @task: Task queueing this futex
  *
  * The hb->lock must be held by the caller, and is released here. A call to
  * futex_queue() is typically paired with exactly one call to futex_unqueue().  The
@@ -299,11 +301,14 @@ extern int futex_unqueue(struct futex_q *q);
  * or nothing if the unqueue is done as part of the wake process and the unqueue
  * state is implicit in the state of woken task (see futex_wait_requeue_pi() for
  * an example).
+ *
+ * Note that @task may be NULL, for async usage of futexes.
  */
-static inline void futex_queue(struct futex_q *q, struct futex_hash_bucket *hb)
+static inline void futex_queue(struct futex_q *q, struct futex_hash_bucket *hb,
+			       struct task_struct *task)
 	__releases(&hb->lock)
 {
-	__futex_queue(q, hb);
+	__futex_queue(q, hb, task);
 	spin_unlock(&hb->lock);
 }
 
diff --git a/kernel/futex/pi.c b/kernel/futex/pi.c
index d62cca5ed8f4..635c7d5d4222 100644
--- a/kernel/futex/pi.c
+++ b/kernel/futex/pi.c
@@ -982,7 +982,7 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int flags, ktime_t *time, int tryl
 	/*
 	 * Only actually queue now that the atomic ops are done:
 	 */
-	__futex_queue(&q, hb);
+	__futex_queue(&q, hb, current);
 
 	if (trylock) {
 		ret = rt_mutex_futex_trylock(&q.pi_state->pi_mutex);
diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c
index 3a10375d9521..a9056acb75ee 100644
--- a/kernel/futex/waitwake.c
+++ b/kernel/futex/waitwake.c
@@ -350,7 +350,7 @@ void futex_wait_queue(struct futex_hash_bucket *hb, struct futex_q *q,
 	 * access to the hash list and forcing another memory barrier.
 	 */
 	set_current_state(TASK_INTERRUPTIBLE|TASK_FREEZABLE);
-	futex_queue(q, hb);
+	futex_queue(q, hb, current);
 
 	/* Arm the timer */
 	if (timeout)
@@ -461,7 +461,7 @@ int futex_wait_multiple_setup(struct futex_vector *vs, int count, int *woken)
 			 * next futex. Queue each futex at this moment so hb can
 			 * be unlocked.
 			 */
-			futex_queue(q, hb);
+			futex_queue(q, hb, current);
 			continue;
 		}
 

-- 
Jens Axboe

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: futex+io_uring: futex_q::task can maybe be dangling (but is not actually accessed, so it's fine)
  2025-01-15 15:32         ` Jens Axboe
@ 2025-01-15 17:05           ` Thomas Gleixner
  2025-01-15 17:07             ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2025-01-15 17:05 UTC (permalink / raw)
  To: Jens Axboe, Peter Zijlstra
  Cc: Jann Horn, Ingo Molnar, Darren Hart, Davidlohr Bueso,
	André Almeida, kernel list, Pavel Begunkov, io-uring

On Wed, Jan 15 2025 at 08:32, Jens Axboe wrote:
> Here's the raw patch. Should've done this initially rather than just
> tackle __futex_queue(), for some reason I thought/assumed that
> futex_queue() was more widely used.

'git grep' is pretty useful to validate such assumptions :)

> What do you think?

Looks about right.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: futex+io_uring: futex_q::task can maybe be dangling (but is not actually accessed, so it's fine)
  2025-01-15 17:05           ` Thomas Gleixner
@ 2025-01-15 17:07             ` Jens Axboe
  0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2025-01-15 17:07 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra
  Cc: Jann Horn, Ingo Molnar, Darren Hart, Davidlohr Bueso,
	André Almeida, kernel list, Pavel Begunkov, io-uring

On 1/15/25 10:05 AM, Thomas Gleixner wrote:
> On Wed, Jan 15 2025 at 08:32, Jens Axboe wrote:
>> Here's the raw patch. Should've done this initially rather than just
>> tackle __futex_queue(), for some reason I thought/assumed that
>> futex_queue() was more widely used.
> 
> 'git grep' is pretty useful to validate such assumptions :)

It would not be a good assumption if it was backed by fact checking ;-)

>> What do you think?
> 
> Looks about right.

OK thanks, fwiw I did send it out as a proper patch as well.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-01-15 17:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-10 22:26 futex+io_uring: futex_q::task can maybe be dangling (but is not actually accessed, so it's fine) Jann Horn
2025-01-11  3:33 ` Jens Axboe
2025-01-13 13:53   ` Jann Horn
2025-01-13 14:38   ` Peter Zijlstra
2025-01-13 14:41     ` Jens Axboe
2025-01-15 10:20     ` Thomas Gleixner
2025-01-15 15:23       ` Jens Axboe
2025-01-15 15:32         ` Jens Axboe
2025-01-15 17:05           ` Thomas Gleixner
2025-01-15 17:07             ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox