public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH 1/1] io_uring: prevent reg-wait speculations
@ 2024-11-19  1:29 Pavel Begunkov
  2024-11-19  1:29 ` Jens Axboe
  2024-11-19  1:38 ` Pavel Begunkov
  0 siblings, 2 replies; 6+ messages in thread
From: Pavel Begunkov @ 2024-11-19  1:29 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, jannh

With *ENTER_EXT_ARG_REG instead of passing a user pointer with arguments
for the waiting loop the user can specify an offset into a pre-mapped
region of memory, in which case the
[offset, offset + sizeof(io_uring_reg_wait)) will be intepreted as the
argument.

As we address a kernel array using a user given index, it'd be a subject
to speculation type of exploits.

Fixes: d617b3147d54c ("io_uring: restore back registered wait arguments")
Fixes: aa00f67adc2c0 ("io_uring: add support for fixed wait regions")
Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/io_uring.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index da8fd460977b..3a3e4fca1545 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3207,6 +3207,7 @@ static struct io_uring_reg_wait *io_get_ext_arg_reg(struct io_ring_ctx *ctx,
 		     end > ctx->cq_wait_size))
 		return ERR_PTR(-EFAULT);
 
+	barrier_nospec();
 	return ctx->cq_wait_arg + offset;
 }
 
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/1] io_uring: prevent reg-wait speculations
  2024-11-19  1:29 [PATCH 1/1] io_uring: prevent reg-wait speculations Pavel Begunkov
@ 2024-11-19  1:29 ` Jens Axboe
  2024-11-19  1:43   ` Pavel Begunkov
  2024-11-19  1:38 ` Pavel Begunkov
  1 sibling, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2024-11-19  1:29 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: jannh

On 11/18/24 6:29 PM, Pavel Begunkov wrote:
> With *ENTER_EXT_ARG_REG instead of passing a user pointer with arguments
> for the waiting loop the user can specify an offset into a pre-mapped
> region of memory, in which case the
> [offset, offset + sizeof(io_uring_reg_wait)) will be intepreted as the
> argument.
> 
> As we address a kernel array using a user given index, it'd be a subject
> to speculation type of exploits.
> 
> Fixes: d617b3147d54c ("io_uring: restore back registered wait arguments")
> Fixes: aa00f67adc2c0 ("io_uring: add support for fixed wait regions")
> Signed-off-by: Pavel Begunkov <[email protected]>
> ---
>  io_uring/io_uring.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index da8fd460977b..3a3e4fca1545 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -3207,6 +3207,7 @@ static struct io_uring_reg_wait *io_get_ext_arg_reg(struct io_ring_ctx *ctx,
>  		     end > ctx->cq_wait_size))
>  		return ERR_PTR(-EFAULT);
>  
> +	barrier_nospec();
>  	return ctx->cq_wait_arg + offset;

We need something better than that, barrier_nospec() is a big slow
hammer...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/1] io_uring: prevent reg-wait speculations
  2024-11-19  1:29 [PATCH 1/1] io_uring: prevent reg-wait speculations Pavel Begunkov
  2024-11-19  1:29 ` Jens Axboe
@ 2024-11-19  1:38 ` Pavel Begunkov
  1 sibling, 0 replies; 6+ messages in thread
From: Pavel Begunkov @ 2024-11-19  1:38 UTC (permalink / raw)
  To: io-uring; +Cc: jannh

On 11/19/24 01:29, Pavel Begunkov wrote:
> With *ENTER_EXT_ARG_REG instead of passing a user pointer with arguments
> for the waiting loop the user can specify an offset into a pre-mapped
> region of memory, in which case the
> [offset, offset + sizeof(io_uring_reg_wait)) will be intepreted as the
> argument.

Jann, do mind taking a look? I hope there is some clever trick with
masks we can use instead of the barrier, it seems expensive.

The byte offset user pases is 0 based and we add it to the base
kernel address:

if (unlikely(check_add_overflow(offset, sizeof(struct ...), &end) ||
	     end > ctx->cq_wait_size))
	return ERR_PTR(-EFAULT);

barrier_nospec();
return ctx->cq_wait_arg + offset;

Here in particular we know the structure size, but I also wonder how
to do it right if size is variable.


> As we address a kernel array using a user given index, it'd be a subject
> to speculation type of exploits.
> 
> Fixes: d617b3147d54c ("io_uring: restore back registered wait arguments")
> Fixes: aa00f67adc2c0 ("io_uring: add support for fixed wait regions")
> Signed-off-by: Pavel Begunkov <[email protected]>
> ---
>   io_uring/io_uring.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index da8fd460977b..3a3e4fca1545 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -3207,6 +3207,7 @@ static struct io_uring_reg_wait *io_get_ext_arg_reg(struct io_ring_ctx *ctx,
>   		     end > ctx->cq_wait_size))
>   		return ERR_PTR(-EFAULT);
>   
> +	barrier_nospec();
>   	return ctx->cq_wait_arg + offset;
>   }
>   

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/1] io_uring: prevent reg-wait speculations
  2024-11-19  1:29 ` Jens Axboe
@ 2024-11-19  1:43   ` Pavel Begunkov
  2024-11-19  1:59     ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2024-11-19  1:43 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: jannh

On 11/19/24 01:29, Jens Axboe wrote:
> On 11/18/24 6:29 PM, Pavel Begunkov wrote:
>> With *ENTER_EXT_ARG_REG instead of passing a user pointer with arguments
>> for the waiting loop the user can specify an offset into a pre-mapped
>> region of memory, in which case the
>> [offset, offset + sizeof(io_uring_reg_wait)) will be intepreted as the
>> argument.
>>
>> As we address a kernel array using a user given index, it'd be a subject
>> to speculation type of exploits.
>>
>> Fixes: d617b3147d54c ("io_uring: restore back registered wait arguments")
>> Fixes: aa00f67adc2c0 ("io_uring: add support for fixed wait regions")
>> Signed-off-by: Pavel Begunkov <[email protected]>
>> ---
>>   io_uring/io_uring.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>> index da8fd460977b..3a3e4fca1545 100644
>> --- a/io_uring/io_uring.c
>> +++ b/io_uring/io_uring.c
>> @@ -3207,6 +3207,7 @@ static struct io_uring_reg_wait *io_get_ext_arg_reg(struct io_ring_ctx *ctx,
>>   		     end > ctx->cq_wait_size))
>>   		return ERR_PTR(-EFAULT);
>>   
>> +	barrier_nospec();
>>   	return ctx->cq_wait_arg + offset;
> 
> We need something better than that, barrier_nospec() is a big slow
> hammer...

Right, more of a discussion opener. I wonder if Jann can help here
(see the other reply). I don't like back and forth like that, but if
nothing works there is an option of returning back to reg-wait array
indexes. Trivial to change, but then we're committing to not expanding
the structure or complicating things if we do.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/1] io_uring: prevent reg-wait speculations
  2024-11-19  1:43   ` Pavel Begunkov
@ 2024-11-19  1:59     ` Jens Axboe
  2024-11-21  0:13       ` Pavel Begunkov
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2024-11-19  1:59 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: jannh

On 11/18/24 6:43 PM, Pavel Begunkov wrote:
> On 11/19/24 01:29, Jens Axboe wrote:
>> On 11/18/24 6:29 PM, Pavel Begunkov wrote:
>>> With *ENTER_EXT_ARG_REG instead of passing a user pointer with arguments
>>> for the waiting loop the user can specify an offset into a pre-mapped
>>> region of memory, in which case the
>>> [offset, offset + sizeof(io_uring_reg_wait)) will be intepreted as the
>>> argument.
>>>
>>> As we address a kernel array using a user given index, it'd be a subject
>>> to speculation type of exploits.
>>>
>>> Fixes: d617b3147d54c ("io_uring: restore back registered wait arguments")
>>> Fixes: aa00f67adc2c0 ("io_uring: add support for fixed wait regions")
>>> Signed-off-by: Pavel Begunkov <[email protected]>
>>> ---
>>>   io_uring/io_uring.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>>> index da8fd460977b..3a3e4fca1545 100644
>>> --- a/io_uring/io_uring.c
>>> +++ b/io_uring/io_uring.c
>>> @@ -3207,6 +3207,7 @@ static struct io_uring_reg_wait *io_get_ext_arg_reg(struct io_ring_ctx *ctx,
>>>                end > ctx->cq_wait_size))
>>>           return ERR_PTR(-EFAULT);
>>>   +    barrier_nospec();
>>>       return ctx->cq_wait_arg + offset;
>>
>> We need something better than that, barrier_nospec() is a big slow
>> hammer...
> 
> Right, more of a discussion opener. I wonder if Jann can help here
> (see the other reply). I don't like back and forth like that, but if
> nothing works there is an option of returning back to reg-wait array
> indexes. Trivial to change, but then we're committing to not expanding
> the structure or complicating things if we do.

Then I think it should've been marked as a discussion point, because we
definitely can't do this. Soliciting input is perfectly fine. And yeah,
was thinking the same thing, if this is an issue then we just go back to
indexing again. At least both the problem and solution is well known
there. The original aa00f67adc2c0 just needed an array_index_nospec()
and it would've been fine.

Not a huge deal in terms of timing, either way.

I suspect we can do something similar here, with just clamping the
indexing offset. But let's hear what Jann thinks.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/1] io_uring: prevent reg-wait speculations
  2024-11-19  1:59     ` Jens Axboe
@ 2024-11-21  0:13       ` Pavel Begunkov
  0 siblings, 0 replies; 6+ messages in thread
From: Pavel Begunkov @ 2024-11-21  0:13 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: jannh

On 11/19/24 01:59, Jens Axboe wrote:
> On 11/18/24 6:43 PM, Pavel Begunkov wrote:
>> On 11/19/24 01:29, Jens Axboe wrote:
>>> On 11/18/24 6:29 PM, Pavel Begunkov wrote:
>>>> With *ENTER_EXT_ARG_REG instead of passing a user pointer with arguments
>>>> for the waiting loop the user can specify an offset into a pre-mapped
>>>> region of memory, in which case the
>>>> [offset, offset + sizeof(io_uring_reg_wait)) will be intepreted as the
>>>> argument.
>>>>
>>>> As we address a kernel array using a user given index, it'd be a subject
>>>> to speculation type of exploits.
>>>>
>>>> Fixes: d617b3147d54c ("io_uring: restore back registered wait arguments")
>>>> Fixes: aa00f67adc2c0 ("io_uring: add support for fixed wait regions")
>>>> Signed-off-by: Pavel Begunkov <[email protected]>
>>>> ---
>>>>    io_uring/io_uring.c | 1 +
>>>>    1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>>>> index da8fd460977b..3a3e4fca1545 100644
>>>> --- a/io_uring/io_uring.c
>>>> +++ b/io_uring/io_uring.c
>>>> @@ -3207,6 +3207,7 @@ static struct io_uring_reg_wait *io_get_ext_arg_reg(struct io_ring_ctx *ctx,
>>>>                 end > ctx->cq_wait_size))
>>>>            return ERR_PTR(-EFAULT);
>>>>    +    barrier_nospec();
>>>>        return ctx->cq_wait_arg + offset;
>>>
>>> We need something better than that, barrier_nospec() is a big slow
>>> hammer...
>>
>> Right, more of a discussion opener. I wonder if Jann can help here
>> (see the other reply). I don't like back and forth like that, but if
>> nothing works there is an option of returning back to reg-wait array
>> indexes. Trivial to change, but then we're committing to not expanding
>> the structure or complicating things if we do.
> 
> Then I think it should've been marked as a discussion point, because we
> definitely can't do this. Soliciting input is perfectly fine. And yeah,
> was thinking the same thing, if this is an issue then we just go back to
> indexing again. At least both the problem and solution is well known
> there. The original aa00f67adc2c0 just needed an array_index_nospec()
> and it would've been fine.
> 
> Not a huge deal in terms of timing, either way.
> 
> I suspect we can do something similar here, with just clamping the
> indexing offset. But let's hear what Jann thinks.

That what I hope for, but I can't say I entirely understand it. E.g.
why can_do_masked_user_access() exists and guards mask_user_address().

IIRC, with invalid argument the mask turns the index into 0. A complete
speculation from my side of how it works is that you then able to
"inspect" or what's the right word the value of array[0] but not a
address of memory of choice. Then in our case, considering that
mappings are page sized, array_index_nospec() would clamp it to either
first 32 bytes of the first page or to absolute addresses [0, 32)
in case size==0 and the mapping is NULL. But that could be just my
fantasy.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-11-21  0:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-19  1:29 [PATCH 1/1] io_uring: prevent reg-wait speculations Pavel Begunkov
2024-11-19  1:29 ` Jens Axboe
2024-11-19  1:43   ` Pavel Begunkov
2024-11-19  1:59     ` Jens Axboe
2024-11-21  0:13       ` Pavel Begunkov
2024-11-19  1:38 ` Pavel Begunkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox