public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH 1/1] io_uring: more graceful request alloc OOM
@ 2023-05-19 14:05 Pavel Begunkov
  2023-05-20  1:57 ` Jens Axboe
  2023-05-20  9:38 ` yang lan
  0 siblings, 2 replies; 7+ messages in thread
From: Pavel Begunkov @ 2023-05-19 14:05 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence, yang lan

It's ok for io_uring request allocation to fail, however there are
reports that it starts killing tasks instead of just returning back
to the userspace. Add __GFP_NORETRY, so it doesn't trigger OOM killer.

Cc: [email protected]
Fixes: 2b188cc1bb857 ("Add io_uring IO interface")
Reported-by: yang lan <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/io_uring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index dab09f568294..ad34a4320dab 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1073,7 +1073,7 @@ static void io_flush_cached_locked_reqs(struct io_ring_ctx *ctx,
 __cold bool __io_alloc_req_refill(struct io_ring_ctx *ctx)
 	__must_hold(&ctx->uring_lock)
 {
-	gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
+	gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY;
 	void *reqs[IO_REQ_ALLOC_BATCH];
 	int ret, i;
 
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] io_uring: more graceful request alloc OOM
  2023-05-19 14:05 [PATCH 1/1] io_uring: more graceful request alloc OOM Pavel Begunkov
@ 2023-05-20  1:57 ` Jens Axboe
  2023-05-20  9:38 ` yang lan
  1 sibling, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2023-05-20  1:57 UTC (permalink / raw)
  To: io-uring, Pavel Begunkov; +Cc: yang lan


On Fri, 19 May 2023 15:05:14 +0100, Pavel Begunkov wrote:
> It's ok for io_uring request allocation to fail, however there are
> reports that it starts killing tasks instead of just returning back
> to the userspace. Add __GFP_NORETRY, so it doesn't trigger OOM killer.
> 
> 

Applied, thanks!

[1/1] io_uring: more graceful request alloc OOM
      (no commit info)

Best regards,
-- 
Jens Axboe




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] io_uring: more graceful request alloc OOM
  2023-05-19 14:05 [PATCH 1/1] io_uring: more graceful request alloc OOM Pavel Begunkov
  2023-05-20  1:57 ` Jens Axboe
@ 2023-05-20  9:38 ` yang lan
  2023-05-22  0:40   ` Pavel Begunkov
  1 sibling, 1 reply; 7+ messages in thread
From: yang lan @ 2023-05-20  9:38 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: io-uring, Jens Axboe

Hi,

Thanks for your response.

But I applied this patch to LTS kernel 5.10.180, it can still trigger this bug.

--- io_uring/io_uring.c.back    2023-05-20 17:11:25.870550438 +0800
+++ io_uring/io_uring.c 2023-05-20 16:35:24.265846283 +0800
@@ -1970,7 +1970,7 @@
static struct io_kiocb *io_alloc_req(struct io_ring_ctx *ctx)
        __must_hold(&ctx->uring_lock)
 {
        struct io_submit_state *state = &ctx->submit_state;
-       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
+       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY;
        int ret, i;

        BUILD_BUG_ON(ARRAY_SIZE(state->reqs) < IO_REQ_ALLOC_BATCH);

The io_uring.c.back is the original file.
Do I apply this patch wrong?

Regards,

Yang

Pavel Begunkov <[email protected]> 于2023年5月19日周五 22:06写道:
>
> It's ok for io_uring request allocation to fail, however there are
> reports that it starts killing tasks instead of just returning back
> to the userspace. Add __GFP_NORETRY, so it doesn't trigger OOM killer.
>
> Cc: [email protected]
> Fixes: 2b188cc1bb857 ("Add io_uring IO interface")
> Reported-by: yang lan <[email protected]>
> Signed-off-by: Pavel Begunkov <[email protected]>
> ---
>  io_uring/io_uring.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index dab09f568294..ad34a4320dab 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -1073,7 +1073,7 @@ static void io_flush_cached_locked_reqs(struct io_ring_ctx *ctx,
>  __cold bool __io_alloc_req_refill(struct io_ring_ctx *ctx)
>         __must_hold(&ctx->uring_lock)
>  {
> -       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
> +       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY;
>         void *reqs[IO_REQ_ALLOC_BATCH];
>         int ret, i;
>
> --
> 2.40.0
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] io_uring: more graceful request alloc OOM
  2023-05-20  9:38 ` yang lan
@ 2023-05-22  0:40   ` Pavel Begunkov
  2023-05-22  7:55     ` yang lan
  0 siblings, 1 reply; 7+ messages in thread
From: Pavel Begunkov @ 2023-05-22  0:40 UTC (permalink / raw)
  To: yang lan; +Cc: io-uring, Jens Axboe

On 5/20/23 10:38, yang lan wrote:
> Hi,
> 
> Thanks for your response.
> 
> But I applied this patch to LTS kernel 5.10.180, it can still trigger this bug.
> 
> --- io_uring/io_uring.c.back    2023-05-20 17:11:25.870550438 +0800
> +++ io_uring/io_uring.c 2023-05-20 16:35:24.265846283 +0800
> @@ -1970,7 +1970,7 @@
> static struct io_kiocb *io_alloc_req(struct io_ring_ctx *ctx)
>          __must_hold(&ctx->uring_lock)
>   {
>          struct io_submit_state *state = &ctx->submit_state;
> -       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
> +       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY;
>          int ret, i;
> 
>          BUILD_BUG_ON(ARRAY_SIZE(state->reqs) < IO_REQ_ALLOC_BATCH);
> 
> The io_uring.c.back is the original file.
> Do I apply this patch wrong?

The patch looks fine. I run a self-written test before
sending with 6.4, worked as expected. I need to run the syz
test, maybe it shifted to another spot, e.g. in provided
buffers.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] io_uring: more graceful request alloc OOM
  2023-05-22  0:40   ` Pavel Begunkov
@ 2023-05-22  7:55     ` yang lan
  2023-05-23 12:08       ` Pavel Begunkov
  0 siblings, 1 reply; 7+ messages in thread
From: yang lan @ 2023-05-22  7:55 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: io-uring, Jens Axboe

Hi,

Thanks. I'm also analyzing the root cause of this bug.

By the way, can I apply for a CVE? And should I submit a request to
some official organizations, such as Openwall, etc?

Regards,

Yang

Pavel Begunkov <[email protected]> 于2023年5月22日周一 08:45写道:
>
> On 5/20/23 10:38, yang lan wrote:
> > Hi,
> >
> > Thanks for your response.
> >
> > But I applied this patch to LTS kernel 5.10.180, it can still trigger this bug.
> >
> > --- io_uring/io_uring.c.back    2023-05-20 17:11:25.870550438 +0800
> > +++ io_uring/io_uring.c 2023-05-20 16:35:24.265846283 +0800
> > @@ -1970,7 +1970,7 @@
> > static struct io_kiocb *io_alloc_req(struct io_ring_ctx *ctx)
> >          __must_hold(&ctx->uring_lock)
> >   {
> >          struct io_submit_state *state = &ctx->submit_state;
> > -       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
> > +       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY;
> >          int ret, i;
> >
> >          BUILD_BUG_ON(ARRAY_SIZE(state->reqs) < IO_REQ_ALLOC_BATCH);
> >
> > The io_uring.c.back is the original file.
> > Do I apply this patch wrong?
>
> The patch looks fine. I run a self-written test before
> sending with 6.4, worked as expected. I need to run the syz
> test, maybe it shifted to another spot, e.g. in provided
> buffers.
>
> --
> Pavel Begunkov

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] io_uring: more graceful request alloc OOM
  2023-05-22  7:55     ` yang lan
@ 2023-05-23 12:08       ` Pavel Begunkov
  2023-05-24  3:15         ` yang lan
  0 siblings, 1 reply; 7+ messages in thread
From: Pavel Begunkov @ 2023-05-23 12:08 UTC (permalink / raw)
  To: yang lan; +Cc: io-uring, Jens Axboe

On 5/22/23 08:55, yang lan wrote:
> Hi,
> 
> Thanks. I'm also analyzing the root cause of this bug.

The repro indeed triggers, this time in another place. Though
when I patch all of them it would fail somewhere else, like in
ext4 on a pagefault.

We can add a couple more those "don't oom but fail" and some
niceness around, but I think it's fine as it is as allocations
are under cgroup. If admin cares about collision b/w users there
should be cgroups, so allocations of one don't affect another. And
if a user pushes it to the limit and oom's itself and gets OOM,
that should be fine.

> By the way, can I apply for a CVE? And should I submit a request to
> some official organizations, such as Openwall, etc?

Sorry, but we cannot help you here. We don't deal with CVEs.

That aside, I'm not even sure it's CVE'able because it shouldn't
be exploitable in a configured environment (unless it is). But
I'm not an expert in that so might be wrong.



> Pavel Begunkov <[email protected]> 于2023年5月22日周一 08:45写道:
>>
>> On 5/20/23 10:38, yang lan wrote:
>>> Hi,
>>>
>>> Thanks for your response.
>>>
>>> But I applied this patch to LTS kernel 5.10.180, it can still trigger this bug.
>>>
>>> --- io_uring/io_uring.c.back    2023-05-20 17:11:25.870550438 +0800
>>> +++ io_uring/io_uring.c 2023-05-20 16:35:24.265846283 +0800
>>> @@ -1970,7 +1970,7 @@
>>> static struct io_kiocb *io_alloc_req(struct io_ring_ctx *ctx)
>>>           __must_hold(&ctx->uring_lock)
>>>    {
>>>           struct io_submit_state *state = &ctx->submit_state;
>>> -       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
>>> +       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY;
>>>           int ret, i;
>>>
>>>           BUILD_BUG_ON(ARRAY_SIZE(state->reqs) < IO_REQ_ALLOC_BATCH);
>>>
>>> The io_uring.c.back is the original file.
>>> Do I apply this patch wrong?
>>
>> The patch looks fine. I run a self-written test before
>> sending with 6.4, worked as expected. I need to run the syz
>> test, maybe it shifted to another spot, e.g. in provided
>> buffers.
>>
>> --
>> Pavel Begunkov

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] io_uring: more graceful request alloc OOM
  2023-05-23 12:08       ` Pavel Begunkov
@ 2023-05-24  3:15         ` yang lan
  0 siblings, 0 replies; 7+ messages in thread
From: yang lan @ 2023-05-24  3:15 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: io-uring, Jens Axboe

Hi,

Thanks.

Regards,

Yang

Pavel Begunkov <[email protected]> 于2023年5月23日周二 20:15写道:
>
> On 5/22/23 08:55, yang lan wrote:
> > Hi,
> >
> > Thanks. I'm also analyzing the root cause of this bug.
>
> The repro indeed triggers, this time in another place. Though
> when I patch all of them it would fail somewhere else, like in
> ext4 on a pagefault.
>
> We can add a couple more those "don't oom but fail" and some
> niceness around, but I think it's fine as it is as allocations
> are under cgroup. If admin cares about collision b/w users there
> should be cgroups, so allocations of one don't affect another. And
> if a user pushes it to the limit and oom's itself and gets OOM,
> that should be fine.
>
> > By the way, can I apply for a CVE? And should I submit a request to
> > some official organizations, such as Openwall, etc?
>
> Sorry, but we cannot help you here. We don't deal with CVEs.
>
> That aside, I'm not even sure it's CVE'able because it shouldn't
> be exploitable in a configured environment (unless it is). But
> I'm not an expert in that so might be wrong.
>
>
>
> > Pavel Begunkov <[email protected]> 于2023年5月22日周一 08:45写道:
> >>
> >> On 5/20/23 10:38, yang lan wrote:
> >>> Hi,
> >>>
> >>> Thanks for your response.
> >>>
> >>> But I applied this patch to LTS kernel 5.10.180, it can still trigger this bug.
> >>>
> >>> --- io_uring/io_uring.c.back    2023-05-20 17:11:25.870550438 +0800
> >>> +++ io_uring/io_uring.c 2023-05-20 16:35:24.265846283 +0800
> >>> @@ -1970,7 +1970,7 @@
> >>> static struct io_kiocb *io_alloc_req(struct io_ring_ctx *ctx)
> >>>           __must_hold(&ctx->uring_lock)
> >>>    {
> >>>           struct io_submit_state *state = &ctx->submit_state;
> >>> -       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
> >>> +       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY;
> >>>           int ret, i;
> >>>
> >>>           BUILD_BUG_ON(ARRAY_SIZE(state->reqs) < IO_REQ_ALLOC_BATCH);
> >>>
> >>> The io_uring.c.back is the original file.
> >>> Do I apply this patch wrong?
> >>
> >> The patch looks fine. I run a self-written test before
> >> sending with 6.4, worked as expected. I need to run the syz
> >> test, maybe it shifted to another spot, e.g. in provided
> >> buffers.
> >>
> >> --
> >> Pavel Begunkov
>
> --
> Pavel Begunkov

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-05-24  3:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-19 14:05 [PATCH 1/1] io_uring: more graceful request alloc OOM Pavel Begunkov
2023-05-20  1:57 ` Jens Axboe
2023-05-20  9:38 ` yang lan
2023-05-22  0:40   ` Pavel Begunkov
2023-05-22  7:55     ` yang lan
2023-05-23 12:08       ` Pavel Begunkov
2023-05-24  3:15         ` yang lan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox