public inbox for [email protected]
 help / color / mirror / Atom feed
From: David Rientjes <[email protected]>
To: Shakeel Butt <[email protected]>
Cc: Jens Axboe <[email protected]>,
	Pavel Begunkov <[email protected]>,
	Andrew Morton <[email protected]>,
	Linux MM <[email protected]>,
	LKML <[email protected]>,
	[email protected]
Subject: Re: [PATCH] mm: io_uring: allow oom-killer from io_uring_setup
Date: Tue, 25 Jan 2022 17:42:01 -0800 (PST)	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <CALvZod5B_nQdU_MHmcYyOpHhGrv5YUnMY-rBPE11Tou6XU_mSA@mail.gmail.com>

On Tue, 25 Jan 2022, Shakeel Butt wrote:

> > > On an overcommitted system which is running multiple workloads of
> > > varying priorities, it is preferred to trigger an oom-killer to kill a
> > > low priority workload than to let the high priority workload receiving
> > > ENOMEMs. On our memory overcommitted systems, we are seeing a lot of
> > > ENOMEMs instead of oom-kills because io_uring_setup callchain is using
> > > __GFP_NORETRY gfp flag which avoids the oom-killer. Let's remove it and
> > > allow the oom-killer to kill a lower priority job.
> > >
> >
> > What is the size of the allocations that io_mem_alloc() is doing?
> >
> > If get_order(size) > PAGE_ALLOC_COSTLY_ORDER, then this will fail even
> > without the __GFP_NORETRY.  To make the guarantee that workloads are not
> > receiving ENOMEM, it seems like we'd need to guarantee that allocations
> > going through io_mem_alloc() are sufficiently small.
> >
> > (And if we're really serious about it, then even something like a
> > BUILD_BUG_ON().)
> >
> 
> The test case provided to me for which the user was seeing ENOMEMs was
> io_uring_setup() with 64 entries (nothing else).
> 
> If I understand rings_size() calculations correctly then the 0 order
> allocation was requested in io_mem_alloc().
> 
> For order > PAGE_ALLOC_COSTLY_ORDER, maybe we can use
> __GFP_RETRY_MAYFAIL. It will at least do more aggressive reclaim
> though I think that is a separate discussion. For this issue, we are
> seeing ENOMEMs even for order 0 allocations.
> 

Ah, gotcha, thanks for the background.  IIUC, io_uring_setup() can be done 
with anything with CAP_SYS_NICE so my only concern would be whether this 
could be used maliciously on a system not using memcg, but in that case we 
can already fork many small processes that consume all memory and oom kill 
everything else on the system already.

  reply	other threads:[~2022-01-26  1:42 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-25  5:17 [PATCH] mm: io_uring: allow oom-killer from io_uring_setup Shakeel Butt
2022-01-25 18:35 ` David Rientjes
2022-01-25 22:57   ` Shakeel Butt
2022-01-26  1:42     ` David Rientjes [this message]
2022-02-05  6:32 ` Shakeel Butt
2022-02-07 15:44 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox