public inbox for [email protected]
 help / color / mirror / Atom feed
From: Dmitry Kadashev <[email protected]>
To: Pavel Begunkov <[email protected]>
Cc: Jens Axboe <[email protected]>, Josef <[email protected]>,
	Norman Maurer <[email protected]>,
	io-uring <[email protected]>
Subject: Re: "Cannot allocate memory" on ring creation (not RLIMIT_MEMLOCK)
Date: Tue, 22 Dec 2020 20:13:21 +0700	[thread overview]
Message-ID: <CAOKbgA5UD7ZMF1YzyCyrKXTObm4uqho1OKY2=HU2aiiBNjfBJQ@mail.gmail.com> (raw)
In-Reply-To: <CAOKbgA5xZSpMWGfDpetXqVck4fvC9xkmKuWYV8nrpOBqPmCfAQ@mail.gmail.com>

On Tue, Dec 22, 2020 at 6:06 PM Dmitry Kadashev <[email protected]> wrote:
>
> On Tue, Dec 22, 2020 at 6:04 PM Dmitry Kadashev <[email protected]> wrote:
> >
> > On Tue, Dec 22, 2020 at 11:11 AM Pavel Begunkov <[email protected]> wrote:
> > >
> > > On 22/12/2020 03:35, Pavel Begunkov wrote:
> > > > On 21/12/2020 11:00, Dmitry Kadashev wrote:
> > > > [snip]
> > > >>> We do not share rings between processes. Our rings are accessible from different
> > > >>> threads (under locks), but nothing fancy.
> > > >>>
> > > >>>> In other words, if you kill all your io_uring applications, does it
> > > >>>> go back to normal?
> > > >>>
> > > >>> I'm pretty sure it does not, the only fix is to reboot the box. But I'll find an
> > > >>> affected box and double check just in case.
> > > >
> > > > I can't spot any misaccounting, but I wonder if it can be that your memory is
> > > > getting fragmented enough to be unable make an allocation of 16 __contiguous__
> > > > pages, i.e. sizeof(sqe) * 1024
> > > >
> > > > That's how it's allocated internally:
> > > >
> > > > static void *io_mem_alloc(size_t size)
> > > > {
> > > >       gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN | __GFP_COMP |
> > > >                               __GFP_NORETRY;
> > > >
> > > >       return (void *) __get_free_pages(gfp_flags, get_order(size));
> > > > }
> > > >
> > > > What about smaller rings? Can you check io_uring of what SQ size it can allocate?
> > > > That can be a different program, e.g. modify a bit liburing/test/nop.
> > >
> > > Even better to allocate N smaller rings, where N = 1024 / SQ_size
> > >
> > > static int try_size(int sq_size)
> > > {
> > >         int ret = 0, i, n = 1024 / sq_size;
> > >         static struct io_uring rings[128];
> > >
> > >         for (i = 0; i < n; ++i) {
> > >                 if (io_uring_queue_init(sq_size, &rings[i], 0) < 0) {
> > >                         ret = -1;
> > >                         break;
> > >                 }
> > >         }
> > >         for (i -= 1; i >= 0; i--)
> > >                 io_uring_queue_exit(&rings[i]);
> > >         return ret;
> > > }
> > >
> > > int main()
> > > {
> > >         int size;
> > >
> > >         for (size = 1024; size >= 2; size /= 2) {
> > >                 if (!try_size(size)) {
> > >                         printf("max size %i\n", size);
> > >                         return 0;
> > >                 }
> > >         }
> > >
> > >         printf("can't allocate %i\n", size);
> > >         return 0;
> > > }
> >
> > Unfortunately I've rebooted the box I've used for tests yesterday, so I can't
> > try this there. Also I was not able to come up with an isolated reproducer for
> > this yet.
> >
> > The good news is I've found a relatively easy way to provoke this on a test VM
> > using our software. Our app runs with "admin" user perms (plus some
> > capabilities), it bumps RLIMIT_MEMLOCK to infinity on start. I've also created
> > an user called 'ioutest' to run the check for ring sizes using a different user.
> >
> > I've modified the test program slightly, to show the number of rings
> > successfully
> > created on each iteration and the actual error message (to debug a problem I was
> > having with it, but I've kept this after that). Here is the output:
> >
> > # sudo -u admin bash -c 'ulimit -a' | grep locked
> > max locked memory       (kbytes, -l) 1024
> >
> > # sudo -u ioutest bash -c 'ulimit -a' | grep locked
> > max locked memory       (kbytes, -l) 1024
> >
> > # sudo -u admin ./iou-test1
> > Failed after 0 rings with 1024 size: Cannot allocate memory
> > Failed after 0 rings with 512 size: Cannot allocate memory
> > Failed after 0 rings with 256 size: Cannot allocate memory
> > Failed after 0 rings with 128 size: Cannot allocate memory
> > Failed after 0 rings with 64 size: Cannot allocate memory
> > Failed after 0 rings with 32 size: Cannot allocate memory
> > Failed after 0 rings with 16 size: Cannot allocate memory
> > Failed after 0 rings with 8 size: Cannot allocate memory
> > Failed after 0 rings with 4 size: Cannot allocate memory
> > Failed after 0 rings with 2 size: Cannot allocate memory
> > can't allocate 1
> >
> > # sudo -u ioutest ./iou-test1
> > max size 1024
> >
> > # ps ax | grep wq
> >     8 ?        I<     0:00 [mm_percpu_wq]
> >   121 ?        I<     0:00 [tpm_dev_wq]
> >   124 ?        I<     0:00 [devfreq_wq]
> > 20593 pts/1    S+     0:00 grep --color=auto wq
>
> This was on kernel 5.6.7, I'm going to try this on 5.10.1 now.

Curious. It seems to be much harder to reproduce on 5.9 and 5.10. I'm 100% sure
it still happens on 5.9 though, since it did happen on production quite a few
times. But the way I've used to reproduce it on 5.6 worked two times there, and
quite quickly. And with 5.9 and 5.10 the same approach does not seem to be
working. I'll give it some more time and also will keep trying to come up with
a synthetic reproducer.

-- 
Dmitry Kadashev

  reply	other threads:[~2020-12-22 13:14 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-17  8:19 "Cannot allocate memory" on ring creation (not RLIMIT_MEMLOCK) Dmitry Kadashev
2020-12-17  8:26 ` Norman Maurer
2020-12-17  8:36   ` Dmitry Kadashev
2020-12-17  8:40     ` Dmitry Kadashev
2020-12-17 10:38       ` Josef
2020-12-17 11:10         ` Dmitry Kadashev
2020-12-17 13:43           ` Victor Stewart
2020-12-18  9:20             ` Dmitry Kadashev
2020-12-18 17:22               ` Jens Axboe
2020-12-18 15:26 ` Jens Axboe
2020-12-18 17:21   ` Josef
2020-12-18 17:23     ` Jens Axboe
2020-12-19  2:49       ` Josef
2020-12-19 16:13         ` Jens Axboe
2020-12-19 16:29           ` Jens Axboe
2020-12-19 17:11             ` Jens Axboe
2020-12-19 17:34               ` Norman Maurer
2020-12-19 17:38                 ` Jens Axboe
2020-12-19 20:51                   ` Josef
2020-12-19 21:54                     ` Jens Axboe
2020-12-19 23:13                       ` Jens Axboe
2020-12-19 23:42                         ` Josef
2020-12-19 23:42                         ` Pavel Begunkov
2020-12-20  0:25                           ` Jens Axboe
2020-12-20  0:55                             ` Pavel Begunkov
2020-12-21 10:35                               ` Dmitry Kadashev
2020-12-21 10:49                                 ` Dmitry Kadashev
2020-12-21 11:00                                 ` Dmitry Kadashev
2020-12-21 15:36                                   ` Pavel Begunkov
2020-12-22  3:35                                   ` Pavel Begunkov
2020-12-22  4:07                                     ` Pavel Begunkov
2020-12-22 11:04                                       ` Dmitry Kadashev
2020-12-22 11:06                                         ` Dmitry Kadashev
2020-12-22 13:13                                           ` Dmitry Kadashev [this message]
2020-12-22 16:33                                         ` Pavel Begunkov
2020-12-23  8:39                                           ` Dmitry Kadashev
2020-12-23  9:38                                             ` Dmitry Kadashev
2020-12-23 11:48                                               ` Dmitry Kadashev
2020-12-23 12:27                                                 ` Pavel Begunkov
2020-12-20  1:57                             ` Pavel Begunkov
2020-12-20  7:13                               ` Josef
2020-12-20 13:00                                 ` Pavel Begunkov
2020-12-20 14:19                                   ` Pavel Begunkov
2020-12-20 15:56                                     ` Josef
2020-12-20 15:58                                       ` Pavel Begunkov
2020-12-20 16:14                                   ` Jens Axboe
2020-12-20 16:59                                     ` Josef
2020-12-20 18:23                                       ` Josef
2020-12-20 18:41                                         ` Pavel Begunkov
2020-12-21  8:22                                           ` Josef
2020-12-21 15:30                                             ` Pavel Begunkov
2020-12-21 10:31               ` Dmitry Kadashev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOKbgA5UD7ZMF1YzyCyrKXTObm4uqho1OKY2=HU2aiiBNjfBJQ@mail.gmail.com' \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox