From: Salvatore Bonaccorso <[email protected]>
To: Daniel Black <[email protected]>
Cc: Jens Axboe <[email protected]>,
Pavel Begunkov <[email protected]>,
[email protected], [email protected]
Subject: Re: uring regression - lost write request
Date: Fri, 12 Nov 2021 20:19:47 +0100 [thread overview]
Message-ID: <YY6+Uxm+cV/[email protected]> (raw)
In-Reply-To: <CABVffEOEayBow2Oot7_jNHbXL0CQq9SZCWmiWEJjbT6gVC7WKg@mail.gmail.com>
Daniel,
On Fri, Nov 12, 2021 at 05:25:31PM +1100, Daniel Black wrote:
> On Fri, Nov 12, 2021 at 10:44 AM Jens Axboe <[email protected]> wrote:
> >
> > On 11/11/21 10:28 AM, Jens Axboe wrote:
> > > On 11/11/21 9:55 AM, Jens Axboe wrote:
> > >> On 11/11/21 9:19 AM, Jens Axboe wrote:
> > >>> On 11/11/21 8:29 AM, Jens Axboe wrote:
> > >>>> On 11/11/21 7:58 AM, Jens Axboe wrote:
> > >>>>> On 11/11/21 7:30 AM, Jens Axboe wrote:
> > >>>>>> On 11/10/21 11:52 PM, Daniel Black wrote:
> > >>>>>>>> Would it be possible to turn this into a full reproducer script?
> > >>>>>>>> Something that someone that knows nothing about mysqld/mariadb can just
> > >>>>>>>> run and have it reproduce. If I install the 10.6 packages from above,
> > >>>>>>>> then it doesn't seem to use io_uring or be linked against liburing.
> > >>>>>>>
> > >>>>>>> Sorry Jens.
> > >>>>>>>
> > >>>>>>> Hope containers are ok.
> > >>>>>>
> > >>>>>> Don't think I have a way to run that, don't even know what podman is
> > >>>>>> and nor does my distro. I'll google a bit and see if I can get this
> > >>>>>> running.
> > >>>>>>
> > >>>>>> I'm fine building from source and running from there, as long as I
> > >>>>>> know what to do. Would that make it any easier? It definitely would
> > >>>>>> for me :-)
> > >>>>>
> > >>>>> The podman approach seemed to work,
>
> Thanks for bearing with it.
>
> > >>>>> and I was able to run all three
> > >>>>> steps. Didn't see any hangs. I'm going to try again dropping down
> > >>>>> the innodb pool size (box only has 32G of RAM).
> > >>>>>
> > >>>>> The storage can do a lot more than 5k IOPS, I'm going to try ramping
> > >>>>> that up.
>
> Good.
>
> > >>>>>
> > >>>>> Does your reproducer box have multiple NUMA nodes, or is it a single
> > >>>>> socket/nod box?
>
> It was NUMA. Pre 5.14.14 I could produce it on a simpler test on a single node.
>
> > >>>>
> > >>>> Doesn't seem to reproduce for me on current -git. What file system are
> > >>>> you using?
>
> Yes ext4.
>
> > >>>
> > >>> I seem to be able to hit it with ext4, guessing it has more cases that
> > >>> punt to buffered IO. As I initially suspected, I think this is a race
> > >>> with buffered file write hashing. I have a debug patch that just turns
> > >>> a regular non-numa box into multi nodes, may or may not be needed be
> > >>> needed to hit this, but I definitely can now. Looks like this:
> > >>>
> > >>> Node7 DUMP
> > >>> index=0, nr_w=1, max=128, r=0, f=1, h=0
> > >>> w=ffff8f5e8b8470c0, hashed=1/0, flags=2
> > >>> w=ffff8f5e95a9b8c0, hashed=1/0, flags=2
> > >>> index=1, nr_w=0, max=127877, r=0, f=0, h=0
> > >>> free_list
> > >>> worker=ffff8f5eaf2e0540
> > >>> all_list
> > >>> worker=ffff8f5eaf2e0540
> > >>>
> > >>> where we seed node7 in this case having two work items pending, but the
> > >>> worker state is stalled on hash.
> > >>>
> > >>> The hash logic was rewritten as part of the io-wq worker threads being
> > >>> changed for 5.11 iirc, which is why that was my initial suspicion here.
> > >>>
> > >>> I'll take a look at this and make a test patch. Looks like you are able
> > >>> to test self-built kernels, is that correct?
>
> I've been libreating prebuilt kernels, however on the path to self-built again.
>
> Just searching for the holy penguin pee (from yaboot da(ze|ys)) to
> peesign(sic) EFI kernels.
> jk, working through docs:
> https://docs.fedoraproject.org/en-US/quick-docs/kernel/build-custom-kernel/
>
> > >> Can you try with this patch? It's against -git, but it will apply to
> > >> 5.15 as well.
> > >
> > > I think that one covered one potential gap, but I just managed to
> > > reproduce a stall even with it. So hang on testing that one, I'll send
> > > you something more complete when I have confidence in it.
> >
> > Alright, give this one a go if you can. Against -git, but will apply to
> > 5.15 as well.
>
> Applied, built, attempting to boot....
If you want to do the same for Debian based system, the following
might help to get the package built:
https://kernel-team.pages.debian.net/kernel-handbook/ch-common-tasks.html#s4.2.2
I might be able to provide you otherwise a prebuild package with the
patch (unsigned though, but best if you built and test it directly)
Regards,
Salvatore
next prev parent reply other threads:[~2021-11-12 19:19 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CABVffENnJ8JkP7EtuUTqi+VkJDBFU37w1UXe4Q3cB7-ixxh0VA@mail.gmail.com>
2021-10-22 9:10 ` uring regression - lost write request Pavel Begunkov
2021-10-25 9:57 ` Pavel Begunkov
2021-10-25 11:09 ` Daniel Black
2021-10-25 11:25 ` Pavel Begunkov
2021-10-30 7:30 ` Salvatore Bonaccorso
2021-11-01 7:28 ` Daniel Black
2021-11-09 22:58 ` Daniel Black
2021-11-09 23:24 ` Jens Axboe
2021-11-10 18:01 ` Jens Axboe
2021-11-11 6:52 ` Daniel Black
2021-11-11 14:30 ` Jens Axboe
2021-11-11 14:58 ` Jens Axboe
2021-11-11 15:29 ` Jens Axboe
2021-11-11 16:19 ` Jens Axboe
2021-11-11 16:55 ` Jens Axboe
2021-11-11 17:28 ` Jens Axboe
2021-11-11 23:44 ` Jens Axboe
2021-11-12 6:25 ` Daniel Black
2021-11-12 19:19 ` Salvatore Bonaccorso [this message]
2021-11-14 20:33 ` Daniel Black
2021-11-14 20:55 ` Jens Axboe
2021-11-14 21:02 ` Salvatore Bonaccorso
2021-11-14 21:03 ` Jens Axboe
2021-11-24 3:27 ` Daniel Black
2021-11-24 15:28 ` Jens Axboe
2021-11-24 16:10 ` Jens Axboe
2021-11-24 16:18 ` Greg Kroah-Hartman
2021-11-24 16:22 ` Jens Axboe
2021-11-24 22:52 ` Stefan Metzmacher
2021-11-25 0:58 ` Jens Axboe
2021-11-25 16:35 ` Stefan Metzmacher
2021-11-25 17:11 ` Jens Axboe
2022-02-09 23:01 ` Stefan Metzmacher
2022-02-10 0:10 ` Daniel Black
2021-11-24 22:57 ` Daniel Black
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YY6+Uxm+cV/[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox