From: Jens Axboe <[email protected]>
To: Miklos Szeredi <[email protected]>
Cc: [email protected]
Subject: Re: io_uring_prep_openat_direct() and link/drain
Date: Fri, 1 Apr 2022 09:36:25 -0600 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAJfpegvM3LQ8nsJf=LsWjQznpOzC+mZFXB5xkZgZHR2tXXjxLQ@mail.gmail.com>
On 4/1/22 2:40 AM, Miklos Szeredi wrote:
> On Wed, 30 Mar 2022 at 19:49, Jens Axboe <[email protected]> wrote:
>>
>> On 3/30/22 9:53 AM, Jens Axboe wrote:
>>> On 3/30/22 9:17 AM, Jens Axboe wrote:
>>>> On 3/30/22 9:12 AM, Miklos Szeredi wrote:
>>>>> On Wed, 30 Mar 2022 at 17:05, Jens Axboe <[email protected]> wrote:
>>>>>>
>>>>>> On 3/30/22 8:58 AM, Miklos Szeredi wrote:
>>>>>>> Next issue: seems like file slot reuse is not working correctly.
>>>>>>> Attached program compares reads using io_uring with plain reads of
>>>>>>> proc files.
>>>>>>>
>>>>>>> In the below example it is using two slots alternately but the number
>>>>>>> of slots does not seem to matter, read is apparently always using a
>>>>>>> stale file (the prior one to the most recent open on that slot). See
>>>>>>> how the sizes of the files lag by two lines:
>>>>>>>
>>>>>>> root@kvm:~# ./procreads
>>>>>>> procreads: /proc/1/stat: ok (313)
>>>>>>> procreads: /proc/2/stat: ok (149)
>>>>>>> procreads: /proc/3/stat: read size mismatch 313/150
>>>>>>> procreads: /proc/4/stat: read size mismatch 149/154
>>>>>>> procreads: /proc/5/stat: read size mismatch 150/161
>>>>>>> procreads: /proc/6/stat: read size mismatch 154/171
>>>>>>> ...
>>>>>>>
>>>>>>> Any ideas?
>>>>>>
>>>>>> Didn't look at your code yet, but with the current tree, this is the
>>>>>> behavior when a fixed file is used:
>>>>>>
>>>>>> At prep time, if the slot is valid it is used. If it isn't valid,
>>>>>> assignment is deferred until the request is issued.
>>>>>>
>>>>>> Which granted is a bit weird. It means that if you do:
>>>>>>
>>>>>> <open fileA into slot 1, slot 1 currently unused><read slot 1>
>>>>>>
>>>>>> the read will read from fileA. But for:
>>>>>>
>>>>>> <open fileB into slot 1, slot 1 is fileA currently><read slot 1>
>>>>>>
>>>>>> since slot 1 is already valid at prep time for the read, the read will
>>>>>> be from fileA again.
>>>>>>
>>>>>> Is this what you are seeing? It's definitely a bit confusing, and the
>>>>>> only reason why I didn't change it is because it could potentially break
>>>>>> applications. Don't think there's a high risk of that, however, so may
>>>>>> indeed be worth it to just bite the bullet and the assignment is
>>>>>> consistent (eg always done from the perspective of the previous
>>>>>> dependent request having completed).
>>>>>>
>>>>>> Is this what you are seeing?
>>>>>
>>>>> Right, this explains it. Then the only workaround would be to wait
>>>>> for the open to finish before submitting the read, but that would
>>>>> defeat the whole point of using io_uring for this purpose.
>>>>
>>>> Honestly, I think we should just change it during this round, making it
>>>> consistent with the "slot is unused" use case. The old use case is more
>>>> more of a "it happened to work" vs the newer consistent behavior of "we
>>>> always assign the file when execution starts on the request".
>>>>
>>>> Let me spin a patch, would be great if you could test.
>>>
>>> Something like this on top of the current tree should work. Can you
>>> test?
>>
>> You can also just re-pull for-5.18/io_uring, it has been updated. A last
>> minute edit make a 0 return from io_assign_file() which should've been
>> 'true'...
>
> Yep, this works now.
>
> Next issue: will get ENFILE even though there are just 40 slots.
> When running as root, then it will get as far as invoking the OOM
> killer, which is really bad.
>
> There's no leak, this apparently only happens when the worker doing
> the fputs can't keep up. Simple solution: do the fput() of the
> previous file synchronously with the open_direct operation; fput
> shouldn't be expensive... Is there a reason why this wouldn't work?
I take it you're continually reusing those slots? If you have a test
case that'd be ideal. Agree that it sounds like we just need an
appropriate breather to allow fput/task_work to run. Or it could be the
deferral free of the fixed slot.
--
Jens Axboe
next prev parent reply other threads:[~2022-04-01 16:09 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-29 13:20 io_uring_prep_openat_direct() and link/drain Miklos Szeredi
2022-03-29 16:08 ` Jens Axboe
2022-03-29 17:04 ` Jens Axboe
2022-03-29 18:21 ` Miklos Szeredi
2022-03-29 18:26 ` Jens Axboe
2022-03-29 18:31 ` Miklos Szeredi
2022-03-29 18:40 ` Jens Axboe
2022-03-29 19:30 ` Miklos Szeredi
2022-03-29 20:03 ` Jens Axboe
2022-03-30 8:18 ` Miklos Szeredi
2022-03-30 12:35 ` Jens Axboe
2022-03-30 12:43 ` Miklos Szeredi
2022-03-30 12:48 ` Jens Axboe
2022-03-30 12:51 ` Miklos Szeredi
2022-03-30 14:58 ` Miklos Szeredi
2022-03-30 15:05 ` Jens Axboe
2022-03-30 15:12 ` Miklos Szeredi
2022-03-30 15:17 ` Jens Axboe
2022-03-30 15:53 ` Jens Axboe
2022-03-30 17:49 ` Jens Axboe
2022-04-01 8:40 ` Miklos Szeredi
2022-04-01 15:36 ` Jens Axboe [this message]
2022-04-01 16:02 ` Miklos Szeredi
2022-04-01 16:21 ` Jens Axboe
2022-04-02 1:17 ` Jens Axboe
2022-04-05 7:45 ` Miklos Szeredi
2022-04-05 14:44 ` Jens Axboe
2022-04-21 12:31 ` Miklos Szeredi
2022-04-21 12:34 ` Jens Axboe
2022-04-21 12:39 ` Miklos Szeredi
2022-04-21 12:41 ` Jens Axboe
2022-04-21 13:10 ` Miklos Szeredi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox