public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: Miklos Szeredi <[email protected]>
Cc: [email protected]
Subject: Re: io_uring_prep_openat_direct() and link/drain
Date: Fri, 1 Apr 2022 09:36:25 -0600	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAJfpegvM3LQ8nsJf=LsWjQznpOzC+mZFXB5xkZgZHR2tXXjxLQ@mail.gmail.com>

On 4/1/22 2:40 AM, Miklos Szeredi wrote:
> On Wed, 30 Mar 2022 at 19:49, Jens Axboe <[email protected]> wrote:
>>
>> On 3/30/22 9:53 AM, Jens Axboe wrote:
>>> On 3/30/22 9:17 AM, Jens Axboe wrote:
>>>> On 3/30/22 9:12 AM, Miklos Szeredi wrote:
>>>>> On Wed, 30 Mar 2022 at 17:05, Jens Axboe <[email protected]> wrote:
>>>>>>
>>>>>> On 3/30/22 8:58 AM, Miklos Szeredi wrote:
>>>>>>> Next issue:  seems like file slot reuse is not working correctly.
>>>>>>> Attached program compares reads using io_uring with plain reads of
>>>>>>> proc files.
>>>>>>>
>>>>>>> In the below example it is using two slots alternately but the number
>>>>>>> of slots does not seem to matter, read is apparently always using a
>>>>>>> stale file (the prior one to the most recent open on that slot).  See
>>>>>>> how the sizes of the files lag by two lines:
>>>>>>>
>>>>>>> root@kvm:~# ./procreads
>>>>>>> procreads: /proc/1/stat: ok (313)
>>>>>>> procreads: /proc/2/stat: ok (149)
>>>>>>> procreads: /proc/3/stat: read size mismatch 313/150
>>>>>>> procreads: /proc/4/stat: read size mismatch 149/154
>>>>>>> procreads: /proc/5/stat: read size mismatch 150/161
>>>>>>> procreads: /proc/6/stat: read size mismatch 154/171
>>>>>>> ...
>>>>>>>
>>>>>>> Any ideas?
>>>>>>
>>>>>> Didn't look at your code yet, but with the current tree, this is the
>>>>>> behavior when a fixed file is used:
>>>>>>
>>>>>> At prep time, if the slot is valid it is used. If it isn't valid,
>>>>>> assignment is deferred until the request is issued.
>>>>>>
>>>>>> Which granted is a bit weird. It means that if you do:
>>>>>>
>>>>>> <open fileA into slot 1, slot 1 currently unused><read slot 1>
>>>>>>
>>>>>> the read will read from fileA. But for:
>>>>>>
>>>>>> <open fileB into slot 1, slot 1 is fileA currently><read slot 1>
>>>>>>
>>>>>> since slot 1 is already valid at prep time for the read, the read will
>>>>>> be from fileA again.
>>>>>>
>>>>>> Is this what you are seeing? It's definitely a bit confusing, and the
>>>>>> only reason why I didn't change it is because it could potentially break
>>>>>> applications. Don't think there's a high risk of that, however, so may
>>>>>> indeed be worth it to just bite the bullet and the assignment is
>>>>>> consistent (eg always done from the perspective of the previous
>>>>>> dependent request having completed).
>>>>>>
>>>>>> Is this what you are seeing?
>>>>>
>>>>> Right, this explains it.   Then the only workaround would be to wait
>>>>> for the open to finish before submitting the read, but that would
>>>>> defeat the whole point of using io_uring for this purpose.
>>>>
>>>> Honestly, I think we should just change it during this round, making it
>>>> consistent with the "slot is unused" use case. The old use case is more
>>>> more of a "it happened to work" vs the newer consistent behavior of "we
>>>> always assign the file when execution starts on the request".
>>>>
>>>> Let me spin a patch, would be great if you could test.
>>>
>>> Something like this on top of the current tree should work. Can you
>>> test?
>>
>> You can also just re-pull for-5.18/io_uring, it has been updated. A last
>> minute edit make a 0 return from io_assign_file() which should've been
>> 'true'...
> 
> Yep, this works now.
> 
> Next issue:  will get ENFILE even though there are just 40 slots.
> When running as root, then it will get as far as invoking the OOM
> killer, which is really bad.
> 
> There's no leak, this apparently only happens when the worker doing
> the fputs can't keep up.  Simple solution:  do the fput() of the
> previous file synchronously with the open_direct operation; fput
> shouldn't be expensive...  Is there a reason why this wouldn't work?

I take it you're continually reusing those slots? If you have a test
case that'd be ideal. Agree that it sounds like we just need an
appropriate breather to allow fput/task_work to run. Or it could be the
deferral free of the fixed slot.

-- 
Jens Axboe


  reply	other threads:[~2022-04-01 16:09 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-29 13:20 io_uring_prep_openat_direct() and link/drain Miklos Szeredi
2022-03-29 16:08 ` Jens Axboe
2022-03-29 17:04   ` Jens Axboe
2022-03-29 18:21     ` Miklos Szeredi
2022-03-29 18:26       ` Jens Axboe
2022-03-29 18:31         ` Miklos Szeredi
2022-03-29 18:40           ` Jens Axboe
2022-03-29 19:30             ` Miklos Szeredi
2022-03-29 20:03               ` Jens Axboe
2022-03-30  8:18                 ` Miklos Szeredi
2022-03-30 12:35                   ` Jens Axboe
2022-03-30 12:43                     ` Miklos Szeredi
2022-03-30 12:48                       ` Jens Axboe
2022-03-30 12:51                         ` Miklos Szeredi
2022-03-30 14:58                           ` Miklos Szeredi
2022-03-30 15:05                             ` Jens Axboe
2022-03-30 15:12                               ` Miklos Szeredi
2022-03-30 15:17                                 ` Jens Axboe
2022-03-30 15:53                                   ` Jens Axboe
2022-03-30 17:49                                     ` Jens Axboe
2022-04-01  8:40                                       ` Miklos Szeredi
2022-04-01 15:36                                         ` Jens Axboe [this message]
2022-04-01 16:02                                           ` Miklos Szeredi
2022-04-01 16:21                                             ` Jens Axboe
2022-04-02  1:17                                               ` Jens Axboe
2022-04-05  7:45                                                 ` Miklos Szeredi
2022-04-05 14:44                                                   ` Jens Axboe
2022-04-21 12:31                                                     ` Miklos Szeredi
2022-04-21 12:34                                                       ` Jens Axboe
2022-04-21 12:39                                                         ` Miklos Szeredi
2022-04-21 12:41                                                           ` Jens Axboe
2022-04-21 13:10                                                             ` Miklos Szeredi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox