From: Hao_Xu <[email protected]>
To: Jens Axboe <[email protected]>,
Matthew Wilcox <[email protected]>,
[email protected]
Cc: Johannes Weiner <[email protected]>
Subject: Re: Loophole in async page I/O
Date: Wed, 14 Oct 2020 03:50:00 +0800 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
在 2020/10/14 上午1:50, Jens Axboe 写道:
> On 10/12/20 11:31 PM, Hao_Xu wrote:
>> 在 2020/10/13 上午6:08, Jens Axboe 写道:
>>> On 10/12/20 3:13 PM, Matthew Wilcox wrote:
>>>> This one's pretty unlikely, but there's a case in buffered reads where
>>>> an IOCB_WAITQ read can end up sleeping.
>>>>
>>>> generic_file_buffered_read():
>>>> page = find_get_page(mapping, index);
>>>> ...
>>>> if (!PageUptodate(page)) {
>>>> ...
>>>> if (iocb->ki_flags & IOCB_WAITQ) {
>>>> ...
>>>> error = wait_on_page_locked_async(page,
>>>> iocb->ki_waitq);
>>>> wait_on_page_locked_async():
>>>> if (!PageLocked(page))
>>>> return 0;
>>>> (back to generic_file_buffered_read):
>>>> if (!mapping->a_ops->is_partially_uptodate(page,
>>>> offset, iter->count))
>>>> goto page_not_up_to_date_locked;
>>>>
>>>> page_not_up_to_date_locked:
>>>> if (iocb->ki_flags & (IOCB_NOIO | IOCB_NOWAIT)) {
>>>> unlock_page(page);
>>>> put_page(page);
>>>> goto would_block;
>>>> }
>>>> ...
>>>> error = mapping->a_ops->readpage(filp, page);
>>>> (will unlock page on I/O completion)
>>>> if (!PageUptodate(page)) {
>>>> error = lock_page_killable(page);
>>>>
>>>> So if we have IOCB_WAITQ set but IOCB_NOWAIT clear, we'll call ->readpage()
>>>> and wait for the I/O to complete. I can't quite figure out if this is
>>>> intentional -- I think not; if I understand the semantics right, we
>>>> should be returning -EIOCBQUEUED and punting to an I/O thread to
>>>> kick off the I/O and wait.
>>>>
>>>> I think the right fix is to return -EIOCBQUEUED from
>>>> wait_on_page_locked_async() if the page isn't locked. ie this:
>>>>
>>>> @@ -1258,7 +1258,7 @@ static int wait_on_page_locked_async(struct page *page,
>>>> struct wait_page_queue *wait)
>>>> {
>>>> if (!PageLocked(page))
>>>> - return 0;
>>>> + return -EIOCBQUEUED;
>>>> return __wait_on_page_locked_async(compound_head(page), wait, false);
>>>> }
>>>>
>>>> But as I said, I'm not sure what the semantics are supposed to be.
>>>
>>> If NOWAIT isn't set, then the issue attempt is from the helper thread
>>> already, and IOCB_WAITQ shouldn't be set either (the latter doesn't
>>> matter for this discussion). So it's totally fine and expected to block
>>> at that point.
>>>
>>> Hmm actually, I believe that:
>>>
>>> commit c8d317aa1887b40b188ec3aaa6e9e524333caed1
>>> Author: Hao Xu <[email protected]>
>>> Date: Tue Sep 29 20:00:45 2020 +0800
>>>
>>> io_uring: fix async buffered reads when readahead is disabled
>>>
>>> maybe messed up that case, so we could block off the retry-path. I'll
>>> take a closer look, looks like that can be the case if read-ahead is
>>> disabled.
>>>
>>> In general, we can only return -EIOCBQUEUED if the IO has been started
>>> or is in progress already. That means we can safely rely on being told
>>> when it's unlocked/done. If we need to block, we should be returning
>>> -EAGAIN, which would punt to a worker thread.
>>>
>> Hi Jens,
>> My undertanding of io_uring buffered reads process after the commit
>> c8d317aa1887b40b188ec3aaa6e9e524333caed1 has been merged is:
>> the first io_uring IO try is with IOCB_NOWAIT, the second retry in the
>> same context is with IOCB_WAITQ but without IOCB_NOWAIT.
>> so in Matthew's case, lock_page_async() will be called after calling
>> mapping->a_ops->readpage(), So it won't end up sleeping.
>> Actually this case is what happens when readahead is disabled or somehow
>> skipped for reasons like blk_cgroup_congested() returns true. And this
>> case is my commit c8d317aa1887b40b188e for.
>
> Well, try the patches. I agree it's not going to sleep with the previous
> fix, but we're definitely driving a lower utilization by not utilizing
> read-ahead even if disabled.
>
> Re-run your previous tests with these two applied and see what you get.
>
Sure I agree, looks good to me. I'll try the tests with the new code.
Thanks
next prev parent reply other threads:[~2020-10-13 19:50 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-12 21:13 Loophole in async page I/O Matthew Wilcox
2020-10-12 22:08 ` Jens Axboe
2020-10-12 22:22 ` Jens Axboe
2020-10-12 22:42 ` Jens Axboe
2020-10-14 20:31 ` Hao_Xu
2020-10-14 20:57 ` Jens Axboe
2020-10-15 11:27 ` Hao_Xu
2020-10-15 12:17 ` Hao_Xu
2020-10-13 5:31 ` Hao_Xu
2020-10-13 17:50 ` Jens Axboe
2020-10-13 19:50 ` Hao_Xu [this message]
2020-10-13 5:13 ` Hao_Xu
2020-10-13 12:01 ` Matthew Wilcox
2020-10-13 19:57 ` Hao_Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1891f527-f5d8-1c1c-00c4-0a5f1f7f7832@linux.alibaba.com \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox