public inbox for [email protected]
 help / color / mirror / Atom feed
From: Hao_Xu <[email protected]>
To: Jens Axboe <[email protected]>,
	Matthew Wilcox <[email protected]>,
	[email protected]
Cc: Johannes Weiner <[email protected]>
Subject: Re: Loophole in async page I/O
Date: Tue, 13 Oct 2020 13:31:17 +0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

在 2020/10/13 上午6:08, Jens Axboe 写道:
> On 10/12/20 3:13 PM, Matthew Wilcox wrote:
>> This one's pretty unlikely, but there's a case in buffered reads where
>> an IOCB_WAITQ read can end up sleeping.
>>
>> generic_file_buffered_read():
>>                  page = find_get_page(mapping, index);
>> ...
>>                  if (!PageUptodate(page)) {
>> ...
>>                          if (iocb->ki_flags & IOCB_WAITQ) {
>> ...
>>                                  error = wait_on_page_locked_async(page,
>>                                                                  iocb->ki_waitq);
>> wait_on_page_locked_async():
>>          if (!PageLocked(page))
>>                  return 0;
>> (back to generic_file_buffered_read):
>>                          if (!mapping->a_ops->is_partially_uptodate(page,
>>                                                          offset, iter->count))
>>                                  goto page_not_up_to_date_locked;
>>
>> page_not_up_to_date_locked:
>>                  if (iocb->ki_flags & (IOCB_NOIO | IOCB_NOWAIT)) {
>>                          unlock_page(page);
>>                          put_page(page);
>>                          goto would_block;
>>                  }
>> ...
>>                  error = mapping->a_ops->readpage(filp, page);
>> (will unlock page on I/O completion)
>>                  if (!PageUptodate(page)) {
>>                          error = lock_page_killable(page);
>>
>> So if we have IOCB_WAITQ set but IOCB_NOWAIT clear, we'll call ->readpage()
>> and wait for the I/O to complete.  I can't quite figure out if this is
>> intentional -- I think not; if I understand the semantics right, we
>> should be returning -EIOCBQUEUED and punting to an I/O thread to
>> kick off the I/O and wait.
>>
>> I think the right fix is to return -EIOCBQUEUED from
>> wait_on_page_locked_async() if the page isn't locked.  ie this:
>>
>> @@ -1258,7 +1258,7 @@ static int wait_on_page_locked_async(struct page *page,
>>                                       struct wait_page_queue *wait)
>>   {
>>          if (!PageLocked(page))
>> -               return 0;
>> +               return -EIOCBQUEUED;
>>          return __wait_on_page_locked_async(compound_head(page), wait, false);
>>   }
>>   
>> But as I said, I'm not sure what the semantics are supposed to be.
> 
> If NOWAIT isn't set, then the issue attempt is from the helper thread
> already, and IOCB_WAITQ shouldn't be set either (the latter doesn't
> matter for this discussion). So it's totally fine and expected to block
> at that point.
> 
> Hmm actually, I believe that:
> 
> commit c8d317aa1887b40b188ec3aaa6e9e524333caed1
> Author: Hao Xu <[email protected]>
> Date:   Tue Sep 29 20:00:45 2020 +0800
> 
>      io_uring: fix async buffered reads when readahead is disabled
> 
> maybe messed up that case, so we could block off the retry-path. I'll
> take a closer look, looks like that can be the case if read-ahead is
> disabled.
> 
> In general, we can only return -EIOCBQUEUED if the IO has been started
> or is in progress already. That means we can safely rely on being told
> when it's unlocked/done. If we need to block, we should be returning
> -EAGAIN, which would punt to a worker thread.
> 
Hi Jens,
My undertanding of io_uring buffered reads process after the commit 
c8d317aa1887b40b188ec3aaa6e9e524333caed1 has been merged is:
the first io_uring IO try is with IOCB_NOWAIT, the second retry in the 
same context is with IOCB_WAITQ but without IOCB_NOWAIT.
so in Matthew's case, lock_page_async() will be called after calling 
mapping->a_ops->readpage(), So it won't end up sleeping.
Actually this case is what happens when readahead is disabled or somehow 
skipped for reasons like blk_cgroup_congested() returns true. And this 
case is my commit c8d317aa1887b40b188e for.

Regards,
Hao


  parent reply	other threads:[~2020-10-13  5:31 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-12 21:13 Loophole in async page I/O Matthew Wilcox
2020-10-12 22:08 ` Jens Axboe
2020-10-12 22:22   ` Jens Axboe
2020-10-12 22:42     ` Jens Axboe
2020-10-14 20:31       ` Hao_Xu
2020-10-14 20:57         ` Jens Axboe
2020-10-15 11:27           ` Hao_Xu
2020-10-15 12:17             ` Hao_Xu
2020-10-13  5:31   ` Hao_Xu [this message]
2020-10-13 17:50     ` Jens Axboe
2020-10-13 19:50       ` Hao_Xu
2020-10-13  5:13 ` Hao_Xu
2020-10-13 12:01   ` Matthew Wilcox
2020-10-13 19:57     ` Hao_Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=61743c36-ff6a-fce6-a3c4-55ec1c3f1cfa@linux.alibaba.com \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox