public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH] btrfs: Prevent nowait or async read from doing sync IO
@ 2021-01-08  0:02 Martin Raiber
  2021-01-12 15:36 ` David Sterba
  2021-02-26 17:00 ` David Sterba
  0 siblings, 2 replies; 6+ messages in thread
From: Martin Raiber @ 2021-01-08  0:02 UTC (permalink / raw)
  To: linux-btrfs; +Cc: io-uring, Martin Raiber

When reading from btrfs file via io_uring I get following
call traces:

[<0>] wait_on_page_bit+0x12b/0x270
[<0>] read_extent_buffer_pages+0x2ad/0x360
[<0>] btree_read_extent_buffer_pages+0x97/0x110
[<0>] read_tree_block+0x36/0x60
[<0>] read_block_for_search.isra.0+0x1a9/0x360
[<0>] btrfs_search_slot+0x23d/0x9f0
[<0>] btrfs_lookup_csum+0x75/0x170
[<0>] btrfs_lookup_bio_sums+0x23d/0x630
[<0>] btrfs_submit_data_bio+0x109/0x180
[<0>] submit_one_bio+0x44/0x70
[<0>] extent_readahead+0x37a/0x3a0
[<0>] read_pages+0x8e/0x1f0
[<0>] page_cache_ra_unbounded+0x1aa/0x1f0
[<0>] generic_file_buffered_read+0x3eb/0x830
[<0>] io_iter_do_read+0x1a/0x40
[<0>] io_read+0xde/0x350
[<0>] io_issue_sqe+0x5cd/0xed0
[<0>] __io_queue_sqe+0xf9/0x370
[<0>] io_submit_sqes+0x637/0x910
[<0>] __x64_sys_io_uring_enter+0x22e/0x390
[<0>] do_syscall_64+0x33/0x80
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Prevent those by setting IOCB_NOIO before calling
generic_file_buffered_read.

Async read has the same problem. So disable that by removing
FMODE_BUF_RASYNC. This was added with commit
8730f12b7962b21ea9ad2756abce1e205d22db84 ("btrfs: flag files as
supporting buffered async reads") with 5.9. Io_uring will read
the data via worker threads if it can't be read without sync IO
this way.

Signed-off-by: Martin Raiber <[email protected]>
---
 fs/btrfs/file.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 0e41459b8..8bb561f6d 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -3589,7 +3589,7 @@ static loff_t btrfs_file_llseek(struct file *file, loff_t offset, int whence)
 
 static int btrfs_file_open(struct inode *inode, struct file *filp)
 {
-	filp->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC;
+	filp->f_mode |= FMODE_NOWAIT;
 	return generic_file_open(inode, filp);
 }
 
@@ -3639,7 +3639,18 @@ static ssize_t btrfs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
 			return ret;
 	}
 
-	return generic_file_buffered_read(iocb, to, ret);
+	if (iocb->ki_flags & IOCB_NOWAIT)
+		iocb->ki_flags |= IOCB_NOIO;
+
+	ret = generic_file_buffered_read(iocb, to, ret);
+
+	if (iocb->ki_flags & IOCB_NOWAIT) {
+		iocb->ki_flags &= ~IOCB_NOIO;
+		if (ret == 0)
+			ret = -EAGAIN;
+	}
+
+	return ret;
 }
 
 const struct file_operations btrfs_file_operations = {
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] btrfs: Prevent nowait or async read from doing sync IO
  2021-01-08  0:02 [PATCH] btrfs: Prevent nowait or async read from doing sync IO Martin Raiber
@ 2021-01-12 15:36 ` David Sterba
  2021-01-12 17:01   ` Pavel Begunkov
  2021-02-26 17:00 ` David Sterba
  1 sibling, 1 reply; 6+ messages in thread
From: David Sterba @ 2021-01-12 15:36 UTC (permalink / raw)
  To: Martin Raiber; +Cc: linux-btrfs, io-uring

On Fri, Jan 08, 2021 at 12:02:48AM +0000, Martin Raiber wrote:
> When reading from btrfs file via io_uring I get following
> call traces:

Is there a way to reproduce by common tools (fio) or is a specialized
one needed?

> [<0>] wait_on_page_bit+0x12b/0x270
> [<0>] read_extent_buffer_pages+0x2ad/0x360
> [<0>] btree_read_extent_buffer_pages+0x97/0x110
> [<0>] read_tree_block+0x36/0x60
> [<0>] read_block_for_search.isra.0+0x1a9/0x360
> [<0>] btrfs_search_slot+0x23d/0x9f0
> [<0>] btrfs_lookup_csum+0x75/0x170
> [<0>] btrfs_lookup_bio_sums+0x23d/0x630
> [<0>] btrfs_submit_data_bio+0x109/0x180
> [<0>] submit_one_bio+0x44/0x70
> [<0>] extent_readahead+0x37a/0x3a0
> [<0>] read_pages+0x8e/0x1f0
> [<0>] page_cache_ra_unbounded+0x1aa/0x1f0
> [<0>] generic_file_buffered_read+0x3eb/0x830
> [<0>] io_iter_do_read+0x1a/0x40
> [<0>] io_read+0xde/0x350
> [<0>] io_issue_sqe+0x5cd/0xed0
> [<0>] __io_queue_sqe+0xf9/0x370
> [<0>] io_submit_sqes+0x637/0x910
> [<0>] __x64_sys_io_uring_enter+0x22e/0x390
> [<0>] do_syscall_64+0x33/0x80
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Prevent those by setting IOCB_NOIO before calling
> generic_file_buffered_read.
> 
> Async read has the same problem. So disable that by removing
> FMODE_BUF_RASYNC. This was added with commit
> 8730f12b7962b21ea9ad2756abce1e205d22db84 ("btrfs: flag files as

Oh yeah that's the commit that went to btrfs code out-of-band. I am not
familiar with the io_uring support and have no good idea what the new
flag was supposed to do.

> supporting buffered async reads") with 5.9. Io_uring will read
> the data via worker threads if it can't be read without sync IO
> this way.

What are the implications of that? Like more context switching (due to
the worker threads) or other potential performance related problems?

> 
> Signed-off-by: Martin Raiber <[email protected]>
> ---
>  fs/btrfs/file.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 0e41459b8..8bb561f6d 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -3589,7 +3589,7 @@ static loff_t btrfs_file_llseek(struct file *file, loff_t offset, int whence)
>  
>  static int btrfs_file_open(struct inode *inode, struct file *filp)
>  {
> -	filp->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC;
> +	filp->f_mode |= FMODE_NOWAIT;
>  	return generic_file_open(inode, filp);
>  }
>  
> @@ -3639,7 +3639,18 @@ static ssize_t btrfs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
>  			return ret;
>  	}
>  

> -	return generic_file_buffered_read(iocb, to, ret);
> +	if (iocb->ki_flags & IOCB_NOWAIT)
> +		iocb->ki_flags |= IOCB_NOIO;
> +
> +	ret = generic_file_buffered_read(iocb, to, ret);
> +
> +	if (iocb->ki_flags & IOCB_NOWAIT) {
> +		iocb->ki_flags &= ~IOCB_NOIO;

This should probably use the original value of iocb->ki_flags, as the
NOIO flag is set unconditionally and if it were set initially, now it
would be lost. I haven't checked if this is actually possible as the
iocb code is inside kernel, but just from the safety POV it would be
better to use original value.

> +		if (ret == 0)
> +			ret = -EAGAIN;
> +	}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] btrfs: Prevent nowait or async read from doing sync IO
  2021-01-12 15:36 ` David Sterba
@ 2021-01-12 17:01   ` Pavel Begunkov
  2021-01-24 19:09     ` Martin Raiber
  0 siblings, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2021-01-12 17:01 UTC (permalink / raw)
  To: dsterba, Martin Raiber, linux-btrfs, io-uring

On 12/01/2021 15:36, David Sterba wrote:
> On Fri, Jan 08, 2021 at 12:02:48AM +0000, Martin Raiber wrote:
>> When reading from btrfs file via io_uring I get following
>> call traces:
> 
> Is there a way to reproduce by common tools (fio) or is a specialized
> one needed?

I'm not familiar with this particular issue, but:
should _probably_ be reproducible with fio with io_uring engine
or fio/t/io_uring tool.

> 
>> [<0>] wait_on_page_bit+0x12b/0x270
>> [<0>] read_extent_buffer_pages+0x2ad/0x360
>> [<0>] btree_read_extent_buffer_pages+0x97/0x110
>> [<0>] read_tree_block+0x36/0x60
>> [<0>] read_block_for_search.isra.0+0x1a9/0x360
>> [<0>] btrfs_search_slot+0x23d/0x9f0
>> [<0>] btrfs_lookup_csum+0x75/0x170
>> [<0>] btrfs_lookup_bio_sums+0x23d/0x630
>> [<0>] btrfs_submit_data_bio+0x109/0x180
>> [<0>] submit_one_bio+0x44/0x70
>> [<0>] extent_readahead+0x37a/0x3a0
>> [<0>] read_pages+0x8e/0x1f0
>> [<0>] page_cache_ra_unbounded+0x1aa/0x1f0
>> [<0>] generic_file_buffered_read+0x3eb/0x830
>> [<0>] io_iter_do_read+0x1a/0x40
>> [<0>] io_read+0xde/0x350
>> [<0>] io_issue_sqe+0x5cd/0xed0
>> [<0>] __io_queue_sqe+0xf9/0x370
>> [<0>] io_submit_sqes+0x637/0x910
>> [<0>] __x64_sys_io_uring_enter+0x22e/0x390
>> [<0>] do_syscall_64+0x33/0x80
>> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>> Prevent those by setting IOCB_NOIO before calling
>> generic_file_buffered_read.
>>
>> Async read has the same problem. So disable that by removing
>> FMODE_BUF_RASYNC. This was added with commit
>> 8730f12b7962b21ea9ad2756abce1e205d22db84 ("btrfs: flag files as
> 
> Oh yeah that's the commit that went to btrfs code out-of-band. I am not
> familiar with the io_uring support and have no good idea what the new
> flag was supposed to do.

iirc, Jens did make buffered IO asynchronous by waiting on a page
with wait_page_queue, but don't remember well enough.

> 
>> supporting buffered async reads") with 5.9. Io_uring will read
>> the data via worker threads if it can't be read without sync IO
>> this way.
> 
> What are the implications of that? Like more context switching (due to
> the worker threads) or other potential performance related problems?

io_uring splits submission and completion steps and usually expect
submissions to happen quick and not block (at least for long),
otherwise it can't submit other requests, that reduces QD and so
forth. In the worst case it can serialise it to QD1. I guess the
same can be applied to AIO.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] btrfs: Prevent nowait or async read from doing sync IO
  2021-01-12 17:01   ` Pavel Begunkov
@ 2021-01-24 19:09     ` Martin Raiber
  0 siblings, 0 replies; 6+ messages in thread
From: Martin Raiber @ 2021-01-24 19:09 UTC (permalink / raw)
  To: Pavel Begunkov, dsterba, linux-btrfs, io-uring

On 12.01.2021 18:01 Pavel Begunkov wrote:
> On 12/01/2021 15:36, David Sterba wrote:
>> On Fri, Jan 08, 2021 at 12:02:48AM +0000, Martin Raiber wrote:
>>> When reading from btrfs file via io_uring I get following
>>> call traces:
>> Is there a way to reproduce by common tools (fio) or is a specialized
>> one needed?
> I'm not familiar with this particular issue, but:
> should _probably_ be reproducible with fio with io_uring engine
> or fio/t/io_uring tool.
>
>>> [<0>] wait_on_page_bit+0x12b/0x270
>>> [<0>] read_extent_buffer_pages+0x2ad/0x360
>>> [<0>] btree_read_extent_buffer_pages+0x97/0x110
>>> [<0>] read_tree_block+0x36/0x60
>>> [<0>] read_block_for_search.isra.0+0x1a9/0x360
>>> [<0>] btrfs_search_slot+0x23d/0x9f0
>>> [<0>] btrfs_lookup_csum+0x75/0x170
>>> [<0>] btrfs_lookup_bio_sums+0x23d/0x630
>>> [<0>] btrfs_submit_data_bio+0x109/0x180
>>> [<0>] submit_one_bio+0x44/0x70
>>> [<0>] extent_readahead+0x37a/0x3a0
>>> [<0>] read_pages+0x8e/0x1f0
>>> [<0>] page_cache_ra_unbounded+0x1aa/0x1f0
>>> [<0>] generic_file_buffered_read+0x3eb/0x830
>>> [<0>] io_iter_do_read+0x1a/0x40
>>> [<0>] io_read+0xde/0x350
>>> [<0>] io_issue_sqe+0x5cd/0xed0
>>> [<0>] __io_queue_sqe+0xf9/0x370
>>> [<0>] io_submit_sqes+0x637/0x910
>>> [<0>] __x64_sys_io_uring_enter+0x22e/0x390
>>> [<0>] do_syscall_64+0x33/0x80
>>> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>
>>> Prevent those by setting IOCB_NOIO before calling
>>> generic_file_buffered_read.
>>>
>>> Async read has the same problem. So disable that by removing
>>> FMODE_BUF_RASYNC. This was added with commit
>>> 8730f12b7962b21ea9ad2756abce1e205d22db84 ("btrfs: flag files as
>> Oh yeah that's the commit that went to btrfs code out-of-band. I am not
>> familiar with the io_uring support and have no good idea what the new
>> flag was supposed to do.
> iirc, Jens did make buffered IO asynchronous by waiting on a page
> with wait_page_queue, but don't remember well enough.
>
>>> supporting buffered async reads") with 5.9. Io_uring will read
>>> the data via worker threads if it can't be read without sync IO
>>> this way.
>> What are the implications of that? Like more context switching (due to
>> the worker threads) or other potential performance related problems?
> io_uring splits submission and completion steps and usually expect
> submissions to happen quick and not block (at least for long),
> otherwise it can't submit other requests, that reduces QD and so
> forth. In the worst case it can serialise it to QD1. I guess the
> same can be applied to AIO.

Io_submit historically had the problem that it is truely async only for 
certain operations. That's why everyone only uses it only for async 
direct I/O with preallocated files (and even then e.g. Mysql has 
innodb_use_native_aio as tuning option that replaces io_submit with a 
userspace thread pool). Io_uring is fixing that by making everything 
async, so the thread calling io_uring_enter never should do any io (only 
read from page cache etc.). The idea is that one can build e.g. a web 
server that uses only one thread and does all (former blocking) syscalls 
via io_uring and handles a large amount of connections. If btrfs does 
blocking io in this one thread this web server wouldn't work well with 
btrfs since the blocking call would e.g. delay accepting new connections.

Specifically w.r.t. read() io_uring has following logic:

  * Try read_iter with RWF_NOWAIT/IOCB_NOWAIT.
  * If read_iter returns -EAGAIN. Look at the FMODE_BUF_RASYNC flag. If
    set do the read with IOCB_WAITQ and callback set (AIO).
  * If FMODE_BUF_RASYNC is not set, sync read in a io_uring worker thread.

My guess is that since btrfs needs to do the checksum calculations in a 
worker anyway, that it's best/simpler to not support the AIO submission 
(so not set the FMODE_BUF_RASYNC).

W.r.t. RWF_NOWAIT the problem is that it synchronously reads the csum 
before async submitting the page reads. When reading randomly from a 
(large) file this means at least one synchronous read per io_uring 
submission. I guess the same happens for preadv2 with RWF_NOWAIT (and 
io_submit), man page:

> *RWF_NOWAIT *(since Linux 4.14)
>                Do not wait for data which is not immediately available.
>                If this flag is specified, the*preadv2*() system call will
>                return instantly if it would have to read data from the
>                backing storage or wait for a lock.  If some data was
>                successfully read, it will return the number of bytes
>                read.  If no bytes were read, it will return -1 and set
>                /errno <https://man7.org/linux/man-pages/man3/errno.3.html>/  to*EAGAIN*.  Currently, this flag is meaningful only
>                for*preadv2*().

I haven't tested this, but the same would probably happen if it doesn't 
have the extents in cache, though that might happen seldom enough that 
it's not worth fixing (for now).

I did also look at how ext4 with fs-verity handles this and it does it 
about the same as btrfs (synchronously reading the csum/hash before 
submitting the read), so same problem there.

One could probably set the FMODE_BUF_RASYNC for files without checksums, 
though idk if compressed extents are a problem.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] btrfs: Prevent nowait or async read from doing sync IO
  2021-01-08  0:02 [PATCH] btrfs: Prevent nowait or async read from doing sync IO Martin Raiber
  2021-01-12 15:36 ` David Sterba
@ 2021-02-26 17:00 ` David Sterba
  2021-03-08 19:03   ` Martin Raiber
  1 sibling, 1 reply; 6+ messages in thread
From: David Sterba @ 2021-02-26 17:00 UTC (permalink / raw)
  To: Martin Raiber; +Cc: linux-btrfs, io-uring

On Fri, Jan 08, 2021 at 12:02:48AM +0000, Martin Raiber wrote:
> When reading from btrfs file via io_uring I get following
> call traces:
> 
> [<0>] wait_on_page_bit+0x12b/0x270
> [<0>] read_extent_buffer_pages+0x2ad/0x360
> [<0>] btree_read_extent_buffer_pages+0x97/0x110
> [<0>] read_tree_block+0x36/0x60
> [<0>] read_block_for_search.isra.0+0x1a9/0x360
> [<0>] btrfs_search_slot+0x23d/0x9f0
> [<0>] btrfs_lookup_csum+0x75/0x170
> [<0>] btrfs_lookup_bio_sums+0x23d/0x630
> [<0>] btrfs_submit_data_bio+0x109/0x180
> [<0>] submit_one_bio+0x44/0x70
> [<0>] extent_readahead+0x37a/0x3a0
> [<0>] read_pages+0x8e/0x1f0
> [<0>] page_cache_ra_unbounded+0x1aa/0x1f0
> [<0>] generic_file_buffered_read+0x3eb/0x830
> [<0>] io_iter_do_read+0x1a/0x40
> [<0>] io_read+0xde/0x350
> [<0>] io_issue_sqe+0x5cd/0xed0
> [<0>] __io_queue_sqe+0xf9/0x370
> [<0>] io_submit_sqes+0x637/0x910
> [<0>] __x64_sys_io_uring_enter+0x22e/0x390
> [<0>] do_syscall_64+0x33/0x80
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Prevent those by setting IOCB_NOIO before calling
> generic_file_buffered_read.
> 
> Async read has the same problem. So disable that by removing
> FMODE_BUF_RASYNC. This was added with commit
> 8730f12b7962b21ea9ad2756abce1e205d22db84 ("btrfs: flag files as
> supporting buffered async reads") with 5.9. Io_uring will read
> the data via worker threads if it can't be read without sync IO
> this way.
> 
> Signed-off-by: Martin Raiber <[email protected]>
> ---
>  fs/btrfs/file.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 0e41459b8..8bb561f6d 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -3589,7 +3589,7 @@ static loff_t btrfs_file_llseek(struct file *file, loff_t offset, int whence)
>  
>  static int btrfs_file_open(struct inode *inode, struct file *filp)
>  {
> -	filp->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC;
> +	filp->f_mode |= FMODE_NOWAIT;
>  	return generic_file_open(inode, filp);
>  }
>  
> @@ -3639,7 +3639,18 @@ static ssize_t btrfs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
>  			return ret;
>  	}
>  
> -	return generic_file_buffered_read(iocb, to, ret);
> +	if (iocb->ki_flags & IOCB_NOWAIT)
> +		iocb->ki_flags |= IOCB_NOIO;
> +
> +	ret = generic_file_buffered_read(iocb, to, ret);
> +
> +	if (iocb->ki_flags & IOCB_NOWAIT) {
> +		iocb->ki_flags &= ~IOCB_NOIO;
> +		if (ret == 0)
> +			ret = -EAGAIN;
> +	}

Christoph has some doubts about the code,
https://lore.kernel.org/lkml/[email protected]/

The patch has been in for-next but as I'm not sure it's correct and
don't have a reproducer, I'll remove it again. We do want to fix the
warning, maybe there's only something trivial missing but we need to be
sure, I don't have enough expertise here.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] btrfs: Prevent nowait or async read from doing sync IO
  2021-02-26 17:00 ` David Sterba
@ 2021-03-08 19:03   ` Martin Raiber
  0 siblings, 0 replies; 6+ messages in thread
From: Martin Raiber @ 2021-03-08 19:03 UTC (permalink / raw)
  To: dsterba, linux-btrfs, io-uring

On 26.02.2021 18:00 David Sterba wrote:
> On Fri, Jan 08, 2021 at 12:02:48AM +0000, Martin Raiber wrote:
>> When reading from btrfs file via io_uring I get following
>> call traces:
>>
>> [<0>] wait_on_page_bit+0x12b/0x270
>> [<0>] read_extent_buffer_pages+0x2ad/0x360
>> [<0>] btree_read_extent_buffer_pages+0x97/0x110
>> [<0>] read_tree_block+0x36/0x60
>> [<0>] read_block_for_search.isra.0+0x1a9/0x360
>> [<0>] btrfs_search_slot+0x23d/0x9f0
>> [<0>] btrfs_lookup_csum+0x75/0x170
>> [<0>] btrfs_lookup_bio_sums+0x23d/0x630
>> [<0>] btrfs_submit_data_bio+0x109/0x180
>> [<0>] submit_one_bio+0x44/0x70
>> [<0>] extent_readahead+0x37a/0x3a0
>> [<0>] read_pages+0x8e/0x1f0
>> [<0>] page_cache_ra_unbounded+0x1aa/0x1f0
>> [<0>] generic_file_buffered_read+0x3eb/0x830
>> [<0>] io_iter_do_read+0x1a/0x40
>> [<0>] io_read+0xde/0x350
>> [<0>] io_issue_sqe+0x5cd/0xed0
>> [<0>] __io_queue_sqe+0xf9/0x370
>> [<0>] io_submit_sqes+0x637/0x910
>> [<0>] __x64_sys_io_uring_enter+0x22e/0x390
>> [<0>] do_syscall_64+0x33/0x80
>> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>> Prevent those by setting IOCB_NOIO before calling
>> generic_file_buffered_read.
>>
>> Async read has the same problem. So disable that by removing
>> FMODE_BUF_RASYNC. This was added with commit
>> 8730f12b7962b21ea9ad2756abce1e205d22db84 ("btrfs: flag files as
>> supporting buffered async reads") with 5.9. Io_uring will read
>> the data via worker threads if it can't be read without sync IO
>> this way.
>>
>> Signed-off-by: Martin Raiber <[email protected]>
>> ---
>>  fs/btrfs/file.c | 15 +++++++++++++--
>>  1 file changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
>> index 0e41459b8..8bb561f6d 100644
>> --- a/fs/btrfs/file.c
>> +++ b/fs/btrfs/file.c
>> @@ -3589,7 +3589,7 @@ static loff_t btrfs_file_llseek(struct file *file, loff_t offset, int whence)
>>  
>>  static int btrfs_file_open(struct inode *inode, struct file *filp)
>>  {
>> -	filp->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC;
>> +	filp->f_mode |= FMODE_NOWAIT;
>>  	return generic_file_open(inode, filp);
>>  }
>>  
>> @@ -3639,7 +3639,18 @@ static ssize_t btrfs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
>>  			return ret;
>>  	}
>>  
>> -	return generic_file_buffered_read(iocb, to, ret);
>> +	if (iocb->ki_flags & IOCB_NOWAIT)
>> +		iocb->ki_flags |= IOCB_NOIO;
>> +
>> +	ret = generic_file_buffered_read(iocb, to, ret);
>> +
>> +	if (iocb->ki_flags & IOCB_NOWAIT) {
>> +		iocb->ki_flags &= ~IOCB_NOIO;
>> +		if (ret == 0)
>> +			ret = -EAGAIN;
>> +	}
> Christoph has some doubts about the code,
> https://lore.kernel.org/lkml/[email protected]/
>
> The patch has been in for-next but as I'm not sure it's correct and
> don't have a reproducer, I'll remove it again. We do want to fix the
> warning, maybe there's only something trivial missing but we need to be
> sure, I don't have enough expertise here.

The general gist of the critism is kind of correct. It is generic_file_buffered_read/filemap_read that handles the IOCB_NOIO, however. It is only used from gfs2 since 5.8 and IOCB_NOIO was added to 5.8 with 41da51bce36f44eefc1e3d0f47d18841cbd065ba ....

However, I cannot see how to find out if readahead was called with IOCB_NOWAIT from extent_readahead/btrfs_readahead/readahead_control. So add an additional parameter to address_space_operations.readahead ? As mentioned, not too relevant to btrfs (because of the CRC calculation), but making readahead async in all cases (incl. IOCB_WAITQ) would be the proper solution.

W.r.t. testing: The most low-effort way I can think of is to add an io_uring switch to xfs_io, so that xfstests can be run using io_uring (where possible). Then check via tracing/perf that there aren't any call stacks with both io_uring_enter and wait_on_page_bit (or any other blocking call) in them.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-03-08 19:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-01-08  0:02 [PATCH] btrfs: Prevent nowait or async read from doing sync IO Martin Raiber
2021-01-12 15:36 ` David Sterba
2021-01-12 17:01   ` Pavel Begunkov
2021-01-24 19:09     ` Martin Raiber
2021-02-26 17:00 ` David Sterba
2021-03-08 19:03   ` Martin Raiber

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox