public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: Mark Harmstone <[email protected]>,
	[email protected], [email protected]
Subject: Re: [PATCH 5/6] btrfs: move wait out of btrfs_encoded_read
Date: Fri, 6 Sep 2024 16:11:16 +0100	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 8/23/24 17:27, Mark Harmstone wrote:
> Makes it so that if btrfs_encoded_read has launched a bio, rather than
> waiting for it to complete it leaves the extent and inode locked and returns
> -EIOCBQUEUED. The caller is responsible for waiting on the bio, and
> calling the completion code in the new btrfs_encoded_read_regular_end.
> 
> Signed-off-by: Mark Harmstone <[email protected]>
> ---
>   fs/btrfs/btrfs_inode.h |  1 +
>   fs/btrfs/inode.c       | 81 +++++++++++++++++++++++-------------------
>   fs/btrfs/ioctl.c       |  8 +++++
>   3 files changed, 54 insertions(+), 36 deletions(-)
> 
> diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
> index f4d77c3bb544..a5d786c6d7d4 100644
> --- a/fs/btrfs/btrfs_inode.h
> +++ b/fs/btrfs/btrfs_inode.h
> @@ -623,6 +623,7 @@ struct btrfs_encoded_read_private {
>   };
>   
>   ssize_t btrfs_encoded_read(struct btrfs_encoded_read_private *priv);
> +ssize_t btrfs_encoded_read_regular_end(struct btrfs_encoded_read_private *priv);
>   ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from,
>   			       const struct btrfs_ioctl_encoded_io_args *encoded);
>   
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 1e53977a4854..1bd4c74e8c51 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -8999,7 +8999,7 @@ static ssize_t btrfs_encoded_read_inline(
>   				struct extent_state **cached_state,
>   				u64 extent_start, size_t count,
>   				struct btrfs_ioctl_encoded_io_args *encoded,
> -				bool *unlocked)
> +				bool *need_unlock)
>   {
>   	struct btrfs_root *root = inode->root;
>   	struct btrfs_fs_info *fs_info = root->fs_info;
> @@ -9067,7 +9067,7 @@ static ssize_t btrfs_encoded_read_inline(
>   	btrfs_release_path(path);
>   	unlock_extent(io_tree, start, lockend, cached_state);
>   	btrfs_inode_unlock(inode, BTRFS_ILOCK_SHARED);
> -	*unlocked = true;
> +	*need_unlock = false;
>   
>   	ret = copy_to_iter(tmp, count, iter);
>   	if (ret != count)
> @@ -9155,42 +9155,26 @@ int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode,
>   	return blk_status_to_errno(READ_ONCE(priv.status));
>   }
>   
> -static ssize_t btrfs_encoded_read_regular(struct btrfs_encoded_read_private *priv,
> -					  u64 start, u64 lockend,
> -					  u64 disk_bytenr, u64 disk_io_size,
> -					  bool *unlocked)
> +ssize_t btrfs_encoded_read_regular_end(struct btrfs_encoded_read_private *priv)
> -	struct btrfs_inode *inode = BTRFS_I(file_inode(priv->file));
> -	struct extent_io_tree *io_tree = &inode->io_tree;
> +	u64 cur, start, lockend;
>   	unsigned long i;
> -	u64 cur;
>   	size_t page_offset;
> -	ssize_t ret;
> -
> -	priv->nr_pages = DIV_ROUND_UP(disk_io_size, PAGE_SIZE);
> -	priv->pages = kcalloc(priv->nr_pages, sizeof(struct page *), GFP_NOFS);
> -	if (!priv->pages) {
> -		priv->nr_pages = 0;
> -		return -ENOMEM;
> -	}
> -	ret = btrfs_alloc_page_array(priv->nr_pages, priv->pages, false);
> -	if (ret)
> -		return -ENOMEM;
> +	struct btrfs_inode *inode = BTRFS_I(file_inode(priv->file));
> +	struct btrfs_fs_info *fs_info = inode->root->fs_info;
> +	struct extent_io_tree *io_tree = &inode->io_tree;
> +	int ret;
> -	_btrfs_encoded_read_regular_fill_pages(inode, start, disk_bytenr,
> -					       disk_io_size, priv);
> +	start = ALIGN_DOWN(priv->args.offset, fs_info->sectorsize);
> +	lockend = start + BTRFS_MAX_UNCOMPRESSED - 1;
>   
> -	if (atomic_dec_return(&priv->pending))
> -		io_wait_event(priv->wait, !atomic_read(&priv->pending));
> +	unlock_extent(io_tree, start, lockend, &priv->cached_state);
> +	btrfs_inode_unlock(inode, BTRFS_ILOCK_SHARED);

AFAIS, btrfs_encoded_read_regular_end is here to be used asynchronously
in a later patch, specifically via bio callback -> io_uring tw callback,
and that io_uring tw callback might not get run in any reasonable amount
of time as it's controlled by the user space.

So, it takes the inode lock in one context (task executing the
request) and releases it in another, it's usually is not allowed.
And it also holds the locks potentially semi indefinitely (guaranteed
to be executed when io_uring and/or task die). There is also
unlock_extent() which I'd assume we don't want to hold locked
for too long.

It's in general a bad practice having is_locked variables, especially
when it's not on stack, even more so when it's implicit like

regular_read() {
	(ret == -EIOCBQUEUED)
		need_unlock = false; // delay it
}
btrfs_encoded_read_finish() {
	if (ret == -EIOCBQUEUED)
		unlock();
}

That just too easy to miss when you do anything with the code
afterwards.

I hope Josef and btrfs folks can comment what would be a way
here, does the inode lock has to be held for the entire
duration of IO?

>   
>   	ret = blk_status_to_errno(READ_ONCE(priv->status));
>   	if (ret)
>   		return ret;
>   
> -	unlock_extent(io_tree, start, lockend, &priv->cached_state);
> -	btrfs_inode_unlock(inode, BTRFS_ILOCK_SHARED);
> -	*unlocked = true;
> -
>   	if (priv->args.compression) {
>   		i = 0;
>   		page_offset = 0;
> @@ -9215,6 +9199,30 @@ static ssize_t btrfs_encoded_read_regular(struct btrfs_encoded_read_private *pri
>   	return priv->count;
>   }

-- 
Pavel Begunkov

  reply	other threads:[~2024-09-06 15:10 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-23 16:27 [PATCH 0/6] btrfs: add io_uring for encoded reads Mark Harmstone
2024-08-23 16:27 ` [PATCH 1/6] btrfs: remove iocb from btrfs_encoded_read Mark Harmstone
2024-08-27  1:12   ` David Sterba
2024-08-23 16:27 ` [PATCH 2/6] btrfs: store encoded read state in struct btrfs_encoded_read_private Mark Harmstone
2024-08-26 15:22   ` David Sterba
2024-08-27  1:03   ` David Sterba
2024-09-06 15:19   ` Pavel Begunkov
2024-08-23 16:27 ` [PATCH 3/6] btrfs: add btrfs_encoded_read_finish Mark Harmstone
2024-08-23 16:27 ` [PATCH 4/6] btrfs: add btrfs_prepare_encoded_read Mark Harmstone
2024-08-23 16:27 ` [PATCH 5/6] btrfs: move wait out of btrfs_encoded_read Mark Harmstone
2024-09-06 15:11   ` Pavel Begunkov [this message]
2024-08-23 16:27 ` [PATCH 6/6] btrfs: add io_uring interface for encoded reads Mark Harmstone
2024-09-06 14:41   ` Pavel Begunkov
2024-09-06 15:33   ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox