From: Christoph Hellwig <[email protected]>
To: Dave Chinner <[email protected]>
Cc: Pavel Begunkov <[email protected]>,
Christian Brauner <[email protected]>,
[email protected], [email protected],
"Darrick J . Wong" <[email protected]>,
[email protected], wu lei <[email protected]>
Subject: Re: [PATCH v2 1/1] iomap: propagate nowait to block layer
Date: Wed, 5 Mar 2025 06:10:59 -0800 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On Wed, Mar 05, 2025 at 12:19:46PM +1100, Dave Chinner wrote:
> I really don't care about what io_uring thinks or does. If the block
> layer REQ_NOWAIT semantics are unusable for non-blocking IO
> submission, then that's the problem that needs fixing. This isn't a
> problem we can (or should) try to work around in the iomap layer.
Agreed. The problem are the block layer semantics. iomap/xfs really
just is the messenger here.
> For example: we have RAID5 witha 64kB chunk size, so max REQ_NOWAIT
> io size is 64kB according to the queue limits. However, if we do a
> 64kB IO at a 60kB chunk offset, that bio is going to be split into a
> 4kB bio and a 60kB bio because they are issued to different physical
> devices.....
>
> There is no way the bio submitter can know that this behaviour will
> occur, nor should they even be attempting to predict when/if such
> splitting may occur.
And for something that has a real block allocator it could also be
entirely dynamic. But I'm not sure if dm-thinp or bcache do anything
like that at the moment.
> > Are you only concerned about the size being too restrictive or do you
> > see any other problems?
>
> I'm concerned abou the fact that REQ_NOWAIT is not usable as it
> stands. We've identified bio chaining as an issue, now bio splitting
> is an issue, and I'm sure if we look further there will be other
> cases that are issues (e.g. bounce buffers).
>
> The underlying problem here is that bio submission errors are
> reported through bio completion mechanisms, not directly back to the
> submitting context. Fix that problem in the block layer API, and
> then iomap can use REQ_NOWAIT without having to care about what the
> block layer is doing under the covers.
Exactly. Either they need to be reported synchronously, or maybe we
need a block layer hook in bio_endio that retries the given bio on a
workqueue without ever bubbling up to the caller. But allowing delayed
BLK_STS_AGAIN is going to mess up any non-trivial caller. But even
for the plain block device is will cause duplicate I/O where some
blocks have already been read/written and then will get resubmitted.
I'm not sure that breaks any atomicity assumptions as we don't really
give explicit ones for block devices (except maybe for the new
RWF_ATOMIC flag?), but it certainly is unexpected and suboptimal.
prev parent reply other threads:[~2025-03-05 14:10 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-04 12:18 [PATCH v2 1/1] iomap: propagate nowait to block layer Pavel Begunkov
2025-03-04 16:07 ` Christoph Hellwig
2025-03-04 16:41 ` Pavel Begunkov
2025-03-04 16:59 ` Christoph Hellwig
2025-03-04 17:36 ` Jens Axboe
2025-03-04 23:26 ` Christoph Hellwig
2025-03-04 23:43 ` Jens Axboe
2025-03-04 23:49 ` Christoph Hellwig
2025-03-05 0:14 ` Pavel Begunkov
2025-03-05 0:18 ` Pavel Begunkov
2025-03-04 17:54 ` Pavel Begunkov
2025-03-04 23:28 ` Christoph Hellwig
2025-03-04 19:22 ` Darrick J. Wong
2025-03-04 20:35 ` Pavel Begunkov
2025-03-05 0:01 ` Christoph Hellwig
2025-03-05 0:45 ` Pavel Begunkov
2025-03-05 1:34 ` Christoph Hellwig
2025-03-04 21:11 ` Dave Chinner
2025-03-04 22:47 ` Pavel Begunkov
2025-03-04 23:40 ` Christoph Hellwig
2025-03-05 1:19 ` Dave Chinner
2025-03-05 14:10 ` Christoph Hellwig [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox