public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH v2 1/1] iomap: propagate nowait to block layer
@ 2025-03-04 12:18 Pavel Begunkov
  2025-03-04 16:07 ` Christoph Hellwig
  2025-03-04 21:11 ` Dave Chinner
  0 siblings, 2 replies; 22+ messages in thread
From: Pavel Begunkov @ 2025-03-04 12:18 UTC (permalink / raw)
  To: Christian Brauner, linux-fsdevel, Dave Chinner
  Cc: io-uring, Darrick J . Wong, linux-xfs, wu lei, asml.silence

There are reports of high io_uring submission latency for ext4 and xfs,
which is due to iomap not propagating nowait flag to the block layer
resulting in waiting for IO during tag allocation.

Because of how errors are propagated back, we can't set REQ_NOWAIT
for multi bio IO, in this case return -EAGAIN and let the caller to
handle it, for example, it can reissue it from a blocking context.
It's aligned with how raw bdev direct IO handles it.

Cc: [email protected]
Link: https://github.com/axboe/liburing/issues/826#issuecomment-2674131870
Reported-by: wu lei <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---

v2:
	Fail multi-bio nowait submissions

 fs/iomap/direct-io.c | 26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index b521eb15759e..07c336fdf4f0 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -363,9 +363,14 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
 	 */
 	if (need_zeroout ||
 	    ((dio->flags & IOMAP_DIO_NEED_SYNC) && !use_fua) ||
-	    ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode)))
+	    ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode))) {
 		dio->flags &= ~IOMAP_DIO_CALLER_COMP;
 
+		if (!is_sync_kiocb(dio->iocb) &&
+		    (dio->iocb->ki_flags & IOCB_NOWAIT))
+			return -EAGAIN;
+	}
+
 	/*
 	 * The rules for polled IO completions follow the guidelines as the
 	 * ones we set for inline and deferred completions. If none of those
@@ -374,6 +379,23 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
 	if (!(dio->flags & (IOMAP_DIO_INLINE_COMP|IOMAP_DIO_CALLER_COMP)))
 		dio->iocb->ki_flags &= ~IOCB_HIPRI;
 
+	bio_opf = iomap_dio_bio_opflags(dio, iomap, use_fua, atomic);
+
+	if (!is_sync_kiocb(dio->iocb) && (dio->iocb->ki_flags & IOCB_NOWAIT)) {
+		/*
+		 * This is nonblocking IO, and we might need to allocate
+		 * multiple bios. In this case, as we cannot guarantee that
+		 * one of the sub bios will not fail getting issued FOR NOWAIT
+		 * and as error results are coalesced across all of them, ask
+		 * for a retry of this from blocking context.
+		 */
+		if (bio_iov_vecs_to_alloc(dio->submit.iter, BIO_MAX_VECS + 1) >
+					  BIO_MAX_VECS)
+			return -EAGAIN;
+
+		bio_opf |= REQ_NOWAIT;
+	}
+
 	if (need_zeroout) {
 		/* zero out from the start of the block to the write offset */
 		pad = pos & (fs_block_size - 1);
@@ -383,8 +405,6 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
 			goto out;
 	}
 
-	bio_opf = iomap_dio_bio_opflags(dio, iomap, use_fua, atomic);
-
 	nr_pages = bio_iov_vecs_to_alloc(dio->submit.iter, BIO_MAX_VECS);
 	do {
 		size_t n;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-03-05 14:10 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-04 12:18 [PATCH v2 1/1] iomap: propagate nowait to block layer Pavel Begunkov
2025-03-04 16:07 ` Christoph Hellwig
2025-03-04 16:41   ` Pavel Begunkov
2025-03-04 16:59     ` Christoph Hellwig
2025-03-04 17:36       ` Jens Axboe
2025-03-04 23:26         ` Christoph Hellwig
2025-03-04 23:43           ` Jens Axboe
2025-03-04 23:49             ` Christoph Hellwig
2025-03-05  0:14               ` Pavel Begunkov
2025-03-05  0:18                 ` Pavel Begunkov
2025-03-04 17:54       ` Pavel Begunkov
2025-03-04 23:28         ` Christoph Hellwig
2025-03-04 19:22     ` Darrick J. Wong
2025-03-04 20:35       ` Pavel Begunkov
2025-03-05  0:01         ` Christoph Hellwig
2025-03-05  0:45           ` Pavel Begunkov
2025-03-05  1:34             ` Christoph Hellwig
2025-03-04 21:11 ` Dave Chinner
2025-03-04 22:47   ` Pavel Begunkov
2025-03-04 23:40     ` Christoph Hellwig
2025-03-05  1:19     ` Dave Chinner
2025-03-05 14:10       ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox