public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH] io_uring/rw: transform single vector readv/writev into ubuf
@ 2023-03-24 14:35 Jens Axboe
  2023-03-24 22:41 ` Ming Lei
  2023-03-24 23:54 ` Keith Busch
  0 siblings, 2 replies; 7+ messages in thread
From: Jens Axboe @ 2023-03-24 14:35 UTC (permalink / raw)
  To: io-uring

It's very common to have applications that use vectored reads or writes,
even if they only pass in a single segment. Obviously they should be
using read/write at that point, but...

Vectored IO comes with the downside of needing to retain iovec state,
and hence they require and allocation and state copy if they end up
getting deferred. Additionally, they also require extra cleanup when
completed as the memory as the allocated state memory has to be freed.

Automatically transform single segment IORING_OP_{READV,WRITEV} into
IORING_OP_{READ,WRITE}, and hence into an ITER_UBUF. Outside of being
more efficient if needing deferral, ITER_UBUF is also more efficient
for normal processing compared to ITER_IOVEC, as they don't require
iteration. The latter is apparent when running peak testing, where
using IORING_OP_READV to randomly read 24 drives previously scored:

IOPS=72.54M, BW=35.42GiB/s, IOS/call=32/31
IOPS=73.35M, BW=35.81GiB/s, IOS/call=32/31
IOPS=72.71M, BW=35.50GiB/s, IOS/call=32/31
IOPS=73.29M, BW=35.78GiB/s, IOS/call=32/32
IOPS=73.45M, BW=35.86GiB/s, IOS/call=32/32
IOPS=73.19M, BW=35.74GiB/s, IOS/call=31/32
IOPS=72.89M, BW=35.59GiB/s, IOS/call=32/31
IOPS=73.07M, BW=35.68GiB/s, IOS/call=32/32

and after the change we get:

IOPS=77.31M, BW=37.75GiB/s, IOS/call=32/31
IOPS=77.32M, BW=37.75GiB/s, IOS/call=32/32
IOPS=77.45M, BW=37.81GiB/s, IOS/call=31/31
IOPS=77.47M, BW=37.83GiB/s, IOS/call=32/32
IOPS=77.14M, BW=37.67GiB/s, IOS/call=32/32
IOPS=77.14M, BW=37.66GiB/s, IOS/call=31/31
IOPS=77.37M, BW=37.78GiB/s, IOS/call=32/32
IOPS=77.25M, BW=37.72GiB/s, IOS/call=32/32

which is a nice win as well.

Signed-off-by: Jens Axboe <[email protected]>

---

diff --git a/io_uring/rw.c b/io_uring/rw.c
index 4c233910e200..5c998754cb17 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -402,7 +402,22 @@ static struct iovec *__io_import_iovec(int ddir, struct io_kiocb *req,
 			      req->ctx->compat);
 	if (unlikely(ret < 0))
 		return ERR_PTR(ret);
-	return iovec;
+	if (iter->nr_segs != 1)
+		return iovec;
+	/*
+	 * Convert to non-vectored request if we have a single segment. If we
+	 * need to defer the request, then we no longer have to allocate and
+	 * maintain a struct io_async_rw. Additionally, we won't have cleanup
+	 * to do at completion time
+	 */
+	rw->addr = (unsigned long) iter->iov[0].iov_base;
+	rw->len = iter->iov[0].iov_len;
+	iov_iter_ubuf(iter, ddir, iter->iov[0].iov_base, rw->len);
+	/* readv -> read distance is the same as writev -> write */
+	BUILD_BUG_ON((IORING_OP_READ - IORING_OP_READV) !=
+			(IORING_OP_WRITE - IORING_OP_WRITEV));
+	req->opcode += (IORING_OP_READ - IORING_OP_READV);
+	return NULL;
 }
 
 static inline int io_import_iovec(int rw, struct io_kiocb *req,

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-03-27 11:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-24 14:35 [PATCH] io_uring/rw: transform single vector readv/writev into ubuf Jens Axboe
2023-03-24 22:41 ` Ming Lei
2023-03-24 23:06   ` Jens Axboe
2023-03-25  0:24     ` Ming Lei
2023-03-27 11:45       ` Pavel Begunkov
2023-03-24 23:54 ` Keith Busch
2023-03-25  1:06   ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox