From: Bijan Mottahedeh <[email protected]>
To: Jens Axboe <[email protected]>
Cc: [email protected]
Subject: io_uring performance with block sizes > 128k
Date: Mon, 2 Mar 2020 15:55:46 -0800 [thread overview]
Message-ID: <[email protected]> (raw)
I'm seeing a sizeable drop in perf with polled fio tests for block sizes
> 128k:
filename=/dev/nvme0n1
rw=randread
direct=1
time_based=1
randrepeat=1
gtod_reduce=1
fio --readonly --ioengine=io_uring --iodepth 1024 --fixedbufs --hipri
--numjobs=16
fio --readonly --ioengine=pvsync2 --iodepth 1024 --hipri --numjobs=16
Compared with the pvsync2 engine, the only major difference I could see
was the dio path, __blkdev_direct_IO() for io_uring vs.
__blkdev_direct_IO_simple() for pvsync2 because of the is_sync_kiocb()
check.
static ssize_t
blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
{
...
if (is_sync_kiocb(iocb) && nr_pages <= BIO_MAX_PAGES)
return __blkdev_direct_IO_simple(iocb, iter, nr_pages);
return __blkdev_direct_IO(iocb, iter, min(nr_pages,
BIO_MAX_PAGES));
}
Just for an experiment, I hacked io_uring code to force it through the
_simple() path and I get better numbers though the variance is fairly
high, but the drop at bs > 128k seems consistent:
# baseline
READ: bw=3167MiB/s (3321MB/s), 186MiB/s-208MiB/s (196MB/s-219MB/s) #128k
READ: bw=898MiB/s (941MB/s), 51.2MiB/s-66.1MiB/s (53.7MB/s-69.3MB/s) #144k
READ: bw=1576MiB/s (1652MB/s), 81.8MiB/s-109MiB/s (85.8MB/s-114MB/s) #256k
# hack
READ: bw=2705MiB/s (2836MB/s), 157MiB/s-174MiB/s (165MB/s-183MB/s) #128k
READ: bw=2901MiB/s (3042MB/s), 174MiB/s-194MiB/s (183MB/s-204MB/s) #144k
READ: bw=4194MiB/s (4398MB/s), 252MiB/s-271MiB/s (265MB/s-284MB/s) #256k
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1972,12 +1972,12 @@ static int io_prep_rw(struct io_kiocb *req,
const struct
return -EOPNOTSUPP;
kiocb->ki_flags |= IOCB_HIPRI;
- kiocb->ki_complete = io_complete_rw_iopoll;
+ kiocb->ki_complete = NULL;
req->result = 0;
} else {
if (kiocb->ki_flags & IOCB_HIPRI)
return -EINVAL;
- kiocb->ki_complete = io_complete_rw;
+ kiocb->ki_complete = NULL;
}
req->rw.addr = READ_ONCE(sqe->addr);
@@ -2005,7 +2005,12 @@ static inline void io_rw_done(struct kiocb
*kiocb, ssize_
ret = -EINTR;
/* fall through */
default:
- kiocb->ki_complete(kiocb, ret, 0);
+ if (kiocb->ki_complete)
+ kiocb->ki_complete(kiocb, ret, 0);
+ else if (kiocb->ki_flags & IOCB_HIPRI)
+ io_complete_rw_iopoll(kiocb, ret, 0);
+ else
+ io_complete_rw(kiocb, ret, 0);
}
}
With the baseline version, perf top shows a significant amount of time
for lock contention. I *think* it is nvmeq->sq_lock.
Does that make sense? I do realize the hack defeats the io_uring
purpose but I though it might provide some clues as to what is going
on. Let me know if there is something else I can try.
Thanks.
--bijan
next reply other threads:[~2020-03-02 23:55 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-02 23:55 Bijan Mottahedeh [this message]
2020-03-02 23:57 ` io_uring performance with block sizes > 128k Jens Axboe
2020-03-03 5:01 ` Jens Axboe
2020-03-03 20:23 ` Bijan Mottahedeh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox