* [Question] testing results of support async buffered reads feature
@ 2020-10-10 9:39 Hao_Xu
2020-10-10 19:17 ` Jens Axboe
0 siblings, 1 reply; 2+ messages in thread
From: Hao_Xu @ 2020-10-10 9:39 UTC (permalink / raw)
To: io-uring, Jens Axboe
Hi Jens,
I've done some testing for io_uring async buffered reads with fio. But I
found something strange to me.
- when readahead is exactly turned off, the async buffered reads feature
appears to be worse than the io-wq method in terms of IOPS.
- when readahead is on, async buffered reads works better but the
optimization rate seems to be related with the size of readahead.
I'm wondering why.
my environment is:
server: physical server
kernel: mainline 5.9.0-rc8+ latest commit 6f2f486d57c4d562cdf4
fs: ext4
device: nvme
fio: 3.20
I did the tests by setting and commenting the code:
filp->f_mode |= FMODE_BUF_RASYNC;
in fs/ext4/file.c ext4_file_open()
the IOPS in different condition is below:
when blockdev setra 0 /mnt/nvme0n1
QD/Test FMODE_BUF_RASYNC set FMODE_BUF_RASYNC not set
1 12.9k 11.0k
2 32.4k 29.7k
4 65.8k 62.1k
8 123k 116k
16 211k 208k
32 235k 296k
64 241k 328k
128 229k 312k
the async buffered reads feature has a smaller IOPS.
when blockdev setra 64 /mnt/nvme0n1
QD/Test FMODE_BUF_RASYNC set FMODE_BUF_RASYNC not set
1 11.4k 12.2k
2 23.8k 30.0k
4 52.7k 61.7k
8 122k 114k
16 208k 181k
32 237k 199k
64 260k 185k
128 231k 201k
for QD=64 (260-185)/185 = 40.5%
when blockdev setra 128 /mnt/nvme0n1
QD/Test FMODE_BUF_RASYNC set FMODE_BUF_RASYNC not set
1 11.4k 10.8k
2 23.9k 22.7k
4 53.1k 46.5k
8 122k 106k
16 204k 182k
32 212k 200k
64 242k 202k
128 229k 188k
for QD=64 (242-202)/202 = 20.0%
when blockdev setra 256 /mnt/nvme0n1
QD/Test FMODE_BUF_RASYNC set FMODE_BUF_RASYNC not set
1 11.5k 12.2k
2 23.8k 29.7k
4 52.9k 61.9k
8 121k 117k
16 207k 186k
32 229k 204k
64 230k 211k
128 240k 203k
for QD=64 (230-211)/211 = 9.0%
the arguments of fio I use are:
fio_test.sh:
blockdev --setra $2 /dev/nvme0n1
fio -filename=/mnt/nvme0n1/fio_read_test.txt \
-buffered=1 \
-iodepth $1 \
-rw=randread \
-ioengine=io_uring \
-randseed=89 \
-runtime=10s \
-norandommap \
-direct=0 \
-bs=4k \
-size=4G \
-name=rand_read_4k
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [Question] testing results of support async buffered reads feature
2020-10-10 9:39 [Question] testing results of support async buffered reads feature Hao_Xu
@ 2020-10-10 19:17 ` Jens Axboe
0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2020-10-10 19:17 UTC (permalink / raw)
To: Hao_Xu, io-uring
On 10/10/20 3:39 AM, Hao_Xu wrote:
> Hi Jens,
> I've done some testing for io_uring async buffered reads with fio. But I
> found something strange to me.
> - when readahead is exactly turned off, the async buffered reads feature
> appears to be worse than the io-wq method in terms of IOPS.
> - when readahead is on, async buffered reads works better but the
> optimization rate seems to be related with the size of readahead.
> I'm wondering why.
I don't think these are necessarily unexpected. By and large, the async
buffered reads are faster, have lower latencies, and are a lot more
efficient in terms of CPU usage. But there are cases where the old
thread offload will be quicker, as you're essentially spreading the
copying over more cores and can get higher bandwidth that way.
If you're utilizing a single ring for your application, then there might
be gains to be had at the higher end of the IOPS or bandwidth spectrum
by selectively using IOSQE_ASYNC for a (small) subset of the issued
reads.
--
Jens Axboe
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2020-10-10 23:12 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-10-10 9:39 [Question] testing results of support async buffered reads feature Hao_Xu
2020-10-10 19:17 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox