From: Jens Axboe <[email protected]>
To: Hao_Xu <[email protected]>,
Matthew Wilcox <[email protected]>,
[email protected]
Cc: Johannes Weiner <[email protected]>,
Andrew Morton <[email protected]>
Subject: Re: Loophole in async page I/O
Date: Wed, 14 Oct 2020 14:57:11 -0600 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 10/14/20 2:31 PM, Hao_Xu wrote:
> Hi Jens,
> I've done some tests for the new fix code with readahead disabled from
> userspace. Here comes some results.
> For the perf reports, since I'm new to kernel stuff, still investigating
> on it.
> I'll keep addressing the issue which causes the difference among the
> four perf reports(in which the copy_user_enhanced_fast_string() catches
> my eyes)
>
> my environment is:
> server: physical server
> kernel: mainline 5.9.0-rc8+ latest commit 6f2f486d57c4d562cdf4
> fs: ext4
> device: nvme ssd
> fio: 3.20
>
> I did the tests by setting and commenting the code:
> filp->f_mode |= FMODE_BUF_RASYNC;
> in fs/ext4/file.c ext4_file_open()
You don't have to modify the kernel, if you use a newer fio then you can
essentially just add:
--force_async=1
after setting the engine to io_uring to get the same effect. Just a
heads up, as that might make it easier for you.
> the IOPS with readahead disabled from userspace is below:
>
> with new fix code(force readahead)
> QD/Test FMODE_BUF_RASYNC set FMODE_BUF_RASYNC not set
> 1 10.8k 10.3k
> 2 21.2k 20.1k
> 4 41.1k 39.1k
> 8 76.1k 72.2k
> 16 133k 126k
> 32 169k 147k
> 64 176k 160k
> 128 (1)187k (2)156k
>
> now async buffered reads feature looks better in terms of IOPS,
> but it still looks similar with the async buffered reads feature in the
> mainline code.
I'd say it looks better all around. And what you're completely
forgetting here is that when FMODE_BUF_RASYNC isn't set, then you're
using QD number of async workers to achieve that result. Hence you have
1..128 threads potentially running on that one, vs having a _single_
process running with FMODE_BUF_RASYNC.
> with mainline code(the fix code in commit c8d317aa1887 ("io_uring: fix
> async buffered reads when readahead is disabled"))
> QD/Test FMODE_BUF_RASYNC set FMODE_BUF_RASYNC not set
> 1 10.9k 10.2k
> 2 21.6k 20.2k
> 4 41.0k 39.9k
> 8 79.7k 75.9k
> 16 141k 138k
> 32 169k 237k
> 64 190k 316k
> 128 (3)195k (4)315k
>
> Considering the number in place (1)(2)(3)(4), the new fix doesn't seem
> to fix the slow down
> but make the number (4) become number (2)
Not sure why there would be a difference between 2 and 4, that does seem
odd. I'll see if I can reproduce that. More questions below.
> the perf reports of (1)(2)(3)(4) situations are:
> (1)
> 9 # Overhead Command Shared Object Symbol
> 10 # ........ ....... ..................
> ..............................................
> 11 #
> 12 10.19% fio [kernel.vmlinux] [k]
> copy_user_enhanced_fast_string
> 13 8.53% fio fio [.] clock_thread_fn
> 14 4.67% fio [kernel.vmlinux] [k] xas_load
> 15 2.18% fio [kernel.vmlinux] [k] clear_page_erms
> 16 2.02% fio libc-2.24.so [.] __memset_avx2_erms
> 17 1.55% fio [kernel.vmlinux] [k] mutex_unlock
> 18 1.51% fio [kernel.vmlinux] [k] shmem_getpage_gfp
> 19 1.48% fio [kernel.vmlinux] [k] native_irq_return_iret
> 20 1.48% fio [kernel.vmlinux] [k] get_page_from_freelist
> 21 1.46% fio [kernel.vmlinux] [k] generic_file_buffered_read
> 22 1.45% fio [nvme] [k] nvme_irq
> 23 1.25% fio [kernel.vmlinux] [k] __list_del_entry_valid
> 24 1.22% fio [kernel.vmlinux] [k] free_pcppages_bulk
> 25 1.15% fio [kernel.vmlinux] [k] _raw_spin_lock
> 26 1.12% fio fio [.] get_io_u
> 27 0.81% fio [ext4] [k] ext4_mpage_readpages
> 28 0.78% fio fio [.] fio_gettime
> 29 0.76% fio [kernel.vmlinux] [k] find_get_entries
> 30 0.75% fio [vdso] [.] __vdso_clock_gettime
> 31 0.73% fio [kernel.vmlinux] [k] release_pages
> 32 0.68% fio [kernel.vmlinux] [k] find_get_entry
> 33 0.68% fio fio [.] io_u_queued_complete
> 34 0.67% fio [kernel.vmlinux] [k] io_async_buf_func
> 35 0.65% fio [kernel.vmlinux] [k] io_submit_sqes
These profiles are of marginal use, as you're only profiling fio itself,
not all of the async workers that are running for !FMODE_BUF_RASYNC.
How long does the test run? It looks suspect that clock_thread_fn shows
up in the profiles at all.
And is it actually doing IO, or are you using shm/tmpfs for this test?
Isn't ext4 hosting the file? I see a lot of shmem_getpage_gfp(), makes
me a little confused.
--
Jens Axboe
next prev parent reply other threads:[~2020-10-15 1:44 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-12 21:13 Loophole in async page I/O Matthew Wilcox
2020-10-12 22:08 ` Jens Axboe
2020-10-12 22:22 ` Jens Axboe
2020-10-12 22:42 ` Jens Axboe
2020-10-14 20:31 ` Hao_Xu
2020-10-14 20:57 ` Jens Axboe [this message]
2020-10-15 11:27 ` Hao_Xu
2020-10-15 12:17 ` Hao_Xu
2020-10-13 5:31 ` Hao_Xu
2020-10-13 17:50 ` Jens Axboe
2020-10-13 19:50 ` Hao_Xu
2020-10-13 5:13 ` Hao_Xu
2020-10-13 12:01 ` Matthew Wilcox
2020-10-13 19:57 ` Hao_Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox