From: Kanchan Joshi <[email protected]>
To: Pavel Begunkov <[email protected]>, Keith Busch <[email protected]>
Cc: [email protected], [email protected], [email protected],
[email protected], [email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
Anuj Gupta <[email protected]>
Subject: Re: [PATCH v6 06/10] io_uring/rw: add support to send metadata along with read/write
Date: Sun, 10 Nov 2024 23:11:55 +0530 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 11/7/2024 10:53 PM, Pavel Begunkov wrote:
>>> 1. SQE128 makes it big for all requests, intermixing with requests that
>>> don't need additional space wastes space. SQE128 is fine to use but at
>>> the same time we should be mindful about it and try to avoid enabling it
>>> if feasible.
>>
>> Right. And initial versions of this series did not use SQE128. But as we
>> moved towards passing more comprehensive PI information, first SQE was
>> not enough. And we thought to make use of SQE128 rather than taking
>> copy_from_user cost.
>
> Do we have any data how expensive it is? I don't think I've ever
> tried to profile it. And where the overhead comes from? speculation
> prevention?
We did measure this for nvme passthru commands in past (and that was the
motivation for building SQE128). Perf profile showed about 3% overhead
for copy [*].
> If it's indeed costly, we can add sth to io_uring like pre-mapping
> memory to optimise it, which would be useful in other places as
> well.
But why to operate as if SQE128 does not exist?
Reads/Writes, at this point, are clearly not using aboud 20b in first
SQE and entire second SQE. Not using second SQE at all does not seem
like the best way to protect it from being used by future users.
Pre-mapping maybe better for opcodes for which copy_for_user has already
been done. For something new (like this), why to start in a suboptimal
way, and later, put the burden of taking hoops on userspace to get to
the same level where it can get by simply passing a flag at the time of
ring setup.
[*]
perf record -a fio -iodepth=256 -rw=randread -ioengine=io_uring -bs=512
-numjobs=1 -size=50G -group_reporting -iodepth_batch_submit=64
-iodepth_batch_complete_min=1 -iodepth_batch_complete_max=64
-fixedbufs=1 -hipri=1 -sqthread_poll=0 -filename=/dev/ng0n1
-name=io_uring_1 -uring_cmd=1
# Overhead Command Shared Object Symbol
# ........ ............... ............................
...............................................................................
#
14.37% fio fio [.] axmap_isset
6.30% fio fio [.]
__fio_gettime
3.69% fio fio [.] get_io_u
3.16% fio [kernel.vmlinux] [k]
copy_user_enhanced_fast_string
2.61% fio [kernel.vmlinux] [k]
io_submit_sqes
1.99% fio [kernel.vmlinux] [k] fget
1.96% fio [nvme_core] [k]
nvme_alloc_request
1.82% fio [nvme] [k] nvme_poll
1.79% fio fio [.]
add_clat_sample
1.69% fio fio [.]
fio_ioring_prep
1.59% fio fio [.] thread_main
1.59% fio [nvme] [k]
nvme_queue_rqs
1.56% fio [kernel.vmlinux] [k] io_issue_sqe
1.52% fio [kernel.vmlinux] [k]
__put_user_nocheck_8
1.44% fio fio [.]
account_io_completion
1.37% fio fio [.]
get_next_rand_block
1.37% fio fio [.]
__get_next_rand_offset.isra.0
1.34% fio fio [.] io_completed
1.34% fio fio [.] td_io_queue
1.27% fio [kernel.vmlinux] [k]
blk_mq_alloc_request
1.27% fio [nvme_core] [k]
nvme_user_cmd64
next prev parent reply other threads:[~2024-11-10 17:42 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20241030180957epcas5p3312b0a582e8562f8c2169e64d41592b2@epcas5p3.samsung.com>
2024-10-30 18:01 ` [PATCH v6 00/10] Read/Write with metadata/integrity Kanchan Joshi
[not found] ` <CGME20241030181000epcas5p2bfb47a79f1e796116135f646c6f0ccc7@epcas5p2.samsung.com>
2024-10-30 18:01 ` [PATCH v6 01/10] block: define set of integrity flags to be inherited by cloned bip Kanchan Joshi
[not found] ` <CGME20241030181002epcas5p2b44e244bcd0c49d0a379f0f4fe07dc3f@epcas5p2.samsung.com>
2024-10-30 18:01 ` [PATCH v6 02/10] block: copy back bounce buffer to user-space correctly in case of split Kanchan Joshi
[not found] ` <CGME20241030181005epcas5p43b40adb5af1029c9ffaecde317bf1c5d@epcas5p4.samsung.com>
2024-10-30 18:01 ` [PATCH v6 03/10] block: modify bio_integrity_map_user to accept iov_iter as argument Kanchan Joshi
2024-10-31 4:33 ` kernel test robot
[not found] ` <CGME20241030181008epcas5p333603fdbf3afb60947d3fc51138d11bf@epcas5p3.samsung.com>
2024-10-30 18:01 ` [PATCH v6 04/10] fs, iov_iter: define meta io descriptor Kanchan Joshi
2024-10-31 6:55 ` Christoph Hellwig
[not found] ` <CGME20241030181010epcas5p2c399ecea97ed6d0e5fb228b5d15c2089@epcas5p2.samsung.com>
2024-10-30 18:01 ` [PATCH v6 05/10] fs: introduce IOCB_HAS_METADATA for metadata Kanchan Joshi
[not found] ` <CGME20241030181013epcas5p2762403c83e29c81ec34b2a7755154245@epcas5p2.samsung.com>
2024-10-30 18:01 ` [PATCH v6 06/10] io_uring/rw: add support to send metadata along with read/write Kanchan Joshi
2024-10-30 21:09 ` Keith Busch
2024-10-31 14:39 ` Pavel Begunkov
2024-11-01 17:54 ` Kanchan Joshi
2024-11-07 17:23 ` Pavel Begunkov
2024-11-10 17:41 ` Kanchan Joshi [this message]
2024-11-12 0:54 ` Pavel Begunkov
2024-11-10 18:36 ` Kanchan Joshi
2024-11-12 1:32 ` Pavel Begunkov
2024-10-31 6:55 ` Christoph Hellwig
[not found] ` <CGME20241030181016epcas5p3da284aa997e81d9855207584ab4bace3@epcas5p3.samsung.com>
2024-10-30 18:01 ` [PATCH v6 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags Kanchan Joshi
[not found] ` <CGME20241030181019epcas5p135961d721959d80f1f60bd4790ed52cf@epcas5p1.samsung.com>
2024-10-30 18:01 ` [PATCH v6 08/10] nvme: add support for passing on the application tag Kanchan Joshi
[not found] ` <CGME20241030181021epcas5p1c61b7980358f3120014b4f99390d1595@epcas5p1.samsung.com>
2024-10-30 18:01 ` [PATCH v6 09/10] scsi: add support for user-meta interface Kanchan Joshi
2024-10-31 5:09 ` kernel test robot
2024-10-31 5:10 ` kernel test robot
[not found] ` <CGME20241030181024epcas5p3964697a08159f8593a6f94764f77a7f3@epcas5p3.samsung.com>
2024-10-30 18:01 ` [PATCH v6 10/10] block: add support to pass user meta buffer Kanchan Joshi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox