From: "Martin K. Petersen" <[email protected]>
To: Bart Van Assche <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>,
Nitesh Shetty <[email protected]>,
Javier Gonzalez <[email protected]>,
Matthew Wilcox <[email protected]>,
Keith Busch <[email protected]>, Christoph Hellwig <[email protected]>,
Keith Busch <[email protected]>,
"[email protected]" <[email protected]>,
"[email protected]" <[email protected]>,
"[email protected]" <[email protected]>,
"[email protected]" <[email protected]>,
"[email protected]" <[email protected]>,
"[email protected]" <[email protected]>
Subject: Re: [PATCHv10 0/9] write hints with nvme fdp, scsi streams
Date: Wed, 27 Nov 2024 15:14:09 -0500 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]> (Bart Van Assche's message of "Wed, 27 Nov 2024 10:42:34 -0800")
Bart,
> Submitting a copy operation as two bios or two requests means that
> there is a risk that one of the two operations never reaches the block
> driver at the bottom of the storage stack and hence that a deadlock
> occurs. I prefer not to introduce any mechanisms that can cause a
> deadlock.
How do you copy a block range without offload? You perform a READ to
read the data into memory. And once the READ completes, you do a WRITE
of the data to the new location.
Token-based copy offload works exactly the same way. You do a POPULATE
TOKEN which is identical to a READ except you get a cookie instead of
the actual data. And then once you have the cookie, you perform a WRITE
USING TOKEN to perform the write operation. Semantically, it's exactly
the same as a normal copy except for the lack of data movement. That's
the whole point!
Once I had support for token-based copy offload working, it became clear
to me that this approach is much simpler than pointer matching, bio
pairs, etc. The REQ_OP_COPY_IN operation and the REQ_OP_COPY_OUT
operation are never in flight at the same time. There are no
synchronization hassles, no lifetimes, no lookup tables in the sd
driver, no nonsense. Semantically, it's a read followed by a write.
For devices that implement single-command copy offload, the
REQ_OP_COPY_IN operation only serves as a validation that no splitting
took place. Once the bio reaches the ULD, the I/O is completed without
ever sending a command to the device. blk-lib then issues a
REQ_OP_COPY_OUT which gets turned into EXTENDED COPY or NVMe Copy and
sent to the destination device.
Aside from making things trivially simple, the COPY_IN/COPY_OUT semantic
is a *requirement* for token-based offload devices. Why would we even
consider having two incompatible sets of copy offload semantics coexist
in the block layer?
--
Martin K. Petersen Oracle Linux Engineering
next prev parent reply other threads:[~2024-11-27 20:14 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-29 15:19 [PATCHv10 0/9] write hints with nvme fdp, scsi streams Keith Busch
2024-10-29 15:19 ` [PATCHv10 1/9] block: use generic u16 for write hints Keith Busch
2024-10-29 17:21 ` Bart Van Assche
2024-10-29 15:19 ` [PATCHv10 2/9] block: introduce max_write_hints queue limit Keith Busch
2024-10-29 15:19 ` [PATCHv10 3/9] statx: add write hint information Keith Busch
2024-10-29 15:19 ` [PATCHv10 4/9] block: allow ability to limit partition write hints Keith Busch
2024-10-29 15:23 ` Christoph Hellwig
2024-10-29 17:25 ` Bart Van Assche
2024-10-30 4:46 ` Christoph Hellwig
2024-10-30 20:11 ` Keith Busch
2024-10-30 20:26 ` Bart Van Assche
2024-10-30 20:37 ` Keith Busch
2024-10-30 21:15 ` Bart Van Assche
2024-10-29 15:19 ` [PATCHv10 5/9] block, fs: add write hint to kiocb Keith Busch
2024-10-29 15:19 ` [PATCHv10 6/9] io_uring: enable per-io hinting capability Keith Busch
2024-11-07 2:09 ` Jens Axboe
2024-10-29 15:19 ` [PATCHv10 7/9] block: export placement hint feature Keith Busch
2024-10-29 15:19 ` [PATCHv10 8/9] nvme: enable FDP support Keith Busch
2024-10-30 0:24 ` Chaitanya Kulkarni
2024-10-29 15:19 ` [PATCHv10 9/9] scsi: set permanent stream count in block limits Keith Busch
2024-10-29 15:26 ` Christoph Hellwig
2024-10-29 15:34 ` Keith Busch
2024-10-29 15:37 ` Christoph Hellwig
2024-10-29 15:38 ` Keith Busch
2024-10-29 15:53 ` Christoph Hellwig
2024-10-29 16:22 ` Keith Busch
2024-10-30 4:55 ` Christoph Hellwig
2024-10-30 15:41 ` Keith Busch
2024-10-30 15:45 ` Christoph Hellwig
2024-10-30 15:48 ` Keith Busch
2024-10-30 15:50 ` Christoph Hellwig
2024-10-30 16:42 ` Keith Busch
2024-10-30 16:57 ` Christoph Hellwig
2024-10-30 17:05 ` Keith Busch
2024-10-30 17:15 ` Christoph Hellwig
2024-10-30 17:23 ` Keith Busch
2024-10-30 22:32 ` Keith Busch
2024-10-31 8:19 ` Hans Holmberg
2024-10-31 13:02 ` Christoph Hellwig
2024-10-31 14:06 ` Keith Busch
2024-11-01 7:16 ` Hans Holmberg
2024-11-01 8:19 ` Javier González
2024-11-01 14:49 ` Keith Busch
2024-11-06 14:26 ` Hans Holmberg
2024-10-30 16:59 ` Bart Van Assche
2024-10-30 17:14 ` Christoph Hellwig
2024-10-30 17:44 ` Bart Van Assche
2024-11-01 1:03 ` Jaegeuk Kim
2024-10-29 17:18 ` Bart Van Assche
2024-10-30 5:42 ` Christoph Hellwig
2024-10-29 15:24 ` [PATCHv10 0/9] write hints with nvme fdp, scsi streams Christoph Hellwig
2024-11-05 15:50 ` Christoph Hellwig
2024-11-06 18:36 ` Keith Busch
2024-11-07 20:36 ` Keith Busch
2024-11-08 14:18 ` Christoph Hellwig
2024-11-08 15:51 ` Keith Busch
2024-11-08 16:54 ` Matthew Wilcox
2024-11-08 17:43 ` Javier Gonzalez
2024-11-08 18:51 ` Bart Van Assche
2024-11-11 9:31 ` Javier Gonzalez
2024-11-11 17:45 ` Bart Van Assche
2024-11-12 13:52 ` Nitesh Shetty
2024-11-19 2:03 ` Martin K. Petersen
2024-11-25 23:21 ` Bart Van Assche
2024-11-27 2:54 ` Martin K. Petersen
2024-11-27 18:42 ` Bart Van Assche
2024-11-27 20:14 ` Martin K. Petersen [this message]
2024-11-27 21:06 ` Bart Van Assche
2024-11-28 2:09 ` Martin K. Petersen
2024-11-28 8:51 ` Damien Le Moal
2024-11-29 6:19 ` Christoph Hellwig
2024-11-29 6:23 ` Damien Le Moal
2024-11-28 3:24 ` Christoph Hellwig
2024-11-28 15:21 ` Keith Busch
2024-11-28 16:40 ` Christoph Hellwig
2024-11-11 6:51 ` Christoph Hellwig
2024-11-11 9:30 ` Javier Gonzalez
2024-11-11 9:37 ` Johannes Thumshirn
2024-11-11 9:41 ` Javier Gonzalez
2024-11-11 9:42 ` hch
2024-11-11 9:43 ` Johannes Thumshirn
2024-11-11 10:37 ` Javier Gonzalez
2024-11-11 6:49 ` Christoph Hellwig
2024-11-11 6:48 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox