public inbox for [email protected]
 help / color / mirror / Atom feed
From: Damien Le Moal <[email protected]>
To: "Martin K. Petersen" <[email protected]>,
	Bart Van Assche <[email protected]>
Cc: Nitesh Shetty <[email protected]>,
	Javier Gonzalez <[email protected]>,
	Matthew Wilcox <[email protected]>,
	Keith Busch <[email protected]>, Christoph Hellwig <[email protected]>,
	Keith Busch <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>
Subject: Re: [PATCHv10 0/9] write hints with nvme fdp, scsi streams
Date: Thu, 28 Nov 2024 17:51:52 +0900	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 11/28/24 11:09, Martin K. Petersen wrote:
> 
> Bart,
> 
>> What if the source LBA range does not require splitting but the
>> destination LBA range requires splitting, e.g. because it crosses a
>> chunk_sectors boundary? Will the REQ_OP_COPY_IN operation succeed in
>> this case and the REQ_OP_COPY_OUT operation fail?
> 
> Yes.
> 
> I experimented with approaching splitting in an iterative fashion. And
> thus, if there was a split halfway through the COPY_IN I/O, we'd issue a
> corresponding COPY_OUT up to the split point and hope that the write
> subsequently didn't need a split. And then deal with the next segment.
> 
> However, given that copy offload offers diminishing returns for small
> I/Os, it was not worth the hassle for the devices I used for
> development. It was cleaner and faster to just fall back to regular
> read/write when a split was required.
> 
>> Does this mean that a third operation is needed to cancel
>> REQ_OP_COPY_IN operations if the REQ_OP_COPY_OUT operation fails?
> 
> No. The device times out the token.
> 
>> Additionally, how to handle bugs in REQ_OP_COPY_* submitters where a
>> large number of REQ_OP_COPY_IN operations is submitted without
>> corresponding REQ_OP_COPY_OUT operation? Is perhaps a mechanism
>> required to discard unmatched REQ_OP_COPY_IN operations after a
>> certain time?
> 
> See above.
> 
> For your EXTENDED COPY use case there is no token and thus the COPY_IN
> completes immediately.
> 
> And for the token case, if you populate a million tokens and don't use
> them before they time out, it sounds like your submitting code is badly
> broken. But it doesn't matter because there are no I/Os in flight and
> thus nothing to discard.
> 
>> Hmm ... we may each have a different opinion about whether or not the
>> COPY_IN/COPY_OUT semantics are a requirement for token-based copy
>> offloading.
> 
> Maybe. But you'll have a hard time convincing me to add any kind of
> state machine or bio matching magic to the SCSI stack when the simplest
> solution is to treat copying like a read followed by a write. There is
> no concurrency, no kernel state, no dependency between two commands, nor
> two scsi_disk/scsi_device object lifetimes to manage.

And that also would allow supporting a fake copy offload with regular read/write
BIOs very easily, I think. So all block devices can be presented as supporting
"copy offload". That is nice for FSes.

> 
>> Additionally, I'm not convinced that implementing COPY_IN/COPY_OUT for
>> ODX devices is that simple. The COPY_IN and COPY_OUT operations have
>> to be translated into three SCSI commands, isn't it? I'm referring to
>> the POPULATE TOKEN, RECEIVE ROD TOKEN INFORMATION and WRITE USING
>> TOKEN commands. What is your opinion about how to translate the two
>> block layer operations into these three SCSI commands?
> 
> COPY_IN is translated to a NOP for devices implementing EXTENDED COPY
> and a POPULATE TOKEN for devices using tokens.
> 
> COPY_OUT is translated to an EXTENDED COPY (or NVMe Copy) for devices
> using a single command approach and WRITE USING TOKEN for devices using
> tokens.

ATA WRITE GATHERED command is also a single copy command. That matches and while
I have not checked SAT, translation would likely work.

While I was initially worried that the 2 BIO based approach would be overly
complicated, it seems that I was wrong :)

> 
> There is no need for RECEIVE ROD TOKEN INFORMATION.
> 
> I am not aware of UFS devices using the token-based approach. And for
> EXTENDED COPY there is only a single command sent to the device. If you
> want to do power management while that command is being processed,
> please deal with that in UFS. The block layer doesn't deal with the
> async variants of any of the other SCSI commands either...
>


-- 
Damien Le Moal
Western Digital Research

  reply	other threads:[~2024-11-28  8:51 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-29 15:19 [PATCHv10 0/9] write hints with nvme fdp, scsi streams Keith Busch
2024-10-29 15:19 ` [PATCHv10 1/9] block: use generic u16 for write hints Keith Busch
2024-10-29 17:21   ` Bart Van Assche
2024-10-29 15:19 ` [PATCHv10 2/9] block: introduce max_write_hints queue limit Keith Busch
2024-10-29 15:19 ` [PATCHv10 3/9] statx: add write hint information Keith Busch
2024-10-29 15:19 ` [PATCHv10 4/9] block: allow ability to limit partition write hints Keith Busch
2024-10-29 15:23   ` Christoph Hellwig
2024-10-29 17:25   ` Bart Van Assche
2024-10-30  4:46     ` Christoph Hellwig
2024-10-30 20:11       ` Keith Busch
2024-10-30 20:26         ` Bart Van Assche
2024-10-30 20:37           ` Keith Busch
2024-10-30 21:15             ` Bart Van Assche
2024-10-29 15:19 ` [PATCHv10 5/9] block, fs: add write hint to kiocb Keith Busch
2024-10-29 15:19 ` [PATCHv10 6/9] io_uring: enable per-io hinting capability Keith Busch
2024-11-07  2:09   ` Jens Axboe
2024-10-29 15:19 ` [PATCHv10 7/9] block: export placement hint feature Keith Busch
2024-10-29 15:19 ` [PATCHv10 8/9] nvme: enable FDP support Keith Busch
2024-10-30  0:24   ` Chaitanya Kulkarni
2024-10-29 15:19 ` [PATCHv10 9/9] scsi: set permanent stream count in block limits Keith Busch
2024-10-29 15:26   ` Christoph Hellwig
2024-10-29 15:34     ` Keith Busch
2024-10-29 15:37       ` Christoph Hellwig
2024-10-29 15:38         ` Keith Busch
2024-10-29 15:53           ` Christoph Hellwig
2024-10-29 16:22             ` Keith Busch
2024-10-30  4:55               ` Christoph Hellwig
2024-10-30 15:41                 ` Keith Busch
2024-10-30 15:45                   ` Christoph Hellwig
2024-10-30 15:48                     ` Keith Busch
2024-10-30 15:50                       ` Christoph Hellwig
2024-10-30 16:42                         ` Keith Busch
2024-10-30 16:57                           ` Christoph Hellwig
2024-10-30 17:05                             ` Keith Busch
2024-10-30 17:15                               ` Christoph Hellwig
2024-10-30 17:23                             ` Keith Busch
2024-10-30 22:32                             ` Keith Busch
2024-10-31  8:19                               ` Hans Holmberg
2024-10-31 13:02                                 ` Christoph Hellwig
2024-10-31 14:06                                 ` Keith Busch
2024-11-01  7:16                                   ` Hans Holmberg
2024-11-01  8:19                                     ` Javier González
2024-11-01 14:49                                     ` Keith Busch
2024-11-06 14:26                                       ` Hans Holmberg
2024-10-30 16:59                 ` Bart Van Assche
2024-10-30 17:14                   ` Christoph Hellwig
2024-10-30 17:44                     ` Bart Van Assche
2024-11-01  1:03                       ` Jaegeuk Kim
2024-10-29 17:18     ` Bart Van Assche
2024-10-30  5:42       ` Christoph Hellwig
2024-10-29 15:24 ` [PATCHv10 0/9] write hints with nvme fdp, scsi streams Christoph Hellwig
2024-11-05 15:50 ` Christoph Hellwig
2024-11-06 18:36   ` Keith Busch
2024-11-07 20:36   ` Keith Busch
2024-11-08 14:18     ` Christoph Hellwig
2024-11-08 15:51       ` Keith Busch
2024-11-08 16:54         ` Matthew Wilcox
2024-11-08 17:43           ` Javier Gonzalez
2024-11-08 18:51             ` Bart Van Assche
2024-11-11  9:31               ` Javier Gonzalez
2024-11-11 17:45                 ` Bart Van Assche
2024-11-12 13:52                   ` Nitesh Shetty
2024-11-19  2:03                     ` Martin K. Petersen
2024-11-25 23:21                       ` Bart Van Assche
2024-11-27  2:54                         ` Martin K. Petersen
2024-11-27 18:42                           ` Bart Van Assche
2024-11-27 20:14                             ` Martin K. Petersen
2024-11-27 21:06                               ` Bart Van Assche
2024-11-28  2:09                                 ` Martin K. Petersen
2024-11-28  8:51                                   ` Damien Le Moal [this message]
2024-11-29  6:19                                     ` Christoph Hellwig
2024-11-29  6:23                                       ` Damien Le Moal
2024-11-28  3:24                               ` Christoph Hellwig
2024-11-28 15:21                               ` Keith Busch
2024-11-28 16:40                                 ` Christoph Hellwig
2024-11-11  6:51             ` Christoph Hellwig
2024-11-11  9:30               ` Javier Gonzalez
2024-11-11  9:37                 ` Johannes Thumshirn
2024-11-11  9:41                   ` Javier Gonzalez
2024-11-11  9:42                     ` hch
2024-11-11  9:43                     ` Johannes Thumshirn
2024-11-11 10:37                       ` Javier Gonzalez
2024-11-11  6:49           ` Christoph Hellwig
2024-11-11  6:48         ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox