public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: Christoph Hellwig <[email protected]>
Cc: [email protected], Jens Axboe <[email protected]>,
	Conrad Meyer <[email protected]>,
	[email protected], [email protected]
Subject: Re: [PATCH v4 8/8] block: implement async write zero pages command
Date: Tue, 10 Sep 2024 21:10:34 +0100	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 9/10/24 15:20, Christoph Hellwig wrote:
> On Tue, Sep 10, 2024 at 01:17:48PM +0100, Pavel Begunkov wrote:
>>>> Add a command that writes the zero page to the drive. Apart from passing
>>>> the zero page instead of actual data it uses the normal write path and
>>>> doesn't do any further acceleration, nor it requires any special
>>>> hardware support. The indended use is to have a fallback when
>>>> BLOCK_URING_CMD_WRITE_ZEROES is not supported.
>>>
>>> That's just a horrible API.  The user should not have to care if the
>>> kernel is using different kinds of implementations.
>>
>> It's rather not a good api when instead of issuing a presumably low
>> overhead fast command the user expects sending a good bunch of actual
>> writes with different performance characteristics.
> 
> The normal use case (at least the ones I've been involved with) are
> simply zero these blocks or the entire device, and please do it as
> good as you can.  Needing asynchronous error handling in userspace
> for that is extremely counter productive.

If we expect any error handling from the user space at all (we do),
it'll and have to be asynchronous, it's async commands and io_uring.
Asking the user to reissue a command in some form is normal.

>> In my experience,
>> such fallbacks cause more pain when a more explicit approach is
>> possible. And let me note that it's already exposed via fallocate, even
>> though in a bit different way.
> 
> Do you mean the FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE case in
> blkdev_fallocate?  As far as I can tell this is actually a really bad
> example, as even a hardware offloaded write zeroes can and often does
> write physical zeroes to the media, and does so from a firmware path
> that is often slower than the kernel loop.

That's a shame, I agree, which is why I call it "presumably" faster,
but that actually gives more reasons why you might want this cmd
separately from write zeroes, considering the user might know
its hardware and the kernel doesn't try to choose which approach
faster.

> But you have an actual use case where you want to send a write zeroes
> command but never a loop of writes, it would be good to document that
> and add a flag for it.  And if we don't have that case it would still

Users who know more about hw and e.g. prefer writes with 0 page as
per above. Users with lots of devices who care about pcie / memory
bandwidth, there is enough of those, they might want to do
something different like adjusting algorithms and throttling.
Better/easier testing, though of lesser importance.

Those I made up just now on the spot, but the reporter did
specifically ask about some way to differentiate fallbacks.

> be good to have a reserved flags field to add it later if needed.

if (unlikely(sqe->ioprio || sqe->__pad1 || sqe->len ||
	     sqe->rw_flags || sqe->file_index))
	return -EINVAL;

There is a good bunch of sqe fields that can used for that later.

> Btw, do you have API documentation (e.g. in the form of a man page)
> for these new calls somewhere?

Mentioned in the cover:

tests and docs:
https://github.com/isilence/liburing.git discard-cmd
man page specifically:
https://github.com/isilence/liburing/commit/a6fa2bc2400bf7fcb80496e322b5db4c8b3191f0

I'll send them once the kernel is set in place.

-- 
Pavel Begunkov

  reply	other threads:[~2024-09-10 20:10 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-06 22:57 [PATCH v4 0/8] implement async block discards and other ops via io_uring Pavel Begunkov
2024-09-06 22:57 ` [PATCH v4 1/8] io_uring/cmd: expose iowq to cmds Pavel Begunkov
2024-09-06 22:57 ` [PATCH v4 2/8] io_uring/cmd: give inline space in request " Pavel Begunkov
2024-09-06 22:57 ` [PATCH v4 3/8] filemap: introduce filemap_invalidate_pages Pavel Begunkov
2024-09-06 22:57 ` [PATCH v4 4/8] block: introduce blk_validate_byte_range() Pavel Begunkov
2024-09-10  7:55   ` Christoph Hellwig
2024-09-06 22:57 ` [PATCH v4 5/8] block: implement async discard as io_uring cmd Pavel Begunkov
2024-09-10  8:01   ` Christoph Hellwig
2024-09-10 10:58     ` Pavel Begunkov
2024-09-10 14:17       ` Christoph Hellwig
2024-09-10 20:22         ` Pavel Begunkov
2024-09-12  9:28           ` Christoph Hellwig
2024-09-06 22:57 ` [PATCH v4 6/8] block: implement async write zeroes command Pavel Begunkov
2024-09-06 22:57 ` [PATCH v4 7/8] block: add nowait flag for __blkdev_issue_zero_pages Pavel Begunkov
2024-09-06 22:57 ` [PATCH v4 8/8] block: implement async write zero pages command Pavel Begunkov
2024-09-10  8:02   ` Christoph Hellwig
2024-09-10 12:17     ` Pavel Begunkov
2024-09-10 14:20       ` Christoph Hellwig
2024-09-10 20:10         ` Pavel Begunkov [this message]
2024-09-12  9:26           ` Christoph Hellwig
2024-09-12 16:38             ` Pavel Begunkov
2024-09-08 22:25 ` [PATCH v4 0/8] implement async block discards and other ops via io_uring Jens Axboe
2024-09-09 14:51 ` Jens Axboe
2024-09-09 15:33   ` Jens Axboe
2024-09-09 15:09 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox