public inbox for [email protected]
 help / color / mirror / Atom feed
From: Ferry Meng <[email protected]>
To: Jason Wang <[email protected]>
Cc: "Michael S . Tsirkin" <[email protected]>,
	[email protected], Jens Axboe <[email protected]>,
	[email protected], [email protected],
	[email protected], Joseph Qi <[email protected]>,
	Jeffle Xu <[email protected]>
Subject: Re: [PATCH 0/3][RFC] virtio-blk: add io_uring passthrough support for virtio-blk
Date: Tue, 17 Dec 2024 14:04:04 +0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <CACGkMEu4=nt0R1pmTauuK_vcc_fbObmyWqe0TO0HhuexmZWHJQ@mail.gmail.com>


On 12/17/24 10:08 AM, Jason Wang wrote:
> On Mon, Dec 16, 2024 at 8:07 PM Ferry Meng <[email protected]> wrote:
>>
>> On 12/16/24 3:38 PM, Jason Wang wrote:
>>> On Mon, Dec 16, 2024 at 10:01 AM Ferry Meng <[email protected]> wrote:
>>>> On 12/3/24 8:14 PM, Ferry Meng wrote:
>>>>> We seek to develop a more flexible way to use virtio-blk and bypass the block
>>>>> layer logic in order to accomplish certain performance optimizations. As a
>>>>> result, we referred to the implementation of io_uring passthrough in NVMe
>>>>> and implemented it in the virtio-blk driver. This patch series adds io_uring
>>>>> passthrough support for virtio-blk devices, resulting in lower submit latency
>>>>> and increased flexibility when utilizing virtio-blk.
>>>>>
>>>>> To test this patch series, I changed fio's code:
>>>>> 1. Added virtio-blk support to engines/io_uring.c.
>>>>> 2. Added virtio-blk support to the t/io_uring.c testing tool.
>>>>> Link: https://github.com/jdmfr/fio
>>>>>
>>>>> Using t/io_uring-vblk, the performance of virtio-blk based on uring-cmd
>>>>> scales better than block device access. (such as below, Virtio-Blk with QEMU,
>>>>> 1-depth fio)
>>>>> (passthru) read: IOPS=17.2k, BW=67.4MiB/s (70.6MB/s)
>>>>> slat (nsec): min=2907, max=43592, avg=3981.87, stdev=595.10
>>>>> clat (usec): min=38, max=285,avg=53.47, stdev= 8.28
>>>>> lat (usec): min=44, max=288, avg=57.45, stdev= 8.28
>>>>> (block) read: IOPS=15.3k, BW=59.8MiB/s (62.7MB/s)
>>>>> slat (nsec): min=3408, max=35366, avg=5102.17, stdev=790.79
>>>>> clat (usec): min=35, max=343, avg=59.63, stdev=10.26
>>>>> lat (usec): min=43, max=349, avg=64.73, stdev=10.21
>>>>>
>>>>> Testing the virtio-blk device with fio using 'engines=io_uring_cmd'
>>>>> and 'engines=io_uring' also demonstrates improvements in submit latency.
>>>>> (passthru) taskset -c 0 t/io_uring-vblk -b4096 -d8 -c4 -s4 -p0 -F1 -B0 -O0 -n1 -u1 /dev/vdcc0
>>>>> IOPS=189.80K, BW=741MiB/s, IOS/call=4/3
>>>>> IOPS=187.68K, BW=733MiB/s, IOS/call=4/3
>>>>> (block) taskset -c 0 t/io_uring-vblk -b4096 -d8 -c4 -s4 -p0 -F1 -B0 -O0 -n1 -u0 /dev/vdc
>>>>> IOPS=101.51K, BW=396MiB/s, IOS/call=4/3
>>>>> IOPS=100.01K, BW=390MiB/s, IOS/call=4/4
>>>>>
>>>>> The performance overhead of submitting IO can be decreased by 25% overall
>>>>> with this patch series. The implementation primarily references 'nvme io_uring
>>>>> passthrough', supporting io_uring_cmd through a separate character interface
>>>>> (temporarily named /dev/vdXc0). Since this is an early version, many
>>>>> details need to be taken into account and redesigned, like:
>>>>> ● Currently, it only considers READ/WRITE scenarios, some more complex operations
>>>>> not included like discard or zone ops.(Normal sqe64 is sufficient, in my opinion;
>>>>> following upgrades, sqe128 and cqe32 might not be needed).
>>>>> ● ......
>>>>>
>>>>> I would appreciate any useful recommendations.
>>>>>
>>>>> Ferry Meng (3):
>>>>>      virtio-blk: add virtio-blk chardev support.
>>>>>      virtio-blk: add uring_cmd support for I/O passthru on chardev.
>>>>>      virtio-blk: add uring_cmd iopoll support.
>>>>>
>>>>>     drivers/block/virtio_blk.c      | 325 +++++++++++++++++++++++++++++++-
>>>>>     include/uapi/linux/virtio_blk.h |  16 ++
>>>>>     2 files changed, 336 insertions(+), 5 deletions(-)
>>>> Hi, Micheal & Jason :
>>>>
>>>> What about yours' opinion? As virtio-blk maintainer. Looking forward to
>>>> your reply.
>>>>
>>>> Thanks
>>> If I understand this correctly, this proposal wants to make io_uring a
>>> transport of the virito-blk command. So the application doesn't need
>>> to worry about compatibility etc. This seems to be fine.
>>>
>>> But I wonder what's the security consideration, for example do we
>>> allow all virtio-blk commands to be passthroughs and why.
>> About 'security consideration', the generic char-dev belongs to root, so
>> only root can use this passthrough path.
> This seems like a restriction. A lot of applications want to be run
> without privilege to be safe.
>
I'm sorry that there may have been some misunderstanding in my previous 
explanation. The generic cdev file's default group is 'root,' but we can 
just use 'chgrp' and change it to what we want.

After which, apps can then utilize it, just like they would with a 
standard file.

>> On the other hand, to what I know, virtio-blk commands are all related
>> to 'I/O operations', so we can support all those opcodes with bypassing
>> vfs&block layer (if we want). I just realized the most  basic read/write
>> in this RFC patch series, others will be considered later.
>>
>>> Thanks
>>>
> Thanks

  reply	other threads:[~2024-12-17  6:04 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-03 12:14 [PATCH 0/3][RFC] virtio-blk: add io_uring passthrough support for virtio-blk Ferry Meng
2024-12-03 12:14 ` [PATCH 1/3] virtio-blk: add virtio-blk chardev support Ferry Meng
2024-12-03 12:14 ` [PATCH 2/3] virtio-blk: add uring_cmd support for I/O passthru on chardev Ferry Meng
2024-12-04 15:19   ` kernel test robot
2024-12-03 12:14 ` [PATCH 3/3] virtio-blk: add uring_cmd iopoll support Ferry Meng
2024-12-04 21:47 ` [PATCH 0/3][RFC] virtio-blk: add io_uring passthrough support for virtio-blk Stefan Hajnoczi
2024-12-05  9:51   ` [Resend]Re: " Ferry Meng
2024-12-16  2:01 ` Ferry Meng
2024-12-16  7:38   ` Jason Wang
2024-12-16 12:07     ` Ferry Meng
2024-12-17  2:08       ` Jason Wang
2024-12-17  6:04         ` Ferry Meng [this message]
2024-12-16 15:54 ` Christoph Hellwig
2024-12-16 16:13   ` Stefan Hajnoczi
2024-12-17  2:12     ` Jason Wang
2024-12-17  6:08     ` Jingbo Xu
2024-12-17 17:54       ` Jens Axboe
2024-12-17 21:00         ` Stefan Hajnoczi
2024-12-17 21:07           ` Jens Axboe
2024-12-18  3:35             ` Ferry Meng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6c071252-2b47-48d1-b111-3b01b90b7f1c@linux.alibaba.com \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox