public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	Harris James R <james.r.harris@intel.com>,
	io-uring@vger.kernel.org,
	Gabriel Krisman Bertazi <krisman@collabora.com>,
	ZiyangZhang <ZiyangZhang@linux.alibaba.com>,
	Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	ming.lei@redhat.com
Subject: Re: [PATCH V2 0/1] ubd: add io_uring based userspace block driver
Date: Thu, 19 May 2022 21:33:34 +0800	[thread overview]
Message-ID: <YoZHLrxE87t6T+Tz@T590> (raw)
In-Reply-To: <20220517055358.3164431-1-ming.lei@redhat.com>

On Tue, May 17, 2022 at 01:53:57PM +0800, Ming Lei wrote:
> Hello Guys,
> 
> ubd driver is one kernel driver for implementing generic userspace block
> device/driver, which delivers io request from ubd block device(/dev/ubdbN) into
> ubd server[1] which is the userspace part of ubd for communicating
> with ubd driver and handling specific io logic by its target module.
> 
> Another thing ubd driver handles is to copy data between user space buffer
> and request/bio's pages, or take zero copy if mm is ready for support it in
> future. ubd driver doesn't handle any IO logic of the specific driver, so
> it is small/simple, and all io logics are done by the target code in ubdserver.
> 
> The above two are main jobs done by ubd driver.
> 
> ubd driver can help to move IO logic into userspace, in which the
> development work is easier/more effective than doing in kernel, such as,
> ubd-loop takes < 200 lines of loop specific code to get basically same 
> function with kernel loop block driver, meantime the performance is
> still good. ubdsrv[1] provide built-in test for comparing both by running
> "make test T=loop".
> 
> Another example is high performance qcow2 support[2], which could be built with
> ubd framework more easily than doing it inside kernel.
> 
> Also there are more people who express interests on userspace block driver[3],
> Gabriel Krisman Bertazi proposes this topic in lsf/mm/ebpf 2022 and mentioned
> requirement from Google. Ziyang Zhang from Alibaba said they "plan to
> replace TCMU by UBD as a new choice" because UBD can get better throughput than
> TCMU even with single queue[4], meantime UBD is simple. Also there is userspace
> storage service for providing storage to containers.
> 
> It is io_uring based: io request is delivered to userspace via new added
> io_uring command which has been proved as very efficient for making nvme
> passthrough IO to get better IOPS than io_uring(READ/WRITE). Meantime one
> shared/mmap buffer is used for sharing io descriptor to userspace, the
> buffer is readonly for userspace, each IO just takes 24bytes so far.
> It is suggested to use io_uring in userspace(target part of ubd server)
> to handle IO request too. And it is still easy for ubdserver to support
> io handling by non-io_uring, and this work isn't done yet, but can be
> supported easily with help o eventfd.
> 
> This way is efficient since no extra io command copy is required, no sleep
> is needed in transferring io command to userspace. Meantime the communication
> protocol is simple and efficient, one single command of
> UBD_IO_COMMIT_AND_FETCH_REQ can handle both fetching io request desc and commit
> command result in one trip. IO handling is often batched after single
> io_uring_enter() returns, both IO requests from ubd server target and
> IO commands could be handled as a whole batch.
> 
> Remove RFC now because ubd driver codes gets lots of cleanup, enhancement and
> bug fixes since V1:
> 
> - cleanup uapi: remove ubd specific error code,  switch to linux error code,
> remove one command op, remove one field from cmd_desc
> 
> - add monitor mechanism to handle ubq_daemon being killed, ubdsrv[1]
>   includes builtin tests for covering heavy IO with deleting ubd / killing
>   ubq_daemon at the same time, and V2 pass all the two tests(make test T=generic),
>   and the abort/stop mechanism is simple
> 
> - fix MQ command buffer mmap bug, and now 'xfstetests -g auto' works well on
>   MQ ubd-loop devices(test/scratch)
> 
> - improve batching submission as suggested by Jens
> 
> - improve handling for starting device, replace random wait/poll with
> completion
> 
> - all kinds of cleanup, bug fix,..
> 
> And the patch by patch change since V1 can be found in the following
> tree:
> 
> https://github.com/ming1/linux/commits/my_for-5.18-ubd-devel_v2

BTW, a one-line fix[1] is added to above branch, which fixes performance
obviously on small BS(< 128k) test. If anyone run performance test,
please include this fix.

[1] https://github.com/ming1/linux/commit/fa91354b418e83953304a3efad4ee6ac40ea6110

Thanks,
Ming


      parent reply	other threads:[~2022-05-19 13:34 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-17  5:53 [PATCH V2 0/1] ubd: add io_uring based userspace block driver Ming Lei
2022-05-17  5:53 ` [PATCH V2 1/1] " Ming Lei
2022-05-17 10:00   ` Ziyang Zhang
2022-05-17 12:55     ` Ming Lei
2022-05-18  5:53       ` Ziyang Zhang
2022-05-17  8:01 ` [PATCH V2 0/1] " Christoph Hellwig
2022-05-17 14:06 ` Stefan Hajnoczi
2022-05-18  7:09   ` Ming Lei
2022-05-18 10:45     ` Stefan Hajnoczi
2022-05-18 12:53       ` Ming Lei
2022-05-18 15:49         ` Stefan Hajnoczi
2022-05-19  2:42           ` Ming Lei
2022-05-19  9:46             ` Stefan Hajnoczi
2022-05-18  6:38 ` Liu Xiaodong
2022-05-18 13:18   ` Ming Lei
2022-05-23 14:56     ` Liu Xiaodong
2022-05-24  2:59       ` Ming Lei
2022-05-18  9:26 ` Stefan Hajnoczi
2022-05-19 13:33 ` Ming Lei [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YoZHLrxE87t6T+Tz@T590 \
    --to=ming.lei@redhat.com \
    --cc=ZiyangZhang@linux.alibaba.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=james.r.harris@intel.com \
    --cc=krisman@collabora.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stefanha@redhat.com \
    --cc=xiaoguang.wang@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox