From: Ziyang Zhang <[email protected]>
To: Ming Lei <[email protected]>
Cc: [email protected], [email protected],
[email protected], Jens Axboe <[email protected]>,
Gabriel Krisman Bertazi <[email protected]>,
Xiaoguang Wang <[email protected]>
Subject: Re: [PATCH V5 1/2] ublk_drv: add io_uring based userspace block driver
Date: Thu, 14 Jul 2022 18:20:38 +0800 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 2022/7/13 22:07, Ming Lei wrote:
> This is the driver part of userspace block driver(ublk driver), the other
> part is userspace daemon part(ublksrv)[1].
>
> The two parts communicate by io_uring's IORING_OP_URING_CMD with one
> shared cmd buffer for storing io command, and the buffer is read only for
> ublksrv, each io command is indexed by io request tag directly, and
> is written by ublk driver.
>
> For example, when one READ io request is submitted to ublk block driver, ublk
> driver stores the io command into cmd buffer first, then completes one
> IORING_OP_URING_CMD for notifying ublksrv, and the URING_CMD is issued to
> ublk driver beforehand by ublksrv for getting notification of any new io request,
> and each URING_CMD is associated with one io request by tag.
>
> After ublksrv gets the io command, it translates and handles the ublk io
> request, such as, for the ublk-loop target, ublksrv translates the request
> into same request on another file or disk, like the kernel loop block
> driver. In ublksrv's implementation, the io is still handled by io_uring,
> and share same ring with IORING_OP_URING_CMD command. When the target io
> request is done, the same IORING_OP_URING_CMD is issued to ublk driver for
> both committing io request result and getting future notification of new
> io request.
>
> Another thing done by ublk driver is to copy data between kernel io
> request and ublksrv's io buffer:
>
> 1) before ubsrv handles WRITE request, copy the request's data into
> ublksrv's userspace io buffer, so that ublksrv can handle the write
> request
>
> 2) after ubsrv handles READ request, copy ublksrv's userspace io buffer
> into this READ request, then ublk driver can complete the READ request
>
> Zero copy may be switched if mm is ready to support it.
>
> ublk driver doesn't handle any logic of the specific user space driver,
> so it is small/simple enough.
>
> [1] ublksrv
>
> https://github.com/ming1/ubdsrv
>
> Signed-off-by: Ming Lei <[email protected]>
> ---
Hi, Ming
I find that a big change from v4 to v5 is the simplification of locks.
In v5 you remove ubq->abort_lock, and I want to ask why it is OK to remove it?
If you have time, could you explain how ublk deals with potential race on:
1)queue_rq 2)ublk_abort_queue 3) ublk_ctrl_stop_dev 4) ublk_rq_task_work.
(Lock in ublk really confuses me...)
[...]
> +
> +/*
> + * __ublk_fail_req() may be called from abort context or ->ubq_daemon
> + * context during exiting, so lock is required.
> + *
> + * Also aborting may not be started yet, keep in mind that one failed
> + * request may be issued by block layer again.
> + */
> +static void __ublk_fail_req(struct ublk_io *io, struct request *req)
> +{
> + WARN_ON_ONCE(io->flags & UBLK_IO_FLAG_ACTIVE);
> +
> + if (!(io->flags & UBLK_IO_FLAG_ABORTED)) {
> + io->flags |= UBLK_IO_FLAG_ABORTED;
> + blk_mq_end_request(req, BLK_STS_IOERR);
> + }
> +}
> +
[...]
> +
> +/*
> + * When ->ubq_daemon is exiting, either new request is ended immediately,
> + * or any queued io command is drained, so it is safe to abort queue
> + * lockless
> + */
> +static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq)
> +{
> + int i;
> +
> + if (!ublk_get_device(ub))
> + return;
> +
> + for (i = 0; i < ubq->q_depth; i++) {
> + struct ublk_io *io = &ubq->ios[i];
> +
> + if (!(io->flags & UBLK_IO_FLAG_ACTIVE)) {
> + struct request *rq;
> +
> + /*
> + * Either we fail the request or ublk_rq_task_work_fn
> + * will do it
> + */
> + rq = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], i);
> + if (rq)
> + __ublk_fail_req(io, rq);
> + }
> + }
> + ublk_put_device(ub);
> +}
> +
Another problem:
1) comment of __ublk_fail_req(): "so lock is required"
2) comment of ublk_abort_queue(): "so it is safe to abort queue lockless"
3) ublk_abort_queue() calls _ublk_fail_req() on all ubqs.
Perhaps you need to update the comments?
Regards,
Zhang
next prev parent reply other threads:[~2022-07-14 10:20 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-13 14:07 [PATCH V5 0/2] ublk: add io_uring based userspace block driver Ming Lei
2022-07-13 14:07 ` [PATCH V5 1/2] ublk_drv: " Ming Lei
2022-07-14 10:20 ` Ziyang Zhang [this message]
2022-07-14 10:48 ` Ming Lei
2022-07-14 13:23 ` Ziyang Zhang
2022-07-15 2:04 ` Ming Lei
2022-07-15 6:07 ` Ziyang Zhang
2022-07-13 14:07 ` [PATCH V5 2/2] ublk_drv: support to complete io command via task_work_add Ming Lei
2022-07-13 20:25 ` [PATCH V5 0/2] ublk: add io_uring based userspace block driver Jens Axboe
2022-07-14 0:19 ` Ming Lei
2022-07-14 2:54 ` Jens Axboe
2022-07-14 2:59 ` Jens Axboe
2022-07-14 5:30 ` Ming Lei
2022-07-19 10:15 ` Geert Uytterhoeven
2022-07-14 2:54 ` Jens Axboe
2022-07-14 14:41 ` Gabriel Krisman Bertazi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a4249561-84a0-a314-c377-b96d28b7b20b@linux.alibaba.com \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox