From: Ming Lei <ming.lei@redhat.com>
To: Keith Busch <kbusch@meta.com>
Cc: asml.silence@gmail.com, axboe@kernel.dk,
linux-block@vger.kernel.org, io-uring@vger.kernel.org,
bernd@bsbernd.com, Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCHv2 4/6] ublk: zc register/unregister bvec
Date: Wed, 12 Feb 2025 10:49:15 +0800 [thread overview]
Message-ID: <Z6wMK5WxvS_MzLh3@fedora> (raw)
In-Reply-To: <20250211005646.222452-5-kbusch@meta.com>
On Mon, Feb 10, 2025 at 04:56:44PM -0800, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> Provide new operations for the user to request mapping an active request
> to an io uring instance's buf_table. The user has to provide the index
> it wants to install the buffer.
>
> A reference count is taken on the request to ensure it can't be
> completed while it is active in a ring's buf_table.
>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> drivers/block/ublk_drv.c | 145 +++++++++++++++++++++++++---------
> include/uapi/linux/ublk_cmd.h | 4 +
> 2 files changed, 113 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index 529085181f355..ccfda7b2c24da 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -51,6 +51,9 @@
> /* private ioctl command mirror */
> #define UBLK_CMD_DEL_DEV_ASYNC _IOC_NR(UBLK_U_CMD_DEL_DEV_ASYNC)
>
> +#define UBLK_IO_REGISTER_IO_BUF _IOC_NR(UBLK_U_IO_REGISTER_IO_BUF)
> +#define UBLK_IO_UNREGISTER_IO_BUF _IOC_NR(UBLK_U_IO_UNREGISTER_IO_BUF)
UBLK_IO_REGISTER_IO_BUF command may be completed, and buffer isn't used
by RW_FIXED yet in the following cases:
- application doesn't submit any RW_FIXED consumer OP
- io_uring_enter() only issued UBLK_IO_REGISTER_IO_BUF, and the other
OPs can't be issued because of out of resource
...
Then io_uring_enter() returns, and the application is panic or killed,
how to avoid buffer leak?
It need to deal with in io_uring cancel code for calling ->release() if
the kbuffer node isn't released.
UBLK_IO_UNREGISTER_IO_BUF still need to call ->release() if the node
buffer isn't used.
> +
> /* All UBLK_F_* have to be included into UBLK_F_ALL */
> #define UBLK_F_ALL (UBLK_F_SUPPORT_ZERO_COPY \
> | UBLK_F_URING_CMD_COMP_IN_TASK \
> @@ -76,6 +79,9 @@ struct ublk_rq_data {
> struct llist_node node;
>
> struct kref ref;
> +
> +#define UBLK_ZC_REGISTERED 0
> + unsigned long flags;
> };
>
> struct ublk_uring_cmd_pdu {
> @@ -201,7 +207,7 @@ static inline struct ublksrv_io_desc *ublk_get_iod(struct ublk_queue *ubq,
> int tag);
> static inline bool ublk_dev_is_user_copy(const struct ublk_device *ub)
> {
> - return ub->dev_info.flags & UBLK_F_USER_COPY;
> + return ub->dev_info.flags & (UBLK_F_USER_COPY | UBLK_F_SUPPORT_ZERO_COPY);
> }
>
> static inline bool ublk_dev_is_zoned(const struct ublk_device *ub)
> @@ -581,7 +587,7 @@ static void ublk_apply_params(struct ublk_device *ub)
>
> static inline bool ublk_support_user_copy(const struct ublk_queue *ubq)
> {
> - return ubq->flags & UBLK_F_USER_COPY;
> + return ubq->flags & (UBLK_F_USER_COPY | UBLK_F_SUPPORT_ZERO_COPY);
> }
>
> static inline bool ublk_need_req_ref(const struct ublk_queue *ubq)
> @@ -1747,6 +1753,102 @@ static inline void ublk_prep_cancel(struct io_uring_cmd *cmd,
> io_uring_cmd_mark_cancelable(cmd, issue_flags);
> }
>
> +static inline struct request *__ublk_check_and_get_req(struct ublk_device *ub,
> + struct ublk_queue *ubq, int tag, size_t offset)
> +{
> + struct request *req;
> +
> + if (!ublk_need_req_ref(ubq))
> + return NULL;
> +
> + req = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], tag);
> + if (!req)
> + return NULL;
> +
> + if (!ublk_get_req_ref(ubq, req))
> + return NULL;
> +
> + if (unlikely(!blk_mq_request_started(req) || req->tag != tag))
> + goto fail_put;
> +
> + if (!ublk_rq_has_data(req))
> + goto fail_put;
> +
> + if (offset > blk_rq_bytes(req))
> + goto fail_put;
> +
> + return req;
> +fail_put:
> + ublk_put_req_ref(ubq, req);
> + return NULL;
> +}
> +
> +static void ublk_io_release(void *priv)
> +{
> + struct request *rq = priv;
> + struct ublk_queue *ubq = rq->mq_hctx->driver_data;
> +
> + ublk_put_req_ref(ubq, rq);
> +}
It isn't enough to just get & put request reference here between registering
buffer and freeing the registered node buf, because the same reference can be
dropped from ublk_commit_completion() which is from queueing
UBLK_IO_COMMIT_AND_FETCH_REQ, and buggy app may queue this command multiple
times for freeing the request.
One solution is to not allow request completion until the ->release() is
returned.
Thanks,
Ming
next prev parent reply other threads:[~2025-02-12 2:49 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-11 0:56 [PATCHv2 0/6] ublk zero-copy support Keith Busch
2025-02-11 0:56 ` [PATCHv2 1/6] io_uring: use node for import Keith Busch
2025-02-11 0:56 ` [PATCHv2 2/6] io_uring: create resource release callback Keith Busch
2025-02-13 1:31 ` Pavel Begunkov
2025-02-13 1:58 ` Keith Busch
2025-02-13 13:06 ` Pavel Begunkov
2025-02-11 0:56 ` [PATCHv2 3/6] io_uring: add support for kernel registered bvecs Keith Busch
2025-02-13 1:33 ` Pavel Begunkov
2025-02-14 3:30 ` Ming Lei
2025-02-14 15:26 ` Keith Busch
2025-02-15 1:34 ` Ming Lei
2025-02-18 20:34 ` Keith Busch
2025-02-11 0:56 ` [PATCHv2 4/6] ublk: zc register/unregister bvec Keith Busch
2025-02-12 2:49 ` Ming Lei [this message]
2025-02-12 4:11 ` Keith Busch
2025-02-12 9:24 ` Ming Lei
2025-02-12 14:59 ` Keith Busch
2025-02-13 2:12 ` Pavel Begunkov
2025-02-11 0:56 ` [PATCHv2 5/6] io_uring: add abstraction for buf_table rsrc data Keith Busch
2025-02-11 0:56 ` [PATCHv2 6/6] io_uring: cache nodes and mapped buffers Keith Busch
2025-02-11 16:47 ` Keith Busch
2025-02-12 2:29 ` [PATCHv2 0/6] ublk zero-copy support Ming Lei
2025-02-12 15:28 ` Keith Busch
2025-02-12 16:06 ` Pavel Begunkov
2025-02-13 1:52 ` Ming Lei
2025-02-13 15:12 ` lizetao
2025-02-13 16:06 ` Keith Busch
2025-02-14 3:39 ` lizetao
2025-02-14 2:41 ` Ming Lei
2025-02-14 4:21 ` lizetao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z6wMK5WxvS_MzLh3@fedora \
--to=ming.lei@redhat.com \
--cc=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=bernd@bsbernd.com \
--cc=io-uring@vger.kernel.org \
--cc=kbusch@kernel.org \
--cc=kbusch@meta.com \
--cc=linux-block@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox