public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Keith Busch <kbusch@meta.com>
Cc: asml.silence@gmail.com, axboe@kernel.dk,
	linux-block@vger.kernel.org, io-uring@vger.kernel.org,
	bernd@bsbernd.com, Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCHv2 4/6] ublk: zc register/unregister bvec
Date: Wed, 12 Feb 2025 10:49:15 +0800	[thread overview]
Message-ID: <Z6wMK5WxvS_MzLh3@fedora> (raw)
In-Reply-To: <20250211005646.222452-5-kbusch@meta.com>

On Mon, Feb 10, 2025 at 04:56:44PM -0800, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> Provide new operations for the user to request mapping an active request
> to an io uring instance's buf_table. The user has to provide the index
> it wants to install the buffer.
> 
> A reference count is taken on the request to ensure it can't be
> completed while it is active in a ring's buf_table.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>  drivers/block/ublk_drv.c      | 145 +++++++++++++++++++++++++---------
>  include/uapi/linux/ublk_cmd.h |   4 +
>  2 files changed, 113 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index 529085181f355..ccfda7b2c24da 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -51,6 +51,9 @@
>  /* private ioctl command mirror */
>  #define UBLK_CMD_DEL_DEV_ASYNC	_IOC_NR(UBLK_U_CMD_DEL_DEV_ASYNC)
>  
> +#define UBLK_IO_REGISTER_IO_BUF		_IOC_NR(UBLK_U_IO_REGISTER_IO_BUF)
> +#define UBLK_IO_UNREGISTER_IO_BUF	_IOC_NR(UBLK_U_IO_UNREGISTER_IO_BUF)

UBLK_IO_REGISTER_IO_BUF command may be completed, and buffer isn't used
by RW_FIXED yet in the following cases:

- application doesn't submit any RW_FIXED consumer OP

- io_uring_enter() only issued UBLK_IO_REGISTER_IO_BUF, and the other
  OPs can't be issued because of out of resource 

...

Then io_uring_enter() returns, and the application is panic or killed,
how to avoid buffer leak?

It need to deal with in io_uring cancel code for calling ->release() if
the kbuffer node isn't released.

UBLK_IO_UNREGISTER_IO_BUF still need to call ->release() if the node
buffer isn't used.

> +
>  /* All UBLK_F_* have to be included into UBLK_F_ALL */
>  #define UBLK_F_ALL (UBLK_F_SUPPORT_ZERO_COPY \
>  		| UBLK_F_URING_CMD_COMP_IN_TASK \
> @@ -76,6 +79,9 @@ struct ublk_rq_data {
>  	struct llist_node node;
>  
>  	struct kref ref;
> +
> +#define UBLK_ZC_REGISTERED 0
> +	unsigned long flags;
>  };
>  
>  struct ublk_uring_cmd_pdu {
> @@ -201,7 +207,7 @@ static inline struct ublksrv_io_desc *ublk_get_iod(struct ublk_queue *ubq,
>  						   int tag);
>  static inline bool ublk_dev_is_user_copy(const struct ublk_device *ub)
>  {
> -	return ub->dev_info.flags & UBLK_F_USER_COPY;
> +	return ub->dev_info.flags & (UBLK_F_USER_COPY | UBLK_F_SUPPORT_ZERO_COPY);
>  }
>  
>  static inline bool ublk_dev_is_zoned(const struct ublk_device *ub)
> @@ -581,7 +587,7 @@ static void ublk_apply_params(struct ublk_device *ub)
>  
>  static inline bool ublk_support_user_copy(const struct ublk_queue *ubq)
>  {
> -	return ubq->flags & UBLK_F_USER_COPY;
> +	return ubq->flags & (UBLK_F_USER_COPY | UBLK_F_SUPPORT_ZERO_COPY);
>  }
>  
>  static inline bool ublk_need_req_ref(const struct ublk_queue *ubq)
> @@ -1747,6 +1753,102 @@ static inline void ublk_prep_cancel(struct io_uring_cmd *cmd,
>  	io_uring_cmd_mark_cancelable(cmd, issue_flags);
>  }
>  
> +static inline struct request *__ublk_check_and_get_req(struct ublk_device *ub,
> +		struct ublk_queue *ubq, int tag, size_t offset)
> +{
> +	struct request *req;
> +
> +	if (!ublk_need_req_ref(ubq))
> +		return NULL;
> +
> +	req = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], tag);
> +	if (!req)
> +		return NULL;
> +
> +	if (!ublk_get_req_ref(ubq, req))
> +		return NULL;
> +
> +	if (unlikely(!blk_mq_request_started(req) || req->tag != tag))
> +		goto fail_put;
> +
> +	if (!ublk_rq_has_data(req))
> +		goto fail_put;
> +
> +	if (offset > blk_rq_bytes(req))
> +		goto fail_put;
> +
> +	return req;
> +fail_put:
> +	ublk_put_req_ref(ubq, req);
> +	return NULL;
> +}
> +
> +static void ublk_io_release(void *priv)
> +{
> +	struct request *rq = priv;
> +	struct ublk_queue *ubq = rq->mq_hctx->driver_data;
> +
> +	ublk_put_req_ref(ubq, rq);
> +}

It isn't enough to just get & put request reference here between registering
buffer and freeing the registered node buf, because the same reference can be
dropped from ublk_commit_completion() which is from queueing
UBLK_IO_COMMIT_AND_FETCH_REQ, and buggy app may queue this command multiple
times for freeing the request.

One solution is to not allow request completion until the ->release() is
returned.

Thanks,
Ming


  reply	other threads:[~2025-02-12  2:49 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-11  0:56 [PATCHv2 0/6] ublk zero-copy support Keith Busch
2025-02-11  0:56 ` [PATCHv2 1/6] io_uring: use node for import Keith Busch
2025-02-11  0:56 ` [PATCHv2 2/6] io_uring: create resource release callback Keith Busch
2025-02-13  1:31   ` Pavel Begunkov
2025-02-13  1:58     ` Keith Busch
2025-02-13 13:06       ` Pavel Begunkov
2025-02-11  0:56 ` [PATCHv2 3/6] io_uring: add support for kernel registered bvecs Keith Busch
2025-02-13  1:33   ` Pavel Begunkov
2025-02-14  3:30   ` Ming Lei
2025-02-14 15:26     ` Keith Busch
2025-02-15  1:34       ` Ming Lei
2025-02-18 20:34         ` Keith Busch
2025-02-11  0:56 ` [PATCHv2 4/6] ublk: zc register/unregister bvec Keith Busch
2025-02-12  2:49   ` Ming Lei [this message]
2025-02-12  4:11     ` Keith Busch
2025-02-12  9:24       ` Ming Lei
2025-02-12 14:59         ` Keith Busch
2025-02-13  2:12   ` Pavel Begunkov
2025-02-11  0:56 ` [PATCHv2 5/6] io_uring: add abstraction for buf_table rsrc data Keith Busch
2025-02-11  0:56 ` [PATCHv2 6/6] io_uring: cache nodes and mapped buffers Keith Busch
2025-02-11 16:47   ` Keith Busch
2025-02-12  2:29 ` [PATCHv2 0/6] ublk zero-copy support Ming Lei
2025-02-12 15:28   ` Keith Busch
2025-02-12 16:06     ` Pavel Begunkov
2025-02-13  1:52       ` Ming Lei
2025-02-13 15:12 ` lizetao
2025-02-13 16:06   ` Keith Busch
2025-02-14  3:39     ` lizetao
2025-02-14  2:41   ` Ming Lei
2025-02-14  4:21     ` lizetao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z6wMK5WxvS_MzLh3@fedora \
    --to=ming.lei@redhat.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=bernd@bsbernd.com \
    --cc=io-uring@vger.kernel.org \
    --cc=kbusch@kernel.org \
    --cc=kbusch@meta.com \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox