public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] io_uring/uring_cmd: be smarter about SQE copying
@ 2025-05-31 20:52 Jens Axboe
  2025-06-02 10:24 ` Pavel Begunkov
  2025-06-03 15:05 ` Caleb Sander Mateos
  0 siblings, 2 replies; 7+ messages in thread
From: Jens Axboe @ 2025-05-31 20:52 UTC (permalink / raw)
  To: io-uring; +Cc: Caleb Sander Mateos

uring_cmd currently copies the SQE unconditionally, which was introduced
as a work-around in commit:

d6211ebbdaa5 ("io_uring/uring_cmd: unconditionally copy SQEs at prep time")

because the checking for whether or not this command may have ->issue()
called from io-wq wasn't complete. Rectify that, ensuring that if the
request is marked explicitly async via REQ_F_FORCE_ASYNC or if it's
part of a link chain, then the SQE is copied upfront.

Always copying can be costly, particularly when dealing with SQE128
rings. But even a normal 64b SQE copy is noticeable at high enough
rates.

Reported-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

---

diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 929cad6ee326..cb4b867a2656 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -181,29 +181,42 @@ void io_uring_cmd_done(struct io_uring_cmd *ioucmd, ssize_t ret, u64 res2,
 }
 EXPORT_SYMBOL_GPL(io_uring_cmd_done);
 
+static void io_uring_sqe_copy(struct io_kiocb *req, struct io_uring_cmd *ioucmd)
+{
+	struct io_async_cmd *ac = req->async_data;
+
+	if (ioucmd->sqe != ac->sqes) {
+		memcpy(ac->sqes, ioucmd->sqe, uring_sqe_size(req->ctx));
+		ioucmd->sqe = ac->sqes;
+	}
+}
+
 static int io_uring_cmd_prep_setup(struct io_kiocb *req,
 				   const struct io_uring_sqe *sqe)
 {
 	struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
+	struct io_ring_ctx *ctx = req->ctx;
 	struct io_async_cmd *ac;
 
 	/* see io_uring_cmd_get_async_data() */
 	BUILD_BUG_ON(offsetof(struct io_async_cmd, data) != 0);
 
-	ac = io_uring_alloc_async_data(&req->ctx->cmd_cache, req);
+	ac = io_uring_alloc_async_data(&ctx->cmd_cache, req);
 	if (!ac)
 		return -ENOMEM;
 	ac->data.op_data = NULL;
 
 	/*
-	 * Unconditionally cache the SQE for now - this is only needed for
-	 * requests that go async, but prep handlers must ensure that any
-	 * sqe data is stable beyond prep. Since uring_cmd is special in
-	 * that it doesn't read in per-op data, play it safe and ensure that
-	 * any SQE data is stable beyond prep. This can later get relaxed.
+	 * Copy SQE now, if we know we're going async. Drain will set
+	 * FORCE_ASYNC, and assume links may cause it to go async. If not,
+	 * copy is deferred until issue time, if the request doesn't issue
+	 * or queue inline.
 	 */
-	memcpy(ac->sqes, sqe, uring_sqe_size(req->ctx));
-	ioucmd->sqe = ac->sqes;
+	ioucmd->sqe = sqe;
+	if (req->flags & (REQ_F_FORCE_ASYNC| REQ_F_LINK | REQ_F_HARDLINK) ||
+	    ctx->submit_state.link.head)
+		io_uring_sqe_copy(req, ioucmd);
+
 	return 0;
 }
 
@@ -259,6 +272,12 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
 	}
 
 	ret = file->f_op->uring_cmd(ioucmd, issue_flags);
+	if (ret == -EAGAIN) {
+		io_uring_sqe_copy(req, ioucmd);
+		return ret;
+	} else if (ret == -EIOCBQUEUED) {
+		return ret;
+	}
 	if (ret == -EAGAIN || ret == -EIOCBQUEUED)
 		return ret;
 	if (ret < 0)

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] io_uring/uring_cmd: be smarter about SQE copying
  2025-05-31 20:52 [PATCH] io_uring/uring_cmd: be smarter about SQE copying Jens Axboe
@ 2025-06-02 10:24 ` Pavel Begunkov
  2025-06-02 11:55   ` Jens Axboe
  2025-06-03 15:05 ` Caleb Sander Mateos
  1 sibling, 1 reply; 7+ messages in thread
From: Pavel Begunkov @ 2025-06-02 10:24 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: Caleb Sander Mateos

On 5/31/25 21:52, Jens Axboe wrote:
> uring_cmd currently copies the SQE unconditionally, which was introduced
...>   	/*
> -	 * Unconditionally cache the SQE for now - this is only needed for
> -	 * requests that go async, but prep handlers must ensure that any
> -	 * sqe data is stable beyond prep. Since uring_cmd is special in
> -	 * that it doesn't read in per-op data, play it safe and ensure that
> -	 * any SQE data is stable beyond prep. This can later get relaxed.
> +	 * Copy SQE now, if we know we're going async. Drain will set
> +	 * FORCE_ASYNC, and assume links may cause it to go async. If not,
> +	 * copy is deferred until issue time, if the request doesn't issue
> +	 * or queue inline.
>   	 */
> -	memcpy(ac->sqes, sqe, uring_sqe_size(req->ctx));
> -	ioucmd->sqe = ac->sqes;
> +	ioucmd->sqe = sqe;
> +	if (req->flags & (REQ_F_FORCE_ASYNC| REQ_F_LINK | REQ_F_HARDLINK) ||
> +	    ctx->submit_state.link.head)
> +		io_uring_sqe_copy(req, ioucmd);
> +

It'd be great if we can't crash the kernel (or do sth more nefarious with
that), and I'm 95% sure it's possible. The culprit is violation of
layering by poking into io_uring core bits that opcodes should not know
about, the flimsiness of attempts to infer the core io_uring behaviour
from opcode handlers, and leaving a potentially stale ->sqe pointer.

-- 
Pavel Begunkov


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] io_uring/uring_cmd: be smarter about SQE copying
  2025-06-02 10:24 ` Pavel Begunkov
@ 2025-06-02 11:55   ` Jens Axboe
  2025-06-02 12:10     ` Pavel Begunkov
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2025-06-02 11:55 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: Caleb Sander Mateos

On 6/2/25 4:24 AM, Pavel Begunkov wrote:
> On 5/31/25 21:52, Jens Axboe wrote:
>> uring_cmd currently copies the SQE unconditionally, which was introduced
> ...>       /*
>> -     * Unconditionally cache the SQE for now - this is only needed for
>> -     * requests that go async, but prep handlers must ensure that any
>> -     * sqe data is stable beyond prep. Since uring_cmd is special in
>> -     * that it doesn't read in per-op data, play it safe and ensure that
>> -     * any SQE data is stable beyond prep. This can later get relaxed.
>> +     * Copy SQE now, if we know we're going async. Drain will set
>> +     * FORCE_ASYNC, and assume links may cause it to go async. If not,
>> +     * copy is deferred until issue time, if the request doesn't issue
>> +     * or queue inline.
>>        */
>> -    memcpy(ac->sqes, sqe, uring_sqe_size(req->ctx));
>> -    ioucmd->sqe = ac->sqes;
>> +    ioucmd->sqe = sqe;
>> +    if (req->flags & (REQ_F_FORCE_ASYNC| REQ_F_LINK | REQ_F_HARDLINK) ||
>> +        ctx->submit_state.link.head)
>> +        io_uring_sqe_copy(req, ioucmd);
>> +
> 
> It'd be great if we can't crash the kernel (or do sth more nefarious with
> that), and I'm 95% sure it's possible. The culprit is violation of
> layering by poking into io_uring core bits that opcodes should not know
> about, the flimsiness of attempts to infer the core io_uring behaviour
> from opcode handlers, and leaving a potentially stale ->sqe pointer.

Sure, it's not a pretty solution. At all. Might be worth just having a
handler for this so that the core can call it, rather than have
uring_cmd attempt to cover all the cases here. That's the most offensive
part to me, the ->sqe pointer I'm not too worried about, in terms of
anything nefarious. If that's a problem, the the uring_cmd prep side is
borken anyway and needs fixing. Getting it wrong could obviously
reintroduce potential issues with data getting lost, however.

It's somewhat annoying to need to copy this upfront for the fairly rare
case of needing it.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] io_uring/uring_cmd: be smarter about SQE copying
  2025-06-02 11:55   ` Jens Axboe
@ 2025-06-02 12:10     ` Pavel Begunkov
  2025-06-02 13:04       ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Pavel Begunkov @ 2025-06-02 12:10 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: Caleb Sander Mateos

On 6/2/25 12:55, Jens Axboe wrote:
> On 6/2/25 4:24 AM, Pavel Begunkov wrote:
>> On 5/31/25 21:52, Jens Axboe wrote:
>>> uring_cmd currently copies the SQE unconditionally, which was introduced
>> ...>       /*
>>> -     * Unconditionally cache the SQE for now - this is only needed for
>>> -     * requests that go async, but prep handlers must ensure that any
>>> -     * sqe data is stable beyond prep. Since uring_cmd is special in
>>> -     * that it doesn't read in per-op data, play it safe and ensure that
>>> -     * any SQE data is stable beyond prep. This can later get relaxed.
>>> +     * Copy SQE now, if we know we're going async. Drain will set
>>> +     * FORCE_ASYNC, and assume links may cause it to go async. If not,
>>> +     * copy is deferred until issue time, if the request doesn't issue
>>> +     * or queue inline.
>>>         */
>>> -    memcpy(ac->sqes, sqe, uring_sqe_size(req->ctx));
>>> -    ioucmd->sqe = ac->sqes;
>>> +    ioucmd->sqe = sqe;
>>> +    if (req->flags & (REQ_F_FORCE_ASYNC| REQ_F_LINK | REQ_F_HARDLINK) ||
>>> +        ctx->submit_state.link.head)
>>> +        io_uring_sqe_copy(req, ioucmd);
>>> +
>>
>> It'd be great if we can't crash the kernel (or do sth more nefarious with
>> that), and I'm 95% sure it's possible. The culprit is violation of
>> layering by poking into io_uring core bits that opcodes should not know
>> about, the flimsiness of attempts to infer the core io_uring behaviour
>> from opcode handlers, and leaving a potentially stale ->sqe pointer.
> 
> Sure, it's not a pretty solution. At all. Might be worth just having a
> handler for this so that the core can call it, rather than have
> uring_cmd attempt to cover all the cases here. That's the most offensive
> part to me, the ->sqe pointer I'm not too worried about, in terms of
> anything nefarious. If that's a problem, the the uring_cmd prep side is

If a cmd accesses it after going async while it points to the SQ,
a lot of interesting things can happen, you just need to swap the
SQ in between, so that the pointer ends up dangling, and there is a
convenient feature for that.

> borken anyway and needs fixing. Getting it wrong could obviously
> reintroduce potential issues with data getting lost, however.

It's flimsy as it depends on the file impl, especially with
-EIOCBQUEUED, but yeah, your choice.

> It's somewhat annoying to need to copy this upfront for the fairly rare
> case of needing it.
> 

-- 
Pavel Begunkov


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] io_uring/uring_cmd: be smarter about SQE copying
  2025-06-02 12:10     ` Pavel Begunkov
@ 2025-06-02 13:04       ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2025-06-02 13:04 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: Caleb Sander Mateos

On 6/2/25 6:10 AM, Pavel Begunkov wrote:
> On 6/2/25 12:55, Jens Axboe wrote:
>> On 6/2/25 4:24 AM, Pavel Begunkov wrote:
>>> On 5/31/25 21:52, Jens Axboe wrote:
>>>> uring_cmd currently copies the SQE unconditionally, which was introduced
>>> ...>       /*
>>>> -     * Unconditionally cache the SQE for now - this is only needed for
>>>> -     * requests that go async, but prep handlers must ensure that any
>>>> -     * sqe data is stable beyond prep. Since uring_cmd is special in
>>>> -     * that it doesn't read in per-op data, play it safe and ensure that
>>>> -     * any SQE data is stable beyond prep. This can later get relaxed.
>>>> +     * Copy SQE now, if we know we're going async. Drain will set
>>>> +     * FORCE_ASYNC, and assume links may cause it to go async. If not,
>>>> +     * copy is deferred until issue time, if the request doesn't issue
>>>> +     * or queue inline.
>>>>         */
>>>> -    memcpy(ac->sqes, sqe, uring_sqe_size(req->ctx));
>>>> -    ioucmd->sqe = ac->sqes;
>>>> +    ioucmd->sqe = sqe;
>>>> +    if (req->flags & (REQ_F_FORCE_ASYNC| REQ_F_LINK | REQ_F_HARDLINK) ||
>>>> +        ctx->submit_state.link.head)
>>>> +        io_uring_sqe_copy(req, ioucmd);
>>>> +
>>>
>>> It'd be great if we can't crash the kernel (or do sth more nefarious with
>>> that), and I'm 95% sure it's possible. The culprit is violation of
>>> layering by poking into io_uring core bits that opcodes should not know
>>> about, the flimsiness of attempts to infer the core io_uring behaviour
>>> from opcode handlers, and leaving a potentially stale ->sqe pointer.
>>
>> Sure, it's not a pretty solution. At all. Might be worth just having a
>> handler for this so that the core can call it, rather than have
>> uring_cmd attempt to cover all the cases here. That's the most offensive
>> part to me, the ->sqe pointer I'm not too worried about, in terms of
>> anything nefarious. If that's a problem, the the uring_cmd prep side is
> 
> If a cmd accesses it after going async while it points to the SQ,
> a lot of interesting things can happen, you just need to swap the
> SQ in between, so that the pointer ends up dangling, and there is a
> convenient feature for that.

Not sure why you aren't being explicit and just saying "resizing", but
yes agree that would be a problematic spot. Above comment was just about
swapping between two separately valid SQEs, not dead SQEs.

>> borken anyway and needs fixing. Getting it wrong could obviously
>> reintroduce potential issues with data getting lost, however.
> 
> It's flimsy as it depends on the file impl, especially with
> -EIOCBQUEUED, but yeah, your choice.

It is... Just consider it an RFC - I think we should do something about
it, productive suggestions more than welcome.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] io_uring/uring_cmd: be smarter about SQE copying
  2025-05-31 20:52 [PATCH] io_uring/uring_cmd: be smarter about SQE copying Jens Axboe
  2025-06-02 10:24 ` Pavel Begunkov
@ 2025-06-03 15:05 ` Caleb Sander Mateos
  2025-06-03 15:11   ` Jens Axboe
  1 sibling, 1 reply; 7+ messages in thread
From: Caleb Sander Mateos @ 2025-06-03 15:05 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

On Sat, May 31, 2025 at 1:52 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> uring_cmd currently copies the SQE unconditionally, which was introduced
> as a work-around in commit:
>
> d6211ebbdaa5 ("io_uring/uring_cmd: unconditionally copy SQEs at prep time")
>
> because the checking for whether or not this command may have ->issue()
> called from io-wq wasn't complete. Rectify that, ensuring that if the
> request is marked explicitly async via REQ_F_FORCE_ASYNC or if it's
> part of a link chain, then the SQE is copied upfront.
>
> Always copying can be costly, particularly when dealing with SQE128
> rings. But even a normal 64b SQE copy is noticeable at high enough
> rates.
>
> Reported-by: Caleb Sander Mateos <csander@purestorage.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>
> ---
>
> diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
> index 929cad6ee326..cb4b867a2656 100644
> --- a/io_uring/uring_cmd.c
> +++ b/io_uring/uring_cmd.c
> @@ -181,29 +181,42 @@ void io_uring_cmd_done(struct io_uring_cmd *ioucmd, ssize_t ret, u64 res2,
>  }
>  EXPORT_SYMBOL_GPL(io_uring_cmd_done);
>
> +static void io_uring_sqe_copy(struct io_kiocb *req, struct io_uring_cmd *ioucmd)
> +{
> +       struct io_async_cmd *ac = req->async_data;
> +
> +       if (ioucmd->sqe != ac->sqes) {
> +               memcpy(ac->sqes, ioucmd->sqe, uring_sqe_size(req->ctx));
> +               ioucmd->sqe = ac->sqes;
> +       }
> +}
> +
>  static int io_uring_cmd_prep_setup(struct io_kiocb *req,
>                                    const struct io_uring_sqe *sqe)
>  {
>         struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
> +       struct io_ring_ctx *ctx = req->ctx;
>         struct io_async_cmd *ac;
>
>         /* see io_uring_cmd_get_async_data() */
>         BUILD_BUG_ON(offsetof(struct io_async_cmd, data) != 0);
>
> -       ac = io_uring_alloc_async_data(&req->ctx->cmd_cache, req);
> +       ac = io_uring_alloc_async_data(&ctx->cmd_cache, req);
>         if (!ac)
>                 return -ENOMEM;
>         ac->data.op_data = NULL;
>
>         /*
> -        * Unconditionally cache the SQE for now - this is only needed for
> -        * requests that go async, but prep handlers must ensure that any
> -        * sqe data is stable beyond prep. Since uring_cmd is special in
> -        * that it doesn't read in per-op data, play it safe and ensure that
> -        * any SQE data is stable beyond prep. This can later get relaxed.
> +        * Copy SQE now, if we know we're going async. Drain will set
> +        * FORCE_ASYNC, and assume links may cause it to go async. If not,
> +        * copy is deferred until issue time, if the request doesn't issue
> +        * or queue inline.
>          */
> -       memcpy(ac->sqes, sqe, uring_sqe_size(req->ctx));
> -       ioucmd->sqe = ac->sqes;
> +       ioucmd->sqe = sqe;
> +       if (req->flags & (REQ_F_FORCE_ASYNC| REQ_F_LINK | REQ_F_HARDLINK) ||
> +           ctx->submit_state.link.head)

To check my understanding, io_init_req() will set REQ_F_FORCE_ASYNC on
any request with IOSQE_IO_DRAIN as well as all subsequent requests
until the IOSQE_IO_DRAIN request completes? Looks like this condition
should work then. I think you can drop REQ_F_LINK | REQ_F_HARDLINK,
though; the initial request of a linked chain will be issued
synchronously, and ctx->submit_state.link.head will be set for the
subsequent requests.

I do share Pavel's concern that whether or not a request will be
initially issued asynchronously is up to the core io_uring code, so it
seems a bit fragile to make these assumptions in the uring_cmd layer.
I think I would prefer either passing a bool issue_async to the
->prep() handler, or adding an optional ->prep_async() hook called if
the initial issue may happen asynchronously.

> +               io_uring_sqe_copy(req, ioucmd);
> +
>         return 0;
>  }
>
> @@ -259,6 +272,12 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
>         }
>
>         ret = file->f_op->uring_cmd(ioucmd, issue_flags);
> +       if (ret == -EAGAIN) {
> +               io_uring_sqe_copy(req, ioucmd);
> +               return ret;
> +       } else if (ret == -EIOCBQUEUED) {
> +               return ret;
> +       }
>         if (ret == -EAGAIN || ret == -EIOCBQUEUED)
>                 return ret;

This if condition is always false now, remove it?

Best,
Caleb

>         if (ret < 0)
>
> --
> Jens Axboe
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] io_uring/uring_cmd: be smarter about SQE copying
  2025-06-03 15:05 ` Caleb Sander Mateos
@ 2025-06-03 15:11   ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2025-06-03 15:11 UTC (permalink / raw)
  To: Caleb Sander Mateos; +Cc: io-uring

On 6/3/25 9:05 AM, Caleb Sander Mateos wrote:
> On Sat, May 31, 2025 at 1:52?PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> uring_cmd currently copies the SQE unconditionally, which was introduced
>> as a work-around in commit:
>>
>> d6211ebbdaa5 ("io_uring/uring_cmd: unconditionally copy SQEs at prep time")
>>
>> because the checking for whether or not this command may have ->issue()
>> called from io-wq wasn't complete. Rectify that, ensuring that if the
>> request is marked explicitly async via REQ_F_FORCE_ASYNC or if it's
>> part of a link chain, then the SQE is copied upfront.
>>
>> Always copying can be costly, particularly when dealing with SQE128
>> rings. But even a normal 64b SQE copy is noticeable at high enough
>> rates.
>>
>> Reported-by: Caleb Sander Mateos <csander@purestorage.com>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>
>> ---
>>
>> diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
>> index 929cad6ee326..cb4b867a2656 100644
>> --- a/io_uring/uring_cmd.c
>> +++ b/io_uring/uring_cmd.c
>> @@ -181,29 +181,42 @@ void io_uring_cmd_done(struct io_uring_cmd *ioucmd, ssize_t ret, u64 res2,
>>  }
>>  EXPORT_SYMBOL_GPL(io_uring_cmd_done);
>>
>> +static void io_uring_sqe_copy(struct io_kiocb *req, struct io_uring_cmd *ioucmd)
>> +{
>> +       struct io_async_cmd *ac = req->async_data;
>> +
>> +       if (ioucmd->sqe != ac->sqes) {
>> +               memcpy(ac->sqes, ioucmd->sqe, uring_sqe_size(req->ctx));
>> +               ioucmd->sqe = ac->sqes;
>> +       }
>> +}
>> +
>>  static int io_uring_cmd_prep_setup(struct io_kiocb *req,
>>                                    const struct io_uring_sqe *sqe)
>>  {
>>         struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
>> +       struct io_ring_ctx *ctx = req->ctx;
>>         struct io_async_cmd *ac;
>>
>>         /* see io_uring_cmd_get_async_data() */
>>         BUILD_BUG_ON(offsetof(struct io_async_cmd, data) != 0);
>>
>> -       ac = io_uring_alloc_async_data(&req->ctx->cmd_cache, req);
>> +       ac = io_uring_alloc_async_data(&ctx->cmd_cache, req);
>>         if (!ac)
>>                 return -ENOMEM;
>>         ac->data.op_data = NULL;
>>
>>         /*
>> -        * Unconditionally cache the SQE for now - this is only needed for
>> -        * requests that go async, but prep handlers must ensure that any
>> -        * sqe data is stable beyond prep. Since uring_cmd is special in
>> -        * that it doesn't read in per-op data, play it safe and ensure that
>> -        * any SQE data is stable beyond prep. This can later get relaxed.
>> +        * Copy SQE now, if we know we're going async. Drain will set
>> +        * FORCE_ASYNC, and assume links may cause it to go async. If not,
>> +        * copy is deferred until issue time, if the request doesn't issue
>> +        * or queue inline.
>>          */
>> -       memcpy(ac->sqes, sqe, uring_sqe_size(req->ctx));
>> -       ioucmd->sqe = ac->sqes;
>> +       ioucmd->sqe = sqe;
>> +       if (req->flags & (REQ_F_FORCE_ASYNC| REQ_F_LINK | REQ_F_HARDLINK) ||
>> +           ctx->submit_state.link.head)
> 
> To check my understanding, io_init_req() will set REQ_F_FORCE_ASYNC on
> any request with IOSQE_IO_DRAIN as well as all subsequent requests
> until the IOSQE_IO_DRAIN request completes? Looks like this condition

Correct

> should work then. I think you can drop REQ_F_LINK | REQ_F_HARDLINK,
> though; the initial request of a linked chain will be issued
> synchronously, and ctx->submit_state.link.head will be set for the
> subsequent requests.
> 
> I do share Pavel's concern that whether or not a request will be
> initially issued asynchronously is up to the core io_uring code, so it
> seems a bit fragile to make these assumptions in the uring_cmd layer.
> I think I would prefer either passing a bool issue_async to the
> ->prep() handler, or adding an optional ->prep_async() hook called if
> the initial issue may happen asynchronously.

Yes I do too, which is why I suggested we add a specific cold handler
for this kind of case. That leaves the core of io_uring calling it
appropriately, and eliminates the need for storing an sqe pointer. Which
would've been fine before resizing, but not a great idea to do now. I'll
do a v2 of this with that in mind, just haven't gotten around to it yet.

>> +               io_uring_sqe_copy(req, ioucmd);
>> +
>>         return 0;
>>  }
>>
>> @@ -259,6 +272,12 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
>>         }
>>
>>         ret = file->f_op->uring_cmd(ioucmd, issue_flags);
>> +       if (ret == -EAGAIN) {
>> +               io_uring_sqe_copy(req, ioucmd);
>> +               return ret;
>> +       } else if (ret == -EIOCBQUEUED) {
>> +               return ret;
>> +       }
>>         if (ret == -EAGAIN || ret == -EIOCBQUEUED)
>>                 return ret;
> 
> This if condition is always false now, remove it?

Heh yes, just forgot to remove those lines when adding the above change.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-06-03 15:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-31 20:52 [PATCH] io_uring/uring_cmd: be smarter about SQE copying Jens Axboe
2025-06-02 10:24 ` Pavel Begunkov
2025-06-02 11:55   ` Jens Axboe
2025-06-02 12:10     ` Pavel Begunkov
2025-06-02 13:04       ` Jens Axboe
2025-06-03 15:05 ` Caleb Sander Mateos
2025-06-03 15:11   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox