public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: Ming Lei <[email protected]>
Cc: Jens Axboe <[email protected]>,
	[email protected], Kanchan Joshi <[email protected]>
Subject: Re: [PATCH] io_uring: complete request via task work in case of DEFER_TASKRUN
Date: Sun, 16 Apr 2023 00:15:20 +0100	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 4/14/23 16:42, Ming Lei wrote:
> On Fri, Apr 14, 2023 at 04:07:52PM +0100, Pavel Begunkov wrote:
>> On 4/14/23 14:53, Ming Lei wrote:
>>> On Fri, Apr 14, 2023 at 02:01:26PM +0100, Pavel Begunkov wrote:
>>>> On 4/14/23 08:53, Ming Lei wrote:
>>>>> So far io_req_complete_post() only covers DEFER_TASKRUN by completing
>>>>> request via task work when the request is completed from IOWQ.
>>>>>
>>>>> However, uring command could be completed from any context, and if io
>>>>> uring is setup with DEFER_TASKRUN, the command is required to be
>>>>> completed from current context, otherwise wait on IORING_ENTER_GETEVENTS
>>>>> can't be wakeup, and may hang forever.
>>>>
>>>> fwiw, there is one legit exception, when the task is half dead
>>>> task_work will be executed by a kthread. It should be fine as it
>>>> locks the ctx down, but I can't help but wonder whether it's only
>>>> ublk_cancel_queue() affected or there are more places in ublk?
>>>
>>> No, it isn't.
>>>
>>> It isn't triggered on nvme-pt just because command is always done
>>> in task context.
>>>
>>> And we know more uring command cases are coming.
>>
>> Because all requests and cmds but ublk complete it from another
>> task, ublk is special in this regard.
> 
> Not sure it is true, cause it is allowed to call io_uring_cmd_done from other
> task technically. And it could be more friendly for driver to not limit
> its caller in the task context. Especially we have another API of
> io_uring_cmd_complete_in_task().

I agree that the cmd io_uring API can do better.


>> I have several more not so related questions:
>>
>> 1) Can requests be submitted by some other task than ->ubq_daemon?
> 
> Yeah, requests can be submitted by other task, but ublk driver doesn't
> allow it because ublk driver has not knowledge when the io_uring context
> goes away, so has to limit requests submitted from ->ubq_daemon only,
> then use this task's information for checking if the io_uring context
> is going to exit. When the io_uring context is dying, we need to
> abort these uring commands(may never complete), see ublk_cancel_queue().
> 
> The only difference is that the uring command may never complete,
> because one uring cmd is only completed when the associated block request
> is coming. The situation could be improved by adding API/callback for
> notifying io_uring exit.

Got it. And it sounds like you can use IORING_SETUP_SINGLE_ISSUER
and possibly IORING_SETUP_DEFER_TASKRUN, if not already.


>> Looking at
>>
>> static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags)
>> {
>>      ...
>>      if (ubq->ubq_daemon && ubq->ubq_daemon != current)
>>         goto out;
>> }
>>
>> ublk_queue_cmd() avoiding io_uring way of delivery and using
>> raw task_work doesn't seem great. Especially with TWA_SIGNAL_NO_IPI.
> 
> Yeah, it has been in my todo list to kill task work. In ublk early time,

I see

> task work just performs better than io_uring_cmd_complete_in_task(), but
> the gap becomes pretty small or even not visible now.

It seems a bit strange, non DEFER_TASKRUN tw is almost identical to what
you do, see __io_req_task_work_add(). Maybe it's extra callbacks on the
execution side.

Did you try DEFER_TASKRUN? Not sure it suits your case as there are
limitations, but the queueing side of it, as well as execution and
waiting are well optimised and should do better.


>> 2) What the purpose of the two lines below? I see how
>> UBLK_F_URING_CMD_COMP_IN_TASK is used, but don't understand
>> why it changes depending on whether it's a module or not.
> 
> task work isn't available in case of building ublk as module.

Ah, makes sense now, thanks

>> 3) The long comment in ublk_queue_cmd() seems quite scary.
>> If you have a cmd / io_uring request it hold a ctx reference
>> and is always allowed to use io_uring's task_work infra like
>> io_uring_cmd_complete_in_task(). Why it's different for ublk?
> 
> The thing is that we don't know if there is io_uring request for the
> coming blk request. UBLK_IO_FLAG_ABORTED just means that the io_uring
> context is dead, and we can't use io_uring_cmd_complete_in_task() any
> more.

Roughly got it, IIUC, there might not be a (valid) io_uring
request backing this block request in the first place because of
this aborting thing.


>>>> One more thing, cmds should not be setting issue_flags but only
>>>> forwarding what the core io_uring code passed, it'll get tons of
>>>> bugs in no time otherwise.
>>>
>>> Here io_uring_cmd_done() is changed to this way recently, and it
>>> could be another topic.
>>
>> And it's abused, but as you said, not particularly related
>> to this patch.
>>
>>
>>>> static void ublk_cancel_queue(struct ublk_queue *ubq)
>>>> {
>>>>       ...
>>>>       io_uring_cmd_done(io->cmd, UBLK_IO_RES_ABORT, 0,
>>>>                         IO_URING_F_UNLOCKED);
>>>> }
>>>>
>>>> Can we replace it with task_work? It should be cold, and I
>>>> assume ublk_cancel_queue() doesn't assume that all requests will
>>>> put down by the end of the function as io_uring_cmd_done()
>>>> can offload it in any case.
>>>
>>> But it isn't specific for ublk, any caller of io_uring_cmd_done()
>>> has such issue since io_uring_cmd_done() is one generic API.
>>
>> Well, fair enough, considering that IO_URING_F_UNLOCKED was
>> just added (*still naively hoping it'll be clean up*)
> 
> IMO, it is reasonable for io_uring_cmd_done to hide any io_uring
> implementation details, even the task context concept, but not
> sure if it is doable.

I agree that there should be a function doing the right thing
without extra flags, i.e. completing via tw, and there should
also be a helper for more advanced performant cases like when
we know the context.

-- 
Pavel Begunkov

  reply	other threads:[~2023-04-15 23:17 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20230414075422epcas5p3ae5de53e643a448f19df82a7a1d5cd1c@epcas5p3.samsung.com>
2023-04-14  7:53 ` [PATCH] io_uring: complete request via task work in case of DEFER_TASKRUN Ming Lei
2023-04-14 11:52   ` Kanchan Joshi
2023-04-14 12:39   ` Jens Axboe
2023-04-14 13:01   ` Pavel Begunkov
2023-04-14 13:53     ` Ming Lei
2023-04-14 14:13       ` Kanchan Joshi
2023-04-14 14:53         ` Ming Lei
2023-04-14 15:07       ` Pavel Begunkov
2023-04-14 15:42         ` Ming Lei
2023-04-15 23:15           ` Pavel Begunkov [this message]
2023-04-16 10:05             ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox