* Queston about io_uring_flush
@ 2021-02-04 9:31 Hao Xu
2021-02-04 11:00 ` Pavel Begunkov
0 siblings, 1 reply; 4+ messages in thread
From: Hao Xu @ 2021-02-04 9:31 UTC (permalink / raw)
To: io-uring, Jens Axboe; +Cc: Pavel Begunkov
Hi all,
Sorry for disturb all of you. Here comes my question.
When we close a uring file, we go into io_uring_flush(),
there is codes at the end:
if (!(ctx->flags & IORING_SETUP_SQPOLL) || ctx->sqo_task == current)
io_uring_del_task_file(file);
My understanding, this is to delete the ctx(associated with the uring
file) from current->io_uring->xa.
I'm thinking of this scenario: the task to close uring file is not the
one which created the uring file.
Then it doesn't make sense to delete the uring file from
current->io_uring->xa. It should be "delete uring file from
ctx->sqo_task->io_uring->xa" instead.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Queston about io_uring_flush
2021-02-04 9:31 Queston about io_uring_flush Hao Xu
@ 2021-02-04 11:00 ` Pavel Begunkov
2021-02-05 7:21 ` Hao Xu
0 siblings, 1 reply; 4+ messages in thread
From: Pavel Begunkov @ 2021-02-04 11:00 UTC (permalink / raw)
To: Hao Xu, io-uring, Jens Axboe
On 04/02/2021 09:31, Hao Xu wrote:
> Hi all,
> Sorry for disturb all of you. Here comes my question.
> When we close a uring file, we go into io_uring_flush(),
> there is codes at the end:
>
> if (!(ctx->flags & IORING_SETUP_SQPOLL) || ctx->sqo_task == current)
> io_uring_del_task_file(file);
>
> My understanding, this is to delete the ctx(associated with the uring
> file) from current->io_uring->xa.
> I'm thinking of this scenario: the task to close uring file is not the
> one which created the uring file.
> Then it doesn't make sense to delete the uring file from current->io_uring->xa. It should be "delete uring file from
> ctx->sqo_task->io_uring->xa" instead.
1. It's not only about created or not, look for
io_uring_add_task_file() call sites.
2. io_uring->xa is basically a map from task to used by it urings.
Every user task should clean only its own context (SQPOLL task is
a bit different), it'll be hell bunch of races otherwise.
3. If happens that it's closed by a task that has nothing to do
with this ctx, then it won't find anything in its
task->io_uring->xa, and so won't delete anything, and that's ok.
io_uring->xa of sqo_task will be cleaned by sqo_task, either
on another close() or on exit() (see io_uring_files_cancel).
4. There is a bunch of cases where that scheme doesn't behave
nice, but at least should not leak/fault when all related tasks
are killed.
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Queston about io_uring_flush
2021-02-04 11:00 ` Pavel Begunkov
@ 2021-02-05 7:21 ` Hao Xu
2021-02-05 9:56 ` Pavel Begunkov
0 siblings, 1 reply; 4+ messages in thread
From: Hao Xu @ 2021-02-05 7:21 UTC (permalink / raw)
To: Pavel Begunkov, io-uring, Jens Axboe
在 2021/2/4 下午7:00, Pavel Begunkov 写道:
> On 04/02/2021 09:31, Hao Xu wrote:
>> Hi all,
>> Sorry for disturb all of you. Here comes my question.
>> When we close a uring file, we go into io_uring_flush(),
>> there is codes at the end:
>>
>> if (!(ctx->flags & IORING_SETUP_SQPOLL) || ctx->sqo_task == current)
>> io_uring_del_task_file(file);
>>
>> My understanding, this is to delete the ctx(associated with the uring
>> file) from current->io_uring->xa.
>> I'm thinking of this scenario: the task to close uring file is not the
>> one which created the uring file.
>> Then it doesn't make sense to delete the uring file from current->io_uring->xa. It should be "delete uring file from
>> ctx->sqo_task->io_uring->xa" instead.
>
> 1. It's not only about created or not, look for
> io_uring_add_task_file() call sites.
>
> 2. io_uring->xa is basically a map from task to used by it urings.
> Every user task should clean only its own context (SQPOLL task is
> a bit different), it'll be hell bunch of races otherwise.
>
> 3. If happens that it's closed by a task that has nothing to do
> with this ctx, then it won't find anything in its
> task->io_uring->xa, and so won't delete anything, and that's ok.
> io_uring->xa of sqo_task will be cleaned by sqo_task, either
> on another close() or on exit() (see io_uring_files_cancel).
>
> 4. There is a bunch of cases where that scheme doesn't behave
> nice, but at least should not leak/fault when all related tasks
> are killed.
>
Thank you Pavel for the detail explanation. I got it, basically
just delay the clean work to sqo_task.
I have this question since I'm looking into the tctx->inflight, it
puzzles me a little bit. When a task exit(), it finally calls
__io_uring_task_cancel(), where we wait until tctx->inflight is 0.
What does tctx->inflight actually mean? I thought it stands for all
the inflight reqs of ctxs of this task. But in tctx_inflight():
/*
* If we have SQPOLL rings, then we need to iterate and find them, and
* add the pending count for those.
*/
xa_for_each(&tctx->xa, index, file) {
struct io_ring_ctx *ctx = file->private_data;
if (ctx->flags & IORING_SETUP_SQPOLL) {
struct io_uring_task *__tctx = ctx->sqo_task->io_uring;
inflight += percpu_counter_sum(&__tctx->inflight);
}
}
Why it adds ctx->sqo_task->io_uring->inflight.
In a scenario like this:
taskA->tctx: ctx0 ctx1
sqpoll normal
Since ctx0->sqo_task is taskA, so isn't taskA->io_uring->inflight
calculated twice?
In another hand, count of requests submited by sqthread will be added to
sqthread->io_uring, do we ommit this part?with that being said, should
taskA wait for sqes/reqs created by taskA but handled by sqthread?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Queston about io_uring_flush
2021-02-05 7:21 ` Hao Xu
@ 2021-02-05 9:56 ` Pavel Begunkov
0 siblings, 0 replies; 4+ messages in thread
From: Pavel Begunkov @ 2021-02-05 9:56 UTC (permalink / raw)
To: Hao Xu, io-uring, Jens Axboe
On 05/02/2021 07:21, Hao Xu wrote:
> 在 2021/2/4 下午7:00, Pavel Begunkov 写道:
>> On 04/02/2021 09:31, Hao Xu wrote:
>>> Hi all,
>>> Sorry for disturb all of you. Here comes my question.
>>> When we close a uring file, we go into io_uring_flush(),
>>> there is codes at the end:
>>>
>>> if (!(ctx->flags & IORING_SETUP_SQPOLL) || ctx->sqo_task == current)
>>> io_uring_del_task_file(file);
>>>
>>> My understanding, this is to delete the ctx(associated with the uring
>>> file) from current->io_uring->xa.
>>> I'm thinking of this scenario: the task to close uring file is not the
>>> one which created the uring file.
>>> Then it doesn't make sense to delete the uring file from current->io_uring->xa. It should be "delete uring file from
>>> ctx->sqo_task->io_uring->xa" instead.
>>
>> 1. It's not only about created or not, look for
>> io_uring_add_task_file() call sites.
>>
>> 2. io_uring->xa is basically a map from task to used by it urings.
>> Every user task should clean only its own context (SQPOLL task is
>> a bit different), it'll be hell bunch of races otherwise.
>>
>> 3. If happens that it's closed by a task that has nothing to do
>> with this ctx, then it won't find anything in its
>> task->io_uring->xa, and so won't delete anything, and that's ok.
>> io_uring->xa of sqo_task will be cleaned by sqo_task, either
>> on another close() or on exit() (see io_uring_files_cancel).
>>
>> 4. There is a bunch of cases where that scheme doesn't behave
>> nice, but at least should not leak/fault when all related tasks
>> are killed.
>>
> Thank you Pavel for the detail explanation. I got it, basically
> just delay the clean work to sqo_task.
> I have this question since I'm looking into the tctx->inflight, it puzzles me a little bit. When a task exit(), it finally calls
> __io_uring_task_cancel(), where we wait until tctx->inflight is 0.
> What does tctx->inflight actually mean? I thought it stands for all
> the inflight reqs of ctxs of this task. But in tctx_inflight():
>
> /*
> * If we have SQPOLL rings, then we need to iterate and find them, and
> * add the pending count for those.
> */
> xa_for_each(&tctx->xa, index, file) {
> struct io_ring_ctx *ctx = file->private_data;
>
> if (ctx->flags & IORING_SETUP_SQPOLL) {
> struct io_uring_task *__tctx = ctx->sqo_task->io_uring;
>
> inflight += percpu_counter_sum(&__tctx->inflight);
> }
> }
>
> Why it adds ctx->sqo_task->io_uring->inflight.
> In a scenario like this:
> taskA->tctx: ctx0 ctx1
> sqpoll normal
>
> Since ctx0->sqo_task is taskA, so isn't taskA->io_uring->inflight calculated twice?
> In another hand, count of requests submited by sqthread will be added to sqthread->io_uring, do we ommit this part?with that being said, should taskA wait for sqes/reqs created by taskA but handled by sqthread?
Hah, yes it's known and tctx_inflight() always returns 0 in this
case :) I'm patching it for 5.12 because it's rather bulky, and
with some of recent 5.11 fixes for now the whole thing should do
what we want in terms of cancellations.
But good catch
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-02-05 10:05 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-02-04 9:31 Queston about io_uring_flush Hao Xu
2021-02-04 11:00 ` Pavel Begunkov
2021-02-05 7:21 ` Hao Xu
2021-02-05 9:56 ` Pavel Begunkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox