public inbox for [email protected]
 help / color / mirror / Atom feed
From: Xiaoguang Wang <[email protected]>
To: Pavel Begunkov <[email protected]>, [email protected]
Cc: [email protected]
Subject: Re: [PATCH v3 2/3] io_uring: reduce frequent add_wait_queue() overhead for multi-shot poll request
Date: Fri, 29 Oct 2021 21:37:23 +0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

hi,

> On 10/29/21 03:57, Xiaoguang Wang wrote:
>> hi,
>>
>>> On 10/25/21 06:38, Xiaoguang Wang wrote:
>>>> Run echo_server to evaluate io_uring's multi-shot poll performance, 
>>>> perf
>>>> shows that add_wait_queue() has obvious overhead. Intruduce a new 
>>>> state
>>>> 'active' in io_poll_iocb to indicate whether io_poll_wake() should 
>>>> queue
>>>> a task_work. This new state will be set to true initially, be set 
>>>> to false
>>>> when starting to queue a task work, and be set to true again when a 
>>>> poll
>>>> cqe has been committed. One concern is that this method may lost 
>>>> waken-up
>>>> event, but seems it's ok.
>>>>
>>>>    io_poll_wake                io_poll_task_func
>>>> t1                       |
>>>> t2                       | WRITE_ONCE(req->poll.active, true);
>>>> t3                       |
>>>> t4                       |    io_commit_cqring(ctx);
>>>> t5                       |
>>>> t6                       |
>>>>
>>>> If waken-up events happens before or at t4, it's ok, user app will 
>>>> always
>>>> see a cqe. If waken-up events happens after t4 and IIUC, 
>>>> io_poll_wake()
>>>> will see the new req->poll.active value by using READ_ONCE().
>>>>
>>>> Echo_server codes can be cloned from:
>>>> https://codeup.openanolis.cn/codeup/storage/io_uring-echo-server.git,
>>>> branch is xiaoguangwang/io_uring_multishot.
>>>>
>>>> Without this patch, the tps in our test environment is 284116, with
>>>> this patch, the tps is 287832, about 1.3% reqs improvement, which
>>>> is indeed in accord with the saved add_wait_queue() cost.
>>>>
>>>> Signed-off-by: Xiaoguang Wang <[email protected]>
>>>> ---
>>>>   fs/io_uring.c | 57 
>>>> +++++++++++++++++++++++++++++++++------------------------
>>>>   1 file changed, 33 insertions(+), 24 deletions(-)
>>>>
>>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>>> index 18af9bb9a4bc..e4c779dac953 100644
>>>> --- a/fs/io_uring.c
>>>> +++ b/fs/io_uring.c
>>>> @@ -481,6 +481,7 @@ struct io_poll_iocb {
>>>>       __poll_t            events;
>>>>       bool                done;
>>>>       bool                canceled;
>>>> +    bool                active;
>>>>       struct wait_queue_entry        wait;
>>>>   };
>>>>   @@ -5233,8 +5234,6 @@ static inline int __io_async_wake(struct 
>>>> io_kiocb *req, struct io_poll_iocb *pol
>>>>   {
>>>>       trace_io_uring_task_add(req->ctx, req->opcode, 
>>>> req->user_data, mask);
>>>>   -    list_del_init(&poll->wait.entry);
>>>> -
>>>
>>> As I mentioned to Hao some time ago, we can't allow this function or in
>>> particular io_req_task_work_add() to happen twice before the first
>>> task work got executed, they use the same field in io_kiocb and those
>>> will corrupt the tw list.
>>>
>>> Looks that's what can happen here.
>> If I have understood scenario your described correctly, I think it 
>> won't happen :)
>> With this patch, if the first io_req_task_work_add() is issued, 
>> poll.active
>> will be set to false, then no new io_req_task_work_add() will be issued.
>> Only the first task_work installed by the first 
>> io_req_task_work_add() has
>> completed, poll.active will be set to true again.
>
> Ah, I see now, the active dance is in io_poll_wake(). That won't work
> with READ_ONCE/WRITE_ONCE though, you would need real atomics
>
> The easiest race to explain is:
>
> CPU1                        | CPU2
> io_poll_wake                | io_poll_wake
> if (p->active) return 0;    | if (p->active) return 0;
it's "if (!p->active) return 0;" in my patch :)
> // p->active is false in both cases, continue
> p->active = false;          | p->active = false;
> task_work_add()             | task_work_add()
io_poll_wake() is called with poll->head->lock, so there will no concurrent
io_poll_wake() calls.

Regards,
Xiaoguang Wang
>
>
> But there are more subtle races.
>


  reply	other threads:[~2021-10-29 13:37 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-25  5:38 [PATCH v3 0/3] improvements for multi-shot poll requests Xiaoguang Wang
2021-10-25  5:38 ` [PATCH v3 1/3] io_uring: refactor event check out of __io_async_wake() Xiaoguang Wang
2021-10-25  9:35   ` Praveen Kumar
2021-10-25  5:38 ` [PATCH v3 2/3] io_uring: reduce frequent add_wait_queue() overhead for multi-shot poll request Xiaoguang Wang
2021-10-28 19:21   ` Pavel Begunkov
2021-10-29  2:57     ` Xiaoguang Wang
2021-10-29 10:02       ` Pavel Begunkov
2021-10-29 13:37         ` Xiaoguang Wang [this message]
2021-10-29 13:47           ` Pavel Begunkov
2021-10-29 14:12           ` Pavel Begunkov
2021-10-29 14:34             ` Xiaoguang Wang
2021-10-25  5:38 ` [PATCH v3 3/3] io_uring: don't get completion_lock in io_poll_rewait() Xiaoguang Wang
2021-10-28 19:26   ` Pavel Begunkov
2021-10-29  5:59     ` Xiaoguang Wang
2021-10-28 18:19 ` [PATCH v3 0/3] improvements for multi-shot poll requests Jens Axboe
2021-10-29 18:29   ` Jens Axboe
2021-10-28 18:19 ` Jens Axboe
2021-10-28 19:01   ` Pavel Begunkov
2021-10-28 19:04     ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bede0dc1-dc6f-3674-9650-e0f3e60c3d01@linux.alibaba.com \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox