From: Jens Axboe <[email protected]>
To: Xuan Zhuo <[email protected]>,
io-uring <[email protected]>
Cc: [email protected]
Subject: Re: [RFC] io_commit_cqring __io_cqring_fill_event take up too much cpu
Date: Mon, 22 Jun 2020 08:50:20 -0600 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 6/22/20 7:29 AM, Xuan Zhuo wrote:
> Hi Jens,
> I found a problem, and I think it is necessary to solve it. But the change
> may be relatively large, so I would like to ask you and everyone for your
> opinions. Or everyone has other ideas about this issue:
>
> Problem description:
> ===================
> I found that in the sq thread mode, the CPU used by io_commit_cqring and
> __io_cqring_fill_event accounts for a relatively large amount. The reason is
> because a large number of calls to smp_store_release and WRITE_ONCE.
> These two functions are relatively slow, and we need to call smp_store_release
> every time we submit a cqe. This large number of calls has caused this
> problem to become very prominent.
>
> My test environment is in qemu, using io_uring to accept a large number of
> udp packets in sq thread mode, the speed is 800000pps. I submitted 100 sqes
> to recv udp packet at the beginning of the application, and every time I
> received a cqe, I submitted another sqe. The perf top result of sq thread is
> as follows:
>
>
>
> 17.97% [kernel] [k] copy_user_generic_unrolled
> 13.92% [kernel] [k] io_commit_cqring
> 11.04% [kernel] [k] __io_cqring_fill_event
> 10.33% [kernel] [k] udp_recvmsg
> 5.94% [kernel] [k] skb_release_data
> 4.31% [kernel] [k] udp_rmem_release
> 2.68% [kernel] [k] __check_object_size
> 2.24% [kernel] [k] __slab_free
> 2.22% [kernel] [k] _raw_spin_lock_bh
> 2.21% [kernel] [k] kmem_cache_free
> 2.13% [kernel] [k] free_pcppages_bulk
> 1.83% [kernel] [k] io_submit_sqes
> 1.38% [kernel] [k] page_frag_free
> 1.31% [kernel] [k] inet_recvmsg
>
>
>
> It can be seen that io_commit_cqring and __io_cqring_fill_event account
> for 24.96%. This is too much. In general, the proportion of syscall may not
> be so high, so we must solve this problem.
>
>
> Solution:
> =================
> I consider that when the nr of an io_submit_sqes is too large, we don't call
> io_cqring_add_event directly, we can put the completed req in the queue, and
> then call __io_cqring_fill_event for each req then call once io_commit_cqring
> at the end of the io_submit_sqes function. In this way my local simple test
> looks good.
I think the solution here is to defer the cq ring filling + commit to the
caller instead of deep down the stack, I think that's a nice win in general.
To do that, we need to be able to do it after io_submit_sqes() has been
called. We can either do that inline, by passing down a list or struct
that allows the caller to place the request there instead of filling
the event, or out-of-band by having eg a percpu struct that allows the
same thing. In both cases, the actual call site would do something ala:
if (comp_list && successful_completion) {
req->result = ret;
list_add_tail(&req->list, comp_list);
} else {
io_cqring_add_event(req, ret);
if (!successful_completion)
req_set_fail_links(req);
io_put_req(req);
}
and then have the caller iterate the list and fill completions, if it's
non-empty on return.
I don't think this is necessarily hard, but to do it nicely it will
touch a bunch code and hence be quite a bit of churn. I do think the
reward is worth it though, as this applies to the "normal" submission
path as well, not just the SQPOLL variant.
--
Jens Axboe
next prev parent reply other threads:[~2020-06-22 14:50 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-22 13:29 [RFC] io_commit_cqring __io_cqring_fill_event take up too much cpu Xuan Zhuo
2020-06-22 14:50 ` Jens Axboe [this message]
2020-06-22 17:11 ` Jens Axboe
2020-06-23 8:42 ` xuanzhuo
2020-06-23 12:32 ` Pavel Begunkov
2020-06-23 14:44 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox