From: Hao Xu <[email protected]>
To: Olivier Langlois <[email protected]>,
Jens Axboe <[email protected]>,
[email protected]
Subject: Re: napi_busy_poll
Date: Mon, 21 Feb 2022 13:03:48 +0800 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
在 2022/2/19 下午3:02, Olivier Langlois 写道:
> On Fri, 2022-02-18 at 15:41 +0800, Hao Xu wrote:
>>
>> Hi Oliver,
>>
>> Have you tried just issue one recv/pollin request and observe the
>>
>> napi_id?
>
> Hi Hao, not precisely but you are 100% right about where the
> association is done. It is when a packet is received that the
> association is made. This happens in few places but the most likely
> place where it can happen with my NIC (Intel igb) is inside
> napi_gro_receive().
Yes, when a packet is received-->set skb->napi_id, when receiving a
batch of them-->deliver the skbs to the protocol layer and set
sk->napi_id
>
> I do verify the socket napi_id once a WebSocket session is established.
> At that point a lot of packets going back and forth have been
> exchanged:
>
> TCP handshake
> TLS handshake
> HTTP request requesting a WS upgrade
>
> At that point, the napi_id has been assigned.
>
> My problem was only that my socket packets were routed on the loopback
> interface which has no napi devices associated to it.
>
> I did remove the local SOCKS proxy out of my setup and NAPI ids started
> to appear as expected.
>
>> From my understanding of the network stack, the napi_id
>>
>> of a socket won't be valid until it gets some packets. Because before
>>
>> that moment, busy_poll doesn't know which hw queue to poll.
>>
>> In other words, the idea of NAPI polling is: the packets of a socket
>>
>> can be from any hw queue of a net adapter, but we just poll the
>>
>> queue which just received some data. So to get this piece of info,
>>
>> there must be some data coming to one queue, before doing the
>>
>> busy_poll. Correct me if I'm wrong since I'm also a newbie of
>>
>> network stuff...
>
> I am now getting what you mean here. So there are 2 possible
> approaches. You either:
>
> 1. add the napi id when you are sure that it is available after its
> setting in the sock layer but you are not sure if it will be needed
> again with future requests as it is too late to be of any use for the
> current request (unless it is a MULTISHOT poll) (the add is performed
> in io_poll_task_func() and io_apoll_task_func()
>
> 2. add the napi id when the request poll is armed where this knowledge
> could be leveraged to handle the current req knowing that we might fail
> getting the id if it is the initial recv request. (the add would be
> performed in __io_arm_poll_handler)
I explains this in the patch.
>
> TBH, I am not sure if there are arguments in favor of one approach over
> the other... Maybe option #1 is the only one to make napi busy poll
> work correctly with MULTISHOT requests...
>
> I'll let you think about this point... Your first choice might be the
> right one...
>
> the other thing to consider when choosing the call site is locking...
> when done from __io_arm_poll_handler(), uring_lock is acquired...
>
> I am not sure that this is always the case with
> io_poll_task_func/io_apoll_task_func...
>
> I'll post v1 of the patch. My testing is showing that it works fine.
> race condition is not an issue when busy poll is performed by sqpoll
> thread because the list modification is exclusivy performed by that
> thread too.
>
> but I think that there is a possible race condition where the napi_list
> could be used from io_cqring_wait() while another thread modify the
> list. This is NOT done in my testing scenario but definitely something
> that could happen somewhere in the real world...
Will there be any issue if we do the access with
list_for_each_entry_safe? I think it is safe enough.
>
>>
>>
>> I was considering to poll all the rx rings, but it seemed to be not
>>
>> efficient from some tests by my colleague.
>
> This is definitely the simplest implementation but I did not go as far
> as testing it. There is too much unknown variables to be viable IMHO. I
> am not too sure how many napi devices there can be in a typical server.
> I know that in my test machine, it has 2 NICs and one of them is just
> unconnected. If we were to loop through all the devices, we would be
> polling wastefully at least half of all the devices on the system. That
> does not sound like a very good approach.
>
next prev parent reply other threads:[~2022-02-21 5:03 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-08 14:58 napi_busy_poll Olivier Langlois
2022-02-08 17:05 ` napi_busy_poll Jens Axboe
2022-02-09 3:34 ` napi_busy_poll Hao Xu
2022-02-12 19:51 ` napi_busy_poll Olivier Langlois
2022-02-13 18:47 ` napi_busy_poll Jens Axboe
2022-02-14 17:13 ` napi_busy_poll Hao Xu
2022-02-15 8:37 ` napi_busy_poll Olivier Langlois
2022-02-15 18:05 ` napi_busy_poll Olivier Langlois
2022-02-16 3:12 ` napi_busy_poll Hao Xu
2022-02-16 19:19 ` napi_busy_poll Olivier Langlois
2022-02-16 12:14 ` napi_busy_poll Hao Xu
2022-02-17 20:28 ` napi_busy_poll Olivier Langlois
2022-02-18 8:06 ` napi_busy_poll Hao Xu
2022-02-19 7:14 ` napi_busy_poll Olivier Langlois
2022-02-21 4:52 ` napi_busy_poll Hao Xu
2022-02-17 23:18 ` napi_busy_poll Olivier Langlois
2022-02-17 23:25 ` napi_busy_poll Jens Axboe
2022-02-18 7:21 ` napi_busy_poll Hao Xu
2022-02-18 5:05 ` napi_busy_poll Olivier Langlois
2022-02-18 7:41 ` napi_busy_poll Hao Xu
2022-02-19 7:02 ` napi_busy_poll Olivier Langlois
2022-02-21 5:03 ` Hao Xu [this message]
2022-02-25 4:42 ` napi_busy_poll Olivier Langlois
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b4440070-e255-9107-4214-1b00ee84ac47@linux.alibaba.com \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox