public inbox for [email protected]
 help / color / mirror / Atom feed
From: Hao Xu <[email protected]>
To: Olivier Langlois <[email protected]>,
	Jens Axboe <[email protected]>,
	[email protected]
Subject: Re: napi_busy_poll
Date: Mon, 21 Feb 2022 13:03:48 +0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

在 2022/2/19 下午3:02, Olivier Langlois 写道:
> On Fri, 2022-02-18 at 15:41 +0800, Hao Xu wrote:
>>
>> Hi Oliver,
>>
>> Have you tried just issue one recv/pollin request and observe the
>>
>> napi_id?
> 
> Hi Hao, not precisely but you are 100% right about where the
> association is done. It is when a packet is received that the
> association is made. This happens in few places but the most likely
> place where it can happen with my NIC (Intel igb) is inside
> napi_gro_receive().

Yes, when a packet is received-->set skb->napi_id, when receiving a
batch of them-->deliver the skbs to the protocol layer and set
sk->napi_id
> 
> I do verify the socket napi_id once a WebSocket session is established.
> At that point a lot of packets going back and forth have been
> exchanged:
> 
> TCP handshake
> TLS handshake
> HTTP request requesting a WS upgrade
> 
> At that point, the napi_id has been assigned.
> 
> My problem was only that my socket packets were routed on the loopback
> interface which has no napi devices associated to it.
> 
> I did remove the local SOCKS proxy out of my setup and NAPI ids started
> to appear as expected.
> 
>>   From my understanding of the network stack, the napi_id
>>
>> of a socket won't be valid until it gets some packets. Because before
>>
>> that moment, busy_poll doesn't know which hw queue to poll.
>>
>> In other words, the idea of NAPI polling is: the packets of a socket
>>
>> can be from any hw queue of a net adapter, but we just poll the
>>
>> queue which just received some data. So to get this piece of info,
>>
>> there must be some data coming to one queue, before doing the
>>
>> busy_poll. Correct me if I'm wrong since I'm also a newbie of
>>
>> network stuff...
> 
> I am now getting what you mean here. So there are 2 possible
> approaches. You either:
> 
> 1. add the napi id when you are sure that it is available after its
> setting in the sock layer but you are not sure if it will be needed
> again with future requests as it is too late to be of any use for the
> current request (unless it is a MULTISHOT poll) (the add is performed
> in io_poll_task_func() and io_apoll_task_func()
> 
> 2. add the napi id when the request poll is armed where this knowledge
> could be leveraged to handle the current req knowing that we might fail
> getting the id if it is the initial recv request. (the add would be
> performed in __io_arm_poll_handler)
I explains this in the patch.
> 
> TBH, I am not sure if there are arguments in favor of one approach over
> the other... Maybe option #1 is the only one to make napi busy poll
> work correctly with MULTISHOT requests...
> 
> I'll let you think about this point... Your first choice might be the
> right one...
> 
> the other thing to consider when choosing the call site is locking...
> when done from __io_arm_poll_handler(), uring_lock is acquired...
> 
> I am not sure that this is always the case with
> io_poll_task_func/io_apoll_task_func...
> 
> I'll post v1 of the patch. My testing is showing that it works fine.
> race condition is not an issue when busy poll is performed by sqpoll
> thread because the list modification is exclusivy performed by that
> thread too.
> 
> but I think that there is a possible race condition where the napi_list
> could be used from io_cqring_wait() while another thread modify the
> list. This is NOT done in my testing scenario but definitely something
> that could happen somewhere in the real world...

Will there be any issue if we do the access with
list_for_each_entry_safe? I think it is safe enough.

> 
>>
>>
>> I was considering to poll all the rx rings, but it seemed to be not
>>
>> efficient from some tests by my colleague.
> 
> This is definitely the simplest implementation but I did not go as far
> as testing it. There is too much unknown variables to be viable IMHO. I
> am not too sure how many napi devices there can be in a typical server.
> I know that in my test machine, it has 2 NICs and one of them is just
> unconnected. If we were to loop through all the devices, we would be
> polling wastefully at least half of all the devices on the system. That
> does not sound like a very good approach.
> 


  reply	other threads:[~2022-02-21  5:03 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-08 14:58 napi_busy_poll Olivier Langlois
2022-02-08 17:05 ` napi_busy_poll Jens Axboe
2022-02-09  3:34   ` napi_busy_poll Hao Xu
2022-02-12 19:51     ` napi_busy_poll Olivier Langlois
2022-02-13 18:47       ` napi_busy_poll Jens Axboe
2022-02-14 17:13       ` napi_busy_poll Hao Xu
2022-02-15  8:37         ` napi_busy_poll Olivier Langlois
2022-02-15 18:05           ` napi_busy_poll Olivier Langlois
2022-02-16  3:12             ` napi_busy_poll Hao Xu
2022-02-16 19:19               ` napi_busy_poll Olivier Langlois
2022-02-16 12:14             ` napi_busy_poll Hao Xu
2022-02-17 20:28               ` napi_busy_poll Olivier Langlois
2022-02-18  8:06                 ` napi_busy_poll Hao Xu
2022-02-19  7:14                   ` napi_busy_poll Olivier Langlois
2022-02-21  4:52                     ` napi_busy_poll Hao Xu
2022-02-17 23:18               ` napi_busy_poll Olivier Langlois
2022-02-17 23:25                 ` napi_busy_poll Jens Axboe
2022-02-18  7:21                 ` napi_busy_poll Hao Xu
2022-02-18  5:05               ` napi_busy_poll Olivier Langlois
2022-02-18  7:41                 ` napi_busy_poll Hao Xu
2022-02-19  7:02                   ` napi_busy_poll Olivier Langlois
2022-02-21  5:03                     ` Hao Xu [this message]
2022-02-25  4:42                       ` napi_busy_poll Olivier Langlois

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b4440070-e255-9107-4214-1b00ee84ac47@linux.alibaba.com \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox