public inbox for [email protected]
 help / color / mirror / Atom feed
From: Olivier Langlois <[email protected]>
To: Pavel Begunkov <[email protected]>, [email protected]
Cc: [email protected]
Subject: Re: io_uring NAPI busy poll RCU is causing 50 context switches/second to my sqpoll thread
Date: Thu, 01 Aug 2024 18:02:49 -0400	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On Wed, 2024-07-31 at 01:33 +0100, Pavel Begunkov wrote:
> 
> You're seeing something that doesn't make much sense to me, and we
> need
> to understand what that is. There might be a bug _somewhere_, that's
> always a possibility, but before saying that let's get a bit more
> data.
> 
> While the app is working, can you grab a profile and run mpstat for
> the
> CPU on which you have the SQPOLL task?
> 
> perf record -g -C <CPU number> --all-kernel &
> mpstat -u -P <CPU number> 5 10 &
> 
> And then as usual, time it so that you have some activity going on,
> mpstat interval may need adjustments, and perf report it as before.
> 
First thing first.

The other day, I did put my foot in my mouth by saying the NAPI busy
poll was adding 50 context switches/second.

I was responsible for that behavior with the rcu_nocb_poll boot kernel
param. I have removed the option and the context switches went away...

I am clearly outside my comfort zone with this project, I am trying
things without fully understand what I am doing and I am making errors
and stuff that is incorrect.

On top of that, before mentioning io_uring RCU usage, I did not realize
that net/core was already massively using RCU, including in
napi_busy_poll, therefore, that io_uring is using rcu before calling
napi_busy_poll, the point does seem very moot.

this is what I did the other day and I wanted to apologize to have said
something incorrect.

that being said, it does not remove the possible merit of what I did
propose.

I really think that the current io_uring implemention of the napi
device tracking strategy is overkill for a lot of scenarios...

if some sort of abstract interface like a mini struct net_device_ops
with 3-4 function pointers where the user could select between the
standard dynamic tracking or a manual lightweight tracking was present,
that would be very cool... so cool...

I am definitely interested in running the profiler tools that you are
proposing... Most of my problems are resolved...

- I got rid of 99.9% if the NET_RX_SOFTIRQ
- I have reduced significantly the number of NET_TX_SOFTIRQ
  https://github.com/amzn/amzn-drivers/issues/316
- No more rcu context switches
- CPU2 is now nohz_full all the time
- CPU1 local timer interrupt is raised once every 2-3 seconds for an
unknown origin. Paul E. McKenney did offer me his assistance on this
issue
https://lore.kernel.org/rcu/[email protected]/t/#u

I am going to give perf record a second chance... but just keep in
mind, that it is not because it is not recording much, it is not
because nothing is happening. if perf relies on interrupts to properly
operate, there is close to 0 on my nohz_full CPU...

thx a lot for your help Pavel!


  parent reply	other threads:[~2024-08-01 22:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-30 20:05 io_uring NAPI busy poll RCU is causing 50 context switches/second to my sqpoll thread Olivier Langlois
2024-07-30 20:25 ` Pavel Begunkov
2024-07-30 23:14   ` Olivier Langlois
2024-07-31  0:33     ` Pavel Begunkov
2024-07-31  1:00       ` Pavel Begunkov
2024-08-01 23:05         ` Olivier Langlois
2024-08-01 22:02       ` Olivier Langlois [this message]
2024-08-02 15:22         ` Pavel Begunkov
2024-08-03 14:15           ` Olivier Langlois
2024-08-03 14:36             ` Jens Axboe
2024-08-03 16:50               ` Olivier Langlois
2024-08-03 21:37               ` Olivier Langlois

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4dbbd36aa7ecd1ce7a6289600b5655563e4a5a74.camel@trillion01.com \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox