From: Olivier Langlois <[email protected]>
To: Pavel Begunkov <[email protected]>, [email protected]
Subject: Re: io_uring NAPI busy poll RCU is causing 50 context switches/second to my sqpoll thread
Date: Sat, 03 Aug 2024 10:15:05 -0400 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On Fri, 2024-08-02 at 16:22 +0100, Pavel Begunkov wrote:
> >
> > I am definitely interested in running the profiler tools that you
> > are
> > proposing... Most of my problems are resolved...
> >
> > - I got rid of 99.9% if the NET_RX_SOFTIRQ
> > - I have reduced significantly the number of NET_TX_SOFTIRQ
> > https://github.com/amzn/amzn-drivers/issues/316
> > - No more rcu context switches
> > - CPU2 is now nohz_full all the time
> > - CPU1 local timer interrupt is raised once every 2-3 seconds for
> > an
> > unknown origin. Paul E. McKenney did offer me his assistance on
> > this
> > issue
> > https://lore.kernel.org/rcu/[email protected]/t/#u
>
> And I was just going to propose to ask Paul, but great to
> see you beat me on that
>
My investigation has progressed... my cpu1 interrupts are nvme block
device interrupts.
I feel that for questions about block device drivers, this time, I am
ringing at the experts door!
What is the meaning of a nvme interrupt?
I am assuming that this is to signal the completing of writing blocks
in the device...
I am currently looking in the code to find the answer for this.
Next, it seems to me that there is an odd number of interrupts for the
device:
63: 12 0 0 0 PCI-MSIX-0000:00:04.0
0-edge nvme0q0
64: 0 23336 0 0 PCI-MSIX-0000:00:04.0
1-edge nvme0q1
65: 0 0 0 33878 PCI-MSIX-0000:00:04.0
2-edge nvme0q2
why 3? Why not 4? one for each CPU...
If there was 4, I would have concluded that the driver has created a
queue for each CPU...
How are the queues associated to certain request/task?
The file I/O is made by threads running on CPU3, so I find it
surprising that nvmeq1 is choosen...
One noteworthy detail is that the process main thread is on CPU1. In my
flawed mental model of 1 queue per CPU, there could be some sort of
magical association with a process file descriptors table and the
choosen block device queue but this idea does not hold... What would
happen to processes running on CPU2...
next prev parent reply other threads:[~2024-08-03 14:15 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-30 20:05 io_uring NAPI busy poll RCU is causing 50 context switches/second to my sqpoll thread Olivier Langlois
2024-07-30 20:25 ` Pavel Begunkov
2024-07-30 23:14 ` Olivier Langlois
2024-07-31 0:33 ` Pavel Begunkov
2024-07-31 1:00 ` Pavel Begunkov
2024-08-01 23:05 ` Olivier Langlois
2024-08-01 22:02 ` Olivier Langlois
2024-08-02 15:22 ` Pavel Begunkov
2024-08-03 14:15 ` Olivier Langlois [this message]
2024-08-03 14:36 ` Jens Axboe
2024-08-03 16:50 ` Olivier Langlois
2024-08-03 21:37 ` Olivier Langlois
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7edc139bd159764075923e75ffb646e7313c7864.camel@trillion01.com \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox