* [axboe-block:io_uring-defer-tw.4] [io_uring] 61a5e20297: stress-ng.io-uring.ops_per_sec 41.9% regression
@ 2025-07-01 4:47 kernel test robot
0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2025-07-01 4:47 UTC (permalink / raw)
To: Jens Axboe; +Cc: oe-lkp, lkp, io-uring, oliver.sang
Hello,
kernel test robot noticed a 41.9% regression of stress-ng.io-uring.ops_per_sec on:
commit: 61a5e202971d4a242fc761728e89922edde02d38 ("io_uring: switch defer task_work to using a ring")
https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git io_uring-defer-tw.4
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: io-uring
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202507010550.2d6f83ea-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250701/202507010550.2d6f83ea-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-srf-2sp2/io-uring/stress-ng/60s
commit:
8559f3b41f ("io_uring: make task_work pending check dependent on ring type")
61a5e20297 ("io_uring: switch defer task_work to using a ring")
8559f3b41fdcdd01 61a5e202971d4a242fc761728e8
---------------- ---------------------------
%stddev %change %stddev
\ | \
1022268 ± 2% -30.4% 711175 meminfo.Mapped
7.478e+09 +30.1% 9.727e+09 cpuidle..time
3.03e+08 -20.9% 2.398e+08 ± 3% cpuidle..usage
696425 ±171% +181.2% 1958387 ± 81% numa-meminfo.node0.Unevictable
940879 ± 10% -32.9% 631792 ± 14% numa-meminfo.node1.Mapped
43.50 ± 20% -73.9% 11.33 ± 53% perf-c2c.DRAM.local
32749 ± 10% -86.3% 4475 ± 29% perf-c2c.HITM.local
33251 ± 10% -85.0% 4989 ± 25% perf-c2c.HITM.total
14632245 ± 9% -38.5% 8999074 ± 7% numa-numastat.node0.local_node
14749610 ± 9% -38.6% 9056826 ± 6% numa-numastat.node0.numa_hit
21106190 ± 4% -37.5% 13198942 ± 4% numa-numastat.node1.local_node
21186924 ± 4% -37.0% 13339356 ± 4% numa-numastat.node1.numa_hit
43.02 ± 2% -12.5% 37.66 ± 2% vmstat.cpu.id
19.87 +121.8% 44.07 ± 2% vmstat.cpu.wa
73.14 +101.7% 147.54 ± 2% vmstat.procs.b
112.33 ± 2% -64.8% 39.60 ± 7% vmstat.procs.r
12695197 -38.2% 7849636 ± 4% vmstat.system.cs
5179340 ± 2% -24.4% 3915343 ± 4% vmstat.system.in
174059 ±171% +181.3% 489607 ± 81% numa-vmstat.node0.nr_unevictable
174060 ±171% +181.3% 489607 ± 81% numa-vmstat.node0.nr_zone_unevictable
14750003 ± 9% -38.6% 9057006 ± 6% numa-vmstat.node0.numa_hit
14632638 ± 9% -38.5% 8999253 ± 7% numa-vmstat.node0.numa_local
236391 ± 10% -33.3% 157713 ± 14% numa-vmstat.node1.nr_mapped
21186186 ± 4% -37.0% 13338387 ± 4% numa-vmstat.node1.numa_hit
21105453 ± 4% -37.5% 13197958 ± 4% numa-vmstat.node1.numa_local
41.57 -5.8 35.76 ± 2% mpstat.cpu.all.idle%
20.32 +25.1 45.43 ± 2% mpstat.cpu.all.iowait%
6.25 ± 4% -2.2 4.09 ± 6% mpstat.cpu.all.irq%
0.34 ± 4% -0.2 0.14 ± 6% mpstat.cpu.all.soft%
28.91 -15.5 13.40 ± 6% mpstat.cpu.all.sys%
2.62 -1.4 1.17 ± 6% mpstat.cpu.all.usr%
18.83 ± 5% -84.1% 3.00 mpstat.max_utilization.seconds
61.41 -30.1% 42.94 mpstat.max_utilization_pct
3.455e+08 -41.9% 2.006e+08 ± 4% stress-ng.io-uring.ops
5758736 -41.9% 3343243 ± 4% stress-ng.io-uring.ops_per_sec
63485668 -85.7% 9052788 ± 15% stress-ng.time.involuntary_context_switches
86971 -2.2% 85030 stress-ng.time.minor_page_faults
6021 -54.8% 2724 ± 6% stress-ng.time.percent_of_cpu_this_job_got
3383 -53.8% 1562 ± 6% stress-ng.time.system_time
248.17 -67.3% 81.18 ± 9% stress-ng.time.user_time
4.227e+08 -40.1% 2.531e+08 ± 4% stress-ng.time.voluntary_context_switches
2888857 ± 2% -8.1% 2654260 proc-vmstat.nr_active_anon
302955 -3.1% 293576 proc-vmstat.nr_anon_pages
3475920 ± 2% -6.5% 3250878 proc-vmstat.nr_file_pages
44207 -3.1% 42858 proc-vmstat.nr_kernel_stack
255933 ± 3% -30.6% 177546 proc-vmstat.nr_mapped
2586684 ± 3% -8.7% 2361525 proc-vmstat.nr_shmem
43152 -1.5% 42518 proc-vmstat.nr_slab_reclaimable
2888857 ± 2% -8.1% 2654260 proc-vmstat.nr_zone_active_anon
35939101 -37.7% 22399100 ± 3% proc-vmstat.numa_hit
35741003 -37.9% 22200912 ± 3% proc-vmstat.numa_local
585759 ± 5% -27.5% 424436 ± 8% proc-vmstat.numa_pte_updates
36196152 -37.5% 22624491 ± 3% proc-vmstat.pgalloc_normal
700860 ± 3% -7.0% 651538 ± 4% proc-vmstat.pgfault
32134448 -41.1% 18939637 ± 4% proc-vmstat.pgfree
16707904 -77.5% 3755057 ± 10% proc-vmstat.unevictable_pgs_culled
0.17 ± 4% +94.3% 0.32 ± 16% perf-stat.i.MPKI
2.698e+10 -40.1% 1.616e+10 ± 4% perf-stat.i.branch-instructions
0.92 -0.3 0.64 perf-stat.i.branch-miss-rate%
2.173e+08 -57.1% 93142321 ± 5% perf-stat.i.branch-misses
2.25 ± 4% +6.4 8.67 ± 17% perf-stat.i.cache-miss-rate%
1.262e+09 -68.8% 3.94e+08 ± 6% perf-stat.i.cache-references
13218006 -37.6% 8252620 ± 4% perf-stat.i.context-switches
3.40 -7.5% 3.15 ± 3% perf-stat.i.cpi
4.003e+11 -40.4% 2.384e+11 ± 5% perf-stat.i.cpu-cycles
5382764 -76.2% 1281759 ± 10% perf-stat.i.cpu-migrations
32980 ± 5% -25.9% 24437 ± 9% perf-stat.i.cycles-between-cache-misses
1.327e+11 -39.9% 7.973e+10 ± 4% perf-stat.i.instructions
0.33 +9.9% 0.36 ± 3% perf-stat.i.ipc
96.88 -48.8% 49.64 ± 4% perf-stat.i.metric.K/sec
8872 ± 4% -11.6% 7844 ± 4% perf-stat.i.minor-faults
8872 ± 4% -11.6% 7844 ± 4% perf-stat.i.page-faults
0.18 ± 3% +61.7% 0.29 ± 8% perf-stat.overall.MPKI
0.81 -0.2 0.58 perf-stat.overall.branch-miss-rate%
1.88 ± 3% +4.0 5.86 ± 9% perf-stat.overall.cache-miss-rate%
16903 ± 3% -38.3% 10426 ± 9% perf-stat.overall.cycles-between-cache-misses
2.655e+10 -40.1% 1.59e+10 ± 4% perf-stat.ps.branch-instructions
2.138e+08 -57.2% 91585587 ± 5% perf-stat.ps.branch-misses
1.241e+09 -68.8% 3.875e+08 ± 6% perf-stat.ps.cache-references
13003285 -37.6% 8120099 ± 4% perf-stat.ps.context-switches
3.938e+11 -40.5% 2.345e+11 ± 5% perf-stat.ps.cpu-cycles
5295095 -76.2% 1259803 ± 10% perf-stat.ps.cpu-migrations
1.306e+11 -39.9% 7.846e+10 ± 4% perf-stat.ps.instructions
8714 ± 4% -11.7% 7694 ± 4% perf-stat.ps.minor-faults
8714 ± 4% -11.7% 7694 ± 4% perf-stat.ps.page-faults
8.049e+12 -40.0% 4.829e+12 ± 4% perf-stat.total.instructions
879267 ± 3% -77.4% 198767 ± 46% sched_debug.cfs_rq:/.avg_vruntime.avg
2197261 ± 7% -80.3% 433455 ± 40% sched_debug.cfs_rq:/.avg_vruntime.max
702597 ± 3% -82.0% 126663 ± 48% sched_debug.cfs_rq:/.avg_vruntime.min
144651 ± 9% -75.7% 35081 ± 36% sched_debug.cfs_rq:/.avg_vruntime.stddev
0.38 ± 7% -79.6% 0.08 ± 20% sched_debug.cfs_rq:/.h_nr_queued.avg
2.92 ± 20% -65.7% 1.00 sched_debug.cfs_rq:/.h_nr_queued.max
0.61 ± 4% -57.2% 0.26 ± 9% sched_debug.cfs_rq:/.h_nr_queued.stddev
0.34 ± 6% -77.5% 0.08 ± 19% sched_debug.cfs_rq:/.h_nr_runnable.avg
2.92 ± 20% -65.7% 1.00 sched_debug.cfs_rq:/.h_nr_runnable.max
0.56 ± 5% -53.3% 0.26 ± 9% sched_debug.cfs_rq:/.h_nr_runnable.stddev
115895 ± 14% -93.3% 7740 ± 69% sched_debug.cfs_rq:/.left_deadline.avg
1148129 ± 31% -77.7% 255916 ± 52% sched_debug.cfs_rq:/.left_deadline.max
300169 ± 8% -87.0% 39025 ± 54% sched_debug.cfs_rq:/.left_deadline.stddev
115876 ± 14% -93.3% 7740 ± 69% sched_debug.cfs_rq:/.left_vruntime.avg
1147975 ± 31% -77.7% 255883 ± 52% sched_debug.cfs_rq:/.left_vruntime.max
300120 ± 8% -87.0% 39021 ± 54% sched_debug.cfs_rq:/.left_vruntime.stddev
2.08 ± 16% -100.0% 0.00 sched_debug.cfs_rq:/.load_avg.min
879267 ± 3% -77.4% 198767 ± 46% sched_debug.cfs_rq:/.min_vruntime.avg
2197261 ± 7% -80.3% 433455 ± 40% sched_debug.cfs_rq:/.min_vruntime.max
702597 ± 3% -82.0% 126663 ± 48% sched_debug.cfs_rq:/.min_vruntime.min
144651 ± 9% -75.7% 35081 ± 36% sched_debug.cfs_rq:/.min_vruntime.stddev
0.24 ± 5% -67.9% 0.08 ± 19% sched_debug.cfs_rq:/.nr_queued.avg
0.36 -27.4% 0.26 ± 9% sched_debug.cfs_rq:/.nr_queued.stddev
115876 ± 14% -93.3% 7740 ± 69% sched_debug.cfs_rq:/.right_vruntime.avg
1147975 ± 31% -77.7% 255883 ± 52% sched_debug.cfs_rq:/.right_vruntime.max
300120 ± 8% -87.0% 39021 ± 54% sched_debug.cfs_rq:/.right_vruntime.stddev
293.31 ± 2% -61.0% 114.35 ± 10% sched_debug.cfs_rq:/.runnable_avg.avg
114.75 ± 6% -100.0% 0.00 sched_debug.cfs_rq:/.runnable_avg.min
161.40 ± 3% +16.8% 188.44 ± 6% sched_debug.cfs_rq:/.runnable_avg.stddev
243.06 ± 2% -53.0% 114.20 ± 10% sched_debug.cfs_rq:/.util_avg.avg
111.42 ± 5% -100.0% 0.00 sched_debug.cfs_rq:/.util_avg.min
143.53 ± 4% +31.2% 188.36 ± 6% sched_debug.cfs_rq:/.util_avg.stddev
45.14 ± 5% -53.8% 20.87 ± 29% sched_debug.cfs_rq:/.util_est.avg
117.16 ± 9% -23.3% 89.81 ± 15% sched_debug.cfs_rq:/.util_est.stddev
460889 +78.9% 824600 ± 4% sched_debug.cpu.avg_idle.avg
545161 ± 4% +83.4% 1000000 sched_debug.cpu.avg_idle.max
7815 ± 7% -47.7% 4084 ± 14% sched_debug.cpu.avg_idle.min
96234 ± 8% +192.4% 281404 ± 13% sched_debug.cpu.avg_idle.stddev
754.64 ± 5% -19.2% 609.61 ± 9% sched_debug.cpu.clock_task.stddev
1016 ± 7% -74.3% 261.72 ± 25% sched_debug.cpu.curr->pid.avg
1648 -37.6% 1027 ± 14% sched_debug.cpu.curr->pid.stddev
0.00 ± 24% -27.7% 0.00 ± 10% sched_debug.cpu.next_balance.stddev
0.35 ± 10% -82.5% 0.06 ± 20% sched_debug.cpu.nr_running.avg
2.92 ± 20% -65.7% 1.00 sched_debug.cpu.nr_running.max
0.60 ± 6% -61.0% 0.23 ± 8% sched_debug.cpu.nr_running.stddev
2060126 -47.9% 1073009 ± 44% sched_debug.cpu.nr_switches.avg
2688437 -31.6% 1839609 ± 44% sched_debug.cpu.nr_switches.max
650892 ± 9% -43.0% 370926 ± 54% sched_debug.cpu.nr_switches.min
522908 ± 2% -49.9% 261974 ± 45% sched_debug.cpu.nr_switches.stddev
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2025-07-01 4:48 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-01 4:47 [axboe-block:io_uring-defer-tw.4] [io_uring] 61a5e20297: stress-ng.io-uring.ops_per_sec 41.9% regression kernel test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox