public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/15] tracepoint: Avoid double static_branch evaluation at guarded call sites
@ 2026-03-12 15:04 Vineeth Pillai (Google)
  2026-03-12 15:04 ` [PATCH 01/15] tracepoint: Add trace_invoke_##name() API Vineeth Pillai (Google)
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Vineeth Pillai (Google) @ 2026-03-12 15:04 UTC (permalink / raw)
  To: Steven Rostedt, Peter Zijlstra, Dmitry Ilvokhin
  Cc: Vineeth Pillai (Google), Masami Hiramatsu, Mathieu Desnoyers,
	Ingo Molnar, Jens Axboe, io-uring, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Marcelo Ricardo Leitner, Xin Long, Jon Maloy, Aaron Conole,
	Eelco Chaudron, Ilya Maximets, netdev, bpf, linux-sctp,
	tipc-discussion, dev, Oded Gabbay, Koby Elbaz, dri-devel,
	Rafael J. Wysocki, Viresh Kumar, Gautham R. Shenoy, Huang Rui,
	Mario Limonciello, Len Brown, Srinivas Pandruvada, linux-pm,
	MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Christian König,
	Sumit Semwal, linaro-mm-sig, Eddie James, Andrew Jeffery,
	Joel Stanley, linux-fsi, David Airlie, Simona Vetter,
	Alex Deucher, Danilo Krummrich, Matthew Brost, Philipp Stanner,
	Harry Wentland, Leo Li, amd-gfx, Jiri Kosina, Benjamin Tissoires,
	linux-input, Wolfram Sang, linux-i2c, Mark Brown,
	Michael Hennerich, Nuno Sá, linux-spi, James E.J. Bottomley,
	Martin K. Petersen, linux-scsi, Chris Mason, David Sterba,
	linux-btrfs, linux-trace-kernel, linux-kernel

When a caller already guards a tracepoint with an explicit enabled check:

  if (trace_foo_enabled() && cond)
      trace_foo(args);

trace_foo() internally re-evaluates the static_branch_unlikely() key.
Since static branches are patched binary instructions the compiler cannot
fold the two evaluations, so every such site pays the cost twice.

This series introduces trace_invoke_##name() as a companion to
trace_##name().  It calls __do_trace_##name() directly, bypassing the
redundant static-branch re-check, while preserving all other correctness
properties of the normal path (RCU-watching assertion, might_fault() for
syscall tracepoints).  The internal __do_trace_##name() symbol is not
leaked to call sites; trace_invoke_##name() is the only new public API.

  if (trace_foo_enabled() && cond)
      trace_invoke_foo(args);   /* calls __do_trace_foo() directly */

The first patch adds the three-location change to
include/linux/tracepoint.h (__DECLARE_TRACE, __DECLARE_TRACE_SYSCALL,
and the !TRACEPOINTS_ENABLED stub).  The remaining 14 patches
mechanically convert all guarded call sites found in the tree:
kernel/, io_uring/, net/, accel/habanalabs, cpufreq/, devfreq/,
dma-buf/, fsi/, drm/, HID, i2c/, spi/, scsi/ufs/, and btrfs/.

This series is motivated by Peter Zijlstra's observation in the discussion
around Dmitry Ilvokhin's locking tracepoint instrumentation series, where
he noted that compilers cannot optimize static branches and that guarded
call sites end up evaluating the static branch twice for no reason, and
by Steven Rostedt's suggestion to add a proper API instead of exposing
internal implementation details like __do_trace_##name() directly to
call sites:

  https://lore.kernel.org/linux-trace-kernel/8298e098d3418cb446ef396f119edac58a3414e9.1772642407.git.d@ilvokhin.com

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>

Vineeth Pillai (Google) (15):
  tracepoint: Add trace_invoke_##name() API
  kernel: Use trace_invoke_##name() at guarded tracepoint call sites
  io_uring: Use trace_invoke_##name() at guarded tracepoint call sites
  net: Use trace_invoke_##name() at guarded tracepoint call sites
  accel/habanalabs: Use trace_invoke_##name() at guarded tracepoint call
    sites
  cpufreq: Use trace_invoke_##name() at guarded tracepoint call sites
  devfreq: Use trace_invoke_##name() at guarded tracepoint call sites
  dma-buf: Use trace_invoke_##name() at guarded tracepoint call sites
  fsi: Use trace_invoke_##name() at guarded tracepoint call sites
  drm: Use trace_invoke_##name() at guarded tracepoint call sites
  HID: Use trace_invoke_##name() at guarded tracepoint call sites
  i2c: Use trace_invoke_##name() at guarded tracepoint call sites
  spi: Use trace_invoke_##name() at guarded tracepoint call sites
  scsi: ufs: Use trace_invoke_##name() at guarded tracepoint call sites
  btrfs: Use trace_invoke_##name() at guarded tracepoint call sites

 drivers/accel/habanalabs/common/device.c          | 12 ++++++------
 drivers/accel/habanalabs/common/mmu/mmu.c         |  3 ++-
 drivers/accel/habanalabs/common/pci/pci.c         |  4 ++--
 drivers/cpufreq/amd-pstate.c                      | 10 +++++-----
 drivers/cpufreq/cpufreq.c                         |  2 +-
 drivers/cpufreq/intel_pstate.c                    |  2 +-
 drivers/devfreq/devfreq.c                         |  2 +-
 drivers/dma-buf/dma-fence.c                       |  4 ++--
 drivers/fsi/fsi-master-aspeed.c                   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c            |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |  4 ++--
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  2 +-
 drivers/gpu/drm/scheduler/sched_entity.c          |  4 ++--
 drivers/hid/intel-ish-hid/ipc/pci-ish.c           |  2 +-
 drivers/i2c/i2c-core-slave.c                      |  2 +-
 drivers/spi/spi-axi-spi-engine.c                  |  4 ++--
 drivers/ufs/core/ufshcd.c                         | 12 ++++++------
 fs/btrfs/extent_map.c                             |  4 ++--
 fs/btrfs/raid56.c                                 |  4 ++--
 include/linux/tracepoint.h                        | 11 +++++++++++
 io_uring/io_uring.h                               |  2 +-
 kernel/irq_work.c                                 |  2 +-
 kernel/sched/ext.c                                |  2 +-
 kernel/smp.c                                      |  2 +-
 net/core/dev.c                                    |  2 +-
 net/core/xdp.c                                    |  2 +-
 net/openvswitch/actions.c                         |  2 +-
 net/openvswitch/datapath.c                        |  2 +-
 net/sctp/outqueue.c                               |  2 +-
 net/tipc/node.c                                   |  2 +-
 30 files changed, 62 insertions(+), 50 deletions(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2026-03-14  0:24 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-12 15:04 [PATCH 00/15] tracepoint: Avoid double static_branch evaluation at guarded call sites Vineeth Pillai (Google)
2026-03-12 15:04 ` [PATCH 01/15] tracepoint: Add trace_invoke_##name() API Vineeth Pillai (Google)
2026-03-12 15:12   ` Steven Rostedt
2026-03-12 15:39     ` Vineeth Remanan Pillai
2026-03-12 15:53       ` Peter Zijlstra
2026-03-12 16:05         ` Vineeth Remanan Pillai
2026-03-14  0:24           ` Keith Busch
2026-03-12 15:04 ` [PATCH 03/15] io_uring: Use trace_invoke_##name() at guarded tracepoint call sites Vineeth Pillai (Google)
2026-03-12 15:24   ` Keith Busch
2026-03-12 15:38     ` Steven Rostedt
2026-03-12 15:12 ` [PATCH 00/15] tracepoint: Avoid double static_branch evaluation at guarded " Mathieu Desnoyers
2026-03-12 15:23   ` Steven Rostedt
2026-03-12 15:28     ` Mathieu Desnoyers
2026-03-12 15:40       ` Steven Rostedt
2026-03-12 15:49         ` Mathieu Desnoyers
2026-03-12 15:54           ` Peter Zijlstra
2026-03-12 15:57             ` Mathieu Desnoyers
2026-03-12 16:08           ` Vineeth Remanan Pillai
2026-03-12 16:54             ` Andrii Nakryiko
2026-03-12 17:02               ` Steven Rostedt
2026-03-13 14:02                 ` Vineeth Remanan Pillai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox