public inbox for [email protected]
 help / color / mirror / Atom feed
From: Steven Rostedt <[email protected]>
To: Stefan Metzmacher <[email protected]>
Cc: Ingo Molnar <[email protected]>,
	[email protected],
	io-uring <[email protected]>,
	"[email protected]" <[email protected]>
Subject: Re: sched_waking vs. set_event_pid crash (Re: Tracing busy processes/threads freezes/stalls the whole machine)
Date: Fri, 23 Jul 2021 08:41:22 -0400	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On Fri, 23 Jul 2021 13:53:41 +0200
Stefan Metzmacher <[email protected]> wrote:

> Hi Steve,
> 
> >>> Assuming this does fix your issue, I sent out a real patch with the
> >>> explanation of what happened in the change log, so that you can see why
> >>> that change was your issue.    
> >>
> >> Yes, it does the trick, thanks very much!  
> > 
> > Can I get a "Tested-by" from you?  
> 
> Sure!

Thanks.

> 
> Can you check if the static_key_disable() and static_key_enable()
> calls are correctly ordered compared to rcu_assign_pointer()
> and explain why they are, as I not really understand that it's different
> from tracepoint_update_call vs. rcu_assign_pointer

The order doesn't even matter. I'm assuming you are talking about the
static_key_disable/enable with respect to enabling the tracepoint.

The reason it doesn't matter is because the rcu_assign_pointer is
updating an array of elements that hold both the function to call and
the data it needs. Inside the tracepoint loop where the callbacks are
called, in the iterator code (not the static call), you will see:

		it_func_ptr =						\
			rcu_dereference_raw((&__tracepoint_##_name)->funcs); \
		if (it_func_ptr) {					\
			do {						\
				it_func = READ_ONCE((it_func_ptr)->func); \
				__data = (it_func_ptr)->data;		\
				((void(*)(void *, proto))(it_func))(__data, args); \
			} while ((++it_func_ptr)->func);		\
		}

What that does is to get either the old array, or the new array and
places that array into it_func_ptr. Each element of this array contains
the callback (in .func) and the callback's data (in .data).

The enabling or disabling of the tracepoint doesn't really make a
difference with respect to the order of updating the funcs array.
That's because the users of this will either see the old array, the new
array, or NULL, in that it_func_ptr. This is how RCU works.

The bug we hit was because the static_call was updated separately from
the array. That makes it more imperative that you get the order
correct. As my email stated, with static_calls we have this:

		it_func_ptr =						\
			rcu_dereference_raw((&__tracepoint_##name)->funcs); \
		if (it_func_ptr) {					\
			__data = (it_func_ptr)->data;			\
			static_call(tp_func_##name)(__data, args);	\
		}

Where the issue is that the static_call which chooses which callback to
make, is updated asynchronously from the update of the array.

> 
> >> Now I can finally use:
> >>
> >> trace-cmd record -e all -P $(pidof io_uring-cp-forever)
> >>
> >> But that doesn't include the iou-wrk-* threads
> >> and the '-c' option seems to only work with forking.  
> > 
> > Have you tried it? It should work for threads as well. It hooks to the
> > sched_process_fork tracepoint, which should be triggered even when a new
> > thread is created.
> > 
> > Or do you mean that you want that process and all its threads too that are
> > already running? I could probably have it try to add it via the /proc file
> > system in that case.
> > 
> > Can you start the task via trace-cmd?
> > 
> >   trace-cmd record -e all -F -c io_uring-cp-forever ...  
> 
> I think that would work, but I typically want to analyze a process
> that is already running.
> 
> >> Is there a way to specify "trace *all* threads of the given pid"?
> >> (Note the threads are comming and going, so it's not possible to
> >> specifiy -P more than once)  
> > 
> > Right, although, you could append tasks manually to the set_event_pid file
> > from another terminal after starting trace-cmd. Once a pid is added to that
> > file, all children it makes will also be added. That could be a work around
> > until we have trace-cmd do it.  
> 
> Sure, but it will always be racy.

Not really. Matters what you mean by "racy". You wont be able to
"instantaneously" enable a process and all its threads, but you can
capture all of them after a given point. As you are attaching to a
process already running, you already missed events because you were not
yet tracing. But once you have a thread, you will always have it. 

> 
> With children, does new threads also count as children or only fork() children?

New threads. It's the kernel point of view of a task, which does not
differentiate processes from threads. Everything that can be scheduled
on a CPU is called a "task". How a task interacts with other tasks is
what defines it being a "thread" or a "process".

> 
> > Care to write a bugzilla report for this feature?
> > 
> >   https://bugzilla.kernel.org/buglist.cgi?component=Trace-cmd%2FKernelshark&list_id=1088173  
> 
> I may do later...
> 

Thanks,

-- Steve

      reply	other threads:[~2021-07-23 12:41 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-22 14:26 Tracing busy processes/threads freezes/stalls the whole machine Stefan Metzmacher
2021-04-22 14:55 ` Stefan Metzmacher
2021-05-04 12:26 ` Stefan Metzmacher
2021-05-04 13:24 ` Steven Rostedt
2021-05-04 13:28   ` Stefan Metzmacher
2021-05-04 13:32     ` Steven Rostedt
2021-05-04 13:35     ` Steven Rostedt
2021-05-05  9:50       ` Stefan Metzmacher
2021-05-31 15:39         ` Stefan Metzmacher
2021-07-19 17:07           ` Stefan Metzmacher
2021-07-22 22:43             ` sched_waking vs. set_event_pid crash (Re: Tracing busy processes/threads freezes/stalls the whole machine) Stefan Metzmacher
2021-07-23  1:41               ` Steven Rostedt
2021-07-23  2:51               ` Steven Rostedt
2021-07-23 10:35                 ` Stefan Metzmacher
2021-07-23 11:29                   ` Steven Rostedt
2021-07-23 11:53                     ` Stefan Metzmacher
2021-07-23 12:41                       ` Steven Rostedt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox