From: Mathieu Desnoyers <[email protected]>
To: rostedt <[email protected]>
Cc: linux-kernel <[email protected]>,
linux-trace-devel <[email protected]>,
Ingo Molnar <[email protected]>,
Andrew Morton <[email protected]>,
Stefan Metzmacher <[email protected]>,
io-uring <[email protected]>,
Peter Zijlstra <[email protected]>,
paulmck <[email protected]>
Subject: Re: [PATCH] tracepoints: Update static_call before tp_funcs when adding a tracepoint
Date: Mon, 26 Jul 2021 13:39:18 -0400 (EDT) [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
----- On Jul 26, 2021, at 12:56 PM, rostedt [email protected] wrote:
> On Mon, 26 Jul 2021 11:46:41 -0400 (EDT)
> Mathieu Desnoyers <[email protected]> wrote:
[...]
>
>>
>> My second conclusion is that it's odd that transition 1->0 leaves the
>> prior function call in place even after it's been removed. When we go
>> back to 0->1, that function call may still be called even though the
>> function is not there anymore. And there is no RCU synchronization on
>> these transitions, so those are all possible scenarios.
>
> How so? When doing this transition we have:
>
> tracepoint_update_call(tp, tp_funcs, false);
> rcu_assign_pointer(tp->funcs, tp_funcs);
> static_key_enable(&tp->key);
>
> Where that tracepoint_update_call() will reinstall the iterator, and
> that's a full memory barrier. It even sends IPIs to all other CPUs to
> make sure all CPUs are synchronized before continuing.
>
> By the time we get to static_key_enable(), there will not be any CPUs
> that see the old function. And the process of updating a static_key
> also does the same kind of synchronization.
Actually, my explanation was inaccurate. The issue is that the _new_ callback
may see the _old_ data.
Considering __DO_TRACE_CALL:
do { \
struct tracepoint_func *it_func_ptr; \
void *__data; \
it_func_ptr = \
rcu_dereference_raw((&__tracepoint_##name)->funcs); \
if (it_func_ptr) { \
__data = (it_func_ptr)->data; \
----> [ delayed here on one CPU (e.g. vcpu preempted by the host) ]
static_call(tp_func_##name)(__data, args); \
} \
} while (0)
It has loaded the tp->funcs of the old callback (so it will try to use the old
data).
AFAIU, none of the synchronization mechanisms you refer to here (memory barrier,
IPIs..) will change the fact that this CPU may still be delayed across the entire
1->0->1 transition sequence, and may end up calling the new callback with the
old data. Unless an explicit RCU-sync is done.
>
>>
>> My third conclusion is that we'd need synchronize RCU whenever tp_funcs[0].data
>> changes for transitions 1->2, 2->1, and 1->2 because the priorities don't
>> guarantee
>> that the first callback stays in the first position, and we also need to rcu
>> sync
>> unconditionally on transition 1->0. We currently only have sync RCU on
>> transition
>> from 2->1 when tp_funcs[0].func changes, which is bogus in many ways.
>
> Going from 1 to 2, there's no issue. We switch to the iterator, which
> is the old method anyway. It looks directly at the array and matches
> the data with the func for each element of that array, and the data
> read initially (before calling the iterator) is ignored.
This relies on ordering guarantees between RCU assign/dereference and static_call
updates/call. It may well be the case, but I'm asking anyway.
Are we guaranteed of the following ordering ?
CPU A CPU B
static_call_update()
y = rcu_dereference(x) rcu_assign_pointer(x, ...)
do_static_call(y)
That load of "x" should never happen after the CPU fetches the new static call
instruction.
Also, I suspect that transition 2->1 needs an unconditional rcu-sync because you
may have a sequence of 3->2->1 (or 1->2->1) where the element 0 data is unchanged
between 2->1, but was changed from 3->2 (or from 1->2), which may be observed by the
static call.
Thanks,
Mathieu
>
>>
>> Basically, transitions from the iterator to a specific function should be
>> handled
>> with care (making sure the tp_funcs array is updated and rcu-sync is done),
>> except
>> in the specific case where the prior tp->funcs was NULL, which skips the
>> function
>> call. And unless there is a rcu-sync between the state transitions, we need to
>> consider
>> all prior states as additional original state as well. Therefore, in a 1->0->1
>> transition sequence, it's very much possible that the old function ends up
>> observing
>> the new callback's data unless we add some rcu sync in between.
>
> I disagree with the last part, as I explained above.
>
> But I do agree that comparing data is probably the better check.
>
> -- Steve
>
>>
>> Thoughts ?
>>
>> Thanks,
>>
>> Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
next prev parent reply other threads:[~2021-07-26 17:39 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-23 2:33 [PATCH] tracepoints: Update static_call before tp_funcs when adding a tracepoint Steven Rostedt
2021-07-26 15:46 ` Mathieu Desnoyers
2021-07-26 16:56 ` Steven Rostedt
2021-07-26 17:39 ` Mathieu Desnoyers [this message]
2021-07-26 18:49 ` Steven Rostedt
2021-07-26 19:12 ` Mathieu Desnoyers
2021-07-27 11:44 ` Peter Zijlstra
2021-07-27 13:46 ` Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=682927571.6760.1627321158652.JavaMail.zimbra@efficios.com \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox