public inbox for [email protected]
 help / color / mirror / Atom feed
From: Stefan Metzmacher <[email protected]>
To: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>,
	[email protected],
	io-uring <[email protected]>
Subject: Re: Tracing busy processes/threads freezes/stalls the whole machine
Date: Wed, 5 May 2021 11:50:01 +0200	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>


Am 04.05.21 um 15:35 schrieb Steven Rostedt:
> On Tue, 4 May 2021 15:28:12 +0200
> Stefan Metzmacher <[email protected]> wrote:
> 
>> Am 04.05.21 um 15:24 schrieb Steven Rostedt:
>>> On Thu, 22 Apr 2021 16:26:57 +0200
>>> Stefan Metzmacher <[email protected]> wrote:
>>>   
>>>> Hi Steven, hi Ingo,
>>>>  
>>>
>>> Sorry, somehow I missed this.
>>>   
>>>> # But
>>>> # trace-cmd record -e all -P 7  
> 
> Perhaps you could try something like this:
> 
>  # trace-cmd list -s |
>     while read e ; do
>       echo "testing $e";
>       trace-cmd record -P 7 -e $e sleep 1;
>     done
> 
> Which will enable each system at a time, and see if we can pin point what
> is causing the lock up. Narrow it down to a tracing system.

I wasn't able to reproduce it.

With a 5.12 kernel and ubuntu 20.04 amd64 userspace I was able
to produce the following strace output (in the 2nd try), like this:

# ssh [email protected] strace -f -ttT trace-cmd record -e all -P 101 2>&1 | tee out

101 is the pid of a kthread while my './io-uring_cp-forever link-cp.c file' runs
on '172.31.9.166'. See https://github.com/metze-samba/liburing/commits/io_uring-cp-forever
pid 1375 is the pid of 'iou-wrk-1373'.

Maybe it's related to writing the pid into /sys/kernel/tracing/set_event_pid
as it ends like this:
> 10:56:07.493869 stat("/sys/kernel/tracing/set_event_pid", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 <0.000222>
> 10:56:07.494500 openat(AT_FDCWD, "/sys/kernel/tracing/set_event_pid", O_RDONLY) = 3 <0.000134>
> 10:56:07.494934 read(3, "1375\n", 8192) = 5 <0.000433>
> 10:56:07.495708 close(3)                = 0 <0.000121>
...> [pid  1443] 10:56:10.045293 openat(AT_FDCWD, "/sys/kernel/tracing/set_ftrace_pid", O_WRONLY|O_TRUNC|O_CLOEXEC <unfinished ...>
> [pid  1444] 10:56:10.045650 <... read resumed>"sysfs /sys sysfs rw,nosuid,nodev"..., 1024) = 1024 <0.000948>
> [pid  1443] 10:56:10.045821 <... openat resumed>) = 5 <0.000359>
> [pid  1445] 10:56:10.046000 <... fstat resumed>{st_mode=S_IFREG|0444, st_size=0, ...}) = 0 <0.000873>
> [pid  1443] 10:56:10.046174 write(5, "101 ", 4 <unfinished ...>
> [pid  1444] 10:56:10.046674 read(5,  <unfinished ...>
> [pid  1443] 10:56:10.047007 <... write resumed>) = 4 <0.000475>
> [pid  1445] 10:56:10.047188 read(5,  <unfinished ...>
> [pid  1443] 10:56:10.047498 write(5, " ", 1 <unfinished ...>
> [pid  1444] 10:56:10.048343 <... read resumed>"_cls,net_prio cgroup rw,nosuid,n"..., 1024) = 1024 <0.001312>
> [pid  1445] 10:56:10.048578 <... read resumed>"sysfs /sys sysfs rw,nosuid,nodev"..., 1024) = 1024 <0.001224>
> [pid  1444] 10:56:10.048796 read(5,  <unfinished ...>
> [pid  1445] 10:56:10.049195 read(5,  <unfinished ...>
> [pid  1444] 10:56:10.049547 <... read resumed>"ges hugetlbfs rw,relatime,pagesi"..., 1024) = 690 <0.000403>
> [pid  1445] 10:56:10.049722 <... read resumed>"_cls,net_prio cgroup rw,nosuid,n"..., 1024) = 1024 <0.000339>
> [pid  1444] 10:56:10.049924 close(5 <unfinished ...>
> [pid  1445] 10:56:10.050224 read(5,  <unfinished ...>
> [pid  1444] 10:56:10.050550 <... close resumed>) = 0 <0.000323>
> [pid  1445] 10:56:10.050714 <... read resumed>"ges hugetlbfs rw,relatime,pagesi"..., 1024) = 690 <0.000327>
> [pid  1444] 10:56:10.050883 openat(AT_FDCWD, "trace.dat.cpu0", O_WRONLY|O_CREAT|O_TRUNC, 0644 <unfinished ...>
> [pid  1445] 10:56:10.051299 close(5)    = 0 <0.016204>
> [pid  1445] 10:56:10.067841 openat(AT_FDCWD, "trace.dat.cpu1", O_WRONLY|O_CREAT|O_TRUNC, 0644 <unfinished ...>
> [pid  1443] 10:56:10.100729 <... write resumed>) = 0 <0.052057>
> [pid  1443] 10:56:10.101022 openat(AT_FDCWD, "/sys/kernel/tracing/set_event_pid", O_WRONLY|O_TRUNC|O_CLOEXEC) = 6 <0.014801>
> [pid  1443] 10:56:10.116299 write(6, "101 ", 4

The full output can be found here:
https://www.samba.org/~metze/strace.5.12-trace-cmd-record-all-P-101-kthread-fail-01.txt

I redid it on ubuntu 21.04 with this kernel:
https://kernel.ubuntu.com/git/ubuntu/ubuntu-hirsute.git/tag/?h=Ubuntu-5.11.0-16.17
which is based on Linux 5.11.12

I tried it onces using of pid from a kthread and it didn't reproduce:
> 09:10:50.719423 openat(AT_FDCWD, "/sys/kernel/tracing/set_event_pid", O_RDONLY) = 3 <0.000084>
> 09:10:50.719605 read(3, "", 8192)       = 0 <0.000075>
> 09:10:50.719774 close(3)                = 0 <0.000080>
...
> [pid  1135] 09:10:53.468541 fcntl(6, F_GETPIPE_SZ) = 65536 <0.000009>
> [pid  1135] 09:10:53.468589 splice(5, NULL, 7, NULL, 4096, SPLICE_F_MOVE <unfinished ...>
> [pid  1134] 09:10:53.480165 <... write resumed>) = 0 <0.014106>
> [pid  1134] 09:10:53.480201 close(4)    = 0 <0.000011>
> [pid  1134] 09:10:53.480250 openat(AT_FDCWD, "/sys/kernel/tracing/set_event_pid", O_WRONLY|O_TRUNC|O_CLOEXEC) = 4 <0.000024>
> [pid  1134] 09:10:53.480311 write(4, "7 ", 2) = 2 <0.000194>
> [pid  1134] 09:10:53.480530 close(4)    = 0 <0.000011>
> [pid  1134] 09:10:53.480565 openat(AT_FDCWD, "/sys/kernel/tracing/tracing_enabled", O_WRONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) <0.000013>
> [pid  1134] 09:10:53.480605 write(3, "1", 1) = 1 <0.000008>
> [pid  1134] 09:10:53.480632 newfstatat(1, "", {st_mode=S_IFIFO|0600, st_size=0, ...}, AT_EMPTY_PATH) = 0 <0.000007>
> [pid  1134] 09:10:53.480673 pidfd_open(7, 0) = 4 <0.000010>
> [pid  1134] 09:10:53.480701 poll([{fd=4, events=POLLIN}], 1, -1

The full output can be found here:
https://www.samba.org/~metze/strace.5.11.12-trace-cmd-record-all-P-7-kthread-nofail-02.txt

And using the pid of io_wqe_worker-0 it happened again,
also writing the pid into /sys/kernel/tracing/set_event_pid:
(remember 7 was the pid of the kthread):
> 09:13:18.000315 openat(AT_FDCWD, "/sys/kernel/tracing/set_event_pid", O_RDONLY) = 3 <0.000074>
> 09:13:18.000471 read(3, "7\n", 8192)    = 2 <0.000158>
> 09:13:18.000703 close(3)                = 0 <0.000052>
...> [pid  1224] 09:13:20.671939 close(4)    = 0 <0.000084>
> [pid  1224] 09:13:20.672106 openat(AT_FDCWD, "trace.dat.cpu0", O_WRONLY|O_CREAT|O_TRUNC, 0644) = 4 <0.000085>
> [pid  1224] 09:13:20.672292 openat(AT_FDCWD, "/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw", O_RDONLY) = 5 <0.000062>
> [pid  1224] 09:13:20.672432 pipe([6, 7]) = 0 <0.000057>
> [pid  1224] 09:13:20.672595 fcntl(6, F_GETPIPE_SZ) = 65536 <0.000051>
> [pid  1224] 09:13:20.672728 splice(5, NULL, 7, NULL, 4096, SPLICE_F_MOVE <unfinished ...>
> [pid  1223] 09:13:20.688172 <... write resumed>) = 0 <0.020134>
> [pid  1223] 09:13:20.688215 close(4)    = 0 <0.000015>
> [pid  1223] 09:13:20.688276 openat(AT_FDCWD, "/sys/kernel/tracing/set_event_pid", O_WRONLY|O_TRUNC|O_CLOEXEC) = 4 <0.027854>
> [pid  1223] 09:13:20.716198 write(4, "1043 ", 5

The full output can be found here:
https://www.samba.org/~metze/strace.5.11.12-trace-cmd-record-all-P-1043-io_wqe_worker-fail-03.txt

I thought it happens always if there was already a value in /sys/kernel/tracing/set_event_pid...,
but the next time I started with pid 6 from a kthread and directly triggered it:

> 09:41:24.029860 openat(AT_FDCWD, "/sys/kernel/tracing/set_event_pid", O_RDONLY) = 3 <0.000048>
> 09:41:24.029982 read(3, "", 8192)       = 0 <0.000046>
> 09:41:24.030101 close(3)                = 0 <0.000045>
...
> [pid  1425] 09:41:26.212972 read(4, "sysfs /sys sysfs rw,nosuid,nodev"..., 1024) = 1024 <0.000077>
> [pid  1425] 09:41:26.213128 read(4, "group/rdma cgroup rw,nosuid,node"..., 1024) = 1024 <0.000075>
> [pid  1425] 09:41:26.213316 read(4, "ev/hugepages hugetlbfs rw,relati"..., 1024) = 1024 <0.000058>
> [pid  1425] 09:41:26.213459 close(4)    = 0 <0.000044>
> [pid  1425] 09:41:26.213584 openat(AT_FDCWD, "trace.dat.cpu0", O_WRONLY|O_CREAT|O_TRUNC, 0644) = 4 <0.000229>
> [pid  1425] 09:41:26.213895 openat(AT_FDCWD, "/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw", O_RDONLY) = 5 <0.000055>
> [pid  1425] 09:41:26.214030 pipe([6, 7]) = 0 <0.000066>
> [pid  1425] 09:41:26.214188 fcntl(6, F_GETPIPE_SZ) = 65536 <0.000065>
> [pid  1425] 09:41:26.214332 splice(5, NULL, 7, NULL, 4096, SPLICE_F_MOVE

It ended here, but I think the rest ( close(4); openat("/sys/kernel/tracing/set_event_pid"); write("6 ")
was just eaten by the delayed output of the ssh session.

The full output can be found here:
https://www.samba.org/~metze/strace.5.11.12-trace-cmd-record-all-P-6-kthread-fail-04.txt

Does these details help already?

Thanks!
metze

  reply	other threads:[~2021-05-05  9:50 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-22 14:26 Tracing busy processes/threads freezes/stalls the whole machine Stefan Metzmacher
2021-04-22 14:55 ` Stefan Metzmacher
2021-05-04 12:26 ` Stefan Metzmacher
2021-05-04 13:24 ` Steven Rostedt
2021-05-04 13:28   ` Stefan Metzmacher
2021-05-04 13:32     ` Steven Rostedt
2021-05-04 13:35     ` Steven Rostedt
2021-05-05  9:50       ` Stefan Metzmacher [this message]
2021-05-31 15:39         ` Stefan Metzmacher
2021-07-19 17:07           ` Stefan Metzmacher
2021-07-22 22:43             ` sched_waking vs. set_event_pid crash (Re: Tracing busy processes/threads freezes/stalls the whole machine) Stefan Metzmacher
2021-07-23  1:41               ` Steven Rostedt
2021-07-23  2:51               ` Steven Rostedt
2021-07-23 10:35                 ` Stefan Metzmacher
2021-07-23 11:29                   ` Steven Rostedt
2021-07-23 11:53                     ` Stefan Metzmacher
2021-07-23 12:41                       ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox