public inbox for [email protected]
 help / color / mirror / Atom feed
* Re: [PATCH v2 00/13] rename & split tests
       [not found] <[email protected]>
@ 2022-05-21 23:13 ` Dave Chinner
  2022-05-22  1:07   ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2022-05-21 23:13 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Zorro Lang, fstests, Eryu Guan, Amir Goldstein, Christoph Hellwig,
	Darrick J. Wong, io-uring

[cc io_uring]

On Thu, May 12, 2022 at 06:52:37PM +0200, Christian Brauner wrote:
> From: "Christian Brauner (Microsoft)" <[email protected]>
> 
> Hey everyone,
> 
> Please note that this patch series contains patches that will be
> rejected by the fstests mailing list because of the amount of changes
> they contain. So tools like b4 will not be able to find the whole patch
> series on a mailing list. In case it's helpful I've added the
> "fstests.vfstest.for-next" tag which can be pulled. Otherwise it's
> possible to simply use the patch series as it appears in your inbox.
> 
> All vfstests pass:

[...]

> #### xfs ####
> ubuntu@imp1-vm:~/src/git/xfstests$ sudo ./check -g idmapped
> FSTYP         -- xfs (debug)
> PLATFORM      -- Linux/x86_64 imp1-vm 5.18.0-rc4-fs-mnt-hold-writers-8a2e2350494f #107 SMP PREEMPT_DYNAMIC Mon May 9 12:12:34 UTC 2022
> MKFS_OPTIONS  -- -f /dev/sda4
> MOUNT_OPTIONS -- /dev/sda4 /mnt/scratch
> 
> generic/633 58s ...  58s
> generic/644 62s ...  60s
> generic/645 161s ...  161s
> generic/656 62s ...  63s
> xfs/152 133s ...  133s
> xfs/153 94s ...  92s
> Ran: generic/633 generic/644 generic/645 generic/656 xfs/152 xfs/153
> Passed all 6 tests

I'm not sure if it's this series that has introduced a test bug or
triggered a latent issue in the kernel, but I've started seeing
generic/633 throw audit subsystem warnings on a single test machine
as of late Friday:

[ 7285.015888] WARNING: CPU: 3 PID: 2147118 at kernel/auditsc.c:2035 __audit_syscall_entry+0x113/0x140
[ 7285.019973] Modules linked in:
[ 7285.021281] CPU: 3 PID: 2147118 Comm: vfstest Not tainted 5.18.0-rc7-dgc+ #1250
[ 7285.024341] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 7285.027782] RIP: 0010:__audit_syscall_entry+0x113/0x140
[ 7285.029923] Code: 24 e8 c1 6b ff ff 48 8b 34 24 85 c0 48 8b 54 24 08 48 8b 4c 24 10 4c 8b 44 24 18 0f 84 72 ff ff ff 48 83 c4 20 5b 5d 41 5c c3 <0f> 0b 85 c0 75 14 48 83 c4 20 48 c7 c7 70 45 7f 82 5b 5d 41 5c e9
[ 7285.037563] RSP: 0018:ffffc900012f7ed0 EFLAGS: 00010202
[ 7285.039748] RAX: 0000000000000000 RBX: ffff8880aaf18800 RCX: 000000000000003c
[ 7285.043126] RDX: 00000000000000e7 RSI: 0000000000000000 RDI: ffff888102104f00
[ 7285.046120] RBP: 00000000000000e7 R08: fffffffffffffe2c R09: 0000000000000002
[ 7285.049108] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[ 7285.052058] R13: ffffc900012f7f58 R14: 00000000000000e7 R15: 0000000000000000
[ 7285.055030] FS:  00007f7906d6c740(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000
[ 7285.058396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7285.060788] CR2: 00007f3ffa7e9bb8 CR3: 000000010bb00000 CR4: 00000000000006e0
[ 7285.063735] Call Trace:
[ 7285.064796]  <TASK>
[ 7285.065759]  syscall_trace_enter.constprop.0+0x122/0x1a0
[ 7285.067978]  do_syscall_64+0x16/0x80
[ 7285.069497]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 7285.071590] RIP: 0033:0x7f7906e35f49
[ 7285.073118] Code: 00 4c 8b 05 29 6f 10 00 be e7 00 00 00 ba 3c 00 00 00 eb 12 0f 1f 44 00 00 89 d0 0f 05 48 3d 00 f0 ff ff 77 1c f4 89 f0 0f 05 <48> 3d 00 f0 ff ff 76 e7 f7 d8 64 41 89 00 eb df 0f 1f 80 00 00 00
[ 7285.078021] RSP: 002b:00007ffeee52db88 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[ 7285.079995] RAX: ffffffffffffffda RBX: 00007f7906f39920 RCX: 00007f7906e35f49
[ 7285.081869] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
[ 7285.083729] RBP: 0000000000000000 R08: ffffffffffffff88 R09: 0000000000000001
[ 7285.085594] R10: fffffffffffffe2c R11: 0000000000000246 R12: 00007f7906f39920
[ 7285.087457] R13: 0000000000000001 R14: 00007f7906f3ee28 R15: 0000000000000000
[ 7285.089320]  </TASK>
[ 7285.089949] ---[ end trace 0000000000000000 ]---
[ 7285.091182] audit_panic: 22 callbacks suppressed
[ 7285.091185] audit: unrecoverable error in audit_syscall_entry()

Adn faddr_to_line tells me it is this:

void __audit_syscall_entry(int major, unsigned long a1, unsigned long a2,
                           unsigned long a3, unsigned long a4)
{
        struct audit_context *context = audit_context();
        enum audit_state     state;

        if (!audit_enabled || !context)
                return;

>>>>>>  WARN_ON(context->context != AUDIT_CTX_UNUSED);  <<<<<<
        WARN_ON(context->name_count);
        if (context->context != AUDIT_CTX_UNUSED || context->name_count) {
                audit_panic("unrecoverable error in audit_syscall_entry()");
                return;
        }
.....

I have no clue how the audit subsystem works, I don't even know how
to enable it, and I've never seen audit messages on the console of
this test machine. I have other test machines that have audit
enabled, and they do not dump warnings like this - the only thing I
see in the logs for those machines is this:

 run xfstest generic/633
 process 'vfstest' launched '/dev/fd/4/file1' with NULL argv: empty string added
 XFS (pmem0): Unmounting Filesystem
 XFS (pmem0): Mounting V5 Filesystem
 XFS (pmem0): Ending clean mount
 run xfstest generic/634

That's waht I was seeing from this test machine earlier in the week,
too - I've been running 5.18-rc7 as the base kernel all week - so
I'm not sure .....

Oooooohhhh.

/* The per-task audit context. */
struct audit_context {
        int                 dummy;      /* must be the first element */
        enum {
                AUDIT_CTX_UNUSED,       /* audit_context is currently unused */
                AUDIT_CTX_SYSCALL,      /* in use by syscall */
                AUDIT_CTX_URING,        /* in use by io_uring */
        } context;
....

And that reminded me of something. I commented on #xfs on Friday
afternoon:

[20/5/22 15:04] <dchinner> so of the 3.5 hours run time on the machine that jsut completed, it looks like about a dozen tests are responsible for an hour of that runtime...
[20/5/22 15:05] <dchinner> but it was a clean run with no failures in 1055 tests run.
[20/5/22 15:06] <dchinner> But there's some WTFs like this in it:
[20/5/22 15:06] <dchinner> generic/678     [not run] kernel does not support IO_URING
[20/5/22 15:08] <dchinner> yet the same kernel on a different machine:
[20/5/22 15:08] <dchinner> generic/678 11s ...  3s
[20/5/22 15:08] <dchinner> and they have the same userspace, too....

Yeah, the machine that complained about "kernel does not support
IO_URING" is the one that is throwing these warnings now. It wasn't
that the kernel didn't support io-uring, it was that the machine was
missing the liburing-dev library. I installed it and rebuilt
fstests. These audit failures co-incide with the first test runs
with io-uring enabled. And vfstest uses io_uring if fstests enables
it.

Hence this now smells like a pre-existing issue -  either a test bug
or an io_uring task audit context leak. Anyone got any ideas?

Cheers,

Dave.
-- 
Dave Chinner
[email protected]


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 00/13] rename & split tests
  2022-05-21 23:13 ` [PATCH v2 00/13] rename & split tests Dave Chinner
@ 2022-05-22  1:07   ` Jens Axboe
  2022-05-22  2:19     ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2022-05-22  1:07 UTC (permalink / raw)
  To: Dave Chinner, Christian Brauner
  Cc: Zorro Lang, fstests, Eryu Guan, Amir Goldstein, Christoph Hellwig,
	Darrick J. Wong, io-uring

On 5/21/22 5:13 PM, Dave Chinner wrote:
> [cc io_uring]
> 
> On Thu, May 12, 2022 at 06:52:37PM +0200, Christian Brauner wrote:
>> From: "Christian Brauner (Microsoft)" <[email protected]>
>>
>> Hey everyone,
>>
>> Please note that this patch series contains patches that will be
>> rejected by the fstests mailing list because of the amount of changes
>> they contain. So tools like b4 will not be able to find the whole patch
>> series on a mailing list. In case it's helpful I've added the
>> "fstests.vfstest.for-next" tag which can be pulled. Otherwise it's
>> possible to simply use the patch series as it appears in your inbox.
>>
>> All vfstests pass:
> 
> [...]
> 
>> #### xfs ####
>> ubuntu@imp1-vm:~/src/git/xfstests$ sudo ./check -g idmapped
>> FSTYP         -- xfs (debug)
>> PLATFORM      -- Linux/x86_64 imp1-vm 5.18.0-rc4-fs-mnt-hold-writers-8a2e2350494f #107 SMP PREEMPT_DYNAMIC Mon May 9 12:12:34 UTC 2022
>> MKFS_OPTIONS  -- -f /dev/sda4
>> MOUNT_OPTIONS -- /dev/sda4 /mnt/scratch
>>
>> generic/633 58s ...  58s
>> generic/644 62s ...  60s
>> generic/645 161s ...  161s
>> generic/656 62s ...  63s
>> xfs/152 133s ...  133s
>> xfs/153 94s ...  92s
>> Ran: generic/633 generic/644 generic/645 generic/656 xfs/152 xfs/153
>> Passed all 6 tests
> 
> I'm not sure if it's this series that has introduced a test bug or
> triggered a latent issue in the kernel, but I've started seeing
> generic/633 throw audit subsystem warnings on a single test machine
> as of late Friday:
> 
> [ 7285.015888] WARNING: CPU: 3 PID: 2147118 at kernel/auditsc.c:2035 __audit_syscall_entry+0x113/0x140

Does your kernel have this commit?

commit 69e9cd66ae1392437234a63a3a1d60b6655f92ef
Author: Julian Orth <[email protected]>
Date:   Tue May 17 12:32:53 2022 +0200

    audit,io_uring,io-wq: call __audit_uring_exit for dummy contexts

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 00/13] rename & split tests
  2022-05-22  1:07   ` Jens Axboe
@ 2022-05-22  2:19     ` Jens Axboe
  2022-05-23  0:13       ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2022-05-22  2:19 UTC (permalink / raw)
  To: Dave Chinner, Christian Brauner
  Cc: Zorro Lang, fstests, Eryu Guan, Amir Goldstein, Christoph Hellwig,
	Darrick J. Wong, io-uring

On 5/21/22 7:07 PM, Jens Axboe wrote:
> On 5/21/22 5:13 PM, Dave Chinner wrote:
>> [cc io_uring]
>>
>> On Thu, May 12, 2022 at 06:52:37PM +0200, Christian Brauner wrote:
>>> From: "Christian Brauner (Microsoft)" <[email protected]>
>>>
>>> Hey everyone,
>>>
>>> Please note that this patch series contains patches that will be
>>> rejected by the fstests mailing list because of the amount of changes
>>> they contain. So tools like b4 will not be able to find the whole patch
>>> series on a mailing list. In case it's helpful I've added the
>>> "fstests.vfstest.for-next" tag which can be pulled. Otherwise it's
>>> possible to simply use the patch series as it appears in your inbox.
>>>
>>> All vfstests pass:
>>
>> [...]
>>
>>> #### xfs ####
>>> ubuntu@imp1-vm:~/src/git/xfstests$ sudo ./check -g idmapped
>>> FSTYP         -- xfs (debug)
>>> PLATFORM      -- Linux/x86_64 imp1-vm 5.18.0-rc4-fs-mnt-hold-writers-8a2e2350494f #107 SMP PREEMPT_DYNAMIC Mon May 9 12:12:34 UTC 2022
>>> MKFS_OPTIONS  -- -f /dev/sda4
>>> MOUNT_OPTIONS -- /dev/sda4 /mnt/scratch
>>>
>>> generic/633 58s ...  58s
>>> generic/644 62s ...  60s
>>> generic/645 161s ...  161s
>>> generic/656 62s ...  63s
>>> xfs/152 133s ...  133s
>>> xfs/153 94s ...  92s
>>> Ran: generic/633 generic/644 generic/645 generic/656 xfs/152 xfs/153
>>> Passed all 6 tests
>>
>> I'm not sure if it's this series that has introduced a test bug or
>> triggered a latent issue in the kernel, but I've started seeing
>> generic/633 throw audit subsystem warnings on a single test machine
>> as of late Friday:
>>
>> [ 7285.015888] WARNING: CPU: 3 PID: 2147118 at kernel/auditsc.c:2035 __audit_syscall_entry+0x113/0x140
> 
> Does your kernel have this commit?
> 
> commit 69e9cd66ae1392437234a63a3a1d60b6655f92ef
> Author: Julian Orth <[email protected]>
> Date:   Tue May 17 12:32:53 2022 +0200
> 
>     audit,io_uring,io-wq: call __audit_uring_exit for dummy contexts

I could not reproduce either with or without your patch when I finally
got that test going and figure out how to turn on audit and get it
enabled. I don't run with that.

But looking at your line numbers, I think you're missing the above
commit. The WARN_ON_ONCE() matches up with it NOT being applied, which
is most likely why it triggers for you. It's in Linus's tree, but not in
-rc7.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 00/13] rename & split tests
  2022-05-22  2:19     ` Jens Axboe
@ 2022-05-23  0:13       ` Dave Chinner
  2022-05-23  0:57         ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2022-05-23  0:13 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Christian Brauner, Zorro Lang, fstests, Eryu Guan, Amir Goldstein,
	Christoph Hellwig, Darrick J. Wong, io-uring

On Sat, May 21, 2022 at 08:19:51PM -0600, Jens Axboe wrote:
> On 5/21/22 7:07 PM, Jens Axboe wrote:
> > On 5/21/22 5:13 PM, Dave Chinner wrote:
> >> [cc io_uring]
> >>
> >> On Thu, May 12, 2022 at 06:52:37PM +0200, Christian Brauner wrote:
> >>> From: "Christian Brauner (Microsoft)" <[email protected]>
> >>>
> >>> Hey everyone,
> >>>
> >>> Please note that this patch series contains patches that will be
> >>> rejected by the fstests mailing list because of the amount of changes
> >>> they contain. So tools like b4 will not be able to find the whole patch
> >>> series on a mailing list. In case it's helpful I've added the
> >>> "fstests.vfstest.for-next" tag which can be pulled. Otherwise it's
> >>> possible to simply use the patch series as it appears in your inbox.
> >>>
> >>> All vfstests pass:
> >>
> >> [...]
> >>
> >>> #### xfs ####
> >>> ubuntu@imp1-vm:~/src/git/xfstests$ sudo ./check -g idmapped
> >>> FSTYP         -- xfs (debug)
> >>> PLATFORM      -- Linux/x86_64 imp1-vm 5.18.0-rc4-fs-mnt-hold-writers-8a2e2350494f #107 SMP PREEMPT_DYNAMIC Mon May 9 12:12:34 UTC 2022
> >>> MKFS_OPTIONS  -- -f /dev/sda4
> >>> MOUNT_OPTIONS -- /dev/sda4 /mnt/scratch
> >>>
> >>> generic/633 58s ...  58s
> >>> generic/644 62s ...  60s
> >>> generic/645 161s ...  161s
> >>> generic/656 62s ...  63s
> >>> xfs/152 133s ...  133s
> >>> xfs/153 94s ...  92s
> >>> Ran: generic/633 generic/644 generic/645 generic/656 xfs/152 xfs/153
> >>> Passed all 6 tests
> >>
> >> I'm not sure if it's this series that has introduced a test bug or
> >> triggered a latent issue in the kernel, but I've started seeing
> >> generic/633 throw audit subsystem warnings on a single test machine
> >> as of late Friday:
> >>
> >> [ 7285.015888] WARNING: CPU: 3 PID: 2147118 at kernel/auditsc.c:2035 __audit_syscall_entry+0x113/0x140
> > 
> > Does your kernel have this commit?
> > 
> > commit 69e9cd66ae1392437234a63a3a1d60b6655f92ef
> > Author: Julian Orth <[email protected]>
> > Date:   Tue May 17 12:32:53 2022 +0200
> > 
> >     audit,io_uring,io-wq: call __audit_uring_exit for dummy contexts

No, that wasn't in -rc7.

> I could not reproduce either with or without your patch when I finally
> got that test going and figure out how to turn on audit and get it
> enabled. I don't run with that.

Ok. Given that this has been broken for over a year and nobody
has noticed until late .18-rcX, it might be worth adding an audit
enabled VM to your io-uring test farm....

Cheers,

Dve.
-- 
Dave Chinner
[email protected]


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 00/13] rename & split tests
  2022-05-23  0:13       ` Dave Chinner
@ 2022-05-23  0:57         ` Jens Axboe
  0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2022-05-23  0:57 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Christian Brauner, Zorro Lang, fstests, Eryu Guan, Amir Goldstein,
	Christoph Hellwig, Darrick J. Wong, io-uring

On 5/22/22 6:13 PM, Dave Chinner wrote:
> On Sat, May 21, 2022 at 08:19:51PM -0600, Jens Axboe wrote:
>> On 5/21/22 7:07 PM, Jens Axboe wrote:
>>> On 5/21/22 5:13 PM, Dave Chinner wrote:
>>>> [cc io_uring]
>>>>
>>>> On Thu, May 12, 2022 at 06:52:37PM +0200, Christian Brauner wrote:
>>>>> From: "Christian Brauner (Microsoft)" <[email protected]>
>>>>>
>>>>> Hey everyone,
>>>>>
>>>>> Please note that this patch series contains patches that will be
>>>>> rejected by the fstests mailing list because of the amount of changes
>>>>> they contain. So tools like b4 will not be able to find the whole patch
>>>>> series on a mailing list. In case it's helpful I've added the
>>>>> "fstests.vfstest.for-next" tag which can be pulled. Otherwise it's
>>>>> possible to simply use the patch series as it appears in your inbox.
>>>>>
>>>>> All vfstests pass:
>>>>
>>>> [...]
>>>>
>>>>> #### xfs ####
>>>>> ubuntu@imp1-vm:~/src/git/xfstests$ sudo ./check -g idmapped
>>>>> FSTYP         -- xfs (debug)
>>>>> PLATFORM      -- Linux/x86_64 imp1-vm 5.18.0-rc4-fs-mnt-hold-writers-8a2e2350494f #107 SMP PREEMPT_DYNAMIC Mon May 9 12:12:34 UTC 2022
>>>>> MKFS_OPTIONS  -- -f /dev/sda4
>>>>> MOUNT_OPTIONS -- /dev/sda4 /mnt/scratch
>>>>>
>>>>> generic/633 58s ...  58s
>>>>> generic/644 62s ...  60s
>>>>> generic/645 161s ...  161s
>>>>> generic/656 62s ...  63s
>>>>> xfs/152 133s ...  133s
>>>>> xfs/153 94s ...  92s
>>>>> Ran: generic/633 generic/644 generic/645 generic/656 xfs/152 xfs/153
>>>>> Passed all 6 tests
>>>>
>>>> I'm not sure if it's this series that has introduced a test bug or
>>>> triggered a latent issue in the kernel, but I've started seeing
>>>> generic/633 throw audit subsystem warnings on a single test machine
>>>> as of late Friday:
>>>>
>>>> [ 7285.015888] WARNING: CPU: 3 PID: 2147118 at kernel/auditsc.c:2035 __audit_syscall_entry+0x113/0x140
>>>
>>> Does your kernel have this commit?
>>>
>>> commit 69e9cd66ae1392437234a63a3a1d60b6655f92ef
>>> Author: Julian Orth <[email protected]>
>>> Date:   Tue May 17 12:32:53 2022 +0200
>>>
>>>     audit,io_uring,io-wq: call __audit_uring_exit for dummy contexts
> 
> No, that wasn't in -rc7.
> 
>> I could not reproduce either with or without your patch when I finally
>> got that test going and figure out how to turn on audit and get it
>> enabled. I don't run with that.
> 
> Ok. Given that this has been broken for over a year and nobody
> has noticed until late .18-rcX, it might be worth adding an audit
> enabled VM to your io-uring test farm....

It was in the 5.16 release, so it's ~4 months ago. Don't disagree on the
testing, though I do think that's mostly on the audit side. I had no
hand in any of that code.

>From my experience trying to reproduce it yesterday, my test distros
don't even enable it and you have to both fiddle the config and add a
boot parameter to even turn it on. And then it still didn't trigger for
me.

I'll see if I can add something to the testing mix for this.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-05-23  0:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <[email protected]>
2022-05-21 23:13 ` [PATCH v2 00/13] rename & split tests Dave Chinner
2022-05-22  1:07   ` Jens Axboe
2022-05-22  2:19     ` Jens Axboe
2022-05-23  0:13       ` Dave Chinner
2022-05-23  0:57         ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox