public inbox for [email protected]
 help / color / mirror / Atom feed
From: John David Anglin <[email protected]>
To: Helge Deller <[email protected]>, Jens Axboe <[email protected]>,
	[email protected], [email protected]
Subject: Re: io_uring failure on parisc with VIPT caches
Date: Thu, 16 Feb 2023 15:35:59 -0500	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 2023-02-16 3:24 a.m., Helge Deller wrote:
> On 2/16/23 03:50, Jens Axboe wrote:
>> On 2/15/23 7:40 PM, John David Anglin wrote:
>>> On 2023-02-15 6:02 p.m., Jens Axboe wrote:
>>>> This is not related to Helge's patch, 6.1-stable is just still missing:
>>>>
>>>> commit fcc926bb857949dbfa51a7d95f3f5ebc657f198c
>>>> Author: Jens Axboe<[email protected]>
>>>> Date:   Fri Jan 27 09:28:13 2023 -0700
>>>>
>>>>       io_uring: add a conditional reschedule to the IOPOLL cancelation loop
>>>>
>>>> and I'm guessing you're running without preempt.
>>> With 6.2.0-rc8+, I had a different crash running poll-race-mshot.t:
>>>
>>> Backtrace:
>>>
>>>
>>> Kernel Fault: Code=15 (Data TLB miss fault) at addr 0000000000000000
>>> CPU: 0 PID: 18265 Comm: poll-race-mshot Not tainted 6.2.0-rc8+ #1
>>> Hardware name: 9000/800/rp3440
>>>
>>>       YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
>>> PSW: 00010000001001001001000111110000 Not tainted
>>> r00-03  00000000102491f0 ffffffffffffffff 000000004020307c ffffffffffffffff
>>> r04-07  ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
>>> r08-11  ffffffffffffffff 000000000407ef28 000000000407f838 8400000000800000
>>> r12-15  0000000000000000 0000000040c424e0 0000000040c424e0 0000000040c424e0
>>> r16-19  000000000407fd68 0000000063f08648 0000000040c424e0 000000000a085000
>>> r20-23  00000000000d6b44 000000002faf0800 00000000000000ff 0000000000000002
>>> r24-27  000000000407fa30 000000000407fd68 0000000000000000 0000000040c1e4e0
>>> r28-31  400000000000de84 0000000000000000 0000000000000000 0000000000000002
>>> sr00-03  0000000004081000 0000000000000000 0000000000000000 0000000004081de0
>>> sr04-07  0000000004081000 0000000000000000 0000000000000000 00000000040815a8
>>>
>>> IASQ: 0000000004081000 0000000000000000 IAOQ: 0000000000000000 0000000004081590
>>>   IIR: 00000000    ISR: 0000000000000000  IOR: 0000000000000000
>>>   CPU:        0   CR30: 000000004daf5700 CR31: ffffffffffffefff
>>>   ORIG_R28: 0000000000000000
>>>   IAOQ[0]: 0x0
>>>   IAOQ[1]: linear_quiesce+0x0/0x18 [linear]
>>>   RP(r2): intr_check_sig+0x0/0x3c
>>> Backtrace:
>>>
>>> Kernel panic - not syncing: Kernel Fault
>>
>> This means very little to me, is it a NULL pointer deref? And where's
>> the backtrace?
>
> I see iopoll.t triggering the kernel to hang on 32-bit kernel.
> System gets unresponsive, bug with sysrq-l I get:
>
> [  880.020641] sysrq: Show backtrace of all active CPUs
> [  880.024123] sysrq: CPU0:
> [  880.024123] CPU: 0 PID: 7549 Comm: kworker/u32:7 Not tainted 6.1.12-32bit+ #1595
> [  880.024123] Hardware name: 9000/785/C3700
> [  880.024123] Workqueue: events_unbound io_ring_exit_work
> [  880.024123]
> [  880.024123]      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
> [  880.024123] PSW: 00000000000011001111111100001111 Not tainted
> [  880.024123] r00-03  000cff0f 19610540 104f7b70 19610540
> [  880.024123] r04-07  1921a278 00000000 192c8400 1921b508
> [  880.024123] r08-11  00000003 0000002e 195fd050 00000004
> [  880.024123] r12-15  192c8710 10a77000 00000000 00002000
> [  880.024123] r16-19  1921a210 1240c000 1240c060 1924aff0
> [  880.024123] r20-23  00000002 00000000 104b4384 00000020
> [  880.024123] r24-27  00000003 19610548 1921a210 10aba968
> [  880.024123] r28-31  1094f5c0 0000000e 196105c0 104f7b70
> [  880.024123] sr00-03  00000000 00001695 00000000 00001695
> [  880.024123] sr04-07  00000000 00000000 00000000 00000000
> [  880.024123]
> [  880.024123] IASQ: 00000000 00000000 IAOQ: 104f7b6c 104b4384
> [  880.024123]  IIR: 081f0242    ISR: 00002000  IOR: 00000000
> [  880.024123]  CPU:        0   CR30: 195fd050 CR31: d237ffff
> [  880.024123]  ORIG_R28: 00000000
> [  880.024123]  IAOQ[0]: io_do_iopoll+0xb4/0x3a4
> [  880.024123]  IAOQ[1]: iocb_bio_iopoll+0x0/0x50
> [  880.024123]  RP(r2): io_do_iopoll+0xb8/0x3a4
> [  880.024123] Backtrace:
> [  880.024123]  [<1092a2b0>] io_uring_try_cancel_requests+0x184/0x3b0
> [  880.024123]  [<1092a57c>] io_ring_exit_work+0xa0/0x4c4
> [  880.024123]  [<101cb448>] process_one_work+0x1c4/0x3cc
> [  880.024123]  [<101cb7d8>] worker_thread+0x188/0x4b4
> [  880.024123]  [<101d5910>] kthread+0xec/0xf4
> [  880.024123]  [<1018801c>] ret_from_kernel_thread+0x1c/0x24
I had updated to 6.2.0-rc8+ to avoid this issue.

I agree there's not a lot of helpful info in the dump.  Somehow, the code has branched to
location 0 and attempted to execute instruction 0.  RP points at intr_check_sig but not to
a valid return point for a call instruction.  In the dump above, SP is 0.  Maybe the stack
overflowed for the process?

I have run the test multiple times by itself.  It consistently generates a HPMC check.  The PIM
dump provides no more info than the above dump (i.e., kernel has tried to execute location 0).
It didn't appear SP had been clobbered in the PIM dump that I looked at.

Running the test under strace gives different points where the trace stops:

io_uring_setup(64, {flags=0, sq_thread_cpu=0, sq_thread_idle=0, sq_entries=64, cq_entries=128, 
features=IORING_FEAT_SINGLE_MMAP|IORING_FEAT_NODROP|IORING_FEAT_SUBMIT_STABLE|IORING_FEAT_RW_CUR_POS|IORING_FEAT_CUR_PERSONALITY|IORING_FEAT_FAST_POLL|IORING_FEAT_POLL_32BITS|0x1f80, 
sq_off={head=0, tail=16, ring_mask=64, ring_entries=72, flags=84, dropped=80, array=2144}, cq_off={head=32, tail=48, ring_mask=68, 
ring_entries=76, overflow=92, cqes=96, flags=0x58 /* IORING_CQ_??? */}}) = 3

io_uring_enter(3, 64, 0, 0, NULL, 8)    = 64

io_uring_setup(64, {flags=0, sq_thread_cpu=0, sq_thread_idle=0, sq_entries=64, cq_entries=128, 
features=IORING_FEAT_SINGLE_MMAP|IORING_FEAT_NODROP|IORING_FEAT_SUBMIT_STABLE|IORING_FEAT_RW_CUR_POS|IORING_FEAT_CUR_PERSONALITY|IORING_FEAT_FAST_POLL|IORING_FEAT_POLL_32BITS|0x1f80, 
sq_off={head=0, tail=16, ring_mask=64, ring_entries=72, flags=84, dropped=80, array=2144}, cq_off={head=32, tail=48, ring_mask=68, 
ring_entries=76, overflow=92, cqes=96, flags=0x58 /* IORING_CQ_??? */}}) = 3
mmap2(NULL, 2400, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_POPULATE, 3, 0) = 0xf8cad000
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_POPULATE, 3, 0x10000000

-- 
John David Anglin  [email protected]


  parent reply	other threads:[~2023-02-16 20:36 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-12  9:47 io_uring failure on parisc (32-bit userspace and 64-bit kernel) Helge Deller
2023-02-12 13:16 ` Jens Axboe
2023-02-12 13:28   ` Helge Deller
2023-02-12 13:35     ` Jens Axboe
2023-02-12 14:00       ` Jens Axboe
2023-02-12 14:03       ` Helge Deller
2023-02-12 19:35         ` Helge Deller
2023-02-12 19:42           ` Jens Axboe
2023-02-12 20:01             ` Helge Deller
2023-02-12 21:48               ` Jens Axboe
2023-02-12 22:20                 ` Helge Deller
2023-02-12 22:31                   ` Helge Deller
2023-02-13 16:15                     ` Jens Axboe
2023-02-13 20:59                       ` Helge Deller
2023-02-13 21:05                         ` Jens Axboe
2023-02-13 22:05                           ` Helge Deller
2023-02-13 22:50                             ` John David Anglin
2023-02-14 23:09                               ` io_uring failure on parisc with VIPT caches Helge Deller
2023-02-14 23:29                                 ` Jens Axboe
2023-02-15  2:12                                   ` John David Anglin
2023-02-15 15:16                                     ` Jens Axboe
2023-02-15 15:52                                       ` Helge Deller
2023-02-15 15:56                                         ` Jens Axboe
2023-02-15 16:02                                           ` Helge Deller
2023-02-15 16:04                                             ` Jens Axboe
2023-02-15 21:40                                               ` Helge Deller
2023-02-15 23:04                                                 ` Jens Axboe
2023-02-15 16:38                                           ` John David Anglin
2023-02-15 17:01                                             ` Jens Axboe
2023-02-15 19:00                                               ` Jens Axboe
2023-02-15 19:16                                                 ` Jens Axboe
2023-02-15 20:27                                                   ` John David Anglin
2023-02-15 20:37                                                     ` Jens Axboe
2023-02-15 21:06                                                       ` John David Anglin
2023-02-15 21:38                                                         ` Jens Axboe
2023-02-15 21:39                                                         ` John David Anglin
2023-02-15 22:10                                                           ` John David Anglin
2023-02-15 23:02                                                             ` Jens Axboe
2023-02-15 23:43                                                               ` John David Anglin
2023-02-16  2:40                                                               ` John David Anglin
2023-02-16  2:50                                                                 ` Jens Axboe
2023-02-16  8:24                                                                   ` Helge Deller
2023-02-16 15:22                                                                     ` Jens Axboe
2023-02-16 20:35                                                                     ` John David Anglin [this message]
2023-02-15 23:03                                                           ` Jens Axboe
2023-02-15 19:20                                                 ` John David Anglin
2023-02-15 19:24                                                   ` Jens Axboe
2023-02-15 16:18                                         ` John David Anglin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox