* io_uring possibly the culprit for qemu hang (linux-5.4.y)
@ 2020-09-30 16:26 Ju Hyung Park
2020-10-01 3:03 ` Jens Axboe
2020-10-01 8:59 ` Stefano Garzarella
0 siblings, 2 replies; 11+ messages in thread
From: Ju Hyung Park @ 2020-09-30 16:26 UTC (permalink / raw)
To: io-uring, Jens Axboe
Hi everyone.
I have recently switched to a setup running QEMU 5.0(which supports
io_uring) for a Windows 10 guest on Linux v5.4.63.
The QEMU hosts /dev/nvme0n1p3 to the guest with virtio-blk with
discard/unmap enabled.
I've been having a weird issue where the system would randomly hang
whenever I turn on or shutdown the guest. The host will stay up for a
bit and then just hang. No response on SSH, etc. Even ping doesn't
work.
It's been hard to even get a log to debug the issue, but I've been
able to get a show-backtrace-all-active-cpus sysrq dmesg on the most
recent encounter with the issue and it's showing some io_uring
functions.
Since I've been encountering the issue ever since I switched to QEMU
5.0, I suspect io_uring may be the culprit to the issue.
While I'd love to try out the mainline kernel, it's currently not
feasible at the moment as I have to stay in linux-5.4.y. Backporting
mainline's io_uring also seems to be a non-trivial job.
Any tips would be appreciated. I can build my own kernel and I'm
willing to try out (backported) patches.
Thanks.
[243683.539303] NMI backtrace for cpu 1
[243683.539303] CPU: 1 PID: 1527 Comm: qemu-system-x86 Tainted: P
W O 5.4.63+ #1
[243683.539303] Hardware name: System manufacturer System Product
Name/PRIME Z370-A, BIOS 2401 07/12/2019
[243683.539304] RIP: 0010:io_uring_flush+0x98/0x140
[243683.539304] Code: e4 74 70 48 8b 93 e8 02 00 00 48 8b 32 48 8b 4a
08 48 89 4e 08 48 89 31 48 89 12 48 89 52 08 48 8b 72 f8 81 4a a8 00
40 00 00 <48> 85 f6 74 15 4c 3b 62 c8 75 0f ba 01 00 00 00 bf 02 00 00
00 e8
[243683.539304] RSP: 0018:ffff8881f20c3e28 EFLAGS: 00000006
[243683.539305] RAX: ffff888419cd94e0 RBX: ffff88842ba49800 RCX:
ffff888419cd94e0
[243683.539305] RDX: ffff888419cd94e0 RSI: ffff888419cd94d0 RDI:
ffff88842ba49af8
[243683.539306] RBP: ffff88842ba49af8 R08: 0000000000000001 R09:
ffff88840d17aaf8
[243683.539306] R10: 0000000000000001 R11: 00000000ffffffec R12:
ffff88843c68c080
[243683.539306] R13: ffff88842ba49ae8 R14: 0000000000000001 R15:
0000000000000000
[243683.539307] FS: 0000000000000000(0000) GS:ffff88843ea80000(0000)
knlGS:0000000000000000
[243683.539307] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[243683.539307] CR2: 00007f3234b31f90 CR3: 0000000002608001 CR4:
00000000003726e0
[243683.539307] Call Trace:
[243683.539308] ? filp_close+0x2a/0x60
[243683.539308] ? put_files_struct.part.0+0x57/0xb0
[243683.539309] ? do_exit+0x321/0xa70
[243683.539309] ? do_group_exit+0x35/0x90
[243683.539309] ? __x64_sys_exit_group+0xf/0x10
[243683.539309] ? do_syscall_64+0x41/0x160
[243683.539309] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[243684.753272] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[243684.753278] rcu: 1-...0: (1 GPs behind)
idle=a5e/1/0x4000000000000000 softirq=7893711/7893712 fqs=2955
[243684.753280] (detected by 3, t=6002 jiffies, g=17109677, q=117817)
[243684.753282] Sending NMI from CPU 3 to CPUs 1:
[243684.754285] NMI backtrace for cpu 1
[243684.754285] CPU: 1 PID: 1527 Comm: qemu-system-x86 Tainted: P
W O 5.4.63+ #1
[243684.754286] Hardware name: System manufacturer System Product
Name/PRIME Z370-A, BIOS 2401 07/12/2019
[243684.754286] RIP: 0010:io_uring_flush+0x83/0x140
[243684.754287] Code: 89 ef e8 00 36 92 00 48 8b 83 e8 02 00 00 49 39
c5 74 52 4d 85 e4 74 70 48 8b 93 e8 02 00 00 48 8b 32 48 8b 4a 08 48
89 4e 08 <48> 89 31 48 89 12 48 89 52 08 48 8b 72 f8 81 4a a8 00 40 00
00 48
[243684.754287] RSP: 0018:ffff8881f20c3e28 EFLAGS: 00000002
[243684.754288] RAX: ffff888419cd94e0 RBX: ffff88842ba49800 RCX:
ffff888419cd94e0
[243684.754288] RDX: ffff888419cd94e0 RSI: ffff888419cd94e0 RDI:
ffff88842ba49af8
[243684.754289] RBP: ffff88842ba49af8 R08: 0000000000000001 R09:
ffff88840d17aaf8
[243684.754289] R10: 0000000000000001 R11: 00000000ffffffec R12:
ffff88843c68c080
[243684.754289] R13: ffff88842ba49ae8 R14: 0000000000000001 R15:
0000000000000000
[243684.754290] FS: 0000000000000000(0000) GS:ffff88843ea80000(0000)
knlGS:0000000000000000
[243684.754290] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[243684.754291] CR2: 00007f3234b31f90 CR3: 0000000002608001 CR4:
00000000003726e0
[243684.754291] Call Trace:
[243684.754291] ? filp_close+0x2a/0x60
[243684.754291] ? put_files_struct.part.0+0x57/0xb0
[243684.754292] ? do_exit+0x321/0xa70
[243684.754292] ? do_group_exit+0x35/0x90
[243684.754292] ? __x64_sys_exit_group+0xf/0x10
[243684.754293] ? do_syscall_64+0x41/0x160
[243684.754293] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y)
2020-09-30 16:26 io_uring possibly the culprit for qemu hang (linux-5.4.y) Ju Hyung Park
@ 2020-10-01 3:03 ` Jens Axboe
2020-10-01 8:59 ` Stefano Garzarella
1 sibling, 0 replies; 11+ messages in thread
From: Jens Axboe @ 2020-10-01 3:03 UTC (permalink / raw)
To: Ju Hyung Park, io-uring
On 9/30/20 10:26 AM, Ju Hyung Park wrote:
> Hi everyone.
>
> I have recently switched to a setup running QEMU 5.0(which supports
> io_uring) for a Windows 10 guest on Linux v5.4.63.
> The QEMU hosts /dev/nvme0n1p3 to the guest with virtio-blk with
> discard/unmap enabled.
>
> I've been having a weird issue where the system would randomly hang
> whenever I turn on or shutdown the guest. The host will stay up for a
> bit and then just hang. No response on SSH, etc. Even ping doesn't
> work.
>
> It's been hard to even get a log to debug the issue, but I've been
> able to get a show-backtrace-all-active-cpus sysrq dmesg on the most
> recent encounter with the issue and it's showing some io_uring
> functions.
>
> Since I've been encountering the issue ever since I switched to QEMU
> 5.0, I suspect io_uring may be the culprit to the issue.
>
> While I'd love to try out the mainline kernel, it's currently not
> feasible at the moment as I have to stay in linux-5.4.y. Backporting
> mainline's io_uring also seems to be a non-trivial job.
>
> Any tips would be appreciated. I can build my own kernel and I'm
> willing to try out (backported) patches.
I'll see if I can reproduce this, thanks for the report!
--
Jens Axboe
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y)
2020-09-30 16:26 io_uring possibly the culprit for qemu hang (linux-5.4.y) Ju Hyung Park
2020-10-01 3:03 ` Jens Axboe
@ 2020-10-01 8:59 ` Stefano Garzarella
2020-10-01 13:47 ` Jack Wang
2020-10-01 14:30 ` Ju Hyung Park
1 sibling, 2 replies; 11+ messages in thread
From: Stefano Garzarella @ 2020-10-01 8:59 UTC (permalink / raw)
To: Ju Hyung Park; +Cc: io-uring, Jens Axboe, qemu-devel
+Cc: [email protected]
Hi,
On Thu, Oct 01, 2020 at 01:26:51AM +0900, Ju Hyung Park wrote:
> Hi everyone.
>
> I have recently switched to a setup running QEMU 5.0(which supports
> io_uring) for a Windows 10 guest on Linux v5.4.63.
> The QEMU hosts /dev/nvme0n1p3 to the guest with virtio-blk with
> discard/unmap enabled.
Please, can you share the qemu command line that you are using?
This can be useful for the analysis.
Thanks,
Stefano
>
> I've been having a weird issue where the system would randomly hang
> whenever I turn on or shutdown the guest. The host will stay up for a
> bit and then just hang. No response on SSH, etc. Even ping doesn't
> work.
>
> It's been hard to even get a log to debug the issue, but I've been
> able to get a show-backtrace-all-active-cpus sysrq dmesg on the most
> recent encounter with the issue and it's showing some io_uring
> functions.
>
> Since I've been encountering the issue ever since I switched to QEMU
> 5.0, I suspect io_uring may be the culprit to the issue.
>
> While I'd love to try out the mainline kernel, it's currently not
> feasible at the moment as I have to stay in linux-5.4.y. Backporting
> mainline's io_uring also seems to be a non-trivial job.
>
> Any tips would be appreciated. I can build my own kernel and I'm
> willing to try out (backported) patches.
>
> Thanks.
>
> [243683.539303] NMI backtrace for cpu 1
> [243683.539303] CPU: 1 PID: 1527 Comm: qemu-system-x86 Tainted: P
> W O 5.4.63+ #1
> [243683.539303] Hardware name: System manufacturer System Product
> Name/PRIME Z370-A, BIOS 2401 07/12/2019
> [243683.539304] RIP: 0010:io_uring_flush+0x98/0x140
> [243683.539304] Code: e4 74 70 48 8b 93 e8 02 00 00 48 8b 32 48 8b 4a
> 08 48 89 4e 08 48 89 31 48 89 12 48 89 52 08 48 8b 72 f8 81 4a a8 00
> 40 00 00 <48> 85 f6 74 15 4c 3b 62 c8 75 0f ba 01 00 00 00 bf 02 00 00
> 00 e8
> [243683.539304] RSP: 0018:ffff8881f20c3e28 EFLAGS: 00000006
> [243683.539305] RAX: ffff888419cd94e0 RBX: ffff88842ba49800 RCX:
> ffff888419cd94e0
> [243683.539305] RDX: ffff888419cd94e0 RSI: ffff888419cd94d0 RDI:
> ffff88842ba49af8
> [243683.539306] RBP: ffff88842ba49af8 R08: 0000000000000001 R09:
> ffff88840d17aaf8
> [243683.539306] R10: 0000000000000001 R11: 00000000ffffffec R12:
> ffff88843c68c080
> [243683.539306] R13: ffff88842ba49ae8 R14: 0000000000000001 R15:
> 0000000000000000
> [243683.539307] FS: 0000000000000000(0000) GS:ffff88843ea80000(0000)
> knlGS:0000000000000000
> [243683.539307] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [243683.539307] CR2: 00007f3234b31f90 CR3: 0000000002608001 CR4:
> 00000000003726e0
> [243683.539307] Call Trace:
> [243683.539308] ? filp_close+0x2a/0x60
> [243683.539308] ? put_files_struct.part.0+0x57/0xb0
> [243683.539309] ? do_exit+0x321/0xa70
> [243683.539309] ? do_group_exit+0x35/0x90
> [243683.539309] ? __x64_sys_exit_group+0xf/0x10
> [243683.539309] ? do_syscall_64+0x41/0x160
> [243683.539309] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [243684.753272] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> [243684.753278] rcu: 1-...0: (1 GPs behind)
> idle=a5e/1/0x4000000000000000 softirq=7893711/7893712 fqs=2955
> [243684.753280] (detected by 3, t=6002 jiffies, g=17109677, q=117817)
> [243684.753282] Sending NMI from CPU 3 to CPUs 1:
> [243684.754285] NMI backtrace for cpu 1
> [243684.754285] CPU: 1 PID: 1527 Comm: qemu-system-x86 Tainted: P
> W O 5.4.63+ #1
> [243684.754286] Hardware name: System manufacturer System Product
> Name/PRIME Z370-A, BIOS 2401 07/12/2019
> [243684.754286] RIP: 0010:io_uring_flush+0x83/0x140
> [243684.754287] Code: 89 ef e8 00 36 92 00 48 8b 83 e8 02 00 00 49 39
> c5 74 52 4d 85 e4 74 70 48 8b 93 e8 02 00 00 48 8b 32 48 8b 4a 08 48
> 89 4e 08 <48> 89 31 48 89 12 48 89 52 08 48 8b 72 f8 81 4a a8 00 40 00
> 00 48
> [243684.754287] RSP: 0018:ffff8881f20c3e28 EFLAGS: 00000002
> [243684.754288] RAX: ffff888419cd94e0 RBX: ffff88842ba49800 RCX:
> ffff888419cd94e0
> [243684.754288] RDX: ffff888419cd94e0 RSI: ffff888419cd94e0 RDI:
> ffff88842ba49af8
> [243684.754289] RBP: ffff88842ba49af8 R08: 0000000000000001 R09:
> ffff88840d17aaf8
> [243684.754289] R10: 0000000000000001 R11: 00000000ffffffec R12:
> ffff88843c68c080
> [243684.754289] R13: ffff88842ba49ae8 R14: 0000000000000001 R15:
> 0000000000000000
> [243684.754290] FS: 0000000000000000(0000) GS:ffff88843ea80000(0000)
> knlGS:0000000000000000
> [243684.754290] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [243684.754291] CR2: 00007f3234b31f90 CR3: 0000000002608001 CR4:
> 00000000003726e0
> [243684.754291] Call Trace:
> [243684.754291] ? filp_close+0x2a/0x60
> [243684.754291] ? put_files_struct.part.0+0x57/0xb0
> [243684.754292] ? do_exit+0x321/0xa70
> [243684.754292] ? do_group_exit+0x35/0x90
> [243684.754292] ? __x64_sys_exit_group+0xf/0x10
> [243684.754293] ? do_syscall_64+0x41/0x160
> [243684.754293] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y)
2020-10-01 8:59 ` Stefano Garzarella
@ 2020-10-01 13:47 ` Jack Wang
2020-10-01 14:30 ` Ju Hyung Park
1 sibling, 0 replies; 11+ messages in thread
From: Jack Wang @ 2020-10-01 13:47 UTC (permalink / raw)
To: Stefano Garzarella; +Cc: Ju Hyung Park, Jens Axboe, io-uring, qemu-devel
Stefano Garzarella <[email protected]> 于2020年10月1日周四 上午10:59写道:
>
> +Cc: [email protected]
>
> Hi,
>
> On Thu, Oct 01, 2020 at 01:26:51AM +0900, Ju Hyung Park wrote:
> > Hi everyone.
> >
> > I have recently switched to a setup running QEMU 5.0(which supports
> > io_uring) for a Windows 10 guest on Linux v5.4.63.
> > The QEMU hosts /dev/nvme0n1p3 to the guest with virtio-blk with
> > discard/unmap enabled.
>
> Please, can you share the qemu command line that you are using?
> This can be useful for the analysis.
>
> Thanks,
> Stefano
>
> >
> > I've been having a weird issue where the system would randomly hang
> > whenever I turn on or shutdown the guest. The host will stay up for a
> > bit and then just hang. No response on SSH, etc. Even ping doesn't
> > work.
> >
> > It's been hard to even get a log to debug the issue, but I've been
> > able to get a show-backtrace-all-active-cpus sysrq dmesg on the most
> > recent encounter with the issue and it's showing some io_uring
> > functions.
> >
> > Since I've been encountering the issue ever since I switched to QEMU
> > 5.0, I suspect io_uring may be the culprit to the issue.
> >
> > While I'd love to try out the mainline kernel, it's currently not
> > feasible at the moment as I have to stay in linux-5.4.y. Backporting
> > mainline's io_uring also seems to be a non-trivial job.
> >
> > Any tips would be appreciated. I can build my own kernel and I'm
> > willing to try out (backported) patches.
> >
> > Thanks.
> >
> > [243683.539303] NMI backtrace for cpu 1
> > [243683.539303] CPU: 1 PID: 1527 Comm: qemu-system-x86 Tainted: P
> > W O 5.4.63+ #1
> > [243683.539303] Hardware name: System manufacturer System Product
> > Name/PRIME Z370-A, BIOS 2401 07/12/2019
> > [243683.539304] RIP: 0010:io_uring_flush+0x98/0x140
> > [243683.539304] Code: e4 74 70 48 8b 93 e8 02 00 00 48 8b 32 48 8b 4a
> > 08 48 89 4e 08 48 89 31 48 89 12 48 89 52 08 48 8b 72 f8 81 4a a8 00
> > 40 00 00 <48> 85 f6 74 15 4c 3b 62 c8 75 0f ba 01 00 00 00 bf 02 00 00
> > 00 e8
> > [243683.539304] RSP: 0018:ffff8881f20c3e28 EFLAGS: 00000006
> > [243683.539305] RAX: ffff888419cd94e0 RBX: ffff88842ba49800 RCX:
> > ffff888419cd94e0
> > [243683.539305] RDX: ffff888419cd94e0 RSI: ffff888419cd94d0 RDI:
> > ffff88842ba49af8
> > [243683.539306] RBP: ffff88842ba49af8 R08: 0000000000000001 R09:
> > ffff88840d17aaf8
> > [243683.539306] R10: 0000000000000001 R11: 00000000ffffffec R12:
> > ffff88843c68c080
> > [243683.539306] R13: ffff88842ba49ae8 R14: 0000000000000001 R15:
> > 0000000000000000
> > [243683.539307] FS: 0000000000000000(0000) GS:ffff88843ea80000(0000)
> > knlGS:0000000000000000
> > [243683.539307] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [243683.539307] CR2: 00007f3234b31f90 CR3: 0000000002608001 CR4:
> > 00000000003726e0
> > [243683.539307] Call Trace:
> > [243683.539308] ? filp_close+0x2a/0x60
> > [243683.539308] ? put_files_struct.part.0+0x57/0xb0
> > [243683.539309] ? do_exit+0x321/0xa70
> > [243683.539309] ? do_group_exit+0x35/0x90
> > [243683.539309] ? __x64_sys_exit_group+0xf/0x10
> > [243683.539309] ? do_syscall_64+0x41/0x160
> > [243683.539309] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [243684.753272] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> > [243684.753278] rcu: 1-...0: (1 GPs behind)
> > idle=a5e/1/0x4000000000000000 softirq=7893711/7893712 fqs=2955
> > [243684.753280] (detected by 3, t=6002 jiffies, g=17109677, q=117817)
> > [243684.753282] Sending NMI from CPU 3 to CPUs 1:
> > [243684.754285] NMI backtrace for cpu 1
> > [243684.754285] CPU: 1 PID: 1527 Comm: qemu-system-x86 Tainted: P
> > W O 5.4.63+ #1
> > [243684.754286] Hardware name: System manufacturer System Product
> > Name/PRIME Z370-A, BIOS 2401 07/12/2019
> > [243684.754286] RIP: 0010:io_uring_flush+0x83/0x140
> > [243684.754287] Code: 89 ef e8 00 36 92 00 48 8b 83 e8 02 00 00 49 39
> > c5 74 52 4d 85 e4 74 70 48 8b 93 e8 02 00 00 48 8b 32 48 8b 4a 08 48
> > 89 4e 08 <48> 89 31 48 89 12 48 89 52 08 48 8b 72 f8 81 4a a8 00 40 00
> > 00 48
> > [243684.754287] RSP: 0018:ffff8881f20c3e28 EFLAGS: 00000002
> > [243684.754288] RAX: ffff888419cd94e0 RBX: ffff88842ba49800 RCX:
> > ffff888419cd94e0
> > [243684.754288] RDX: ffff888419cd94e0 RSI: ffff888419cd94e0 RDI:
> > ffff88842ba49af8
> > [243684.754289] RBP: ffff88842ba49af8 R08: 0000000000000001 R09:
> > ffff88840d17aaf8
> > [243684.754289] R10: 0000000000000001 R11: 00000000ffffffec R12:
> > ffff88843c68c080
> > [243684.754289] R13: ffff88842ba49ae8 R14: 0000000000000001 R15:
> > 0000000000000000
> > [243684.754290] FS: 0000000000000000(0000) GS:ffff88843ea80000(0000)
> > knlGS:0000000000000000
> > [243684.754290] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [243684.754291] CR2: 00007f3234b31f90 CR3: 0000000002608001 CR4:
> > 00000000003726e0
> > [243684.754291] Call Trace:
> > [243684.754291] ? filp_close+0x2a/0x60
> > [243684.754291] ? put_files_struct.part.0+0x57/0xb0
> > [243684.754292] ? do_exit+0x321/0xa70
> > [243684.754292] ? do_group_exit+0x35/0x90
> > [243684.754292] ? __x64_sys_exit_group+0xf/0x10
> > [243684.754293] ? do_syscall_64+0x41/0x160
> > [243684.754293] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
>
>
I got something similar
[250145.410520] general protection fault: 0000 [#1] SMP
[250145.410868] CPU: 5 PID: 39269 Comm: qemu-5.0 Kdump: loaded
Tainted: G O 5.4.61-pserver
#5.4.61-1+develop20200831.1341+7430880~deb10
[250145.411468] Hardware name: Supermicro Super Server/X11DDW-L, BIOS
3.3 02/21/2020
[250145.412051] RIP: 0010:io_cancel_async_work+0x48/0xa0
[250145.412386] Code: fb 48 8d bf f8 02 00 00 48 89 f5 e8 02 f1 69 00
48 8b 83 e8 02 00 00 49 39 c4 74 52 48 8b 83 e8 02 00 00 48 8b 08 48
8b 50 08 <48> 89 51 08 48 8
9 0a 48 8b 70 f8 48 89 00 48 89 40 08 81 48 a8 00
[250145.413239] RSP: 0018:ffffc2d34efb7cf8 EFLAGS: 00010083
[250145.413576] RAX: ffffec9879729a80 RBX: ffff9fa2740ac400 RCX:
06ffff8000000000
[250145.414153] RDX: ffff9fa2740ac6e8 RSI: ffffec9879729a40 RDI:
ffff9fa2740ac6f8
[250145.414729] RBP: ffff9fa1cf86c080 R08: ffff9fa199ccecf8 R09:
8000002f724b7067
[250145.415307] R10: ffffc2d34efb7b8c R11: 0000000000000003 R12:
ffff9fa2740ac6e8
[250145.415884] R13: 0000000000000000 R14: 000000000000000d R15:
ffff9fa1af48e068
[250145.416461] FS: 0000000000000000(0000) GS:ffff9fa280940000(0000)
knlGS:0000000000000000
[250145.417042] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[250145.417380] CR2: 00007f5f14188000 CR3: 0000002f0c408005 CR4:
00000000007626e0
[250145.417957] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[250145.418535] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[250145.419112] PKRU: 55555554
[250145.419438] Call Trace:
[250145.419768] io_uring_flush+0x34/0x50
[250145.420101] filp_close+0x31/0x60
[250145.420432] put_files_struct+0x6c/0xc0
[250145.420767] do_exit+0x347/0xa50
[250145.421097] do_group_exit+0x3a/0x90
[250145.421429] get_signal+0x125/0x7d0
[250145.421761] do_signal+0x36/0x640
[250145.422090] ? do_send_sig_info+0x5c/0x90
[250145.422423] ? recalc_sigpending+0x17/0x50
[250145.422757] exit_to_usermode_loop+0x61/0xd0
[250145.423090] do_syscall_64+0xe6/0x120
[250145.423424] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[250145.423761] RIP: 0033:0x7f5f176997bb
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y)
2020-10-01 8:59 ` Stefano Garzarella
2020-10-01 13:47 ` Jack Wang
@ 2020-10-01 14:30 ` Ju Hyung Park
2020-10-02 7:34 ` Stefano Garzarella
1 sibling, 1 reply; 11+ messages in thread
From: Ju Hyung Park @ 2020-10-01 14:30 UTC (permalink / raw)
To: Stefano Garzarella; +Cc: io-uring, Jens Axboe, qemu-devel
Hi Stefano,
On Thu, Oct 1, 2020 at 5:59 PM Stefano Garzarella <[email protected]> wrote:
> Please, can you share the qemu command line that you are using?
> This can be useful for the analysis.
Sure.
QEMU:
/usr/bin/qemu-system-x86_64 -name guest=win10,debug-threads=on -S
-object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-win10/master-key.aes
-blockdev {"driver":"file","filename":"/usr/share/OVMF/OVMF_CODE.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"}
-blockdev {"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"}
-blockdev {"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/win10_VARS.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"}
-blockdev {"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"}
-machine pc-q35-5.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off,mem-merge=off,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format
-cpu Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,pdpe1gb=on,ibpb=on,amd-ssbd=on,fma=off,avx=off,f16c=off,rdrand=off,bmi1=off,hle=off,avx2=off,bmi2=off,rtm=off,rdseed=off,adx=off,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-vpindex,hv-runtime,hv-synic,hv-stimer,hv-reset
-m 8192 -mem-prealloc -mem-path /dev/hugepages/libvirt/qemu/1-win10
-overcommit mem-lock=off -smp 4,sockets=1,dies=1,cores=2,threads=2
-uuid 7ccc3031-1dab-4267-b72a-d60065b5ff7f -display none
-no-user-config -nodefaults -chardev
socket,id=charmonitor,fd=32,server,nowait -mon
chardev=charmonitor,id=monitor,mode=control -rtc
base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=delay
-no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global
ICH9-LPC.disable_s4=1 -boot menu=off,strict=on -device
pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1
-device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1
-device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2
-device pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3
-device pcie-pci-bridge,id=pci.5,bus=pci.2,addr=0x0 -device
qemu-xhci,id=usb,bus=pci.1,addr=0x0 -blockdev
{"driver":"host_device","filename":"/dev/disk/by-partuuid/05c3750b-060f-4703-95ea-6f5e546bf6e9","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"}
-blockdev {"node-name":"libvirt-1-format","read-only":false,"discard":"unmap","detect-zeroes":"unmap","cache":{"direct":false,"no-flush":true},"driver":"raw","file":"libvirt-1-storage"}
-device virtio-blk-pci,scsi=off,bus=pcie.0,addr=0xa,drive=libvirt-1-format,id=virtio-disk0,bootindex=1,write-cache=on
-netdev tap,fd=34,id=hostnet0 -device
e1000,netdev=hostnet0,id=net0,mac=52:54:00:c6:bb:bc,bus=pcie.0,addr=0x3
-device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x4 -device
hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device
vfio-pci,host=0000:00:02.0,id=hostdev0,bus=pcie.0,addr=0x2,rombar=0
-device virtio-balloon-pci,id=balloon0,bus=pcie.0,addr=0x8 -object
rng-random,id=objrng0,filename=/dev/urandom -device
virtio-rng-pci,rng=objrng0,id=rng0,bus=pcie.0,addr=0x9 -msg
timestamp=on
And I use libvirt 6.3.0 to manage the VM. Here's an xml of my VM.
<domain type="kvm">
<name>win10</name>
<uuid>7ccc3031-1dab-4267-b72a-d60065b5ff7f</uuid>
<metadata>
<libosinfo:libosinfo
xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
<libosinfo:os id="http://microsoft.com/win/10"/>
</libosinfo:libosinfo>
</metadata>
<memory unit="KiB">8388608</memory>
<currentMemory unit="KiB">8388608</currentMemory>
<memoryBacking>
<hugepages/>
<nosharepages/>
</memoryBacking>
<vcpu placement="static">4</vcpu>
<cputune>
<vcpupin vcpu="0" cpuset="0"/>
<vcpupin vcpu="1" cpuset="2"/>
<vcpupin vcpu="2" cpuset="1"/>
<vcpupin vcpu="3" cpuset="3"/>
</cputune>
<os>
<type arch="x86_64" machine="pc-q35-5.0">hvm</type>
<loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.fd</loader>
<nvram>/var/lib/libvirt/qemu/nvram/win10_VARS.fd</nvram>
<boot dev="hd"/>
<bootmenu enable="no"/>
</os>
<features>
<acpi/>
<apic/>
<hyperv>
<relaxed state="on"/>
<vapic state="on"/>
<spinlocks state="on" retries="8191"/>
<vpindex state="on"/>
<runtime state="on"/>
<synic state="on"/>
<stimer state="on"/>
<reset state="on"/>
</hyperv>
<vmport state="off"/>
</features>
<cpu mode="host-model" check="partial">
<topology sockets="1" dies="1" cores="2" threads="2"/>
</cpu>
<clock offset="localtime">
<timer name="rtc" tickpolicy="catchup"/>
<timer name="pit" tickpolicy="delay"/>
<timer name="hpet" present="no"/>
<timer name="hypervclock" present="yes"/>
</clock>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<pm>
<suspend-to-mem enabled="no"/>
<suspend-to-disk enabled="no"/>
</pm>
<devices>
<emulator>/usr/bin/qemu-system-x86_64</emulator>
<disk type="block" device="disk">
<driver name="qemu" type="raw" cache="unsafe" discard="unmap"
detect_zeroes="unmap"/>
<source dev="/dev/disk/by-partuuid/05c3750b-060f-4703-95ea-6f5e546bf6e9"/>
<target dev="vda" bus="virtio"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x0a"
function="0x0"/>
</disk>
<controller type="pci" index="0" model="pcie-root"/>
<controller type="pci" index="1" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="1" port="0x8"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x01"
function="0x0" multifunction="on"/>
</controller>
<controller type="pci" index="2" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="2" port="0x9"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x01"
function="0x1"/>
</controller>
<controller type="pci" index="3" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="3" port="0xa"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x01"
function="0x2"/>
</controller>
<controller type="pci" index="4" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="4" port="0xb"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x01"
function="0x3"/>
</controller>
<controller type="pci" index="5" model="pcie-to-pci-bridge">
<model name="pcie-pci-bridge"/>
<address type="pci" domain="0x0000" bus="0x02" slot="0x00"
function="0x0"/>
</controller>
<controller type="usb" index="0" model="qemu-xhci">
<address type="pci" domain="0x0000" bus="0x01" slot="0x00"
function="0x0"/>
</controller>
<controller type="sata" index="0">
<address type="pci" domain="0x0000" bus="0x00" slot="0x1f"
function="0x2"/>
</controller>
<interface type="network">
<mac address="52:54:00:c6:bb:bc"/>
<source network="default"/>
<model type="e1000"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x03"
function="0x0"/>
</interface>
<input type="mouse" bus="ps2"/>
<input type="keyboard" bus="ps2"/>
<sound model="ich9">
<address type="pci" domain="0x0000" bus="0x00" slot="0x04"
function="0x0"/>
</sound>
<hostdev mode="subsystem" type="pci" managed="yes">
<source>
<address domain="0x0000" bus="0x00" slot="0x02" function="0x0"/>
</source>
<rom bar="off"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x02"
function="0x0"/>
</hostdev>
<memballoon model="virtio">
<address type="pci" domain="0x0000" bus="0x00" slot="0x08"
function="0x0"/>
</memballoon>
<rng model="virtio">
<backend model="random">/dev/urandom</backend>
<address type="pci" domain="0x0000" bus="0x00" slot="0x09"
function="0x0"/>
</rng>
</devices>
</domain>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y)
2020-10-01 14:30 ` Ju Hyung Park
@ 2020-10-02 7:34 ` Stefano Garzarella
2020-10-16 18:04 ` Ju Hyung Park
0 siblings, 1 reply; 11+ messages in thread
From: Stefano Garzarella @ 2020-10-02 7:34 UTC (permalink / raw)
To: Ju Hyung Park; +Cc: io-uring, Jens Axboe, qemu-devel
Hi Ju,
On Thu, Oct 01, 2020 at 11:30:14PM +0900, Ju Hyung Park wrote:
> Hi Stefano,
>
> On Thu, Oct 1, 2020 at 5:59 PM Stefano Garzarella <[email protected]> wrote:
> > Please, can you share the qemu command line that you are using?
> > This can be useful for the analysis.
>
> Sure.
Thanks for sharing.
The issue seems related to io_uring and the new io_uring fd monitoring
implementation available from QEMU 5.0.
I'll try to reproduce.
For now, as a workaround, you can rebuild qemu by disabling io-uring support:
../configure --disable-linux-io-uring ...
Thanks,
Stefano
>
> QEMU:
> /usr/bin/qemu-system-x86_64 -name guest=win10,debug-threads=on -S
> -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-win10/master-key.aes
> -blockdev {"driver":"file","filename":"/usr/share/OVMF/OVMF_CODE.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"}
> -blockdev {"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"}
> -blockdev {"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/win10_VARS.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"}
> -blockdev {"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"}
> -machine pc-q35-5.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off,mem-merge=off,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format
> -cpu Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,pdpe1gb=on,ibpb=on,amd-ssbd=on,fma=off,avx=off,f16c=off,rdrand=off,bmi1=off,hle=off,avx2=off,bmi2=off,rtm=off,rdseed=off,adx=off,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-vpindex,hv-runtime,hv-synic,hv-stimer,hv-reset
> -m 8192 -mem-prealloc -mem-path /dev/hugepages/libvirt/qemu/1-win10
> -overcommit mem-lock=off -smp 4,sockets=1,dies=1,cores=2,threads=2
> -uuid 7ccc3031-1dab-4267-b72a-d60065b5ff7f -display none
> -no-user-config -nodefaults -chardev
> socket,id=charmonitor,fd=32,server,nowait -mon
> chardev=charmonitor,id=monitor,mode=control -rtc
> base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=delay
> -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global
> ICH9-LPC.disable_s4=1 -boot menu=off,strict=on -device
> pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1
> -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1
> -device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2
> -device pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3
> -device pcie-pci-bridge,id=pci.5,bus=pci.2,addr=0x0 -device
> qemu-xhci,id=usb,bus=pci.1,addr=0x0 -blockdev
> {"driver":"host_device","filename":"/dev/disk/by-partuuid/05c3750b-060f-4703-95ea-6f5e546bf6e9","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"}
> -blockdev {"node-name":"libvirt-1-format","read-only":false,"discard":"unmap","detect-zeroes":"unmap","cache":{"direct":false,"no-flush":true},"driver":"raw","file":"libvirt-1-storage"}
> -device virtio-blk-pci,scsi=off,bus=pcie.0,addr=0xa,drive=libvirt-1-format,id=virtio-disk0,bootindex=1,write-cache=on
> -netdev tap,fd=34,id=hostnet0 -device
> e1000,netdev=hostnet0,id=net0,mac=52:54:00:c6:bb:bc,bus=pcie.0,addr=0x3
> -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x4 -device
> hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device
> vfio-pci,host=0000:00:02.0,id=hostdev0,bus=pcie.0,addr=0x2,rombar=0
> -device virtio-balloon-pci,id=balloon0,bus=pcie.0,addr=0x8 -object
> rng-random,id=objrng0,filename=/dev/urandom -device
> virtio-rng-pci,rng=objrng0,id=rng0,bus=pcie.0,addr=0x9 -msg
> timestamp=on
>
> And I use libvirt 6.3.0 to manage the VM. Here's an xml of my VM.
>
> <domain type="kvm">
> <name>win10</name>
> <uuid>7ccc3031-1dab-4267-b72a-d60065b5ff7f</uuid>
> <metadata>
> <libosinfo:libosinfo
> xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
> <libosinfo:os id="http://microsoft.com/win/10"/>
> </libosinfo:libosinfo>
> </metadata>
> <memory unit="KiB">8388608</memory>
> <currentMemory unit="KiB">8388608</currentMemory>
> <memoryBacking>
> <hugepages/>
> <nosharepages/>
> </memoryBacking>
> <vcpu placement="static">4</vcpu>
> <cputune>
> <vcpupin vcpu="0" cpuset="0"/>
> <vcpupin vcpu="1" cpuset="2"/>
> <vcpupin vcpu="2" cpuset="1"/>
> <vcpupin vcpu="3" cpuset="3"/>
> </cputune>
> <os>
> <type arch="x86_64" machine="pc-q35-5.0">hvm</type>
> <loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.fd</loader>
> <nvram>/var/lib/libvirt/qemu/nvram/win10_VARS.fd</nvram>
> <boot dev="hd"/>
> <bootmenu enable="no"/>
> </os>
> <features>
> <acpi/>
> <apic/>
> <hyperv>
> <relaxed state="on"/>
> <vapic state="on"/>
> <spinlocks state="on" retries="8191"/>
> <vpindex state="on"/>
> <runtime state="on"/>
> <synic state="on"/>
> <stimer state="on"/>
> <reset state="on"/>
> </hyperv>
> <vmport state="off"/>
> </features>
> <cpu mode="host-model" check="partial">
> <topology sockets="1" dies="1" cores="2" threads="2"/>
> </cpu>
> <clock offset="localtime">
> <timer name="rtc" tickpolicy="catchup"/>
> <timer name="pit" tickpolicy="delay"/>
> <timer name="hpet" present="no"/>
> <timer name="hypervclock" present="yes"/>
> </clock>
> <on_poweroff>destroy</on_poweroff>
> <on_reboot>restart</on_reboot>
> <on_crash>destroy</on_crash>
> <pm>
> <suspend-to-mem enabled="no"/>
> <suspend-to-disk enabled="no"/>
> </pm>
> <devices>
> <emulator>/usr/bin/qemu-system-x86_64</emulator>
> <disk type="block" device="disk">
> <driver name="qemu" type="raw" cache="unsafe" discard="unmap"
> detect_zeroes="unmap"/>
> <source dev="/dev/disk/by-partuuid/05c3750b-060f-4703-95ea-6f5e546bf6e9"/>
> <target dev="vda" bus="virtio"/>
> <address type="pci" domain="0x0000" bus="0x00" slot="0x0a"
> function="0x0"/>
> </disk>
> <controller type="pci" index="0" model="pcie-root"/>
> <controller type="pci" index="1" model="pcie-root-port">
> <model name="pcie-root-port"/>
> <target chassis="1" port="0x8"/>
> <address type="pci" domain="0x0000" bus="0x00" slot="0x01"
> function="0x0" multifunction="on"/>
> </controller>
> <controller type="pci" index="2" model="pcie-root-port">
> <model name="pcie-root-port"/>
> <target chassis="2" port="0x9"/>
> <address type="pci" domain="0x0000" bus="0x00" slot="0x01"
> function="0x1"/>
> </controller>
> <controller type="pci" index="3" model="pcie-root-port">
> <model name="pcie-root-port"/>
> <target chassis="3" port="0xa"/>
> <address type="pci" domain="0x0000" bus="0x00" slot="0x01"
> function="0x2"/>
> </controller>
> <controller type="pci" index="4" model="pcie-root-port">
> <model name="pcie-root-port"/>
> <target chassis="4" port="0xb"/>
> <address type="pci" domain="0x0000" bus="0x00" slot="0x01"
> function="0x3"/>
> </controller>
> <controller type="pci" index="5" model="pcie-to-pci-bridge">
> <model name="pcie-pci-bridge"/>
> <address type="pci" domain="0x0000" bus="0x02" slot="0x00"
> function="0x0"/>
> </controller>
> <controller type="usb" index="0" model="qemu-xhci">
> <address type="pci" domain="0x0000" bus="0x01" slot="0x00"
> function="0x0"/>
> </controller>
> <controller type="sata" index="0">
> <address type="pci" domain="0x0000" bus="0x00" slot="0x1f"
> function="0x2"/>
> </controller>
> <interface type="network">
> <mac address="52:54:00:c6:bb:bc"/>
> <source network="default"/>
> <model type="e1000"/>
> <address type="pci" domain="0x0000" bus="0x00" slot="0x03"
> function="0x0"/>
> </interface>
> <input type="mouse" bus="ps2"/>
> <input type="keyboard" bus="ps2"/>
> <sound model="ich9">
> <address type="pci" domain="0x0000" bus="0x00" slot="0x04"
> function="0x0"/>
> </sound>
> <hostdev mode="subsystem" type="pci" managed="yes">
> <source>
> <address domain="0x0000" bus="0x00" slot="0x02" function="0x0"/>
> </source>
> <rom bar="off"/>
> <address type="pci" domain="0x0000" bus="0x00" slot="0x02"
> function="0x0"/>
> </hostdev>
> <memballoon model="virtio">
> <address type="pci" domain="0x0000" bus="0x00" slot="0x08"
> function="0x0"/>
> </memballoon>
> <rng model="virtio">
> <backend model="random">/dev/urandom</backend>
> <address type="pci" domain="0x0000" bus="0x00" slot="0x09"
> function="0x0"/>
> </rng>
> </devices>
> </domain>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y)
2020-10-02 7:34 ` Stefano Garzarella
@ 2020-10-16 18:04 ` Ju Hyung Park
2020-10-16 18:07 ` Jens Axboe
0 siblings, 1 reply; 11+ messages in thread
From: Ju Hyung Park @ 2020-10-16 18:04 UTC (permalink / raw)
To: Jens Axboe, Stefano Garzarella; +Cc: io-uring, qemu-devel
A small update:
As per Stefano's suggestion, disabling io_uring support from QEMU from
the configuration step did fix the problem and I'm no longer having
hangs.
Looks like it __is__ an io_uring issue :(
Btw, I used liburing fe50048 for linking QEMU.
Thanks.
On Fri, Oct 2, 2020 at 4:35 PM Stefano Garzarella <[email protected]> wrote:
>
> Hi Ju,
>
> On Thu, Oct 01, 2020 at 11:30:14PM +0900, Ju Hyung Park wrote:
> > Hi Stefano,
> >
> > On Thu, Oct 1, 2020 at 5:59 PM Stefano Garzarella <[email protected]> wrote:
> > > Please, can you share the qemu command line that you are using?
> > > This can be useful for the analysis.
> >
> > Sure.
>
> Thanks for sharing.
>
> The issue seems related to io_uring and the new io_uring fd monitoring
> implementation available from QEMU 5.0.
>
> I'll try to reproduce.
>
> For now, as a workaround, you can rebuild qemu by disabling io-uring support:
>
> ../configure --disable-linux-io-uring ...
>
>
> Thanks,
> Stefano
>
> >
> > QEMU:
> > /usr/bin/qemu-system-x86_64 -name guest=win10,debug-threads=on -S
> > -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-win10/master-key.aes
> > -blockdev {"driver":"file","filename":"/usr/share/OVMF/OVMF_CODE.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"}
> > -blockdev {"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"}
> > -blockdev {"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/win10_VARS.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"}
> > -blockdev {"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"}
> > -machine pc-q35-5.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off,mem-merge=off,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format
> > -cpu Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,pdpe1gb=on,ibpb=on,amd-ssbd=on,fma=off,avx=off,f16c=off,rdrand=off,bmi1=off,hle=off,avx2=off,bmi2=off,rtm=off,rdseed=off,adx=off,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-vpindex,hv-runtime,hv-synic,hv-stimer,hv-reset
> > -m 8192 -mem-prealloc -mem-path /dev/hugepages/libvirt/qemu/1-win10
> > -overcommit mem-lock=off -smp 4,sockets=1,dies=1,cores=2,threads=2
> > -uuid 7ccc3031-1dab-4267-b72a-d60065b5ff7f -display none
> > -no-user-config -nodefaults -chardev
> > socket,id=charmonitor,fd=32,server,nowait -mon
> > chardev=charmonitor,id=monitor,mode=control -rtc
> > base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=delay
> > -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global
> > ICH9-LPC.disable_s4=1 -boot menu=off,strict=on -device
> > pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1
> > -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1
> > -device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2
> > -device pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3
> > -device pcie-pci-bridge,id=pci.5,bus=pci.2,addr=0x0 -device
> > qemu-xhci,id=usb,bus=pci.1,addr=0x0 -blockdev
> > {"driver":"host_device","filename":"/dev/disk/by-partuuid/05c3750b-060f-4703-95ea-6f5e546bf6e9","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"}
> > -blockdev {"node-name":"libvirt-1-format","read-only":false,"discard":"unmap","detect-zeroes":"unmap","cache":{"direct":false,"no-flush":true},"driver":"raw","file":"libvirt-1-storage"}
> > -device virtio-blk-pci,scsi=off,bus=pcie.0,addr=0xa,drive=libvirt-1-format,id=virtio-disk0,bootindex=1,write-cache=on
> > -netdev tap,fd=34,id=hostnet0 -device
> > e1000,netdev=hostnet0,id=net0,mac=52:54:00:c6:bb:bc,bus=pcie.0,addr=0x3
> > -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x4 -device
> > hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device
> > vfio-pci,host=0000:00:02.0,id=hostdev0,bus=pcie.0,addr=0x2,rombar=0
> > -device virtio-balloon-pci,id=balloon0,bus=pcie.0,addr=0x8 -object
> > rng-random,id=objrng0,filename=/dev/urandom -device
> > virtio-rng-pci,rng=objrng0,id=rng0,bus=pcie.0,addr=0x9 -msg
> > timestamp=on
> >
> > And I use libvirt 6.3.0 to manage the VM. Here's an xml of my VM.
> >
> > <domain type="kvm">
> > <name>win10</name>
> > <uuid>7ccc3031-1dab-4267-b72a-d60065b5ff7f</uuid>
> > <metadata>
> > <libosinfo:libosinfo
> > xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
> > <libosinfo:os id="http://microsoft.com/win/10"/>
> > </libosinfo:libosinfo>
> > </metadata>
> > <memory unit="KiB">8388608</memory>
> > <currentMemory unit="KiB">8388608</currentMemory>
> > <memoryBacking>
> > <hugepages/>
> > <nosharepages/>
> > </memoryBacking>
> > <vcpu placement="static">4</vcpu>
> > <cputune>
> > <vcpupin vcpu="0" cpuset="0"/>
> > <vcpupin vcpu="1" cpuset="2"/>
> > <vcpupin vcpu="2" cpuset="1"/>
> > <vcpupin vcpu="3" cpuset="3"/>
> > </cputune>
> > <os>
> > <type arch="x86_64" machine="pc-q35-5.0">hvm</type>
> > <loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.fd</loader>
> > <nvram>/var/lib/libvirt/qemu/nvram/win10_VARS.fd</nvram>
> > <boot dev="hd"/>
> > <bootmenu enable="no"/>
> > </os>
> > <features>
> > <acpi/>
> > <apic/>
> > <hyperv>
> > <relaxed state="on"/>
> > <vapic state="on"/>
> > <spinlocks state="on" retries="8191"/>
> > <vpindex state="on"/>
> > <runtime state="on"/>
> > <synic state="on"/>
> > <stimer state="on"/>
> > <reset state="on"/>
> > </hyperv>
> > <vmport state="off"/>
> > </features>
> > <cpu mode="host-model" check="partial">
> > <topology sockets="1" dies="1" cores="2" threads="2"/>
> > </cpu>
> > <clock offset="localtime">
> > <timer name="rtc" tickpolicy="catchup"/>
> > <timer name="pit" tickpolicy="delay"/>
> > <timer name="hpet" present="no"/>
> > <timer name="hypervclock" present="yes"/>
> > </clock>
> > <on_poweroff>destroy</on_poweroff>
> > <on_reboot>restart</on_reboot>
> > <on_crash>destroy</on_crash>
> > <pm>
> > <suspend-to-mem enabled="no"/>
> > <suspend-to-disk enabled="no"/>
> > </pm>
> > <devices>
> > <emulator>/usr/bin/qemu-system-x86_64</emulator>
> > <disk type="block" device="disk">
> > <driver name="qemu" type="raw" cache="unsafe" discard="unmap"
> > detect_zeroes="unmap"/>
> > <source dev="/dev/disk/by-partuuid/05c3750b-060f-4703-95ea-6f5e546bf6e9"/>
> > <target dev="vda" bus="virtio"/>
> > <address type="pci" domain="0x0000" bus="0x00" slot="0x0a"
> > function="0x0"/>
> > </disk>
> > <controller type="pci" index="0" model="pcie-root"/>
> > <controller type="pci" index="1" model="pcie-root-port">
> > <model name="pcie-root-port"/>
> > <target chassis="1" port="0x8"/>
> > <address type="pci" domain="0x0000" bus="0x00" slot="0x01"
> > function="0x0" multifunction="on"/>
> > </controller>
> > <controller type="pci" index="2" model="pcie-root-port">
> > <model name="pcie-root-port"/>
> > <target chassis="2" port="0x9"/>
> > <address type="pci" domain="0x0000" bus="0x00" slot="0x01"
> > function="0x1"/>
> > </controller>
> > <controller type="pci" index="3" model="pcie-root-port">
> > <model name="pcie-root-port"/>
> > <target chassis="3" port="0xa"/>
> > <address type="pci" domain="0x0000" bus="0x00" slot="0x01"
> > function="0x2"/>
> > </controller>
> > <controller type="pci" index="4" model="pcie-root-port">
> > <model name="pcie-root-port"/>
> > <target chassis="4" port="0xb"/>
> > <address type="pci" domain="0x0000" bus="0x00" slot="0x01"
> > function="0x3"/>
> > </controller>
> > <controller type="pci" index="5" model="pcie-to-pci-bridge">
> > <model name="pcie-pci-bridge"/>
> > <address type="pci" domain="0x0000" bus="0x02" slot="0x00"
> > function="0x0"/>
> > </controller>
> > <controller type="usb" index="0" model="qemu-xhci">
> > <address type="pci" domain="0x0000" bus="0x01" slot="0x00"
> > function="0x0"/>
> > </controller>
> > <controller type="sata" index="0">
> > <address type="pci" domain="0x0000" bus="0x00" slot="0x1f"
> > function="0x2"/>
> > </controller>
> > <interface type="network">
> > <mac address="52:54:00:c6:bb:bc"/>
> > <source network="default"/>
> > <model type="e1000"/>
> > <address type="pci" domain="0x0000" bus="0x00" slot="0x03"
> > function="0x0"/>
> > </interface>
> > <input type="mouse" bus="ps2"/>
> > <input type="keyboard" bus="ps2"/>
> > <sound model="ich9">
> > <address type="pci" domain="0x0000" bus="0x00" slot="0x04"
> > function="0x0"/>
> > </sound>
> > <hostdev mode="subsystem" type="pci" managed="yes">
> > <source>
> > <address domain="0x0000" bus="0x00" slot="0x02" function="0x0"/>
> > </source>
> > <rom bar="off"/>
> > <address type="pci" domain="0x0000" bus="0x00" slot="0x02"
> > function="0x0"/>
> > </hostdev>
> > <memballoon model="virtio">
> > <address type="pci" domain="0x0000" bus="0x00" slot="0x08"
> > function="0x0"/>
> > </memballoon>
> > <rng model="virtio">
> > <backend model="random">/dev/urandom</backend>
> > <address type="pci" domain="0x0000" bus="0x00" slot="0x09"
> > function="0x0"/>
> > </rng>
> > </devices>
> > </domain>
> >
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y)
2020-10-16 18:04 ` Ju Hyung Park
@ 2020-10-16 18:07 ` Jens Axboe
2020-10-17 14:29 ` Ju Hyung Park
0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2020-10-16 18:07 UTC (permalink / raw)
To: Ju Hyung Park, Stefano Garzarella; +Cc: io-uring, qemu-devel
On 10/16/20 12:04 PM, Ju Hyung Park wrote:
> A small update:
>
> As per Stefano's suggestion, disabling io_uring support from QEMU from
> the configuration step did fix the problem and I'm no longer having
> hangs.
>
> Looks like it __is__ an io_uring issue :(
Would be great if you could try 5.4.71 and see if that helps for your
issue.
--
Jens Axboe
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y)
2020-10-16 18:07 ` Jens Axboe
@ 2020-10-17 14:29 ` Ju Hyung Park
2020-10-17 15:02 ` Jens Axboe
2020-10-19 9:22 ` Pankaj Gupta
0 siblings, 2 replies; 11+ messages in thread
From: Ju Hyung Park @ 2020-10-17 14:29 UTC (permalink / raw)
To: Jens Axboe; +Cc: Stefano Garzarella, io-uring, qemu-devel
Hi Jens.
On Sat, Oct 17, 2020 at 3:07 AM Jens Axboe <[email protected]> wrote:
>
> Would be great if you could try 5.4.71 and see if that helps for your
> issue.
>
Oh wow, yeah it did fix the issue.
I'm able to reliably turn off and start the VM multiple times in a row.
Double checked by confirming QEMU is dynamically linked to liburing.so.1.
Looks like those 4 io_uring fixes helped.
Thanks!
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y)
2020-10-17 14:29 ` Ju Hyung Park
@ 2020-10-17 15:02 ` Jens Axboe
2020-10-19 9:22 ` Pankaj Gupta
1 sibling, 0 replies; 11+ messages in thread
From: Jens Axboe @ 2020-10-17 15:02 UTC (permalink / raw)
To: Ju Hyung Park; +Cc: Stefano Garzarella, io-uring, qemu-devel
On 10/17/20 8:29 AM, Ju Hyung Park wrote:
> Hi Jens.
>
> On Sat, Oct 17, 2020 at 3:07 AM Jens Axboe <[email protected]> wrote:
>>
>> Would be great if you could try 5.4.71 and see if that helps for your
>> issue.
>>
>
> Oh wow, yeah it did fix the issue.
>
> I'm able to reliably turn off and start the VM multiple times in a row.
> Double checked by confirming QEMU is dynamically linked to liburing.so.1.
>
> Looks like those 4 io_uring fixes helped.
Awesome, thanks for testing!
--
Jens Axboe
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y)
2020-10-17 14:29 ` Ju Hyung Park
2020-10-17 15:02 ` Jens Axboe
@ 2020-10-19 9:22 ` Pankaj Gupta
1 sibling, 0 replies; 11+ messages in thread
From: Pankaj Gupta @ 2020-10-19 9:22 UTC (permalink / raw)
To: Jack Wang
Cc: Jens Axboe, Qemu Developers, io-uring, Stefano Garzarella,
Ju Hyung Park
@Jack Wang,
Maybe four io_uring patches in 5.4.71 fixes the issue for you as well?
Thanks,
Pankaj
> Hi Jens.
>
> On Sat, Oct 17, 2020 at 3:07 AM Jens Axboe <[email protected]> wrote:
> >
> > Would be great if you could try 5.4.71 and see if that helps for your
> > issue.
> >
>
> Oh wow, yeah it did fix the issue.
>
> I'm able to reliably turn off and start the VM multiple times in a row.
> Double checked by confirming QEMU is dynamically linked to liburing.so.1.
>
> Looks like those 4 io_uring fixes helped.
>
> Thanks!
>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-10-19 9:23 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-09-30 16:26 io_uring possibly the culprit for qemu hang (linux-5.4.y) Ju Hyung Park
2020-10-01 3:03 ` Jens Axboe
2020-10-01 8:59 ` Stefano Garzarella
2020-10-01 13:47 ` Jack Wang
2020-10-01 14:30 ` Ju Hyung Park
2020-10-02 7:34 ` Stefano Garzarella
2020-10-16 18:04 ` Ju Hyung Park
2020-10-16 18:07 ` Jens Axboe
2020-10-17 14:29 ` Ju Hyung Park
2020-10-17 15:02 ` Jens Axboe
2020-10-19 9:22 ` Pankaj Gupta
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox