* io_uring possibly the culprit for qemu hang (linux-5.4.y) @ 2020-09-30 16:26 Ju Hyung Park 2020-10-01 3:03 ` Jens Axboe 2020-10-01 8:59 ` Stefano Garzarella 0 siblings, 2 replies; 11+ messages in thread From: Ju Hyung Park @ 2020-09-30 16:26 UTC (permalink / raw) To: io-uring, Jens Axboe Hi everyone. I have recently switched to a setup running QEMU 5.0(which supports io_uring) for a Windows 10 guest on Linux v5.4.63. The QEMU hosts /dev/nvme0n1p3 to the guest with virtio-blk with discard/unmap enabled. I've been having a weird issue where the system would randomly hang whenever I turn on or shutdown the guest. The host will stay up for a bit and then just hang. No response on SSH, etc. Even ping doesn't work. It's been hard to even get a log to debug the issue, but I've been able to get a show-backtrace-all-active-cpus sysrq dmesg on the most recent encounter with the issue and it's showing some io_uring functions. Since I've been encountering the issue ever since I switched to QEMU 5.0, I suspect io_uring may be the culprit to the issue. While I'd love to try out the mainline kernel, it's currently not feasible at the moment as I have to stay in linux-5.4.y. Backporting mainline's io_uring also seems to be a non-trivial job. Any tips would be appreciated. I can build my own kernel and I'm willing to try out (backported) patches. Thanks. [243683.539303] NMI backtrace for cpu 1 [243683.539303] CPU: 1 PID: 1527 Comm: qemu-system-x86 Tainted: P W O 5.4.63+ #1 [243683.539303] Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 2401 07/12/2019 [243683.539304] RIP: 0010:io_uring_flush+0x98/0x140 [243683.539304] Code: e4 74 70 48 8b 93 e8 02 00 00 48 8b 32 48 8b 4a 08 48 89 4e 08 48 89 31 48 89 12 48 89 52 08 48 8b 72 f8 81 4a a8 00 40 00 00 <48> 85 f6 74 15 4c 3b 62 c8 75 0f ba 01 00 00 00 bf 02 00 00 00 e8 [243683.539304] RSP: 0018:ffff8881f20c3e28 EFLAGS: 00000006 [243683.539305] RAX: ffff888419cd94e0 RBX: ffff88842ba49800 RCX: ffff888419cd94e0 [243683.539305] RDX: ffff888419cd94e0 RSI: ffff888419cd94d0 RDI: ffff88842ba49af8 [243683.539306] RBP: ffff88842ba49af8 R08: 0000000000000001 R09: ffff88840d17aaf8 [243683.539306] R10: 0000000000000001 R11: 00000000ffffffec R12: ffff88843c68c080 [243683.539306] R13: ffff88842ba49ae8 R14: 0000000000000001 R15: 0000000000000000 [243683.539307] FS: 0000000000000000(0000) GS:ffff88843ea80000(0000) knlGS:0000000000000000 [243683.539307] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [243683.539307] CR2: 00007f3234b31f90 CR3: 0000000002608001 CR4: 00000000003726e0 [243683.539307] Call Trace: [243683.539308] ? filp_close+0x2a/0x60 [243683.539308] ? put_files_struct.part.0+0x57/0xb0 [243683.539309] ? do_exit+0x321/0xa70 [243683.539309] ? do_group_exit+0x35/0x90 [243683.539309] ? __x64_sys_exit_group+0xf/0x10 [243683.539309] ? do_syscall_64+0x41/0x160 [243683.539309] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 [243684.753272] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [243684.753278] rcu: 1-...0: (1 GPs behind) idle=a5e/1/0x4000000000000000 softirq=7893711/7893712 fqs=2955 [243684.753280] (detected by 3, t=6002 jiffies, g=17109677, q=117817) [243684.753282] Sending NMI from CPU 3 to CPUs 1: [243684.754285] NMI backtrace for cpu 1 [243684.754285] CPU: 1 PID: 1527 Comm: qemu-system-x86 Tainted: P W O 5.4.63+ #1 [243684.754286] Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 2401 07/12/2019 [243684.754286] RIP: 0010:io_uring_flush+0x83/0x140 [243684.754287] Code: 89 ef e8 00 36 92 00 48 8b 83 e8 02 00 00 49 39 c5 74 52 4d 85 e4 74 70 48 8b 93 e8 02 00 00 48 8b 32 48 8b 4a 08 48 89 4e 08 <48> 89 31 48 89 12 48 89 52 08 48 8b 72 f8 81 4a a8 00 40 00 00 48 [243684.754287] RSP: 0018:ffff8881f20c3e28 EFLAGS: 00000002 [243684.754288] RAX: ffff888419cd94e0 RBX: ffff88842ba49800 RCX: ffff888419cd94e0 [243684.754288] RDX: ffff888419cd94e0 RSI: ffff888419cd94e0 RDI: ffff88842ba49af8 [243684.754289] RBP: ffff88842ba49af8 R08: 0000000000000001 R09: ffff88840d17aaf8 [243684.754289] R10: 0000000000000001 R11: 00000000ffffffec R12: ffff88843c68c080 [243684.754289] R13: ffff88842ba49ae8 R14: 0000000000000001 R15: 0000000000000000 [243684.754290] FS: 0000000000000000(0000) GS:ffff88843ea80000(0000) knlGS:0000000000000000 [243684.754290] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [243684.754291] CR2: 00007f3234b31f90 CR3: 0000000002608001 CR4: 00000000003726e0 [243684.754291] Call Trace: [243684.754291] ? filp_close+0x2a/0x60 [243684.754291] ? put_files_struct.part.0+0x57/0xb0 [243684.754292] ? do_exit+0x321/0xa70 [243684.754292] ? do_group_exit+0x35/0x90 [243684.754292] ? __x64_sys_exit_group+0xf/0x10 [243684.754293] ? do_syscall_64+0x41/0x160 [243684.754293] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y) 2020-09-30 16:26 io_uring possibly the culprit for qemu hang (linux-5.4.y) Ju Hyung Park @ 2020-10-01 3:03 ` Jens Axboe 2020-10-01 8:59 ` Stefano Garzarella 1 sibling, 0 replies; 11+ messages in thread From: Jens Axboe @ 2020-10-01 3:03 UTC (permalink / raw) To: Ju Hyung Park, io-uring On 9/30/20 10:26 AM, Ju Hyung Park wrote: > Hi everyone. > > I have recently switched to a setup running QEMU 5.0(which supports > io_uring) for a Windows 10 guest on Linux v5.4.63. > The QEMU hosts /dev/nvme0n1p3 to the guest with virtio-blk with > discard/unmap enabled. > > I've been having a weird issue where the system would randomly hang > whenever I turn on or shutdown the guest. The host will stay up for a > bit and then just hang. No response on SSH, etc. Even ping doesn't > work. > > It's been hard to even get a log to debug the issue, but I've been > able to get a show-backtrace-all-active-cpus sysrq dmesg on the most > recent encounter with the issue and it's showing some io_uring > functions. > > Since I've been encountering the issue ever since I switched to QEMU > 5.0, I suspect io_uring may be the culprit to the issue. > > While I'd love to try out the mainline kernel, it's currently not > feasible at the moment as I have to stay in linux-5.4.y. Backporting > mainline's io_uring also seems to be a non-trivial job. > > Any tips would be appreciated. I can build my own kernel and I'm > willing to try out (backported) patches. I'll see if I can reproduce this, thanks for the report! -- Jens Axboe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y) 2020-09-30 16:26 io_uring possibly the culprit for qemu hang (linux-5.4.y) Ju Hyung Park 2020-10-01 3:03 ` Jens Axboe @ 2020-10-01 8:59 ` Stefano Garzarella 2020-10-01 13:47 ` Jack Wang 2020-10-01 14:30 ` Ju Hyung Park 1 sibling, 2 replies; 11+ messages in thread From: Stefano Garzarella @ 2020-10-01 8:59 UTC (permalink / raw) To: Ju Hyung Park; +Cc: io-uring, Jens Axboe, qemu-devel +Cc: [email protected] Hi, On Thu, Oct 01, 2020 at 01:26:51AM +0900, Ju Hyung Park wrote: > Hi everyone. > > I have recently switched to a setup running QEMU 5.0(which supports > io_uring) for a Windows 10 guest on Linux v5.4.63. > The QEMU hosts /dev/nvme0n1p3 to the guest with virtio-blk with > discard/unmap enabled. Please, can you share the qemu command line that you are using? This can be useful for the analysis. Thanks, Stefano > > I've been having a weird issue where the system would randomly hang > whenever I turn on or shutdown the guest. The host will stay up for a > bit and then just hang. No response on SSH, etc. Even ping doesn't > work. > > It's been hard to even get a log to debug the issue, but I've been > able to get a show-backtrace-all-active-cpus sysrq dmesg on the most > recent encounter with the issue and it's showing some io_uring > functions. > > Since I've been encountering the issue ever since I switched to QEMU > 5.0, I suspect io_uring may be the culprit to the issue. > > While I'd love to try out the mainline kernel, it's currently not > feasible at the moment as I have to stay in linux-5.4.y. Backporting > mainline's io_uring also seems to be a non-trivial job. > > Any tips would be appreciated. I can build my own kernel and I'm > willing to try out (backported) patches. > > Thanks. > > [243683.539303] NMI backtrace for cpu 1 > [243683.539303] CPU: 1 PID: 1527 Comm: qemu-system-x86 Tainted: P > W O 5.4.63+ #1 > [243683.539303] Hardware name: System manufacturer System Product > Name/PRIME Z370-A, BIOS 2401 07/12/2019 > [243683.539304] RIP: 0010:io_uring_flush+0x98/0x140 > [243683.539304] Code: e4 74 70 48 8b 93 e8 02 00 00 48 8b 32 48 8b 4a > 08 48 89 4e 08 48 89 31 48 89 12 48 89 52 08 48 8b 72 f8 81 4a a8 00 > 40 00 00 <48> 85 f6 74 15 4c 3b 62 c8 75 0f ba 01 00 00 00 bf 02 00 00 > 00 e8 > [243683.539304] RSP: 0018:ffff8881f20c3e28 EFLAGS: 00000006 > [243683.539305] RAX: ffff888419cd94e0 RBX: ffff88842ba49800 RCX: > ffff888419cd94e0 > [243683.539305] RDX: ffff888419cd94e0 RSI: ffff888419cd94d0 RDI: > ffff88842ba49af8 > [243683.539306] RBP: ffff88842ba49af8 R08: 0000000000000001 R09: > ffff88840d17aaf8 > [243683.539306] R10: 0000000000000001 R11: 00000000ffffffec R12: > ffff88843c68c080 > [243683.539306] R13: ffff88842ba49ae8 R14: 0000000000000001 R15: > 0000000000000000 > [243683.539307] FS: 0000000000000000(0000) GS:ffff88843ea80000(0000) > knlGS:0000000000000000 > [243683.539307] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [243683.539307] CR2: 00007f3234b31f90 CR3: 0000000002608001 CR4: > 00000000003726e0 > [243683.539307] Call Trace: > [243683.539308] ? filp_close+0x2a/0x60 > [243683.539308] ? put_files_struct.part.0+0x57/0xb0 > [243683.539309] ? do_exit+0x321/0xa70 > [243683.539309] ? do_group_exit+0x35/0x90 > [243683.539309] ? __x64_sys_exit_group+0xf/0x10 > [243683.539309] ? do_syscall_64+0x41/0x160 > [243683.539309] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [243684.753272] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: > [243684.753278] rcu: 1-...0: (1 GPs behind) > idle=a5e/1/0x4000000000000000 softirq=7893711/7893712 fqs=2955 > [243684.753280] (detected by 3, t=6002 jiffies, g=17109677, q=117817) > [243684.753282] Sending NMI from CPU 3 to CPUs 1: > [243684.754285] NMI backtrace for cpu 1 > [243684.754285] CPU: 1 PID: 1527 Comm: qemu-system-x86 Tainted: P > W O 5.4.63+ #1 > [243684.754286] Hardware name: System manufacturer System Product > Name/PRIME Z370-A, BIOS 2401 07/12/2019 > [243684.754286] RIP: 0010:io_uring_flush+0x83/0x140 > [243684.754287] Code: 89 ef e8 00 36 92 00 48 8b 83 e8 02 00 00 49 39 > c5 74 52 4d 85 e4 74 70 48 8b 93 e8 02 00 00 48 8b 32 48 8b 4a 08 48 > 89 4e 08 <48> 89 31 48 89 12 48 89 52 08 48 8b 72 f8 81 4a a8 00 40 00 > 00 48 > [243684.754287] RSP: 0018:ffff8881f20c3e28 EFLAGS: 00000002 > [243684.754288] RAX: ffff888419cd94e0 RBX: ffff88842ba49800 RCX: > ffff888419cd94e0 > [243684.754288] RDX: ffff888419cd94e0 RSI: ffff888419cd94e0 RDI: > ffff88842ba49af8 > [243684.754289] RBP: ffff88842ba49af8 R08: 0000000000000001 R09: > ffff88840d17aaf8 > [243684.754289] R10: 0000000000000001 R11: 00000000ffffffec R12: > ffff88843c68c080 > [243684.754289] R13: ffff88842ba49ae8 R14: 0000000000000001 R15: > 0000000000000000 > [243684.754290] FS: 0000000000000000(0000) GS:ffff88843ea80000(0000) > knlGS:0000000000000000 > [243684.754290] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [243684.754291] CR2: 00007f3234b31f90 CR3: 0000000002608001 CR4: > 00000000003726e0 > [243684.754291] Call Trace: > [243684.754291] ? filp_close+0x2a/0x60 > [243684.754291] ? put_files_struct.part.0+0x57/0xb0 > [243684.754292] ? do_exit+0x321/0xa70 > [243684.754292] ? do_group_exit+0x35/0x90 > [243684.754292] ? __x64_sys_exit_group+0xf/0x10 > [243684.754293] ? do_syscall_64+0x41/0x160 > [243684.754293] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y) 2020-10-01 8:59 ` Stefano Garzarella @ 2020-10-01 13:47 ` Jack Wang 2020-10-01 14:30 ` Ju Hyung Park 1 sibling, 0 replies; 11+ messages in thread From: Jack Wang @ 2020-10-01 13:47 UTC (permalink / raw) To: Stefano Garzarella; +Cc: Ju Hyung Park, Jens Axboe, io-uring, qemu-devel Stefano Garzarella <[email protected]> 于2020年10月1日周四 上午10:59写道: > > +Cc: [email protected] > > Hi, > > On Thu, Oct 01, 2020 at 01:26:51AM +0900, Ju Hyung Park wrote: > > Hi everyone. > > > > I have recently switched to a setup running QEMU 5.0(which supports > > io_uring) for a Windows 10 guest on Linux v5.4.63. > > The QEMU hosts /dev/nvme0n1p3 to the guest with virtio-blk with > > discard/unmap enabled. > > Please, can you share the qemu command line that you are using? > This can be useful for the analysis. > > Thanks, > Stefano > > > > > I've been having a weird issue where the system would randomly hang > > whenever I turn on or shutdown the guest. The host will stay up for a > > bit and then just hang. No response on SSH, etc. Even ping doesn't > > work. > > > > It's been hard to even get a log to debug the issue, but I've been > > able to get a show-backtrace-all-active-cpus sysrq dmesg on the most > > recent encounter with the issue and it's showing some io_uring > > functions. > > > > Since I've been encountering the issue ever since I switched to QEMU > > 5.0, I suspect io_uring may be the culprit to the issue. > > > > While I'd love to try out the mainline kernel, it's currently not > > feasible at the moment as I have to stay in linux-5.4.y. Backporting > > mainline's io_uring also seems to be a non-trivial job. > > > > Any tips would be appreciated. I can build my own kernel and I'm > > willing to try out (backported) patches. > > > > Thanks. > > > > [243683.539303] NMI backtrace for cpu 1 > > [243683.539303] CPU: 1 PID: 1527 Comm: qemu-system-x86 Tainted: P > > W O 5.4.63+ #1 > > [243683.539303] Hardware name: System manufacturer System Product > > Name/PRIME Z370-A, BIOS 2401 07/12/2019 > > [243683.539304] RIP: 0010:io_uring_flush+0x98/0x140 > > [243683.539304] Code: e4 74 70 48 8b 93 e8 02 00 00 48 8b 32 48 8b 4a > > 08 48 89 4e 08 48 89 31 48 89 12 48 89 52 08 48 8b 72 f8 81 4a a8 00 > > 40 00 00 <48> 85 f6 74 15 4c 3b 62 c8 75 0f ba 01 00 00 00 bf 02 00 00 > > 00 e8 > > [243683.539304] RSP: 0018:ffff8881f20c3e28 EFLAGS: 00000006 > > [243683.539305] RAX: ffff888419cd94e0 RBX: ffff88842ba49800 RCX: > > ffff888419cd94e0 > > [243683.539305] RDX: ffff888419cd94e0 RSI: ffff888419cd94d0 RDI: > > ffff88842ba49af8 > > [243683.539306] RBP: ffff88842ba49af8 R08: 0000000000000001 R09: > > ffff88840d17aaf8 > > [243683.539306] R10: 0000000000000001 R11: 00000000ffffffec R12: > > ffff88843c68c080 > > [243683.539306] R13: ffff88842ba49ae8 R14: 0000000000000001 R15: > > 0000000000000000 > > [243683.539307] FS: 0000000000000000(0000) GS:ffff88843ea80000(0000) > > knlGS:0000000000000000 > > [243683.539307] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [243683.539307] CR2: 00007f3234b31f90 CR3: 0000000002608001 CR4: > > 00000000003726e0 > > [243683.539307] Call Trace: > > [243683.539308] ? filp_close+0x2a/0x60 > > [243683.539308] ? put_files_struct.part.0+0x57/0xb0 > > [243683.539309] ? do_exit+0x321/0xa70 > > [243683.539309] ? do_group_exit+0x35/0x90 > > [243683.539309] ? __x64_sys_exit_group+0xf/0x10 > > [243683.539309] ? do_syscall_64+0x41/0x160 > > [243683.539309] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [243684.753272] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: > > [243684.753278] rcu: 1-...0: (1 GPs behind) > > idle=a5e/1/0x4000000000000000 softirq=7893711/7893712 fqs=2955 > > [243684.753280] (detected by 3, t=6002 jiffies, g=17109677, q=117817) > > [243684.753282] Sending NMI from CPU 3 to CPUs 1: > > [243684.754285] NMI backtrace for cpu 1 > > [243684.754285] CPU: 1 PID: 1527 Comm: qemu-system-x86 Tainted: P > > W O 5.4.63+ #1 > > [243684.754286] Hardware name: System manufacturer System Product > > Name/PRIME Z370-A, BIOS 2401 07/12/2019 > > [243684.754286] RIP: 0010:io_uring_flush+0x83/0x140 > > [243684.754287] Code: 89 ef e8 00 36 92 00 48 8b 83 e8 02 00 00 49 39 > > c5 74 52 4d 85 e4 74 70 48 8b 93 e8 02 00 00 48 8b 32 48 8b 4a 08 48 > > 89 4e 08 <48> 89 31 48 89 12 48 89 52 08 48 8b 72 f8 81 4a a8 00 40 00 > > 00 48 > > [243684.754287] RSP: 0018:ffff8881f20c3e28 EFLAGS: 00000002 > > [243684.754288] RAX: ffff888419cd94e0 RBX: ffff88842ba49800 RCX: > > ffff888419cd94e0 > > [243684.754288] RDX: ffff888419cd94e0 RSI: ffff888419cd94e0 RDI: > > ffff88842ba49af8 > > [243684.754289] RBP: ffff88842ba49af8 R08: 0000000000000001 R09: > > ffff88840d17aaf8 > > [243684.754289] R10: 0000000000000001 R11: 00000000ffffffec R12: > > ffff88843c68c080 > > [243684.754289] R13: ffff88842ba49ae8 R14: 0000000000000001 R15: > > 0000000000000000 > > [243684.754290] FS: 0000000000000000(0000) GS:ffff88843ea80000(0000) > > knlGS:0000000000000000 > > [243684.754290] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [243684.754291] CR2: 00007f3234b31f90 CR3: 0000000002608001 CR4: > > 00000000003726e0 > > [243684.754291] Call Trace: > > [243684.754291] ? filp_close+0x2a/0x60 > > [243684.754291] ? put_files_struct.part.0+0x57/0xb0 > > [243684.754292] ? do_exit+0x321/0xa70 > > [243684.754292] ? do_group_exit+0x35/0x90 > > [243684.754292] ? __x64_sys_exit_group+0xf/0x10 > > [243684.754293] ? do_syscall_64+0x41/0x160 > > [243684.754293] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > I got something similar [250145.410520] general protection fault: 0000 [#1] SMP [250145.410868] CPU: 5 PID: 39269 Comm: qemu-5.0 Kdump: loaded Tainted: G O 5.4.61-pserver #5.4.61-1+develop20200831.1341+7430880~deb10 [250145.411468] Hardware name: Supermicro Super Server/X11DDW-L, BIOS 3.3 02/21/2020 [250145.412051] RIP: 0010:io_cancel_async_work+0x48/0xa0 [250145.412386] Code: fb 48 8d bf f8 02 00 00 48 89 f5 e8 02 f1 69 00 48 8b 83 e8 02 00 00 49 39 c4 74 52 48 8b 83 e8 02 00 00 48 8b 08 48 8b 50 08 <48> 89 51 08 48 8 9 0a 48 8b 70 f8 48 89 00 48 89 40 08 81 48 a8 00 [250145.413239] RSP: 0018:ffffc2d34efb7cf8 EFLAGS: 00010083 [250145.413576] RAX: ffffec9879729a80 RBX: ffff9fa2740ac400 RCX: 06ffff8000000000 [250145.414153] RDX: ffff9fa2740ac6e8 RSI: ffffec9879729a40 RDI: ffff9fa2740ac6f8 [250145.414729] RBP: ffff9fa1cf86c080 R08: ffff9fa199ccecf8 R09: 8000002f724b7067 [250145.415307] R10: ffffc2d34efb7b8c R11: 0000000000000003 R12: ffff9fa2740ac6e8 [250145.415884] R13: 0000000000000000 R14: 000000000000000d R15: ffff9fa1af48e068 [250145.416461] FS: 0000000000000000(0000) GS:ffff9fa280940000(0000) knlGS:0000000000000000 [250145.417042] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [250145.417380] CR2: 00007f5f14188000 CR3: 0000002f0c408005 CR4: 00000000007626e0 [250145.417957] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [250145.418535] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [250145.419112] PKRU: 55555554 [250145.419438] Call Trace: [250145.419768] io_uring_flush+0x34/0x50 [250145.420101] filp_close+0x31/0x60 [250145.420432] put_files_struct+0x6c/0xc0 [250145.420767] do_exit+0x347/0xa50 [250145.421097] do_group_exit+0x3a/0x90 [250145.421429] get_signal+0x125/0x7d0 [250145.421761] do_signal+0x36/0x640 [250145.422090] ? do_send_sig_info+0x5c/0x90 [250145.422423] ? recalc_sigpending+0x17/0x50 [250145.422757] exit_to_usermode_loop+0x61/0xd0 [250145.423090] do_syscall_64+0xe6/0x120 [250145.423424] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [250145.423761] RIP: 0033:0x7f5f176997bb ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y) 2020-10-01 8:59 ` Stefano Garzarella 2020-10-01 13:47 ` Jack Wang @ 2020-10-01 14:30 ` Ju Hyung Park 2020-10-02 7:34 ` Stefano Garzarella 1 sibling, 1 reply; 11+ messages in thread From: Ju Hyung Park @ 2020-10-01 14:30 UTC (permalink / raw) To: Stefano Garzarella; +Cc: io-uring, Jens Axboe, qemu-devel Hi Stefano, On Thu, Oct 1, 2020 at 5:59 PM Stefano Garzarella <[email protected]> wrote: > Please, can you share the qemu command line that you are using? > This can be useful for the analysis. Sure. QEMU: /usr/bin/qemu-system-x86_64 -name guest=win10,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-win10/master-key.aes -blockdev {"driver":"file","filename":"/usr/share/OVMF/OVMF_CODE.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"} -blockdev {"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/win10_VARS.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"} -machine pc-q35-5.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off,mem-merge=off,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format -cpu Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,pdpe1gb=on,ibpb=on,amd-ssbd=on,fma=off,avx=off,f16c=off,rdrand=off,bmi1=off,hle=off,avx2=off,bmi2=off,rtm=off,rdseed=off,adx=off,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-vpindex,hv-runtime,hv-synic,hv-stimer,hv-reset -m 8192 -mem-prealloc -mem-path /dev/hugepages/libvirt/qemu/1-win10 -overcommit mem-lock=off -smp 4,sockets=1,dies=1,cores=2,threads=2 -uuid 7ccc3031-1dab-4267-b72a-d60065b5ff7f -display none -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=32,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot menu=off,strict=on -device pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 -device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 -device pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 -device pcie-pci-bridge,id=pci.5,bus=pci.2,addr=0x0 -device qemu-xhci,id=usb,bus=pci.1,addr=0x0 -blockdev {"driver":"host_device","filename":"/dev/disk/by-partuuid/05c3750b-060f-4703-95ea-6f5e546bf6e9","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-1-format","read-only":false,"discard":"unmap","detect-zeroes":"unmap","cache":{"direct":false,"no-flush":true},"driver":"raw","file":"libvirt-1-storage"} -device virtio-blk-pci,scsi=off,bus=pcie.0,addr=0xa,drive=libvirt-1-format,id=virtio-disk0,bootindex=1,write-cache=on -netdev tap,fd=34,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=52:54:00:c6:bb:bc,bus=pcie.0,addr=0x3 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device vfio-pci,host=0000:00:02.0,id=hostdev0,bus=pcie.0,addr=0x2,rombar=0 -device virtio-balloon-pci,id=balloon0,bus=pcie.0,addr=0x8 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pcie.0,addr=0x9 -msg timestamp=on And I use libvirt 6.3.0 to manage the VM. Here's an xml of my VM. <domain type="kvm"> <name>win10</name> <uuid>7ccc3031-1dab-4267-b72a-d60065b5ff7f</uuid> <metadata> <libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0"> <libosinfo:os id="http://microsoft.com/win/10"/> </libosinfo:libosinfo> </metadata> <memory unit="KiB">8388608</memory> <currentMemory unit="KiB">8388608</currentMemory> <memoryBacking> <hugepages/> <nosharepages/> </memoryBacking> <vcpu placement="static">4</vcpu> <cputune> <vcpupin vcpu="0" cpuset="0"/> <vcpupin vcpu="1" cpuset="2"/> <vcpupin vcpu="2" cpuset="1"/> <vcpupin vcpu="3" cpuset="3"/> </cputune> <os> <type arch="x86_64" machine="pc-q35-5.0">hvm</type> <loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.fd</loader> <nvram>/var/lib/libvirt/qemu/nvram/win10_VARS.fd</nvram> <boot dev="hd"/> <bootmenu enable="no"/> </os> <features> <acpi/> <apic/> <hyperv> <relaxed state="on"/> <vapic state="on"/> <spinlocks state="on" retries="8191"/> <vpindex state="on"/> <runtime state="on"/> <synic state="on"/> <stimer state="on"/> <reset state="on"/> </hyperv> <vmport state="off"/> </features> <cpu mode="host-model" check="partial"> <topology sockets="1" dies="1" cores="2" threads="2"/> </cpu> <clock offset="localtime"> <timer name="rtc" tickpolicy="catchup"/> <timer name="pit" tickpolicy="delay"/> <timer name="hpet" present="no"/> <timer name="hypervclock" present="yes"/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <pm> <suspend-to-mem enabled="no"/> <suspend-to-disk enabled="no"/> </pm> <devices> <emulator>/usr/bin/qemu-system-x86_64</emulator> <disk type="block" device="disk"> <driver name="qemu" type="raw" cache="unsafe" discard="unmap" detect_zeroes="unmap"/> <source dev="/dev/disk/by-partuuid/05c3750b-060f-4703-95ea-6f5e546bf6e9"/> <target dev="vda" bus="virtio"/> <address type="pci" domain="0x0000" bus="0x00" slot="0x0a" function="0x0"/> </disk> <controller type="pci" index="0" model="pcie-root"/> <controller type="pci" index="1" model="pcie-root-port"> <model name="pcie-root-port"/> <target chassis="1" port="0x8"/> <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x0" multifunction="on"/> </controller> <controller type="pci" index="2" model="pcie-root-port"> <model name="pcie-root-port"/> <target chassis="2" port="0x9"/> <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x1"/> </controller> <controller type="pci" index="3" model="pcie-root-port"> <model name="pcie-root-port"/> <target chassis="3" port="0xa"/> <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x2"/> </controller> <controller type="pci" index="4" model="pcie-root-port"> <model name="pcie-root-port"/> <target chassis="4" port="0xb"/> <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x3"/> </controller> <controller type="pci" index="5" model="pcie-to-pci-bridge"> <model name="pcie-pci-bridge"/> <address type="pci" domain="0x0000" bus="0x02" slot="0x00" function="0x0"/> </controller> <controller type="usb" index="0" model="qemu-xhci"> <address type="pci" domain="0x0000" bus="0x01" slot="0x00" function="0x0"/> </controller> <controller type="sata" index="0"> <address type="pci" domain="0x0000" bus="0x00" slot="0x1f" function="0x2"/> </controller> <interface type="network"> <mac address="52:54:00:c6:bb:bc"/> <source network="default"/> <model type="e1000"/> <address type="pci" domain="0x0000" bus="0x00" slot="0x03" function="0x0"/> </interface> <input type="mouse" bus="ps2"/> <input type="keyboard" bus="ps2"/> <sound model="ich9"> <address type="pci" domain="0x0000" bus="0x00" slot="0x04" function="0x0"/> </sound> <hostdev mode="subsystem" type="pci" managed="yes"> <source> <address domain="0x0000" bus="0x00" slot="0x02" function="0x0"/> </source> <rom bar="off"/> <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x0"/> </hostdev> <memballoon model="virtio"> <address type="pci" domain="0x0000" bus="0x00" slot="0x08" function="0x0"/> </memballoon> <rng model="virtio"> <backend model="random">/dev/urandom</backend> <address type="pci" domain="0x0000" bus="0x00" slot="0x09" function="0x0"/> </rng> </devices> </domain> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y) 2020-10-01 14:30 ` Ju Hyung Park @ 2020-10-02 7:34 ` Stefano Garzarella 2020-10-16 18:04 ` Ju Hyung Park 0 siblings, 1 reply; 11+ messages in thread From: Stefano Garzarella @ 2020-10-02 7:34 UTC (permalink / raw) To: Ju Hyung Park; +Cc: io-uring, Jens Axboe, qemu-devel Hi Ju, On Thu, Oct 01, 2020 at 11:30:14PM +0900, Ju Hyung Park wrote: > Hi Stefano, > > On Thu, Oct 1, 2020 at 5:59 PM Stefano Garzarella <[email protected]> wrote: > > Please, can you share the qemu command line that you are using? > > This can be useful for the analysis. > > Sure. Thanks for sharing. The issue seems related to io_uring and the new io_uring fd monitoring implementation available from QEMU 5.0. I'll try to reproduce. For now, as a workaround, you can rebuild qemu by disabling io-uring support: ../configure --disable-linux-io-uring ... Thanks, Stefano > > QEMU: > /usr/bin/qemu-system-x86_64 -name guest=win10,debug-threads=on -S > -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-win10/master-key.aes > -blockdev {"driver":"file","filename":"/usr/share/OVMF/OVMF_CODE.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"} > -blockdev {"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"} > -blockdev {"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/win10_VARS.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"} > -blockdev {"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"} > -machine pc-q35-5.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off,mem-merge=off,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format > -cpu Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,pdpe1gb=on,ibpb=on,amd-ssbd=on,fma=off,avx=off,f16c=off,rdrand=off,bmi1=off,hle=off,avx2=off,bmi2=off,rtm=off,rdseed=off,adx=off,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-vpindex,hv-runtime,hv-synic,hv-stimer,hv-reset > -m 8192 -mem-prealloc -mem-path /dev/hugepages/libvirt/qemu/1-win10 > -overcommit mem-lock=off -smp 4,sockets=1,dies=1,cores=2,threads=2 > -uuid 7ccc3031-1dab-4267-b72a-d60065b5ff7f -display none > -no-user-config -nodefaults -chardev > socket,id=charmonitor,fd=32,server,nowait -mon > chardev=charmonitor,id=monitor,mode=control -rtc > base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=delay > -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global > ICH9-LPC.disable_s4=1 -boot menu=off,strict=on -device > pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 > -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 > -device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 > -device pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 > -device pcie-pci-bridge,id=pci.5,bus=pci.2,addr=0x0 -device > qemu-xhci,id=usb,bus=pci.1,addr=0x0 -blockdev > {"driver":"host_device","filename":"/dev/disk/by-partuuid/05c3750b-060f-4703-95ea-6f5e546bf6e9","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"} > -blockdev {"node-name":"libvirt-1-format","read-only":false,"discard":"unmap","detect-zeroes":"unmap","cache":{"direct":false,"no-flush":true},"driver":"raw","file":"libvirt-1-storage"} > -device virtio-blk-pci,scsi=off,bus=pcie.0,addr=0xa,drive=libvirt-1-format,id=virtio-disk0,bootindex=1,write-cache=on > -netdev tap,fd=34,id=hostnet0 -device > e1000,netdev=hostnet0,id=net0,mac=52:54:00:c6:bb:bc,bus=pcie.0,addr=0x3 > -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x4 -device > hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device > vfio-pci,host=0000:00:02.0,id=hostdev0,bus=pcie.0,addr=0x2,rombar=0 > -device virtio-balloon-pci,id=balloon0,bus=pcie.0,addr=0x8 -object > rng-random,id=objrng0,filename=/dev/urandom -device > virtio-rng-pci,rng=objrng0,id=rng0,bus=pcie.0,addr=0x9 -msg > timestamp=on > > And I use libvirt 6.3.0 to manage the VM. Here's an xml of my VM. > > <domain type="kvm"> > <name>win10</name> > <uuid>7ccc3031-1dab-4267-b72a-d60065b5ff7f</uuid> > <metadata> > <libosinfo:libosinfo > xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0"> > <libosinfo:os id="http://microsoft.com/win/10"/> > </libosinfo:libosinfo> > </metadata> > <memory unit="KiB">8388608</memory> > <currentMemory unit="KiB">8388608</currentMemory> > <memoryBacking> > <hugepages/> > <nosharepages/> > </memoryBacking> > <vcpu placement="static">4</vcpu> > <cputune> > <vcpupin vcpu="0" cpuset="0"/> > <vcpupin vcpu="1" cpuset="2"/> > <vcpupin vcpu="2" cpuset="1"/> > <vcpupin vcpu="3" cpuset="3"/> > </cputune> > <os> > <type arch="x86_64" machine="pc-q35-5.0">hvm</type> > <loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.fd</loader> > <nvram>/var/lib/libvirt/qemu/nvram/win10_VARS.fd</nvram> > <boot dev="hd"/> > <bootmenu enable="no"/> > </os> > <features> > <acpi/> > <apic/> > <hyperv> > <relaxed state="on"/> > <vapic state="on"/> > <spinlocks state="on" retries="8191"/> > <vpindex state="on"/> > <runtime state="on"/> > <synic state="on"/> > <stimer state="on"/> > <reset state="on"/> > </hyperv> > <vmport state="off"/> > </features> > <cpu mode="host-model" check="partial"> > <topology sockets="1" dies="1" cores="2" threads="2"/> > </cpu> > <clock offset="localtime"> > <timer name="rtc" tickpolicy="catchup"/> > <timer name="pit" tickpolicy="delay"/> > <timer name="hpet" present="no"/> > <timer name="hypervclock" present="yes"/> > </clock> > <on_poweroff>destroy</on_poweroff> > <on_reboot>restart</on_reboot> > <on_crash>destroy</on_crash> > <pm> > <suspend-to-mem enabled="no"/> > <suspend-to-disk enabled="no"/> > </pm> > <devices> > <emulator>/usr/bin/qemu-system-x86_64</emulator> > <disk type="block" device="disk"> > <driver name="qemu" type="raw" cache="unsafe" discard="unmap" > detect_zeroes="unmap"/> > <source dev="/dev/disk/by-partuuid/05c3750b-060f-4703-95ea-6f5e546bf6e9"/> > <target dev="vda" bus="virtio"/> > <address type="pci" domain="0x0000" bus="0x00" slot="0x0a" > function="0x0"/> > </disk> > <controller type="pci" index="0" model="pcie-root"/> > <controller type="pci" index="1" model="pcie-root-port"> > <model name="pcie-root-port"/> > <target chassis="1" port="0x8"/> > <address type="pci" domain="0x0000" bus="0x00" slot="0x01" > function="0x0" multifunction="on"/> > </controller> > <controller type="pci" index="2" model="pcie-root-port"> > <model name="pcie-root-port"/> > <target chassis="2" port="0x9"/> > <address type="pci" domain="0x0000" bus="0x00" slot="0x01" > function="0x1"/> > </controller> > <controller type="pci" index="3" model="pcie-root-port"> > <model name="pcie-root-port"/> > <target chassis="3" port="0xa"/> > <address type="pci" domain="0x0000" bus="0x00" slot="0x01" > function="0x2"/> > </controller> > <controller type="pci" index="4" model="pcie-root-port"> > <model name="pcie-root-port"/> > <target chassis="4" port="0xb"/> > <address type="pci" domain="0x0000" bus="0x00" slot="0x01" > function="0x3"/> > </controller> > <controller type="pci" index="5" model="pcie-to-pci-bridge"> > <model name="pcie-pci-bridge"/> > <address type="pci" domain="0x0000" bus="0x02" slot="0x00" > function="0x0"/> > </controller> > <controller type="usb" index="0" model="qemu-xhci"> > <address type="pci" domain="0x0000" bus="0x01" slot="0x00" > function="0x0"/> > </controller> > <controller type="sata" index="0"> > <address type="pci" domain="0x0000" bus="0x00" slot="0x1f" > function="0x2"/> > </controller> > <interface type="network"> > <mac address="52:54:00:c6:bb:bc"/> > <source network="default"/> > <model type="e1000"/> > <address type="pci" domain="0x0000" bus="0x00" slot="0x03" > function="0x0"/> > </interface> > <input type="mouse" bus="ps2"/> > <input type="keyboard" bus="ps2"/> > <sound model="ich9"> > <address type="pci" domain="0x0000" bus="0x00" slot="0x04" > function="0x0"/> > </sound> > <hostdev mode="subsystem" type="pci" managed="yes"> > <source> > <address domain="0x0000" bus="0x00" slot="0x02" function="0x0"/> > </source> > <rom bar="off"/> > <address type="pci" domain="0x0000" bus="0x00" slot="0x02" > function="0x0"/> > </hostdev> > <memballoon model="virtio"> > <address type="pci" domain="0x0000" bus="0x00" slot="0x08" > function="0x0"/> > </memballoon> > <rng model="virtio"> > <backend model="random">/dev/urandom</backend> > <address type="pci" domain="0x0000" bus="0x00" slot="0x09" > function="0x0"/> > </rng> > </devices> > </domain> > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y) 2020-10-02 7:34 ` Stefano Garzarella @ 2020-10-16 18:04 ` Ju Hyung Park 2020-10-16 18:07 ` Jens Axboe 0 siblings, 1 reply; 11+ messages in thread From: Ju Hyung Park @ 2020-10-16 18:04 UTC (permalink / raw) To: Jens Axboe, Stefano Garzarella; +Cc: io-uring, qemu-devel A small update: As per Stefano's suggestion, disabling io_uring support from QEMU from the configuration step did fix the problem and I'm no longer having hangs. Looks like it __is__ an io_uring issue :( Btw, I used liburing fe50048 for linking QEMU. Thanks. On Fri, Oct 2, 2020 at 4:35 PM Stefano Garzarella <[email protected]> wrote: > > Hi Ju, > > On Thu, Oct 01, 2020 at 11:30:14PM +0900, Ju Hyung Park wrote: > > Hi Stefano, > > > > On Thu, Oct 1, 2020 at 5:59 PM Stefano Garzarella <[email protected]> wrote: > > > Please, can you share the qemu command line that you are using? > > > This can be useful for the analysis. > > > > Sure. > > Thanks for sharing. > > The issue seems related to io_uring and the new io_uring fd monitoring > implementation available from QEMU 5.0. > > I'll try to reproduce. > > For now, as a workaround, you can rebuild qemu by disabling io-uring support: > > ../configure --disable-linux-io-uring ... > > > Thanks, > Stefano > > > > > QEMU: > > /usr/bin/qemu-system-x86_64 -name guest=win10,debug-threads=on -S > > -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-win10/master-key.aes > > -blockdev {"driver":"file","filename":"/usr/share/OVMF/OVMF_CODE.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"} > > -blockdev {"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"} > > -blockdev {"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/win10_VARS.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"} > > -blockdev {"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"} > > -machine pc-q35-5.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off,mem-merge=off,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format > > -cpu Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,pdpe1gb=on,ibpb=on,amd-ssbd=on,fma=off,avx=off,f16c=off,rdrand=off,bmi1=off,hle=off,avx2=off,bmi2=off,rtm=off,rdseed=off,adx=off,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-vpindex,hv-runtime,hv-synic,hv-stimer,hv-reset > > -m 8192 -mem-prealloc -mem-path /dev/hugepages/libvirt/qemu/1-win10 > > -overcommit mem-lock=off -smp 4,sockets=1,dies=1,cores=2,threads=2 > > -uuid 7ccc3031-1dab-4267-b72a-d60065b5ff7f -display none > > -no-user-config -nodefaults -chardev > > socket,id=charmonitor,fd=32,server,nowait -mon > > chardev=charmonitor,id=monitor,mode=control -rtc > > base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=delay > > -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global > > ICH9-LPC.disable_s4=1 -boot menu=off,strict=on -device > > pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 > > -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 > > -device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 > > -device pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 > > -device pcie-pci-bridge,id=pci.5,bus=pci.2,addr=0x0 -device > > qemu-xhci,id=usb,bus=pci.1,addr=0x0 -blockdev > > {"driver":"host_device","filename":"/dev/disk/by-partuuid/05c3750b-060f-4703-95ea-6f5e546bf6e9","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"} > > -blockdev {"node-name":"libvirt-1-format","read-only":false,"discard":"unmap","detect-zeroes":"unmap","cache":{"direct":false,"no-flush":true},"driver":"raw","file":"libvirt-1-storage"} > > -device virtio-blk-pci,scsi=off,bus=pcie.0,addr=0xa,drive=libvirt-1-format,id=virtio-disk0,bootindex=1,write-cache=on > > -netdev tap,fd=34,id=hostnet0 -device > > e1000,netdev=hostnet0,id=net0,mac=52:54:00:c6:bb:bc,bus=pcie.0,addr=0x3 > > -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x4 -device > > hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device > > vfio-pci,host=0000:00:02.0,id=hostdev0,bus=pcie.0,addr=0x2,rombar=0 > > -device virtio-balloon-pci,id=balloon0,bus=pcie.0,addr=0x8 -object > > rng-random,id=objrng0,filename=/dev/urandom -device > > virtio-rng-pci,rng=objrng0,id=rng0,bus=pcie.0,addr=0x9 -msg > > timestamp=on > > > > And I use libvirt 6.3.0 to manage the VM. Here's an xml of my VM. > > > > <domain type="kvm"> > > <name>win10</name> > > <uuid>7ccc3031-1dab-4267-b72a-d60065b5ff7f</uuid> > > <metadata> > > <libosinfo:libosinfo > > xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0"> > > <libosinfo:os id="http://microsoft.com/win/10"/> > > </libosinfo:libosinfo> > > </metadata> > > <memory unit="KiB">8388608</memory> > > <currentMemory unit="KiB">8388608</currentMemory> > > <memoryBacking> > > <hugepages/> > > <nosharepages/> > > </memoryBacking> > > <vcpu placement="static">4</vcpu> > > <cputune> > > <vcpupin vcpu="0" cpuset="0"/> > > <vcpupin vcpu="1" cpuset="2"/> > > <vcpupin vcpu="2" cpuset="1"/> > > <vcpupin vcpu="3" cpuset="3"/> > > </cputune> > > <os> > > <type arch="x86_64" machine="pc-q35-5.0">hvm</type> > > <loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.fd</loader> > > <nvram>/var/lib/libvirt/qemu/nvram/win10_VARS.fd</nvram> > > <boot dev="hd"/> > > <bootmenu enable="no"/> > > </os> > > <features> > > <acpi/> > > <apic/> > > <hyperv> > > <relaxed state="on"/> > > <vapic state="on"/> > > <spinlocks state="on" retries="8191"/> > > <vpindex state="on"/> > > <runtime state="on"/> > > <synic state="on"/> > > <stimer state="on"/> > > <reset state="on"/> > > </hyperv> > > <vmport state="off"/> > > </features> > > <cpu mode="host-model" check="partial"> > > <topology sockets="1" dies="1" cores="2" threads="2"/> > > </cpu> > > <clock offset="localtime"> > > <timer name="rtc" tickpolicy="catchup"/> > > <timer name="pit" tickpolicy="delay"/> > > <timer name="hpet" present="no"/> > > <timer name="hypervclock" present="yes"/> > > </clock> > > <on_poweroff>destroy</on_poweroff> > > <on_reboot>restart</on_reboot> > > <on_crash>destroy</on_crash> > > <pm> > > <suspend-to-mem enabled="no"/> > > <suspend-to-disk enabled="no"/> > > </pm> > > <devices> > > <emulator>/usr/bin/qemu-system-x86_64</emulator> > > <disk type="block" device="disk"> > > <driver name="qemu" type="raw" cache="unsafe" discard="unmap" > > detect_zeroes="unmap"/> > > <source dev="/dev/disk/by-partuuid/05c3750b-060f-4703-95ea-6f5e546bf6e9"/> > > <target dev="vda" bus="virtio"/> > > <address type="pci" domain="0x0000" bus="0x00" slot="0x0a" > > function="0x0"/> > > </disk> > > <controller type="pci" index="0" model="pcie-root"/> > > <controller type="pci" index="1" model="pcie-root-port"> > > <model name="pcie-root-port"/> > > <target chassis="1" port="0x8"/> > > <address type="pci" domain="0x0000" bus="0x00" slot="0x01" > > function="0x0" multifunction="on"/> > > </controller> > > <controller type="pci" index="2" model="pcie-root-port"> > > <model name="pcie-root-port"/> > > <target chassis="2" port="0x9"/> > > <address type="pci" domain="0x0000" bus="0x00" slot="0x01" > > function="0x1"/> > > </controller> > > <controller type="pci" index="3" model="pcie-root-port"> > > <model name="pcie-root-port"/> > > <target chassis="3" port="0xa"/> > > <address type="pci" domain="0x0000" bus="0x00" slot="0x01" > > function="0x2"/> > > </controller> > > <controller type="pci" index="4" model="pcie-root-port"> > > <model name="pcie-root-port"/> > > <target chassis="4" port="0xb"/> > > <address type="pci" domain="0x0000" bus="0x00" slot="0x01" > > function="0x3"/> > > </controller> > > <controller type="pci" index="5" model="pcie-to-pci-bridge"> > > <model name="pcie-pci-bridge"/> > > <address type="pci" domain="0x0000" bus="0x02" slot="0x00" > > function="0x0"/> > > </controller> > > <controller type="usb" index="0" model="qemu-xhci"> > > <address type="pci" domain="0x0000" bus="0x01" slot="0x00" > > function="0x0"/> > > </controller> > > <controller type="sata" index="0"> > > <address type="pci" domain="0x0000" bus="0x00" slot="0x1f" > > function="0x2"/> > > </controller> > > <interface type="network"> > > <mac address="52:54:00:c6:bb:bc"/> > > <source network="default"/> > > <model type="e1000"/> > > <address type="pci" domain="0x0000" bus="0x00" slot="0x03" > > function="0x0"/> > > </interface> > > <input type="mouse" bus="ps2"/> > > <input type="keyboard" bus="ps2"/> > > <sound model="ich9"> > > <address type="pci" domain="0x0000" bus="0x00" slot="0x04" > > function="0x0"/> > > </sound> > > <hostdev mode="subsystem" type="pci" managed="yes"> > > <source> > > <address domain="0x0000" bus="0x00" slot="0x02" function="0x0"/> > > </source> > > <rom bar="off"/> > > <address type="pci" domain="0x0000" bus="0x00" slot="0x02" > > function="0x0"/> > > </hostdev> > > <memballoon model="virtio"> > > <address type="pci" domain="0x0000" bus="0x00" slot="0x08" > > function="0x0"/> > > </memballoon> > > <rng model="virtio"> > > <backend model="random">/dev/urandom</backend> > > <address type="pci" domain="0x0000" bus="0x00" slot="0x09" > > function="0x0"/> > > </rng> > > </devices> > > </domain> > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y) 2020-10-16 18:04 ` Ju Hyung Park @ 2020-10-16 18:07 ` Jens Axboe 2020-10-17 14:29 ` Ju Hyung Park 0 siblings, 1 reply; 11+ messages in thread From: Jens Axboe @ 2020-10-16 18:07 UTC (permalink / raw) To: Ju Hyung Park, Stefano Garzarella; +Cc: io-uring, qemu-devel On 10/16/20 12:04 PM, Ju Hyung Park wrote: > A small update: > > As per Stefano's suggestion, disabling io_uring support from QEMU from > the configuration step did fix the problem and I'm no longer having > hangs. > > Looks like it __is__ an io_uring issue :( Would be great if you could try 5.4.71 and see if that helps for your issue. -- Jens Axboe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y) 2020-10-16 18:07 ` Jens Axboe @ 2020-10-17 14:29 ` Ju Hyung Park 2020-10-17 15:02 ` Jens Axboe 2020-10-19 9:22 ` Pankaj Gupta 0 siblings, 2 replies; 11+ messages in thread From: Ju Hyung Park @ 2020-10-17 14:29 UTC (permalink / raw) To: Jens Axboe; +Cc: Stefano Garzarella, io-uring, qemu-devel Hi Jens. On Sat, Oct 17, 2020 at 3:07 AM Jens Axboe <[email protected]> wrote: > > Would be great if you could try 5.4.71 and see if that helps for your > issue. > Oh wow, yeah it did fix the issue. I'm able to reliably turn off and start the VM multiple times in a row. Double checked by confirming QEMU is dynamically linked to liburing.so.1. Looks like those 4 io_uring fixes helped. Thanks! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y) 2020-10-17 14:29 ` Ju Hyung Park @ 2020-10-17 15:02 ` Jens Axboe 2020-10-19 9:22 ` Pankaj Gupta 1 sibling, 0 replies; 11+ messages in thread From: Jens Axboe @ 2020-10-17 15:02 UTC (permalink / raw) To: Ju Hyung Park; +Cc: Stefano Garzarella, io-uring, qemu-devel On 10/17/20 8:29 AM, Ju Hyung Park wrote: > Hi Jens. > > On Sat, Oct 17, 2020 at 3:07 AM Jens Axboe <[email protected]> wrote: >> >> Would be great if you could try 5.4.71 and see if that helps for your >> issue. >> > > Oh wow, yeah it did fix the issue. > > I'm able to reliably turn off and start the VM multiple times in a row. > Double checked by confirming QEMU is dynamically linked to liburing.so.1. > > Looks like those 4 io_uring fixes helped. Awesome, thanks for testing! -- Jens Axboe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: io_uring possibly the culprit for qemu hang (linux-5.4.y) 2020-10-17 14:29 ` Ju Hyung Park 2020-10-17 15:02 ` Jens Axboe @ 2020-10-19 9:22 ` Pankaj Gupta 1 sibling, 0 replies; 11+ messages in thread From: Pankaj Gupta @ 2020-10-19 9:22 UTC (permalink / raw) To: Jack Wang Cc: Jens Axboe, Qemu Developers, io-uring, Stefano Garzarella, Ju Hyung Park @Jack Wang, Maybe four io_uring patches in 5.4.71 fixes the issue for you as well? Thanks, Pankaj > Hi Jens. > > On Sat, Oct 17, 2020 at 3:07 AM Jens Axboe <[email protected]> wrote: > > > > Would be great if you could try 5.4.71 and see if that helps for your > > issue. > > > > Oh wow, yeah it did fix the issue. > > I'm able to reliably turn off and start the VM multiple times in a row. > Double checked by confirming QEMU is dynamically linked to liburing.so.1. > > Looks like those 4 io_uring fixes helped. > > Thanks! > ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-10-19 9:23 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-09-30 16:26 io_uring possibly the culprit for qemu hang (linux-5.4.y) Ju Hyung Park 2020-10-01 3:03 ` Jens Axboe 2020-10-01 8:59 ` Stefano Garzarella 2020-10-01 13:47 ` Jack Wang 2020-10-01 14:30 ` Ju Hyung Park 2020-10-02 7:34 ` Stefano Garzarella 2020-10-16 18:04 ` Ju Hyung Park 2020-10-16 18:07 ` Jens Axboe 2020-10-17 14:29 ` Ju Hyung Park 2020-10-17 15:02 ` Jens Axboe 2020-10-19 9:22 ` Pankaj Gupta
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox