* Stack trace with Samba VFS io_uring and large transfers
@ 2021-07-24 15:46 Forza
2021-07-24 17:04 ` Jens Axboe
0 siblings, 1 reply; 12+ messages in thread
From: Forza @ 2021-07-24 15:46 UTC (permalink / raw)
To: io-uring
Hi!
I am getting a stack trace and broken/stalled transfers with Samba I
have "vfs objects = io_uring" in smb.conf.
Using Samba-4.13.9, the problem occurs in at least kernel-5.12.17-19 and
5.13.4. I haven't tried earlier kernels.
[52928.008736] ------------[ cut here ]------------
[52928.008740] WARNING: CPU: 3 PID: 2173 at fs/io_uring.c:8749
io_ring_exit_work+0xbd/0x5f0
[52928.008753] Modules linked in: nf_conntrack_netlink vhost_net vhost
vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_state ip6t_REJECT
nf_reject_ipv6 ip6table_filter ip6table_raw ip6table_mangle ip6table_nat
xt_multiport xt_limit ipt_REJECT nf_reject_ipv4 xt_NFLOG nfnetlink_log
xt_conntrack xt_physdev iptable_filter iptable_mangle xt_nat iptable_nat
xt_CT iptable_raw ip_set_bitmap_port ip_set_hash_ip ip_set_hash_net
xt_esp ipt_ah nft_xfrm nf_tables nf_nat_pptp nf_conntrack_pptp nf_nat
nf_conntrack_sip nf_conntrack_irc nf_conntrack_ftp nf_conntrack_h323
nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_bridge
nf_conntrack nf_defrag_ipv6 ip6_tables ip_tables xt_recent xt_set ip_set
nfnetlink nf_defrag_ipv4 nf_socket_ipv4 binfmt_misc amdgpu kvm_amd
drm_ttm_helper kvm ttm gpu_sched k10temp uas irqbypass 8250 8250_base
serial_mctrl_gpio video serial_core backlight pinctrl_amd
[52928.008821] CPU: 3 PID: 2173 Comm: kworker/u32:4 Not tainted
5.13.4-gentoo-e350 #1
[52928.008826] Hardware name: Gigabyte Technology Co., Ltd. B450M
DS3H/B450M DS3H-CF, BIOS F61c 05/10/2021
[52928.008829] Workqueue: events_unbound io_ring_exit_work
[52928.008835] RIP: 0010:io_ring_exit_work+0xbd/0x5f0
[52928.008841] Code: 00 00 00 4c 89 fa 48 c7 c6 b0 7e 30 83 e8 5b 7f 00
00 4c 89 f7 e8 43 95 ff ff 48 8b 05 9c 75 0f 01 49 39 c5 0f 89 7b ff ff
ff <0f> 0b e9 74 ff ff ff 48 c7 c2 c8 b0 af 84 48 c7 c6 be 99 1c 84 48
[52928.008845] RSP: 0018:ffffa52294007df0 EFLAGS: 00010293
[52928.008849] RAX: 0000000103237dd4 RBX: ffff89359ef92600 RCX:
0000000000000008
[52928.008852] RDX: ffff89359ef92080 RSI: 0000000000000000 RDI:
ffff89359ef924c0
[52928.008855] RBP: ffffa52294007e90 R08: 0000000000000001 R09:
0000000000000000
[52928.008857] R10: 0000000000000000 R11: 0000000000000003 R12:
ffff89359ef925d0
[52928.008858] R13: 0000000103237dc0 R14: 0000000000000000 R15:
ffff89359ef92000
[52928.008861] FS: 0000000000000000(0000) GS:ffff8937a6ec0000(0000)
knlGS:0000000000000000
[52928.008864] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[52928.008867] CR2: 00007f13f7904001 CR3: 00000001b62be000 CR4:
00000000003506e0
[52928.008871] Call Trace:
[52928.008876] ? __switch_to+0x23b/0x420
[52928.008882] process_one_work+0x1af/0x300
[52928.008889] worker_thread+0x48/0x3c0
[52928.008894] ? process_one_work+0x300/0x300
[52928.008899] kthread+0x122/0x140
[52928.008903] ? set_kthread_struct+0x30/0x30
[52928.008908] ret_from_fork+0x22/0x30
[52928.008914] ---[ end trace 6a275af934ed94fd ]---
I set "log level = 3" in smb.conf and captured the results:
Windows client: https://paste.tnonline.net/files/1U7Do7BKokPa_smbd-log.txt
Linux client: https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
[2021/07/24 16:46:49.793735, 3]
../../source3/smbd/smb2_read.c:415(smb2_read_complete)
smbd_smb2_read: fnum 4007995332, file
media/vm/libvirt/images/NomadBSD.img, length=1048576 offset=1282408448
read=1048576
[2021/07/24 16:46:49.803693, 3]
../../source3/smbd/smb2_read.c:415(smb2_read_complete)
smbd_smb2_read: fnum 4007995332, file
media/vm/libvirt/images/NomadBSD.img, length=1048576 offset=1283457024
read=1048576
[2021/07/24 16:46:49.811478, 3]
../../source3/smbd/smb2_read.c:415(smb2_read_complete)
smbd_smb2_read: fnum 4007995332, file
media/vm/libvirt/images/NomadBSD.img, length=1048576 offset=1284505600
read=1048576
... here it just stops adding to the log file and the client eventually
times out waiting for more data. No apparent error message in any of the
samba log files.
The stack trace "events_unbound io_ring_exit_work" only seems to occur
once and then doesn't come back in subsequent testing, even if I restart
Samba itself. It comes back after a reboot.
This happens every time I have the io_uring vfs module enabled and never
otherwise. It seems to be only large (several GB) file transfers that
are affected.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Stack trace with Samba VFS io_uring and large transfers
2021-07-24 15:46 Stack trace with Samba VFS io_uring and large transfers Forza
@ 2021-07-24 17:04 ` Jens Axboe
2021-07-24 18:23 ` Forza
0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2021-07-24 17:04 UTC (permalink / raw)
To: Forza, io-uring
On 7/24/21 9:46 AM, Forza wrote:
> Hi!
>
> I am getting a stack trace and broken/stalled transfers with Samba I
> have "vfs objects = io_uring" in smb.conf.
>
> Using Samba-4.13.9, the problem occurs in at least kernel-5.12.17-19 and
> 5.13.4. I haven't tried earlier kernels.
>
>
> [52928.008736] ------------[ cut here ]------------
> [52928.008740] WARNING: CPU: 3 PID: 2173 at fs/io_uring.c:8749
> io_ring_exit_work+0xbd/0x5f0
> [52928.008753] Modules linked in: nf_conntrack_netlink vhost_net vhost
> vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_state ip6t_REJECT
> nf_reject_ipv6 ip6table_filter ip6table_raw ip6table_mangle ip6table_nat
> xt_multiport xt_limit ipt_REJECT nf_reject_ipv4 xt_NFLOG nfnetlink_log
> xt_conntrack xt_physdev iptable_filter iptable_mangle xt_nat iptable_nat
> xt_CT iptable_raw ip_set_bitmap_port ip_set_hash_ip ip_set_hash_net
> xt_esp ipt_ah nft_xfrm nf_tables nf_nat_pptp nf_conntrack_pptp nf_nat
> nf_conntrack_sip nf_conntrack_irc nf_conntrack_ftp nf_conntrack_h323
> nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_bridge
> nf_conntrack nf_defrag_ipv6 ip6_tables ip_tables xt_recent xt_set ip_set
> nfnetlink nf_defrag_ipv4 nf_socket_ipv4 binfmt_misc amdgpu kvm_amd
> drm_ttm_helper kvm ttm gpu_sched k10temp uas irqbypass 8250 8250_base
> serial_mctrl_gpio video serial_core backlight pinctrl_amd
> [52928.008821] CPU: 3 PID: 2173 Comm: kworker/u32:4 Not tainted
> 5.13.4-gentoo-e350 #1
> [52928.008826] Hardware name: Gigabyte Technology Co., Ltd. B450M
> DS3H/B450M DS3H-CF, BIOS F61c 05/10/2021
> [52928.008829] Workqueue: events_unbound io_ring_exit_work
> [52928.008835] RIP: 0010:io_ring_exit_work+0xbd/0x5f0
> [52928.008841] Code: 00 00 00 4c 89 fa 48 c7 c6 b0 7e 30 83 e8 5b 7f 00
> 00 4c 89 f7 e8 43 95 ff ff 48 8b 05 9c 75 0f 01 49 39 c5 0f 89 7b ff ff
> ff <0f> 0b e9 74 ff ff ff 48 c7 c2 c8 b0 af 84 48 c7 c6 be 99 1c 84 48
> [52928.008845] RSP: 0018:ffffa52294007df0 EFLAGS: 00010293
> [52928.008849] RAX: 0000000103237dd4 RBX: ffff89359ef92600 RCX:
> 0000000000000008
> [52928.008852] RDX: ffff89359ef92080 RSI: 0000000000000000 RDI:
> ffff89359ef924c0
> [52928.008855] RBP: ffffa52294007e90 R08: 0000000000000001 R09:
> 0000000000000000
> [52928.008857] R10: 0000000000000000 R11: 0000000000000003 R12:
> ffff89359ef925d0
> [52928.008858] R13: 0000000103237dc0 R14: 0000000000000000 R15:
> ffff89359ef92000
> [52928.008861] FS: 0000000000000000(0000) GS:ffff8937a6ec0000(0000)
> knlGS:0000000000000000
> [52928.008864] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [52928.008867] CR2: 00007f13f7904001 CR3: 00000001b62be000 CR4:
> 00000000003506e0
> [52928.008871] Call Trace:
> [52928.008876] ? __switch_to+0x23b/0x420
> [52928.008882] process_one_work+0x1af/0x300
> [52928.008889] worker_thread+0x48/0x3c0
> [52928.008894] ? process_one_work+0x300/0x300
> [52928.008899] kthread+0x122/0x140
> [52928.008903] ? set_kthread_struct+0x30/0x30
> [52928.008908] ret_from_fork+0x22/0x30
> [52928.008914] ---[ end trace 6a275af934ed94fd ]---
>
>
> I set "log level = 3" in smb.conf and captured the results:
> Windows client: https://paste.tnonline.net/files/1U7Do7BKokPa_smbd-log.txt
> Linux client: https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>
>
> [2021/07/24 16:46:49.793735, 3]
> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
> smbd_smb2_read: fnum 4007995332, file
> media/vm/libvirt/images/NomadBSD.img, length=1048576 offset=1282408448
> read=1048576
> [2021/07/24 16:46:49.803693, 3]
> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
> smbd_smb2_read: fnum 4007995332, file
> media/vm/libvirt/images/NomadBSD.img, length=1048576 offset=1283457024
> read=1048576
> [2021/07/24 16:46:49.811478, 3]
> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
> smbd_smb2_read: fnum 4007995332, file
> media/vm/libvirt/images/NomadBSD.img, length=1048576 offset=1284505600
> read=1048576
>
> ... here it just stops adding to the log file and the client eventually
> times out waiting for more data. No apparent error message in any of the
> samba log files.
>
> The stack trace "events_unbound io_ring_exit_work" only seems to occur
> once and then doesn't come back in subsequent testing, even if I restart
> Samba itself. It comes back after a reboot.
>
> This happens every time I have the io_uring vfs module enabled and never
> otherwise. It seems to be only large (several GB) file transfers that
> are affected.
I'll see if I can reproduce this. I'm assuming samba is using buffered
IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
possible to reproduce without samba with a windows client, as I don't
have any of those. If synthetic reproducing fails, I can try samba
with a Linux client.
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Stack trace with Samba VFS io_uring and large transfers
2021-07-24 17:04 ` Jens Axboe
@ 2021-07-24 18:23 ` Forza
2021-07-24 19:44 ` Jens Axboe
0 siblings, 1 reply; 12+ messages in thread
From: Forza @ 2021-07-24 18:23 UTC (permalink / raw)
To: Jens Axboe, io-uring
Hi!
On 2021-07-24 19:04, Jens Axboe wrote:
> I'll see if I can reproduce this. I'm assuming samba is using buffered
> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
> possible to reproduce without samba with a windows client, as I don't
> have any of those. If synthetic reproducing fails, I can try samba
> with a Linux client.
I attached the logs from both a Windows 10 client and a Linux client
(kernel 5.11.0).
https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
smbd_smb2_read: fnum 2641229669, file
media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
offset=736100352 read=4194304
[2021/07/24 17:26:09.120779, 3]
../../source3/smbd/smb2_read.c:415(smb2_read_complete)
smbd_smb2_read: fnum 2641229669, file
media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
offset=740294656 read=4194304
[2021/07/24 17:26:09.226593, 3]
../../source3/smbd/smb2_read.c:415(smb2_read_complete)
smbd_smb2_read: fnum 2641229669, file
media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
offset=748683264 read=4194304
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Stack trace with Samba VFS io_uring and large transfers
2021-07-24 18:23 ` Forza
@ 2021-07-24 19:44 ` Jens Axboe
2021-07-24 19:51 ` Forza
0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2021-07-24 19:44 UTC (permalink / raw)
To: Forza, io-uring
On 7/24/21 12:23 PM, Forza wrote:
> Hi!
>
> On 2021-07-24 19:04, Jens Axboe wrote:
>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>> possible to reproduce without samba with a windows client, as I don't
>> have any of those. If synthetic reproducing fails, I can try samba
>> with a Linux client.
>
> I attached the logs from both a Windows 10 client and a Linux client
> (kernel 5.11.0).
>
> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>
> smbd_smb2_read: fnum 2641229669, file
> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
> offset=736100352 read=4194304
> [2021/07/24 17:26:09.120779, 3]
> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
> smbd_smb2_read: fnum 2641229669, file
> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
> offset=740294656 read=4194304
> [2021/07/24 17:26:09.226593, 3]
> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
> smbd_smb2_read: fnum 2641229669, file
> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
> offset=748683264 read=4194304
Thanks, this is useful. Before I try and reproduce it, what is the
filesystem that is hosting the samba mount?
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Stack trace with Samba VFS io_uring and large transfers
2021-07-24 19:44 ` Jens Axboe
@ 2021-07-24 19:51 ` Forza
2021-07-26 16:07 ` Jens Axboe
2021-07-28 0:16 ` Stefan Metzmacher
0 siblings, 2 replies; 12+ messages in thread
From: Forza @ 2021-07-24 19:51 UTC (permalink / raw)
To: Jens Axboe, io-uring
On 2021-07-24 21:44, Jens Axboe wrote:
> On 7/24/21 12:23 PM, Forza wrote:
>> Hi!
>>
>> On 2021-07-24 19:04, Jens Axboe wrote:
>>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>>> possible to reproduce without samba with a windows client, as I don't
>>> have any of those. If synthetic reproducing fails, I can try samba
>>> with a Linux client.
>>
>> I attached the logs from both a Windows 10 client and a Linux client
>> (kernel 5.11.0).
>>
>> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>>
>> smbd_smb2_read: fnum 2641229669, file
>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>> offset=736100352 read=4194304
>> [2021/07/24 17:26:09.120779, 3]
>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>> smbd_smb2_read: fnum 2641229669, file
>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>> offset=740294656 read=4194304
>> [2021/07/24 17:26:09.226593, 3]
>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>> smbd_smb2_read: fnum 2641229669, file
>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>> offset=748683264 read=4194304
>
> Thanks, this is useful. Before I try and reproduce it, what is the
> filesystem that is hosting the samba mount?
>
I am using Btrfs.
My testing was done by exporting the share with
vfs objects = io_uring
vfs objects = btrfs, io_uring
Same results in both cases. Exporting with "vfs objects = btrfs" (no
io_uring) works as expected.
Thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Stack trace with Samba VFS io_uring and large transfers
2021-07-24 19:51 ` Forza
@ 2021-07-26 16:07 ` Jens Axboe
2021-07-26 16:35 ` Jens Axboe
2021-07-28 0:16 ` Stefan Metzmacher
1 sibling, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2021-07-26 16:07 UTC (permalink / raw)
To: Forza, io-uring
On 7/24/21 1:51 PM, Forza wrote:
>
>
> On 2021-07-24 21:44, Jens Axboe wrote:
>> On 7/24/21 12:23 PM, Forza wrote:
>>> Hi!
>>>
>>> On 2021-07-24 19:04, Jens Axboe wrote:
>>>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>>>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>>>> possible to reproduce without samba with a windows client, as I don't
>>>> have any of those. If synthetic reproducing fails, I can try samba
>>>> with a Linux client.
>>>
>>> I attached the logs from both a Windows 10 client and a Linux client
>>> (kernel 5.11.0).
>>>
>>> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>>>
>>> smbd_smb2_read: fnum 2641229669, file
>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>> offset=736100352 read=4194304
>>> [2021/07/24 17:26:09.120779, 3]
>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>> smbd_smb2_read: fnum 2641229669, file
>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>> offset=740294656 read=4194304
>>> [2021/07/24 17:26:09.226593, 3]
>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>> smbd_smb2_read: fnum 2641229669, file
>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>> offset=748683264 read=4194304
>>
>> Thanks, this is useful. Before I try and reproduce it, what is the
>> filesystem that is hosting the samba mount?
>>
>
> I am using Btrfs.
>
> My testing was done by exporting the share with
>
> vfs objects = io_uring
> vfs objects = btrfs, io_uring
>
> Same results in both cases. Exporting with "vfs objects = btrfs" (no
> io_uring) works as expected.
Seems to be specific to btrfs, I can reproduce it here. I'll dive in
and see what I can find.
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Stack trace with Samba VFS io_uring and large transfers
2021-07-26 16:07 ` Jens Axboe
@ 2021-07-26 16:35 ` Jens Axboe
2021-07-28 10:02 ` Forza
0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2021-07-26 16:35 UTC (permalink / raw)
To: Forza, io-uring
On 7/26/21 10:07 AM, Jens Axboe wrote:
> On 7/24/21 1:51 PM, Forza wrote:
>>
>>
>> On 2021-07-24 21:44, Jens Axboe wrote:
>>> On 7/24/21 12:23 PM, Forza wrote:
>>>> Hi!
>>>>
>>>> On 2021-07-24 19:04, Jens Axboe wrote:
>>>>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>>>>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>>>>> possible to reproduce without samba with a windows client, as I don't
>>>>> have any of those. If synthetic reproducing fails, I can try samba
>>>>> with a Linux client.
>>>>
>>>> I attached the logs from both a Windows 10 client and a Linux client
>>>> (kernel 5.11.0).
>>>>
>>>> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>>>>
>>>> smbd_smb2_read: fnum 2641229669, file
>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>> offset=736100352 read=4194304
>>>> [2021/07/24 17:26:09.120779, 3]
>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>> smbd_smb2_read: fnum 2641229669, file
>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>> offset=740294656 read=4194304
>>>> [2021/07/24 17:26:09.226593, 3]
>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>> smbd_smb2_read: fnum 2641229669, file
>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>> offset=748683264 read=4194304
>>>
>>> Thanks, this is useful. Before I try and reproduce it, what is the
>>> filesystem that is hosting the samba mount?
>>>
>>
>> I am using Btrfs.
>>
>> My testing was done by exporting the share with
>>
>> vfs objects = io_uring
>> vfs objects = btrfs, io_uring
>>
>> Same results in both cases. Exporting with "vfs objects = btrfs" (no
>> io_uring) works as expected.
>
> Seems to be specific to btrfs, I can reproduce it here. I'll dive in
> and see what I can find.
This looks like a race in dealing with the task_work running. At least
this on top of current -git closes the gap for me and I can't reproduce
it anymore.
diff --git a/fs/io_uring.c b/fs/io_uring.c
index c4d2b320cdd4..998a01cbc00f 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1959,9 +1959,13 @@ static void tctx_task_work(struct callback_head *cb)
node = next;
}
if (wq_list_empty(&tctx->task_list)) {
+ spin_lock_irq(&tctx->task_lock);
clear_bit(0, &tctx->task_state);
- if (wq_list_empty(&tctx->task_list))
+ if (wq_list_empty(&tctx->task_list)) {
+ spin_unlock_irq(&tctx->task_lock);
break;
+ }
+ spin_unlock_irq(&tctx->task_lock);
/* another tctx_task_work() is enqueued, yield */
if (test_and_set_bit(0, &tctx->task_state))
break;
--
Jens Axboe
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: Stack trace with Samba VFS io_uring and large transfers
2021-07-24 19:51 ` Forza
2021-07-26 16:07 ` Jens Axboe
@ 2021-07-28 0:16 ` Stefan Metzmacher
2021-07-28 2:34 ` Jens Axboe
2021-07-28 9:39 ` Forza
1 sibling, 2 replies; 12+ messages in thread
From: Stefan Metzmacher @ 2021-07-28 0:16 UTC (permalink / raw)
To: Forza, Jens Axboe, io-uring
Am 24.07.21 um 21:51 schrieb Forza:
>
>
> On 2021-07-24 21:44, Jens Axboe wrote:
>> On 7/24/21 12:23 PM, Forza wrote:
>>> Hi!
>>>
>>> On 2021-07-24 19:04, Jens Axboe wrote:
>>>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>>>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>>>> possible to reproduce without samba with a windows client, as I don't
>>>> have any of those. If synthetic reproducing fails, I can try samba
>>>> with a Linux client.
>>>
>>> I attached the logs from both a Windows 10 client and a Linux client
>>> (kernel 5.11.0).
>>>
>>> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>>>
>>> smbd_smb2_read: fnum 2641229669, file
>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>> offset=736100352 read=4194304
>>> [2021/07/24 17:26:09.120779, 3]
>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>> smbd_smb2_read: fnum 2641229669, file
>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>> offset=740294656 read=4194304
>>> [2021/07/24 17:26:09.226593, 3]
>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>> smbd_smb2_read: fnum 2641229669, file
>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>> offset=748683264 read=4194304
>>
>> Thanks, this is useful. Before I try and reproduce it, what is the
>> filesystem that is hosting the samba mount?
>>
>
> I am using Btrfs.
>
> My testing was done by exporting the share with
>
> vfs objects = io_uring
> vfs objects = btrfs, io_uring
>
> Same results in both cases. Exporting with "vfs objects = btrfs" (no io_uring) works as expected.
I don't think it makes a difference for the current problem, but I guess you want the following order instead:
vfs objects = io_uring, btrfs
metze
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Stack trace with Samba VFS io_uring and large transfers
2021-07-28 0:16 ` Stefan Metzmacher
@ 2021-07-28 2:34 ` Jens Axboe
2021-07-28 9:39 ` Forza
1 sibling, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2021-07-28 2:34 UTC (permalink / raw)
To: Stefan Metzmacher, Forza, io-uring
On 7/27/21 6:16 PM, Stefan Metzmacher wrote:
>
> Am 24.07.21 um 21:51 schrieb Forza:
>>
>>
>> On 2021-07-24 21:44, Jens Axboe wrote:
>>> On 7/24/21 12:23 PM, Forza wrote:
>>>> Hi!
>>>>
>>>> On 2021-07-24 19:04, Jens Axboe wrote:
>>>>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>>>>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>>>>> possible to reproduce without samba with a windows client, as I don't
>>>>> have any of those. If synthetic reproducing fails, I can try samba
>>>>> with a Linux client.
>>>>
>>>> I attached the logs from both a Windows 10 client and a Linux client
>>>> (kernel 5.11.0).
>>>>
>>>> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>>>>
>>>> smbd_smb2_read: fnum 2641229669, file
>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>> offset=736100352 read=4194304
>>>> [2021/07/24 17:26:09.120779, 3]
>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>> smbd_smb2_read: fnum 2641229669, file
>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>> offset=740294656 read=4194304
>>>> [2021/07/24 17:26:09.226593, 3]
>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>> smbd_smb2_read: fnum 2641229669, file
>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>> offset=748683264 read=4194304
>>>
>>> Thanks, this is useful. Before I try and reproduce it, what is the
>>> filesystem that is hosting the samba mount?
>>>
>>
>> I am using Btrfs.
>>
>> My testing was done by exporting the share with
>>
>> vfs objects = io_uring
>> vfs objects = btrfs, io_uring
>>
>> Same results in both cases. Exporting with "vfs objects = btrfs" (no io_uring) works as expected.
>
> I don't think it makes a difference for the current problem, but I guess you want the following order instead:
>
> vfs objects = io_uring, btrfs
Hopefully the fix I posted takes care of the other part :-)
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Stack trace with Samba VFS io_uring and large transfers
2021-07-28 0:16 ` Stefan Metzmacher
2021-07-28 2:34 ` Jens Axboe
@ 2021-07-28 9:39 ` Forza
1 sibling, 0 replies; 12+ messages in thread
From: Forza @ 2021-07-28 9:39 UTC (permalink / raw)
To: Stefan Metzmacher, Jens Axboe, io-uring
On 2021-07-28 02:16, Stefan Metzmacher wrote:
>> I am using Btrfs.
>>
>> My testing was done by exporting the share with
>>
>> vfs objects = io_uring
>> vfs objects = btrfs, io_uring
>>
>> Same results in both cases. Exporting with "vfs objects = btrfs" (no io_uring) works as expected.
> I don't think it makes a difference for the current problem, but I guess you want the following order instead:
>
> vfs objects = io_uring, btrfs
>
> metze
I haven't tested that combination. However the docs mention that
io_uring must be last VFS object loaded.
https://www.samba.org/samba/docs/4.12/man-html/vfs_io_uring.8.html
"The io_uring VFS module enables asynchronous pread, pwrite and fsync
using the io_uring infrastructure of Linux (>= 5.1). This provides much
less overhead compared to the usage of the pthreadpool for async io.
This module SHOULD be listed last in any module stack as it requires
real kernel file descriptors."
The manpage for vfs_btrfs mentions that btrfs is stackable and in their
example they have btrfs loaded before vfs_shadow_copy.
https://www.samba.org/samba/docs/4.12/man-html/vfs_btrfs.8.html
What would be the implication of having io_uring before the btrfs module?
Regards
Forza
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Stack trace with Samba VFS io_uring and large transfers
2021-07-26 16:35 ` Jens Axboe
@ 2021-07-28 10:02 ` Forza
2021-07-28 14:35 ` Jens Axboe
0 siblings, 1 reply; 12+ messages in thread
From: Forza @ 2021-07-28 10:02 UTC (permalink / raw)
To: Jens Axboe, io-uring
On 2021-07-26 18:35, Jens Axboe wrote:
> On 7/26/21 10:07 AM, Jens Axboe wrote:
>> On 7/24/21 1:51 PM, Forza wrote:
>>>
>>>
>>> On 2021-07-24 21:44, Jens Axboe wrote:
>>>> On 7/24/21 12:23 PM, Forza wrote:
>>>>> Hi!
>>>>>
>>>>> On 2021-07-24 19:04, Jens Axboe wrote:
>>>>>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>>>>>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>>>>>> possible to reproduce without samba with a windows client, as I don't
>>>>>> have any of those. If synthetic reproducing fails, I can try samba
>>>>>> with a Linux client.
>>>>>
>>>>> I attached the logs from both a Windows 10 client and a Linux client
>>>>> (kernel 5.11.0).
>>>>>
>>>>> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>>>>>
>>>>> smbd_smb2_read: fnum 2641229669, file
>>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>>> offset=736100352 read=4194304
>>>>> [2021/07/24 17:26:09.120779, 3]
>>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>>> smbd_smb2_read: fnum 2641229669, file
>>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>>> offset=740294656 read=4194304
>>>>> [2021/07/24 17:26:09.226593, 3]
>>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>>> smbd_smb2_read: fnum 2641229669, file
>>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>>> offset=748683264 read=4194304
>>>>
>>>> Thanks, this is useful. Before I try and reproduce it, what is the
>>>> filesystem that is hosting the samba mount?
>>>>
>>>
>>> I am using Btrfs.
>>>
>>> My testing was done by exporting the share with
>>>
>>> vfs objects = io_uring
>>> vfs objects = btrfs, io_uring
>>>
>>> Same results in both cases. Exporting with "vfs objects = btrfs" (no
>>> io_uring) works as expected.
>>
>> Seems to be specific to btrfs, I can reproduce it here. I'll dive in
>> and see what I can find.
>
> This looks like a race in dealing with the task_work running. At least
> this on top of current -git closes the gap for me and I can't reproduce
> it anymore.
>
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index c4d2b320cdd4..998a01cbc00f 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -1959,9 +1959,13 @@ static void tctx_task_work(struct callback_head *cb)
> node = next;
> }
> if (wq_list_empty(&tctx->task_list)) {
> + spin_lock_irq(&tctx->task_lock);
> clear_bit(0, &tctx->task_state);
> - if (wq_list_empty(&tctx->task_list))
> + if (wq_list_empty(&tctx->task_list)) {
> + spin_unlock_irq(&tctx->task_lock);
> break;
> + }
> + spin_unlock_irq(&tctx->task_lock);
> /* another tctx_task_work() is enqueued, yield */
> if (test_and_set_bit(0, &tctx->task_state))
> break;
>
Thanks! Is there a way to get this on current stable such as 5.13.5?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Stack trace with Samba VFS io_uring and large transfers
2021-07-28 10:02 ` Forza
@ 2021-07-28 14:35 ` Jens Axboe
0 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2021-07-28 14:35 UTC (permalink / raw)
To: Forza, io-uring
On 7/28/21 4:02 AM, Forza wrote:
>
>
> On 2021-07-26 18:35, Jens Axboe wrote:
>> On 7/26/21 10:07 AM, Jens Axboe wrote:
>>> On 7/24/21 1:51 PM, Forza wrote:
>>>>
>>>>
>>>> On 2021-07-24 21:44, Jens Axboe wrote:
>>>>> On 7/24/21 12:23 PM, Forza wrote:
>>>>>> Hi!
>>>>>>
>>>>>> On 2021-07-24 19:04, Jens Axboe wrote:
>>>>>>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>>>>>>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>>>>>>> possible to reproduce without samba with a windows client, as I don't
>>>>>>> have any of those. If synthetic reproducing fails, I can try samba
>>>>>>> with a Linux client.
>>>>>>
>>>>>> I attached the logs from both a Windows 10 client and a Linux client
>>>>>> (kernel 5.11.0).
>>>>>>
>>>>>> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>>>>>>
>>>>>> smbd_smb2_read: fnum 2641229669, file
>>>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>>>> offset=736100352 read=4194304
>>>>>> [2021/07/24 17:26:09.120779, 3]
>>>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>>>> smbd_smb2_read: fnum 2641229669, file
>>>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>>>> offset=740294656 read=4194304
>>>>>> [2021/07/24 17:26:09.226593, 3]
>>>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>>>> smbd_smb2_read: fnum 2641229669, file
>>>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>>>> offset=748683264 read=4194304
>>>>>
>>>>> Thanks, this is useful. Before I try and reproduce it, what is the
>>>>> filesystem that is hosting the samba mount?
>>>>>
>>>>
>>>> I am using Btrfs.
>>>>
>>>> My testing was done by exporting the share with
>>>>
>>>> vfs objects = io_uring
>>>> vfs objects = btrfs, io_uring
>>>>
>>>> Same results in both cases. Exporting with "vfs objects = btrfs" (no
>>>> io_uring) works as expected.
>>>
>>> Seems to be specific to btrfs, I can reproduce it here. I'll dive in
>>> and see what I can find.
>>
>> This looks like a race in dealing with the task_work running. At least
>> this on top of current -git closes the gap for me and I can't reproduce
>> it anymore.
>>
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index c4d2b320cdd4..998a01cbc00f 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -1959,9 +1959,13 @@ static void tctx_task_work(struct callback_head *cb)
>> node = next;
>> }
>> if (wq_list_empty(&tctx->task_list)) {
>> + spin_lock_irq(&tctx->task_lock);
>> clear_bit(0, &tctx->task_state);
>> - if (wq_list_empty(&tctx->task_list))
>> + if (wq_list_empty(&tctx->task_list)) {
>> + spin_unlock_irq(&tctx->task_lock);
>> break;
>> + }
>> + spin_unlock_irq(&tctx->task_lock);
>> /* another tctx_task_work() is enqueued, yield */
>> if (test_and_set_bit(0, &tctx->task_state))
>> break;
>>
>
> Thanks! Is there a way to get this on current stable such as 5.13.5?
Yes, we'll get it to stable once it's upstream. For your particular
testing, not sure there's an easy way... You'd have to apply it to
5.13.5 and compile it on your own, I'm afraid.
If all else fails, hopefully it'll be in 5.13.6 and you can re-test
with that :-)
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2021-07-28 14:36 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-07-24 15:46 Stack trace with Samba VFS io_uring and large transfers Forza
2021-07-24 17:04 ` Jens Axboe
2021-07-24 18:23 ` Forza
2021-07-24 19:44 ` Jens Axboe
2021-07-24 19:51 ` Forza
2021-07-26 16:07 ` Jens Axboe
2021-07-26 16:35 ` Jens Axboe
2021-07-28 10:02 ` Forza
2021-07-28 14:35 ` Jens Axboe
2021-07-28 0:16 ` Stefan Metzmacher
2021-07-28 2:34 ` Jens Axboe
2021-07-28 9:39 ` Forza
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox