public inbox for [email protected]
 help / color / mirror / Atom feed
* Stack trace with Samba VFS io_uring and large transfers
@ 2021-07-24 15:46 Forza
  2021-07-24 17:04 ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Forza @ 2021-07-24 15:46 UTC (permalink / raw)
  To: io-uring

Hi!

I am getting a stack trace and broken/stalled transfers with Samba I 
have "vfs objects = io_uring" in smb.conf.

Using Samba-4.13.9, the problem occurs in at least kernel-5.12.17-19 and 
5.13.4. I haven't tried earlier kernels.


[52928.008736] ------------[ cut here ]------------
[52928.008740] WARNING: CPU: 3 PID: 2173 at fs/io_uring.c:8749 
io_ring_exit_work+0xbd/0x5f0
[52928.008753] Modules linked in: nf_conntrack_netlink vhost_net vhost 
vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_state ip6t_REJECT 
nf_reject_ipv6 ip6table_filter ip6table_raw ip6table_mangle ip6table_nat 
xt_multiport xt_limit ipt_REJECT nf_reject_ipv4 xt_NFLOG nfnetlink_log 
xt_conntrack xt_physdev iptable_filter iptable_mangle xt_nat iptable_nat 
xt_CT iptable_raw ip_set_bitmap_port ip_set_hash_ip ip_set_hash_net 
xt_esp ipt_ah nft_xfrm nf_tables nf_nat_pptp nf_conntrack_pptp nf_nat 
nf_conntrack_sip nf_conntrack_irc nf_conntrack_ftp nf_conntrack_h323 
nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_bridge 
nf_conntrack nf_defrag_ipv6 ip6_tables ip_tables xt_recent xt_set ip_set 
nfnetlink nf_defrag_ipv4 nf_socket_ipv4 binfmt_misc amdgpu kvm_amd 
drm_ttm_helper kvm ttm gpu_sched k10temp uas irqbypass 8250 8250_base 
serial_mctrl_gpio video serial_core backlight pinctrl_amd
[52928.008821] CPU: 3 PID: 2173 Comm: kworker/u32:4 Not tainted 
5.13.4-gentoo-e350 #1
[52928.008826] Hardware name: Gigabyte Technology Co., Ltd. B450M 
DS3H/B450M DS3H-CF, BIOS F61c 05/10/2021
[52928.008829] Workqueue: events_unbound io_ring_exit_work
[52928.008835] RIP: 0010:io_ring_exit_work+0xbd/0x5f0
[52928.008841] Code: 00 00 00 4c 89 fa 48 c7 c6 b0 7e 30 83 e8 5b 7f 00 
00 4c 89 f7 e8 43 95 ff ff 48 8b 05 9c 75 0f 01 49 39 c5 0f 89 7b ff ff 
ff <0f> 0b e9 74 ff ff ff 48 c7 c2 c8 b0 af 84 48 c7 c6 be 99 1c 84 48
[52928.008845] RSP: 0018:ffffa52294007df0 EFLAGS: 00010293
[52928.008849] RAX: 0000000103237dd4 RBX: ffff89359ef92600 RCX: 
0000000000000008
[52928.008852] RDX: ffff89359ef92080 RSI: 0000000000000000 RDI: 
ffff89359ef924c0
[52928.008855] RBP: ffffa52294007e90 R08: 0000000000000001 R09: 
0000000000000000
[52928.008857] R10: 0000000000000000 R11: 0000000000000003 R12: 
ffff89359ef925d0
[52928.008858] R13: 0000000103237dc0 R14: 0000000000000000 R15: 
ffff89359ef92000
[52928.008861] FS:  0000000000000000(0000) GS:ffff8937a6ec0000(0000) 
knlGS:0000000000000000
[52928.008864] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[52928.008867] CR2: 00007f13f7904001 CR3: 00000001b62be000 CR4: 
00000000003506e0
[52928.008871] Call Trace:
[52928.008876]  ? __switch_to+0x23b/0x420
[52928.008882]  process_one_work+0x1af/0x300
[52928.008889]  worker_thread+0x48/0x3c0
[52928.008894]  ? process_one_work+0x300/0x300
[52928.008899]  kthread+0x122/0x140
[52928.008903]  ? set_kthread_struct+0x30/0x30
[52928.008908]  ret_from_fork+0x22/0x30
[52928.008914] ---[ end trace 6a275af934ed94fd ]---


I set "log level = 3" in smb.conf and captured the results:
Windows client: https://paste.tnonline.net/files/1U7Do7BKokPa_smbd-log.txt
Linux client: https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt


[2021/07/24 16:46:49.793735,  3] 
../../source3/smbd/smb2_read.c:415(smb2_read_complete)
   smbd_smb2_read: fnum 4007995332, file 
media/vm/libvirt/images/NomadBSD.img, length=1048576 offset=1282408448 
read=1048576
[2021/07/24 16:46:49.803693,  3] 
../../source3/smbd/smb2_read.c:415(smb2_read_complete)
   smbd_smb2_read: fnum 4007995332, file 
media/vm/libvirt/images/NomadBSD.img, length=1048576 offset=1283457024 
read=1048576
[2021/07/24 16:46:49.811478,  3] 
../../source3/smbd/smb2_read.c:415(smb2_read_complete)
   smbd_smb2_read: fnum 4007995332, file 
media/vm/libvirt/images/NomadBSD.img, length=1048576 offset=1284505600 
read=1048576

... here it just stops adding to the log file and the client eventually 
times out waiting for more data. No apparent error message in any of the 
samba log files.

The stack trace "events_unbound io_ring_exit_work" only seems to occur 
once and then doesn't come back in subsequent testing, even if I restart 
Samba itself. It comes back after a reboot.

This happens every time I have the io_uring vfs module enabled and never 
otherwise. It seems to be only large (several GB) file transfers that 
are affected.




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Stack trace with Samba VFS io_uring and large transfers
  2021-07-24 15:46 Stack trace with Samba VFS io_uring and large transfers Forza
@ 2021-07-24 17:04 ` Jens Axboe
  2021-07-24 18:23   ` Forza
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2021-07-24 17:04 UTC (permalink / raw)
  To: Forza, io-uring

On 7/24/21 9:46 AM, Forza wrote:
> Hi!
> 
> I am getting a stack trace and broken/stalled transfers with Samba I 
> have "vfs objects = io_uring" in smb.conf.
> 
> Using Samba-4.13.9, the problem occurs in at least kernel-5.12.17-19 and 
> 5.13.4. I haven't tried earlier kernels.
> 
> 
> [52928.008736] ------------[ cut here ]------------
> [52928.008740] WARNING: CPU: 3 PID: 2173 at fs/io_uring.c:8749 
> io_ring_exit_work+0xbd/0x5f0
> [52928.008753] Modules linked in: nf_conntrack_netlink vhost_net vhost 
> vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_state ip6t_REJECT 
> nf_reject_ipv6 ip6table_filter ip6table_raw ip6table_mangle ip6table_nat 
> xt_multiport xt_limit ipt_REJECT nf_reject_ipv4 xt_NFLOG nfnetlink_log 
> xt_conntrack xt_physdev iptable_filter iptable_mangle xt_nat iptable_nat 
> xt_CT iptable_raw ip_set_bitmap_port ip_set_hash_ip ip_set_hash_net 
> xt_esp ipt_ah nft_xfrm nf_tables nf_nat_pptp nf_conntrack_pptp nf_nat 
> nf_conntrack_sip nf_conntrack_irc nf_conntrack_ftp nf_conntrack_h323 
> nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_bridge 
> nf_conntrack nf_defrag_ipv6 ip6_tables ip_tables xt_recent xt_set ip_set 
> nfnetlink nf_defrag_ipv4 nf_socket_ipv4 binfmt_misc amdgpu kvm_amd 
> drm_ttm_helper kvm ttm gpu_sched k10temp uas irqbypass 8250 8250_base 
> serial_mctrl_gpio video serial_core backlight pinctrl_amd
> [52928.008821] CPU: 3 PID: 2173 Comm: kworker/u32:4 Not tainted 
> 5.13.4-gentoo-e350 #1
> [52928.008826] Hardware name: Gigabyte Technology Co., Ltd. B450M 
> DS3H/B450M DS3H-CF, BIOS F61c 05/10/2021
> [52928.008829] Workqueue: events_unbound io_ring_exit_work
> [52928.008835] RIP: 0010:io_ring_exit_work+0xbd/0x5f0
> [52928.008841] Code: 00 00 00 4c 89 fa 48 c7 c6 b0 7e 30 83 e8 5b 7f 00 
> 00 4c 89 f7 e8 43 95 ff ff 48 8b 05 9c 75 0f 01 49 39 c5 0f 89 7b ff ff 
> ff <0f> 0b e9 74 ff ff ff 48 c7 c2 c8 b0 af 84 48 c7 c6 be 99 1c 84 48
> [52928.008845] RSP: 0018:ffffa52294007df0 EFLAGS: 00010293
> [52928.008849] RAX: 0000000103237dd4 RBX: ffff89359ef92600 RCX: 
> 0000000000000008
> [52928.008852] RDX: ffff89359ef92080 RSI: 0000000000000000 RDI: 
> ffff89359ef924c0
> [52928.008855] RBP: ffffa52294007e90 R08: 0000000000000001 R09: 
> 0000000000000000
> [52928.008857] R10: 0000000000000000 R11: 0000000000000003 R12: 
> ffff89359ef925d0
> [52928.008858] R13: 0000000103237dc0 R14: 0000000000000000 R15: 
> ffff89359ef92000
> [52928.008861] FS:  0000000000000000(0000) GS:ffff8937a6ec0000(0000) 
> knlGS:0000000000000000
> [52928.008864] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [52928.008867] CR2: 00007f13f7904001 CR3: 00000001b62be000 CR4: 
> 00000000003506e0
> [52928.008871] Call Trace:
> [52928.008876]  ? __switch_to+0x23b/0x420
> [52928.008882]  process_one_work+0x1af/0x300
> [52928.008889]  worker_thread+0x48/0x3c0
> [52928.008894]  ? process_one_work+0x300/0x300
> [52928.008899]  kthread+0x122/0x140
> [52928.008903]  ? set_kthread_struct+0x30/0x30
> [52928.008908]  ret_from_fork+0x22/0x30
> [52928.008914] ---[ end trace 6a275af934ed94fd ]---
> 
> 
> I set "log level = 3" in smb.conf and captured the results:
> Windows client: https://paste.tnonline.net/files/1U7Do7BKokPa_smbd-log.txt
> Linux client: https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
> 
> 
> [2021/07/24 16:46:49.793735,  3] 
> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>    smbd_smb2_read: fnum 4007995332, file 
> media/vm/libvirt/images/NomadBSD.img, length=1048576 offset=1282408448 
> read=1048576
> [2021/07/24 16:46:49.803693,  3] 
> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>    smbd_smb2_read: fnum 4007995332, file 
> media/vm/libvirt/images/NomadBSD.img, length=1048576 offset=1283457024 
> read=1048576
> [2021/07/24 16:46:49.811478,  3] 
> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>    smbd_smb2_read: fnum 4007995332, file 
> media/vm/libvirt/images/NomadBSD.img, length=1048576 offset=1284505600 
> read=1048576
> 
> ... here it just stops adding to the log file and the client eventually 
> times out waiting for more data. No apparent error message in any of the 
> samba log files.
> 
> The stack trace "events_unbound io_ring_exit_work" only seems to occur 
> once and then doesn't come back in subsequent testing, even if I restart 
> Samba itself. It comes back after a reboot.
> 
> This happens every time I have the io_uring vfs module enabled and never 
> otherwise. It seems to be only large (several GB) file transfers that 
> are affected.

I'll see if I can reproduce this. I'm assuming samba is using buffered
IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
possible to reproduce without samba with a windows client, as I don't
have any of those. If synthetic reproducing fails, I can try samba
with a Linux client.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Stack trace with Samba VFS io_uring and large transfers
  2021-07-24 17:04 ` Jens Axboe
@ 2021-07-24 18:23   ` Forza
  2021-07-24 19:44     ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Forza @ 2021-07-24 18:23 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Hi!

On 2021-07-24 19:04, Jens Axboe wrote:
> I'll see if I can reproduce this. I'm assuming samba is using buffered
> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
> possible to reproduce without samba with a windows client, as I don't
> have any of those. If synthetic reproducing fails, I can try samba
> with a Linux client.

I attached the logs from both a Windows 10 client and a Linux client 
(kernel 5.11.0).

https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt

   smbd_smb2_read: fnum 2641229669, file 
media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304 
offset=736100352 read=4194304
[2021/07/24 17:26:09.120779,  3] 
../../source3/smbd/smb2_read.c:415(smb2_read_complete)
   smbd_smb2_read: fnum 2641229669, file 
media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304 
offset=740294656 read=4194304
[2021/07/24 17:26:09.226593,  3] 
../../source3/smbd/smb2_read.c:415(smb2_read_complete)
   smbd_smb2_read: fnum 2641229669, file 
media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304 
offset=748683264 read=4194304

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Stack trace with Samba VFS io_uring and large transfers
  2021-07-24 18:23   ` Forza
@ 2021-07-24 19:44     ` Jens Axboe
  2021-07-24 19:51       ` Forza
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2021-07-24 19:44 UTC (permalink / raw)
  To: Forza, io-uring

On 7/24/21 12:23 PM, Forza wrote:
> Hi!
> 
> On 2021-07-24 19:04, Jens Axboe wrote:
>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>> possible to reproduce without samba with a windows client, as I don't
>> have any of those. If synthetic reproducing fails, I can try samba
>> with a Linux client.
> 
> I attached the logs from both a Windows 10 client and a Linux client 
> (kernel 5.11.0).
> 
> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
> 
>    smbd_smb2_read: fnum 2641229669, file 
> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304 
> offset=736100352 read=4194304
> [2021/07/24 17:26:09.120779,  3] 
> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>    smbd_smb2_read: fnum 2641229669, file 
> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304 
> offset=740294656 read=4194304
> [2021/07/24 17:26:09.226593,  3] 
> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>    smbd_smb2_read: fnum 2641229669, file 
> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304 
> offset=748683264 read=4194304

Thanks, this is useful. Before I try and reproduce it, what is the
filesystem that is hosting the samba mount?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Stack trace with Samba VFS io_uring and large transfers
  2021-07-24 19:44     ` Jens Axboe
@ 2021-07-24 19:51       ` Forza
  2021-07-26 16:07         ` Jens Axboe
  2021-07-28  0:16         ` Stefan Metzmacher
  0 siblings, 2 replies; 12+ messages in thread
From: Forza @ 2021-07-24 19:51 UTC (permalink / raw)
  To: Jens Axboe, io-uring



On 2021-07-24 21:44, Jens Axboe wrote:
> On 7/24/21 12:23 PM, Forza wrote:
>> Hi!
>>
>> On 2021-07-24 19:04, Jens Axboe wrote:
>>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>>> possible to reproduce without samba with a windows client, as I don't
>>> have any of those. If synthetic reproducing fails, I can try samba
>>> with a Linux client.
>>
>> I attached the logs from both a Windows 10 client and a Linux client
>> (kernel 5.11.0).
>>
>> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>>
>>     smbd_smb2_read: fnum 2641229669, file
>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>> offset=736100352 read=4194304
>> [2021/07/24 17:26:09.120779,  3]
>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>     smbd_smb2_read: fnum 2641229669, file
>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>> offset=740294656 read=4194304
>> [2021/07/24 17:26:09.226593,  3]
>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>     smbd_smb2_read: fnum 2641229669, file
>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>> offset=748683264 read=4194304
> 
> Thanks, this is useful. Before I try and reproduce it, what is the
> filesystem that is hosting the samba mount?
> 

I am using Btrfs.

My testing was done by exporting the share with

   vfs objects = io_uring
   vfs objects = btrfs, io_uring

Same results in both cases. Exporting with "vfs objects = btrfs" (no 
io_uring) works as expected.

Thanks.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Stack trace with Samba VFS io_uring and large transfers
  2021-07-24 19:51       ` Forza
@ 2021-07-26 16:07         ` Jens Axboe
  2021-07-26 16:35           ` Jens Axboe
  2021-07-28  0:16         ` Stefan Metzmacher
  1 sibling, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2021-07-26 16:07 UTC (permalink / raw)
  To: Forza, io-uring

On 7/24/21 1:51 PM, Forza wrote:
> 
> 
> On 2021-07-24 21:44, Jens Axboe wrote:
>> On 7/24/21 12:23 PM, Forza wrote:
>>> Hi!
>>>
>>> On 2021-07-24 19:04, Jens Axboe wrote:
>>>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>>>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>>>> possible to reproduce without samba with a windows client, as I don't
>>>> have any of those. If synthetic reproducing fails, I can try samba
>>>> with a Linux client.
>>>
>>> I attached the logs from both a Windows 10 client and a Linux client
>>> (kernel 5.11.0).
>>>
>>> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>>>
>>>     smbd_smb2_read: fnum 2641229669, file
>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>> offset=736100352 read=4194304
>>> [2021/07/24 17:26:09.120779,  3]
>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>     smbd_smb2_read: fnum 2641229669, file
>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>> offset=740294656 read=4194304
>>> [2021/07/24 17:26:09.226593,  3]
>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>     smbd_smb2_read: fnum 2641229669, file
>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>> offset=748683264 read=4194304
>>
>> Thanks, this is useful. Before I try and reproduce it, what is the
>> filesystem that is hosting the samba mount?
>>
> 
> I am using Btrfs.
> 
> My testing was done by exporting the share with
> 
>    vfs objects = io_uring
>    vfs objects = btrfs, io_uring
> 
> Same results in both cases. Exporting with "vfs objects = btrfs" (no 
> io_uring) works as expected.

Seems to be specific to btrfs, I can reproduce it here. I'll dive in
and see what I can find.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Stack trace with Samba VFS io_uring and large transfers
  2021-07-26 16:07         ` Jens Axboe
@ 2021-07-26 16:35           ` Jens Axboe
  2021-07-28 10:02             ` Forza
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2021-07-26 16:35 UTC (permalink / raw)
  To: Forza, io-uring

On 7/26/21 10:07 AM, Jens Axboe wrote:
> On 7/24/21 1:51 PM, Forza wrote:
>>
>>
>> On 2021-07-24 21:44, Jens Axboe wrote:
>>> On 7/24/21 12:23 PM, Forza wrote:
>>>> Hi!
>>>>
>>>> On 2021-07-24 19:04, Jens Axboe wrote:
>>>>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>>>>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>>>>> possible to reproduce without samba with a windows client, as I don't
>>>>> have any of those. If synthetic reproducing fails, I can try samba
>>>>> with a Linux client.
>>>>
>>>> I attached the logs from both a Windows 10 client and a Linux client
>>>> (kernel 5.11.0).
>>>>
>>>> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>>>>
>>>>     smbd_smb2_read: fnum 2641229669, file
>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>> offset=736100352 read=4194304
>>>> [2021/07/24 17:26:09.120779,  3]
>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>>     smbd_smb2_read: fnum 2641229669, file
>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>> offset=740294656 read=4194304
>>>> [2021/07/24 17:26:09.226593,  3]
>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>>     smbd_smb2_read: fnum 2641229669, file
>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>> offset=748683264 read=4194304
>>>
>>> Thanks, this is useful. Before I try and reproduce it, what is the
>>> filesystem that is hosting the samba mount?
>>>
>>
>> I am using Btrfs.
>>
>> My testing was done by exporting the share with
>>
>>    vfs objects = io_uring
>>    vfs objects = btrfs, io_uring
>>
>> Same results in both cases. Exporting with "vfs objects = btrfs" (no 
>> io_uring) works as expected.
> 
> Seems to be specific to btrfs, I can reproduce it here. I'll dive in
> and see what I can find.

This looks like a race in dealing with the task_work running. At least
this on top of current -git closes the gap for me and I can't reproduce
it anymore.


diff --git a/fs/io_uring.c b/fs/io_uring.c
index c4d2b320cdd4..998a01cbc00f 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1959,9 +1959,13 @@ static void tctx_task_work(struct callback_head *cb)
 			node = next;
 		}
 		if (wq_list_empty(&tctx->task_list)) {
+			spin_lock_irq(&tctx->task_lock);
 			clear_bit(0, &tctx->task_state);
-			if (wq_list_empty(&tctx->task_list))
+			if (wq_list_empty(&tctx->task_list)) {
+				spin_unlock_irq(&tctx->task_lock);
 				break;
+			}
+			spin_unlock_irq(&tctx->task_lock);
 			/* another tctx_task_work() is enqueued, yield */
 			if (test_and_set_bit(0, &tctx->task_state))
 				break;

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Stack trace with Samba VFS io_uring and large transfers
  2021-07-24 19:51       ` Forza
  2021-07-26 16:07         ` Jens Axboe
@ 2021-07-28  0:16         ` Stefan Metzmacher
  2021-07-28  2:34           ` Jens Axboe
  2021-07-28  9:39           ` Forza
  1 sibling, 2 replies; 12+ messages in thread
From: Stefan Metzmacher @ 2021-07-28  0:16 UTC (permalink / raw)
  To: Forza, Jens Axboe, io-uring


Am 24.07.21 um 21:51 schrieb Forza:
> 
> 
> On 2021-07-24 21:44, Jens Axboe wrote:
>> On 7/24/21 12:23 PM, Forza wrote:
>>> Hi!
>>>
>>> On 2021-07-24 19:04, Jens Axboe wrote:
>>>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>>>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>>>> possible to reproduce without samba with a windows client, as I don't
>>>> have any of those. If synthetic reproducing fails, I can try samba
>>>> with a Linux client.
>>>
>>> I attached the logs from both a Windows 10 client and a Linux client
>>> (kernel 5.11.0).
>>>
>>> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>>>
>>>     smbd_smb2_read: fnum 2641229669, file
>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>> offset=736100352 read=4194304
>>> [2021/07/24 17:26:09.120779,  3]
>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>     smbd_smb2_read: fnum 2641229669, file
>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>> offset=740294656 read=4194304
>>> [2021/07/24 17:26:09.226593,  3]
>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>     smbd_smb2_read: fnum 2641229669, file
>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>> offset=748683264 read=4194304
>>
>> Thanks, this is useful. Before I try and reproduce it, what is the
>> filesystem that is hosting the samba mount?
>>
> 
> I am using Btrfs.
> 
> My testing was done by exporting the share with
> 
>   vfs objects = io_uring
>   vfs objects = btrfs, io_uring
> 
> Same results in both cases. Exporting with "vfs objects = btrfs" (no io_uring) works as expected.

I don't think it makes a difference for the current problem, but I guess you want the following order instead:

vfs objects = io_uring, btrfs

metze

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Stack trace with Samba VFS io_uring and large transfers
  2021-07-28  0:16         ` Stefan Metzmacher
@ 2021-07-28  2:34           ` Jens Axboe
  2021-07-28  9:39           ` Forza
  1 sibling, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2021-07-28  2:34 UTC (permalink / raw)
  To: Stefan Metzmacher, Forza, io-uring

On 7/27/21 6:16 PM, Stefan Metzmacher wrote:
> 
> Am 24.07.21 um 21:51 schrieb Forza:
>>
>>
>> On 2021-07-24 21:44, Jens Axboe wrote:
>>> On 7/24/21 12:23 PM, Forza wrote:
>>>> Hi!
>>>>
>>>> On 2021-07-24 19:04, Jens Axboe wrote:
>>>>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>>>>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>>>>> possible to reproduce without samba with a windows client, as I don't
>>>>> have any of those. If synthetic reproducing fails, I can try samba
>>>>> with a Linux client.
>>>>
>>>> I attached the logs from both a Windows 10 client and a Linux client
>>>> (kernel 5.11.0).
>>>>
>>>> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>>>>
>>>>     smbd_smb2_read: fnum 2641229669, file
>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>> offset=736100352 read=4194304
>>>> [2021/07/24 17:26:09.120779,  3]
>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>>     smbd_smb2_read: fnum 2641229669, file
>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>> offset=740294656 read=4194304
>>>> [2021/07/24 17:26:09.226593,  3]
>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>>     smbd_smb2_read: fnum 2641229669, file
>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>> offset=748683264 read=4194304
>>>
>>> Thanks, this is useful. Before I try and reproduce it, what is the
>>> filesystem that is hosting the samba mount?
>>>
>>
>> I am using Btrfs.
>>
>> My testing was done by exporting the share with
>>
>>   vfs objects = io_uring
>>   vfs objects = btrfs, io_uring
>>
>> Same results in both cases. Exporting with "vfs objects = btrfs" (no io_uring) works as expected.
> 
> I don't think it makes a difference for the current problem, but I guess you want the following order instead:
> 
> vfs objects = io_uring, btrfs

Hopefully the fix I posted takes care of the other part :-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Stack trace with Samba VFS io_uring and large transfers
  2021-07-28  0:16         ` Stefan Metzmacher
  2021-07-28  2:34           ` Jens Axboe
@ 2021-07-28  9:39           ` Forza
  1 sibling, 0 replies; 12+ messages in thread
From: Forza @ 2021-07-28  9:39 UTC (permalink / raw)
  To: Stefan Metzmacher, Jens Axboe, io-uring



On 2021-07-28 02:16, Stefan Metzmacher wrote:
>> I am using Btrfs.
>>
>> My testing was done by exporting the share with
>>
>>    vfs objects = io_uring
>>    vfs objects = btrfs, io_uring
>>
>> Same results in both cases. Exporting with "vfs objects = btrfs" (no io_uring) works as expected.
> I don't think it makes a difference for the current problem, but I guess you want the following order instead:
> 
> vfs objects = io_uring, btrfs
> 
> metze

I haven't tested that combination. However the docs mention that 
io_uring must be last VFS object loaded.

https://www.samba.org/samba/docs/4.12/man-html/vfs_io_uring.8.html

"The io_uring VFS module enables asynchronous pread, pwrite and fsync 
using the io_uring infrastructure of Linux (>= 5.1). This provides much 
less overhead compared to the usage of the pthreadpool for async io.

This module SHOULD be listed last in any module stack as it requires 
real kernel file descriptors."

The manpage for vfs_btrfs mentions that btrfs is stackable and in their 
example they have btrfs loaded before vfs_shadow_copy.

https://www.samba.org/samba/docs/4.12/man-html/vfs_btrfs.8.html

What would be the implication of having io_uring before the btrfs module?

Regards
Forza

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Stack trace with Samba VFS io_uring and large transfers
  2021-07-26 16:35           ` Jens Axboe
@ 2021-07-28 10:02             ` Forza
  2021-07-28 14:35               ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Forza @ 2021-07-28 10:02 UTC (permalink / raw)
  To: Jens Axboe, io-uring



On 2021-07-26 18:35, Jens Axboe wrote:
> On 7/26/21 10:07 AM, Jens Axboe wrote:
>> On 7/24/21 1:51 PM, Forza wrote:
>>>
>>>
>>> On 2021-07-24 21:44, Jens Axboe wrote:
>>>> On 7/24/21 12:23 PM, Forza wrote:
>>>>> Hi!
>>>>>
>>>>> On 2021-07-24 19:04, Jens Axboe wrote:
>>>>>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>>>>>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>>>>>> possible to reproduce without samba with a windows client, as I don't
>>>>>> have any of those. If synthetic reproducing fails, I can try samba
>>>>>> with a Linux client.
>>>>>
>>>>> I attached the logs from both a Windows 10 client and a Linux client
>>>>> (kernel 5.11.0).
>>>>>
>>>>> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>>>>>
>>>>>      smbd_smb2_read: fnum 2641229669, file
>>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>>> offset=736100352 read=4194304
>>>>> [2021/07/24 17:26:09.120779,  3]
>>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>>>      smbd_smb2_read: fnum 2641229669, file
>>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>>> offset=740294656 read=4194304
>>>>> [2021/07/24 17:26:09.226593,  3]
>>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>>>      smbd_smb2_read: fnum 2641229669, file
>>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>>> offset=748683264 read=4194304
>>>>
>>>> Thanks, this is useful. Before I try and reproduce it, what is the
>>>> filesystem that is hosting the samba mount?
>>>>
>>>
>>> I am using Btrfs.
>>>
>>> My testing was done by exporting the share with
>>>
>>>     vfs objects = io_uring
>>>     vfs objects = btrfs, io_uring
>>>
>>> Same results in both cases. Exporting with "vfs objects = btrfs" (no
>>> io_uring) works as expected.
>>
>> Seems to be specific to btrfs, I can reproduce it here. I'll dive in
>> and see what I can find.
> 
> This looks like a race in dealing with the task_work running. At least
> this on top of current -git closes the gap for me and I can't reproduce
> it anymore.
> 
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index c4d2b320cdd4..998a01cbc00f 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -1959,9 +1959,13 @@ static void tctx_task_work(struct callback_head *cb)
>   			node = next;
>   		}
>   		if (wq_list_empty(&tctx->task_list)) {
> +			spin_lock_irq(&tctx->task_lock);
>   			clear_bit(0, &tctx->task_state);
> -			if (wq_list_empty(&tctx->task_list))
> +			if (wq_list_empty(&tctx->task_list)) {
> +				spin_unlock_irq(&tctx->task_lock);
>   				break;
> +			}
> +			spin_unlock_irq(&tctx->task_lock);
>   			/* another tctx_task_work() is enqueued, yield */
>   			if (test_and_set_bit(0, &tctx->task_state))
>   				break;
> 

Thanks! Is there a way to get this on current stable such as 5.13.5?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Stack trace with Samba VFS io_uring and large transfers
  2021-07-28 10:02             ` Forza
@ 2021-07-28 14:35               ` Jens Axboe
  0 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2021-07-28 14:35 UTC (permalink / raw)
  To: Forza, io-uring

On 7/28/21 4:02 AM, Forza wrote:
> 
> 
> On 2021-07-26 18:35, Jens Axboe wrote:
>> On 7/26/21 10:07 AM, Jens Axboe wrote:
>>> On 7/24/21 1:51 PM, Forza wrote:
>>>>
>>>>
>>>> On 2021-07-24 21:44, Jens Axboe wrote:
>>>>> On 7/24/21 12:23 PM, Forza wrote:
>>>>>> Hi!
>>>>>>
>>>>>> On 2021-07-24 19:04, Jens Axboe wrote:
>>>>>>> I'll see if I can reproduce this. I'm assuming samba is using buffered
>>>>>>> IO, and it looks like it's reading in chunks of 1MB. Hopefully it's
>>>>>>> possible to reproduce without samba with a windows client, as I don't
>>>>>>> have any of those. If synthetic reproducing fails, I can try samba
>>>>>>> with a Linux client.
>>>>>>
>>>>>> I attached the logs from both a Windows 10 client and a Linux client
>>>>>> (kernel 5.11.0).
>>>>>>
>>>>>> https://paste.tnonline.net/files/r4yebSzlGEVD_linux-client.txt
>>>>>>
>>>>>>      smbd_smb2_read: fnum 2641229669, file
>>>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>>>> offset=736100352 read=4194304
>>>>>> [2021/07/24 17:26:09.120779,  3]
>>>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>>>>      smbd_smb2_read: fnum 2641229669, file
>>>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>>>> offset=740294656 read=4194304
>>>>>> [2021/07/24 17:26:09.226593,  3]
>>>>>> ../../source3/smbd/smb2_read.c:415(smb2_read_complete)
>>>>>>      smbd_smb2_read: fnum 2641229669, file
>>>>>> media/vm/libvirt/images/Mint_Cinnamon.img, length=4194304
>>>>>> offset=748683264 read=4194304
>>>>>
>>>>> Thanks, this is useful. Before I try and reproduce it, what is the
>>>>> filesystem that is hosting the samba mount?
>>>>>
>>>>
>>>> I am using Btrfs.
>>>>
>>>> My testing was done by exporting the share with
>>>>
>>>>     vfs objects = io_uring
>>>>     vfs objects = btrfs, io_uring
>>>>
>>>> Same results in both cases. Exporting with "vfs objects = btrfs" (no
>>>> io_uring) works as expected.
>>>
>>> Seems to be specific to btrfs, I can reproduce it here. I'll dive in
>>> and see what I can find.
>>
>> This looks like a race in dealing with the task_work running. At least
>> this on top of current -git closes the gap for me and I can't reproduce
>> it anymore.
>>
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index c4d2b320cdd4..998a01cbc00f 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -1959,9 +1959,13 @@ static void tctx_task_work(struct callback_head *cb)
>>   			node = next;
>>   		}
>>   		if (wq_list_empty(&tctx->task_list)) {
>> +			spin_lock_irq(&tctx->task_lock);
>>   			clear_bit(0, &tctx->task_state);
>> -			if (wq_list_empty(&tctx->task_list))
>> +			if (wq_list_empty(&tctx->task_list)) {
>> +				spin_unlock_irq(&tctx->task_lock);
>>   				break;
>> +			}
>> +			spin_unlock_irq(&tctx->task_lock);
>>   			/* another tctx_task_work() is enqueued, yield */
>>   			if (test_and_set_bit(0, &tctx->task_state))
>>   				break;
>>
> 
> Thanks! Is there a way to get this on current stable such as 5.13.5?

Yes, we'll get it to stable once it's upstream. For your particular
testing, not sure there's an easy way... You'd have to apply it to
5.13.5 and compile it on your own, I'm afraid.

If all else fails, hopefully it'll be in 5.13.6 and you can re-test
with that :-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-07-28 14:36 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-07-24 15:46 Stack trace with Samba VFS io_uring and large transfers Forza
2021-07-24 17:04 ` Jens Axboe
2021-07-24 18:23   ` Forza
2021-07-24 19:44     ` Jens Axboe
2021-07-24 19:51       ` Forza
2021-07-26 16:07         ` Jens Axboe
2021-07-26 16:35           ` Jens Axboe
2021-07-28 10:02             ` Forza
2021-07-28 14:35               ` Jens Axboe
2021-07-28  0:16         ` Stefan Metzmacher
2021-07-28  2:34           ` Jens Axboe
2021-07-28  9:39           ` Forza

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox