* Re: Linux 5.18-rc4 [not found] <CAHk-=whmtHMzjaVUF9bS+7vE_rrRctcCTvsAeB8fuLYcyYLN-g@mail.gmail.com> @ 2022-04-27 17:59 ` Ammar Faizi 2022-04-27 18:31 ` Linus Torvalds 0 siblings, 1 reply; 11+ messages in thread From: Ammar Faizi @ 2022-04-27 17:59 UTC (permalink / raw) To: Linus Torvalds, Linux Kernel Mailing List Cc: Al Viro, Eric Biederman, Kees Cook, linux-fsdevel, linux-mm, linux-kernel, gwml On 4/25/22 5:22 AM, Linus Torvalds wrote: > Fairly slow and calm week - which makes me just suspect that the other > shoe will drop at some point. > > But maybe things are just going really well this release. It's bound > to happen _occasionally_, after all. + fs/exec.c maintainers. Testing Linux 5.18-rc4 on my laptop, it has been running for 2 days. Got the following lockdep splat this night. I don't have the reproducer. If you need more information, feel free to let me know. [78140.503644] ====================================================== [78140.503646] WARNING: possible circular locking dependency detected [78140.503648] 5.18.0-rc4-superb-owl-00006-gd615b5416f8a #12 Tainted: G W [78140.503650] ------------------------------------------------------ [78140.503651] preconv/111629 is trying to acquire lock: [78140.503653] ffff88834d633248 (&ctx->lock){+.+.}-{2:2}, at: update_file_ctx+0x19/0xe0 [78140.503663] but task is already holding lock: [78140.503664] ffff888103d80458 (&newf->file_lock){+.+.}-{2:2}, at: iterate_fd+0x34/0x150 [78140.503669] which lock already depends on the new lock. [78140.503671] the existing dependency chain (in reverse order) is: [78140.503672] -> #4 (&newf->file_lock){+.+.}-{2:2}: [78140.503675] _raw_spin_lock+0x2f/0x40 [78140.503679] seq_show+0x72/0x280 [78140.503681] seq_read_iter+0x125/0x3c0 [78140.503684] seq_read+0xd0/0xe0 [78140.503686] vfs_read+0xf5/0x2f0 [78140.503688] ksys_read+0x58/0xb0 [78140.503690] do_syscall_64+0x3d/0x90 [78140.503693] entry_SYSCALL_64_after_hwframe+0x44/0xae [78140.503695] -> #3 (&p->alloc_lock){+.+.}-{2:2}: [78140.503699] _raw_spin_lock+0x2f/0x40 [78140.503700] newseg+0x25b/0x360 [78140.503703] ipcget+0x3fb/0x480 [78140.503705] __x64_sys_shmget+0x48/0x50 [78140.503708] do_syscall_64+0x3d/0x90 [78140.503710] entry_SYSCALL_64_after_hwframe+0x44/0xae [78140.503713] -> #2 (&new->lock){+.+.}-{2:2}: [78140.503716] _raw_spin_lock+0x2f/0x40 [78140.503718] ipc_addid+0xb3/0x700 [78140.503720] newseg+0x238/0x360 [78140.503722] ipcget+0x3fb/0x480 [78140.503724] __x64_sys_shmget+0x48/0x50 [78140.503727] do_syscall_64+0x3d/0x90 [78140.503729] entry_SYSCALL_64_after_hwframe+0x44/0xae [78140.503731] -> #1 (lock#3){+.+.}-{2:2}: [78140.503735] local_lock_acquire+0x1d/0x70 [78140.503738] __radix_tree_preload+0x38/0x150 [78140.503740] idr_preload+0xa/0x40 [78140.503743] aa_alloc_secid+0x15/0xb0 [78140.503745] aa_label_alloc+0x6c/0x1b0 [78140.503747] aa_label_merge+0x52/0x430 [78140.503750] update_file_ctx+0x3f/0xe0 [78140.503752] aa_file_perm+0x56e/0x5c0 [78140.503754] common_file_perm+0x70/0xd0 [78140.503756] security_mmap_file+0x4b/0xd0 [78140.503759] vm_mmap_pgoff+0x50/0x150 [78140.503761] elf_map+0x9f/0x120 [78140.503763] load_elf_binary+0x521/0xc80 [78140.503767] bprm_execve+0x39f/0x660 [78140.503769] do_execveat_common+0x1d0/0x220 [78140.503771] __x64_sys_execveat+0x3d/0x50 [78140.503773] do_syscall_64+0x3d/0x90 [78140.503775] entry_SYSCALL_64_after_hwframe+0x44/0xae [78140.503777] -> #0 (&ctx->lock){+.+.}-{2:2}: [78140.503780] __lock_acquire+0x1573/0x2ce0 [78140.503783] lock_acquire+0xbd/0x190 [78140.503785] _raw_spin_lock+0x2f/0x40 [78140.503787] update_file_ctx+0x19/0xe0 [78140.503788] aa_file_perm+0x56e/0x5c0 [78140.503790] match_file+0x78/0x90 [78140.503792] iterate_fd+0xae/0x150 [78140.503794] aa_inherit_files+0xbe/0x170 [78140.503796] apparmor_bprm_committing_creds+0x50/0x80 [78140.503798] security_bprm_committing_creds+0x1d/0x30 [78140.503800] begin_new_exec+0x3c5/0x450 [78140.503802] load_elf_binary+0x269/0xc80 [78140.503804] bprm_execve+0x39f/0x660 [78140.503806] do_execveat_common+0x1d0/0x220 [78140.503808] __x64_sys_execve+0x36/0x40 [78140.503809] do_syscall_64+0x3d/0x90 [78140.503812] entry_SYSCALL_64_after_hwframe+0x44/0xae [78140.503815] other info that might help us debug this: [78140.503816] Chain exists of: &ctx->lock --> &p->alloc_lock --> &newf->file_lock [78140.503820] Possible unsafe locking scenario: [78140.503821] CPU0 CPU1 [78140.503823] ---- ---- [78140.503824] lock(&newf->file_lock); [78140.503826] lock(&p->alloc_lock); [78140.503828] lock(&newf->file_lock); [78140.503830] lock(&ctx->lock); [78140.503832] *** DEADLOCK *** [78140.503833] 3 locks held by preconv/111629: [78140.503835] #0: ffff888111b62550 (&sig->cred_guard_mutex){+.+.}-{3:3}, at: bprm_execve+0x39/0x660 [78140.503840] #1: ffff888111b625e8 (&sig->exec_update_lock){++++}-{3:3}, at: exec_mmap+0x4e/0x250 [78140.503844] #2: ffff888103d80458 (&newf->file_lock){+.+.}-{2:2}, at: iterate_fd+0x34/0x150 [78140.503849] stack backtrace: [78140.503851] CPU: 3 PID: 111629 Comm: preconv Tainted: G W 5.18.0-rc4-superb-owl-00006-gd615b5416f8a #12 6fd282a37da6f0e0172ecfa29689f3d250476a2b [78140.503855] Hardware name: HP HP Laptop 14s-dq2xxx/87FD, BIOS F.15 09/15/2021 [78140.503856] Call Trace: [78140.503858] <TASK> [78140.503860] dump_stack_lvl+0x5a/0x74 [78140.503863] check_noncircular+0xd3/0xe0 [78140.503866] ? register_lock_class+0x35/0x2a0 [78140.503870] __lock_acquire+0x1573/0x2ce0 [78140.503872] ? prepend_path+0x375/0x410 [78140.503876] ? d_absolute_path+0x48/0x80 [78140.503879] ? aa_path_name+0x132/0x470 [78140.503883] ? lock_is_held_type+0xd0/0x130 [78140.503886] lock_acquire+0xbd/0x190 [78140.503888] ? update_file_ctx+0x19/0xe0 [78140.503892] _raw_spin_lock+0x2f/0x40 [78140.503894] ? update_file_ctx+0x19/0xe0 [78140.503896] update_file_ctx+0x19/0xe0 [78140.503899] aa_file_perm+0x56e/0x5c0 [78140.503904] ? aa_inherit_files+0x170/0x170 [78140.503906] match_file+0x78/0x90 [78140.503909] iterate_fd+0xae/0x150 [78140.503912] aa_inherit_files+0xbe/0x170 [78140.503915] apparmor_bprm_committing_creds+0x50/0x80 [78140.503918] security_bprm_committing_creds+0x1d/0x30 [78140.503921] begin_new_exec+0x3c5/0x450 [78140.503924] load_elf_binary+0x269/0xc80 [78140.503928] ? lock_release+0x1ee/0x260 [78140.503930] ? bprm_execve+0x399/0x660 [78140.503933] bprm_execve+0x39f/0x660 [78140.503936] do_execveat_common+0x1d0/0x220 [78140.503940] __x64_sys_execve+0x36/0x40 [78140.503942] do_syscall_64+0x3d/0x90 [78140.503946] entry_SYSCALL_64_after_hwframe+0x44/0xae [78140.503948] RIP: 0033:0x7f700a8ea33b [78140.503954] Code: Unable to access opcode bytes at RIP 0x7f700a8ea311. [78140.503955] RSP: 002b:00007fff315e7db8 EFLAGS: 00000246 ORIG_RAX: 000000000000003b [78140.503958] RAX: ffffffffffffffda RBX: 00007fff315e7dc0 RCX: 00007f700a8ea33b [78140.503960] RDX: 000056419e9ea7e0 RSI: 000056419e9e9160 RDI: 00007fff315e7dc0 [78140.503962] RBP: 00007fff315e7f60 R08: 0000000000000008 R09: 0000000000000000 [78140.503964] R10: 0000000000000001 R11: 0000000000000246 R12: 000056419e9ea760 [78140.503965] R13: 000056419e9e9160 R14: 00007fff315e9eb4 R15: 00007fff315e9ebc [78140.503971] </TASK> -- Ammar Faizi ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 5.18-rc4 2022-04-27 17:59 ` Linux 5.18-rc4 Ammar Faizi @ 2022-04-27 18:31 ` Linus Torvalds 2022-06-06 15:19 ` Eric W. Biederman 0 siblings, 1 reply; 11+ messages in thread From: Linus Torvalds @ 2022-04-27 18:31 UTC (permalink / raw) To: Ammar Faizi, John Johansen, James Morris, LSM List Cc: Linux Kernel Mailing List, Al Viro, Eric Biederman, Kees Cook, linux-fsdevel, Linux-MM, gwml This looks like it might be AppArmor-related. Adding AppArmor and security module people to the participants. Sorry for top-posting and quoting the whole thing, but this is really just bringing in more people to the discussion. So on the exec path we have apparmor_bprm_committing_creds() -> aa_inherit_files() -> iterate_fd (takes files->file_lock) -> aa_file_perm -> update_file_ctx (takes aa_file_ctx->lock) which gives us that file_lock -> ctx lock order. All within AppArmor. And then we apparently _also_ have the reverse ctx lock -> file_lock order by way of 'alloc_lock', which is the 'task_lock()' thing That one is a horror to decode and I didn't, but seems to go through ipcget -> newseg.. Anybody? Linus On Wed, Apr 27, 2022 at 11:00 AM Ammar Faizi <[email protected]> wrote: > > On 4/25/22 5:22 AM, Linus Torvalds wrote: > > Fairly slow and calm week - which makes me just suspect that the other > > shoe will drop at some point. > > > > But maybe things are just going really well this release. It's bound > > to happen _occasionally_, after all. > > + fs/exec.c maintainers. > > Testing Linux 5.18-rc4 on my laptop, it has been running for 2 days. Got > the following lockdep splat this night. I don't have the reproducer. If > you need more information, feel free to let me know. > > [78140.503644] ====================================================== > [78140.503646] WARNING: possible circular locking dependency detected > [78140.503648] 5.18.0-rc4-superb-owl-00006-gd615b5416f8a #12 Tainted: G W > [78140.503650] ------------------------------------------------------ > [78140.503651] preconv/111629 is trying to acquire lock: > [78140.503653] ffff88834d633248 (&ctx->lock){+.+.}-{2:2}, at: update_file_ctx+0x19/0xe0 > [78140.503663] > but task is already holding lock: > [78140.503664] ffff888103d80458 (&newf->file_lock){+.+.}-{2:2}, at: iterate_fd+0x34/0x150 > [78140.503669] > which lock already depends on the new lock. > > [78140.503671] > the existing dependency chain (in reverse order) is: > [78140.503672] > -> #4 (&newf->file_lock){+.+.}-{2:2}: > [78140.503675] _raw_spin_lock+0x2f/0x40 > [78140.503679] seq_show+0x72/0x280 > [78140.503681] seq_read_iter+0x125/0x3c0 > [78140.503684] seq_read+0xd0/0xe0 > [78140.503686] vfs_read+0xf5/0x2f0 > [78140.503688] ksys_read+0x58/0xb0 > [78140.503690] do_syscall_64+0x3d/0x90 > [78140.503693] entry_SYSCALL_64_after_hwframe+0x44/0xae > [78140.503695] > -> #3 (&p->alloc_lock){+.+.}-{2:2}: > [78140.503699] _raw_spin_lock+0x2f/0x40 > [78140.503700] newseg+0x25b/0x360 > [78140.503703] ipcget+0x3fb/0x480 > [78140.503705] __x64_sys_shmget+0x48/0x50 > [78140.503708] do_syscall_64+0x3d/0x90 > [78140.503710] entry_SYSCALL_64_after_hwframe+0x44/0xae > [78140.503713] > -> #2 (&new->lock){+.+.}-{2:2}: > [78140.503716] _raw_spin_lock+0x2f/0x40 > [78140.503718] ipc_addid+0xb3/0x700 > [78140.503720] newseg+0x238/0x360 > [78140.503722] ipcget+0x3fb/0x480 > [78140.503724] __x64_sys_shmget+0x48/0x50 > [78140.503727] do_syscall_64+0x3d/0x90 > [78140.503729] entry_SYSCALL_64_after_hwframe+0x44/0xae > [78140.503731] > -> #1 (lock#3){+.+.}-{2:2}: > [78140.503735] local_lock_acquire+0x1d/0x70 > [78140.503738] __radix_tree_preload+0x38/0x150 > [78140.503740] idr_preload+0xa/0x40 > [78140.503743] aa_alloc_secid+0x15/0xb0 > [78140.503745] aa_label_alloc+0x6c/0x1b0 > [78140.503747] aa_label_merge+0x52/0x430 > [78140.503750] update_file_ctx+0x3f/0xe0 > [78140.503752] aa_file_perm+0x56e/0x5c0 > [78140.503754] common_file_perm+0x70/0xd0 > [78140.503756] security_mmap_file+0x4b/0xd0 > [78140.503759] vm_mmap_pgoff+0x50/0x150 > [78140.503761] elf_map+0x9f/0x120 > [78140.503763] load_elf_binary+0x521/0xc80 > [78140.503767] bprm_execve+0x39f/0x660 > [78140.503769] do_execveat_common+0x1d0/0x220 > [78140.503771] __x64_sys_execveat+0x3d/0x50 > [78140.503773] do_syscall_64+0x3d/0x90 > [78140.503775] entry_SYSCALL_64_after_hwframe+0x44/0xae > [78140.503777] > -> #0 (&ctx->lock){+.+.}-{2:2}: > [78140.503780] __lock_acquire+0x1573/0x2ce0 > [78140.503783] lock_acquire+0xbd/0x190 > [78140.503785] _raw_spin_lock+0x2f/0x40 > [78140.503787] update_file_ctx+0x19/0xe0 > [78140.503788] aa_file_perm+0x56e/0x5c0 > [78140.503790] match_file+0x78/0x90 > [78140.503792] iterate_fd+0xae/0x150 > [78140.503794] aa_inherit_files+0xbe/0x170 > [78140.503796] apparmor_bprm_committing_creds+0x50/0x80 > [78140.503798] security_bprm_committing_creds+0x1d/0x30 > [78140.503800] begin_new_exec+0x3c5/0x450 > [78140.503802] load_elf_binary+0x269/0xc80 > [78140.503804] bprm_execve+0x39f/0x660 > [78140.503806] do_execveat_common+0x1d0/0x220 > [78140.503808] __x64_sys_execve+0x36/0x40 > [78140.503809] do_syscall_64+0x3d/0x90 > [78140.503812] entry_SYSCALL_64_after_hwframe+0x44/0xae > [78140.503815] > other info that might help us debug this: > > [78140.503816] Chain exists of: > &ctx->lock --> &p->alloc_lock --> &newf->file_lock > > [78140.503820] Possible unsafe locking scenario: > > [78140.503821] CPU0 CPU1 > [78140.503823] ---- ---- > [78140.503824] lock(&newf->file_lock); > [78140.503826] lock(&p->alloc_lock); > [78140.503828] lock(&newf->file_lock); > [78140.503830] lock(&ctx->lock); > [78140.503832] > *** DEADLOCK *** > > [78140.503833] 3 locks held by preconv/111629: > [78140.503835] #0: ffff888111b62550 (&sig->cred_guard_mutex){+.+.}-{3:3}, at: bprm_execve+0x39/0x660 > [78140.503840] #1: ffff888111b625e8 (&sig->exec_update_lock){++++}-{3:3}, at: exec_mmap+0x4e/0x250 > [78140.503844] #2: ffff888103d80458 (&newf->file_lock){+.+.}-{2:2}, at: iterate_fd+0x34/0x150 > [78140.503849] > stack backtrace: > [78140.503851] CPU: 3 PID: 111629 Comm: preconv Tainted: G W 5.18.0-rc4-superb-owl-00006-gd615b5416f8a #12 6fd282a37da6f0e0172ecfa29689f3d250476a2b > [78140.503855] Hardware name: HP HP Laptop 14s-dq2xxx/87FD, BIOS F.15 09/15/2021 > [78140.503856] Call Trace: > [78140.503858] <TASK> > [78140.503860] dump_stack_lvl+0x5a/0x74 > [78140.503863] check_noncircular+0xd3/0xe0 > [78140.503866] ? register_lock_class+0x35/0x2a0 > [78140.503870] __lock_acquire+0x1573/0x2ce0 > [78140.503872] ? prepend_path+0x375/0x410 > [78140.503876] ? d_absolute_path+0x48/0x80 > [78140.503879] ? aa_path_name+0x132/0x470 > [78140.503883] ? lock_is_held_type+0xd0/0x130 > [78140.503886] lock_acquire+0xbd/0x190 > [78140.503888] ? update_file_ctx+0x19/0xe0 > [78140.503892] _raw_spin_lock+0x2f/0x40 > [78140.503894] ? update_file_ctx+0x19/0xe0 > [78140.503896] update_file_ctx+0x19/0xe0 > [78140.503899] aa_file_perm+0x56e/0x5c0 > [78140.503904] ? aa_inherit_files+0x170/0x170 > [78140.503906] match_file+0x78/0x90 > [78140.503909] iterate_fd+0xae/0x150 > [78140.503912] aa_inherit_files+0xbe/0x170 > [78140.503915] apparmor_bprm_committing_creds+0x50/0x80 > [78140.503918] security_bprm_committing_creds+0x1d/0x30 > [78140.503921] begin_new_exec+0x3c5/0x450 > [78140.503924] load_elf_binary+0x269/0xc80 > [78140.503928] ? lock_release+0x1ee/0x260 > [78140.503930] ? bprm_execve+0x399/0x660 > [78140.503933] bprm_execve+0x39f/0x660 > [78140.503936] do_execveat_common+0x1d0/0x220 > [78140.503940] __x64_sys_execve+0x36/0x40 > [78140.503942] do_syscall_64+0x3d/0x90 > [78140.503946] entry_SYSCALL_64_after_hwframe+0x44/0xae > [78140.503948] RIP: 0033:0x7f700a8ea33b > [78140.503954] Code: Unable to access opcode bytes at RIP 0x7f700a8ea311. > [78140.503955] RSP: 002b:00007fff315e7db8 EFLAGS: 00000246 ORIG_RAX: 000000000000003b > [78140.503958] RAX: ffffffffffffffda RBX: 00007fff315e7dc0 RCX: 00007f700a8ea33b > [78140.503960] RDX: 000056419e9ea7e0 RSI: 000056419e9e9160 RDI: 00007fff315e7dc0 > [78140.503962] RBP: 00007fff315e7f60 R08: 0000000000000008 R09: 0000000000000000 > [78140.503964] R10: 0000000000000001 R11: 0000000000000246 R12: 000056419e9ea760 > [78140.503965] R13: 000056419e9e9160 R14: 00007fff315e9eb4 R15: 00007fff315e9ebc > [78140.503971] </TASK> > > -- > Ammar Faizi ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 5.18-rc4 2022-04-27 18:31 ` Linus Torvalds @ 2022-06-06 15:19 ` Eric W. Biederman 2022-06-06 18:28 ` Linus Torvalds 0 siblings, 1 reply; 11+ messages in thread From: Eric W. Biederman @ 2022-06-06 15:19 UTC (permalink / raw) To: Linus Torvalds Cc: Ammar Faizi, John Johansen, James Morris, LSM List, Linux Kernel Mailing List, Al Viro, Kees Cook, linux-fsdevel Linus Torvalds <[email protected]> writes: > This looks like it might be AppArmor-related. > > Adding AppArmor and security module people to the participants. > > Sorry for top-posting and quoting the whole thing, but this is really > just bringing in more people to the discussion. > > So on the exec path we have > > apparmor_bprm_committing_creds() -> > aa_inherit_files() -> > iterate_fd (takes files->file_lock) -> > aa_file_perm -> > update_file_ctx (takes aa_file_ctx->lock) > > which gives us that file_lock -> ctx lock order. All within AppArmor. > > And then we apparently _also_ have the reverse ctx lock -> file_lock > order by way of 'alloc_lock', which is the 'task_lock()' thing > > That one is a horror to decode and I didn't, but seems to go through > ipcget -> newseg.. > > Anybody? Has anyone looked into this lock ordering issues? Eric > > On Wed, Apr 27, 2022 at 11:00 AM Ammar Faizi <[email protected]> wrote: >> >> On 4/25/22 5:22 AM, Linus Torvalds wrote: >> > Fairly slow and calm week - which makes me just suspect that the other >> > shoe will drop at some point. >> > >> > But maybe things are just going really well this release. It's bound >> > to happen _occasionally_, after all. >> >> + fs/exec.c maintainers. >> >> Testing Linux 5.18-rc4 on my laptop, it has been running for 2 days. Got >> the following lockdep splat this night. I don't have the reproducer. If >> you need more information, feel free to let me know. >> >> [78140.503644] ====================================================== >> [78140.503646] WARNING: possible circular locking dependency detected >> [78140.503648] 5.18.0-rc4-superb-owl-00006-gd615b5416f8a #12 Tainted: G W >> [78140.503650] ------------------------------------------------------ >> [78140.503651] preconv/111629 is trying to acquire lock: >> [78140.503653] ffff88834d633248 (&ctx->lock){+.+.}-{2:2}, at: update_file_ctx+0x19/0xe0 >> [78140.503663] >> but task is already holding lock: >> [78140.503664] ffff888103d80458 (&newf->file_lock){+.+.}-{2:2}, at: iterate_fd+0x34/0x150 >> [78140.503669] >> which lock already depends on the new lock. >> >> [78140.503671] >> the existing dependency chain (in reverse order) is: >> [78140.503672] >> -> #4 (&newf->file_lock){+.+.}-{2:2}: >> [78140.503675] _raw_spin_lock+0x2f/0x40 >> [78140.503679] seq_show+0x72/0x280 >> [78140.503681] seq_read_iter+0x125/0x3c0 >> [78140.503684] seq_read+0xd0/0xe0 >> [78140.503686] vfs_read+0xf5/0x2f0 >> [78140.503688] ksys_read+0x58/0xb0 >> [78140.503690] do_syscall_64+0x3d/0x90 >> [78140.503693] entry_SYSCALL_64_after_hwframe+0x44/0xae >> [78140.503695] >> -> #3 (&p->alloc_lock){+.+.}-{2:2}: >> [78140.503699] _raw_spin_lock+0x2f/0x40 >> [78140.503700] newseg+0x25b/0x360 >> [78140.503703] ipcget+0x3fb/0x480 >> [78140.503705] __x64_sys_shmget+0x48/0x50 >> [78140.503708] do_syscall_64+0x3d/0x90 >> [78140.503710] entry_SYSCALL_64_after_hwframe+0x44/0xae >> [78140.503713] >> -> #2 (&new->lock){+.+.}-{2:2}: >> [78140.503716] _raw_spin_lock+0x2f/0x40 >> [78140.503718] ipc_addid+0xb3/0x700 >> [78140.503720] newseg+0x238/0x360 >> [78140.503722] ipcget+0x3fb/0x480 >> [78140.503724] __x64_sys_shmget+0x48/0x50 >> [78140.503727] do_syscall_64+0x3d/0x90 >> [78140.503729] entry_SYSCALL_64_after_hwframe+0x44/0xae >> [78140.503731] >> -> #1 (lock#3){+.+.}-{2:2}: >> [78140.503735] local_lock_acquire+0x1d/0x70 >> [78140.503738] __radix_tree_preload+0x38/0x150 >> [78140.503740] idr_preload+0xa/0x40 >> [78140.503743] aa_alloc_secid+0x15/0xb0 >> [78140.503745] aa_label_alloc+0x6c/0x1b0 >> [78140.503747] aa_label_merge+0x52/0x430 >> [78140.503750] update_file_ctx+0x3f/0xe0 >> [78140.503752] aa_file_perm+0x56e/0x5c0 >> [78140.503754] common_file_perm+0x70/0xd0 >> [78140.503756] security_mmap_file+0x4b/0xd0 >> [78140.503759] vm_mmap_pgoff+0x50/0x150 >> [78140.503761] elf_map+0x9f/0x120 >> [78140.503763] load_elf_binary+0x521/0xc80 >> [78140.503767] bprm_execve+0x39f/0x660 >> [78140.503769] do_execveat_common+0x1d0/0x220 >> [78140.503771] __x64_sys_execveat+0x3d/0x50 >> [78140.503773] do_syscall_64+0x3d/0x90 >> [78140.503775] entry_SYSCALL_64_after_hwframe+0x44/0xae >> [78140.503777] >> -> #0 (&ctx->lock){+.+.}-{2:2}: >> [78140.503780] __lock_acquire+0x1573/0x2ce0 >> [78140.503783] lock_acquire+0xbd/0x190 >> [78140.503785] _raw_spin_lock+0x2f/0x40 >> [78140.503787] update_file_ctx+0x19/0xe0 >> [78140.503788] aa_file_perm+0x56e/0x5c0 >> [78140.503790] match_file+0x78/0x90 >> [78140.503792] iterate_fd+0xae/0x150 >> [78140.503794] aa_inherit_files+0xbe/0x170 >> [78140.503796] apparmor_bprm_committing_creds+0x50/0x80 >> [78140.503798] security_bprm_committing_creds+0x1d/0x30 >> [78140.503800] begin_new_exec+0x3c5/0x450 >> [78140.503802] load_elf_binary+0x269/0xc80 >> [78140.503804] bprm_execve+0x39f/0x660 >> [78140.503806] do_execveat_common+0x1d0/0x220 >> [78140.503808] __x64_sys_execve+0x36/0x40 >> [78140.503809] do_syscall_64+0x3d/0x90 >> [78140.503812] entry_SYSCALL_64_after_hwframe+0x44/0xae >> [78140.503815] >> other info that might help us debug this: >> >> [78140.503816] Chain exists of: >> &ctx->lock --> &p->alloc_lock --> &newf->file_lock >> >> [78140.503820] Possible unsafe locking scenario: >> >> [78140.503821] CPU0 CPU1 >> [78140.503823] ---- ---- >> [78140.503824] lock(&newf->file_lock); >> [78140.503826] lock(&p->alloc_lock); >> [78140.503828] lock(&newf->file_lock); >> [78140.503830] lock(&ctx->lock); >> [78140.503832] >> *** DEADLOCK *** >> >> [78140.503833] 3 locks held by preconv/111629: >> [78140.503835] #0: ffff888111b62550 (&sig->cred_guard_mutex){+.+.}-{3:3}, at: bprm_execve+0x39/0x660 >> [78140.503840] #1: ffff888111b625e8 (&sig->exec_update_lock){++++}-{3:3}, at: exec_mmap+0x4e/0x250 >> [78140.503844] #2: ffff888103d80458 (&newf->file_lock){+.+.}-{2:2}, at: iterate_fd+0x34/0x150 >> [78140.503849] >> stack backtrace: >> [78140.503851] CPU: 3 PID: 111629 Comm: preconv Tainted: G W 5.18.0-rc4-superb-owl-00006-gd615b5416f8a #12 6fd282a37da6f0e0172ecfa29689f3d250476a2b >> [78140.503855] Hardware name: HP HP Laptop 14s-dq2xxx/87FD, BIOS F.15 09/15/2021 >> [78140.503856] Call Trace: >> [78140.503858] <TASK> >> [78140.503860] dump_stack_lvl+0x5a/0x74 >> [78140.503863] check_noncircular+0xd3/0xe0 >> [78140.503866] ? register_lock_class+0x35/0x2a0 >> [78140.503870] __lock_acquire+0x1573/0x2ce0 >> [78140.503872] ? prepend_path+0x375/0x410 >> [78140.503876] ? d_absolute_path+0x48/0x80 >> [78140.503879] ? aa_path_name+0x132/0x470 >> [78140.503883] ? lock_is_held_type+0xd0/0x130 >> [78140.503886] lock_acquire+0xbd/0x190 >> [78140.503888] ? update_file_ctx+0x19/0xe0 >> [78140.503892] _raw_spin_lock+0x2f/0x40 >> [78140.503894] ? update_file_ctx+0x19/0xe0 >> [78140.503896] update_file_ctx+0x19/0xe0 >> [78140.503899] aa_file_perm+0x56e/0x5c0 >> [78140.503904] ? aa_inherit_files+0x170/0x170 >> [78140.503906] match_file+0x78/0x90 >> [78140.503909] iterate_fd+0xae/0x150 >> [78140.503912] aa_inherit_files+0xbe/0x170 >> [78140.503915] apparmor_bprm_committing_creds+0x50/0x80 >> [78140.503918] security_bprm_committing_creds+0x1d/0x30 >> [78140.503921] begin_new_exec+0x3c5/0x450 >> [78140.503924] load_elf_binary+0x269/0xc80 >> [78140.503928] ? lock_release+0x1ee/0x260 >> [78140.503930] ? bprm_execve+0x399/0x660 >> [78140.503933] bprm_execve+0x39f/0x660 >> [78140.503936] do_execveat_common+0x1d0/0x220 >> [78140.503940] __x64_sys_execve+0x36/0x40 >> [78140.503942] do_syscall_64+0x3d/0x90 >> [78140.503946] entry_SYSCALL_64_after_hwframe+0x44/0xae >> [78140.503948] RIP: 0033:0x7f700a8ea33b >> [78140.503954] Code: Unable to access opcode bytes at RIP 0x7f700a8ea311. >> [78140.503955] RSP: 002b:00007fff315e7db8 EFLAGS: 00000246 ORIG_RAX: 000000000000003b >> [78140.503958] RAX: ffffffffffffffda RBX: 00007fff315e7dc0 RCX: 00007f700a8ea33b >> [78140.503960] RDX: 000056419e9ea7e0 RSI: 000056419e9e9160 RDI: 00007fff315e7dc0 >> [78140.503962] RBP: 00007fff315e7f60 R08: 0000000000000008 R09: 0000000000000000 >> [78140.503964] R10: 0000000000000001 R11: 0000000000000246 R12: 000056419e9ea760 >> [78140.503965] R13: 000056419e9e9160 R14: 00007fff315e9eb4 R15: 00007fff315e9ebc >> [78140.503971] </TASK> >> >> -- >> Ammar Faizi ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 5.18-rc4 2022-06-06 15:19 ` Eric W. Biederman @ 2022-06-06 18:28 ` Linus Torvalds 2022-06-06 19:19 ` John Johansen 0 siblings, 1 reply; 11+ messages in thread From: Linus Torvalds @ 2022-06-06 18:28 UTC (permalink / raw) To: Eric W. Biederman Cc: Ammar Faizi, John Johansen, James Morris, LSM List, Linux Kernel Mailing List, Al Viro, Kees Cook, <[email protected]>, Linux-MM, gwml On Mon, Jun 6, 2022 at 8:19 AM Eric W. Biederman <[email protected]> wrote: > Has anyone looked into this lock ordering issues? The deadlock is > >> [78140.503821] CPU0 CPU1 > >> [78140.503823] ---- ---- > >> [78140.503824] lock(&newf->file_lock); > >> [78140.503826] lock(&p->alloc_lock); > >> [78140.503828] lock(&newf->file_lock); > >> [78140.503830] lock(&ctx->lock); and the alloc_lock -> file_lock on CPU1 is trivial - it's seq_show() in fs/proc/fd.c: task_lock(task); files = task->files; if (files) { unsigned int fd = proc_fd(m->private); spin_lock(&files->file_lock); and that looks all normal. But the other chains look painful. I do see the IPC code doing ugly things, in particular I detest this code: task_lock(current); list_add(&shp->shm_clist, ¤t->sysvshm.shm_clist); task_unlock(current); where it is using the task lock to protect the shm_clist list. Nasty. And it's doing that inside the shm_ids.rwsem lock _and_ inside the shp->shm_perm.lock. So the IPC code has newseg() doing shmget -> ipcget(): down_write(ids->rwsem) -> newseg(): ipc_addid gets perm->lock task_lock(current) so you have ids->rwsem -> perm->lock -> alloc_lock there. So now we have that ids->rwsem -> ipcperm->lock -> alloc_lock -> file_lock when you put those sequences together. But I didn't figure out what the security subsystem angle is and how that then apparently mixes things up with execve. Yes, newseg() is doing that error = security_shm_alloc(&shp->shm_perm); while holding rwsem, but I can't see how that matters. From the lockdep output, rwsem doesn't actually seem to be part of the whole sequence. It *looks* like we have apparmour ctx->lock --> radix_tree_preloads.lock --> ipcperm->lock and apparently that's called under the file_lock somewhere, completing the circle. I guess the execve component is that begin_new_exec -> security_bprm_committing_creds -> apparmor_bprm_committing_creds -> aa_inherit_files -> iterate_fd -> *takes file_lock* match_file -> aa_file_perm -> update_file_ctx *takes ctx->lock* so that's how you get file_lock -> ctx->lock. So you have: SHMGET: ipcperm->lock -> alloc_lock /proc: alloc_lock -> file_lock apparmor_bprm_committing_creds: file_lock -> ctx->lock and then all you need is ctx->lock -> ipcperm->lock but I didn't find that part. I suspect that part is that both Apparmor and IPC use the idr local lock. Linus ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 5.18-rc4 2022-06-06 18:28 ` Linus Torvalds @ 2022-06-06 19:19 ` John Johansen 2022-06-06 19:47 ` Linus Torvalds 2022-06-06 20:23 ` Matthew Wilcox 0 siblings, 2 replies; 11+ messages in thread From: John Johansen @ 2022-06-06 19:19 UTC (permalink / raw) To: Linus Torvalds, Eric W. Biederman Cc: Ammar Faizi, James Morris, LSM List, Linux Kernel Mailing List, Al Viro, Kees Cook, linux-fsdevel, Linux-MM, gwml On 6/6/22 11:28, Linus Torvalds wrote: > On Mon, Jun 6, 2022 at 8:19 AM Eric W. Biederman <[email protected]> wrote: >> Has anyone looked into this lock ordering issues? > > The deadlock is > >>>> [78140.503821] CPU0 CPU1 >>>> [78140.503823] ---- ---- >>>> [78140.503824] lock(&newf->file_lock); >>>> [78140.503826] lock(&p->alloc_lock); >>>> [78140.503828] lock(&newf->file_lock); >>>> [78140.503830] lock(&ctx->lock); > > and the alloc_lock -> file_lock on CPU1 is trivial - it's seq_show() > in fs/proc/fd.c: > > task_lock(task); > files = task->files; > if (files) { > unsigned int fd = proc_fd(m->private); > > spin_lock(&files->file_lock); > > and that looks all normal. > > But the other chains look painful. > > I do see the IPC code doing ugly things, in particular I detest this code: > > task_lock(current); > list_add(&shp->shm_clist, ¤t->sysvshm.shm_clist); > task_unlock(current); > > where it is using the task lock to protect the shm_clist list. Nasty. > > And it's doing that inside the shm_ids.rwsem lock _and_ inside the > shp->shm_perm.lock. > > So the IPC code has newseg() doing > > shmget -> > ipcget(): > down_write(ids->rwsem) -> > newseg(): > ipc_addid gets perm->lock > task_lock(current) > > so you have > > ids->rwsem -> perm->lock -> alloc_lock > > there. > > So now we have that > > ids->rwsem -> ipcperm->lock -> alloc_lock -> file_lock > > when you put those sequences together. > > But I didn't figure out what the security subsystem angle is and how > that then apparently mixes things up with execve. > > Yes, newseg() is doing that > > error = security_shm_alloc(&shp->shm_perm); > > while holding rwsem, but I can't see how that matters. From the > lockdep output, rwsem doesn't actually seem to be part of the whole > sequence. > > It *looks* like we have > > apparmour ctx->lock --> > radix_tree_preloads.lock --> > ipcperm->lock > > and apparently that's called under the file_lock somewhere, completing > the circle. > > I guess the execve component is that > > begin_new_exec -> > security_bprm_committing_creds -> > apparmor_bprm_committing_creds -> > aa_inherit_files -> > iterate_fd -> *takes file_lock* > match_file -> > aa_file_perm -> > update_file_ctx *takes ctx->lock* > > so that's how you get file_lock -> ctx->lock. > yes > So you have: > > SHMGET: > ipcperm->lock -> alloc_lock > /proc: > alloc_lock -> file_lock > apparmor_bprm_committing_creds: > file_lock -> ctx->lock > > and then all you need is ctx->lock -> ipcperm->lock but I didn't find that part. > yeah that is the part I got stuck on, before being pulled away from this > I suspect that part is that both Apparmor and IPC use the idr local lock. > bingo, apparmor moved its secids allocation from a custom radix tree to idr in 99cc45e48678 apparmor: Use an IDR to allocate apparmor secids and ipc is using the idr for its id allocation as well I can easily lift the secid() allocation out of the ctx->lock but that would still leave it happening under the file_lock and not fix the problem. I think the quick solution would be for apparmor to stop using idr, reverting back at least temporarily to the custom radix tree. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 5.18-rc4 2022-06-06 19:19 ` John Johansen @ 2022-06-06 19:47 ` Linus Torvalds 2022-06-06 20:23 ` Matthew Wilcox 1 sibling, 0 replies; 11+ messages in thread From: Linus Torvalds @ 2022-06-06 19:47 UTC (permalink / raw) To: John Johansen, Thomas Gleixner Cc: Eric W. Biederman, Ammar Faizi, James Morris, LSM List, Linux Kernel Mailing List, Al Viro, Kees Cook, linux-fsdevel, Linux-MM, gwml On Mon, Jun 6, 2022 at 12:19 PM John Johansen <[email protected]> wrote: > > > I suspect that part is that both Apparmor and IPC use the idr local lock. > > bingo, > > apparmor moved its secids allocation from a custom radix tree to idr in > > 99cc45e48678 apparmor: Use an IDR to allocate apparmor secids > > and ipc is using the idr for its id allocation as well The thing is, I'm not entirely convinced you can deadlock on a local lock. A local lock is per-cpu, so one CPU holding that lock won't actually block another CPU holding it. Even on RT, I think. I *think* local locks are useful for lockdep not because of any lock chains they introduce, but because of how lockdep catches irq mis-use (where they *can* deadlock). But I may be entirely wrong, and maybe that lock chain through the local lock actually does matter. Let's bring in people who actually know what they are doing, rather than my wild speculation. Thomas? Linus ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 5.18-rc4 2022-06-06 19:19 ` John Johansen 2022-06-06 19:47 ` Linus Torvalds @ 2022-06-06 20:23 ` Matthew Wilcox 2022-06-06 21:00 ` John Johansen 1 sibling, 1 reply; 11+ messages in thread From: Matthew Wilcox @ 2022-06-06 20:23 UTC (permalink / raw) To: John Johansen Cc: Linus Torvalds, Eric W. Biederman, Ammar Faizi, James Morris, LSM List, Linux Kernel Mailing List, Al Viro, Kees Cook, linux-fsdevel, Linux-MM, gwml On Mon, Jun 06, 2022 at 12:19:36PM -0700, John Johansen wrote: > > I suspect that part is that both Apparmor and IPC use the idr local lock. > > > bingo, > > apparmor moved its secids allocation from a custom radix tree to idr in > > 99cc45e48678 apparmor: Use an IDR to allocate apparmor secids > > and ipc is using the idr for its id allocation as well > > I can easily lift the secid() allocation out of the ctx->lock but that > would still leave it happening under the file_lock and not fix the problem. > I think the quick solution would be for apparmor to stop using idr, reverting > back at least temporarily to the custom radix tree. How about moving forward to the XArray that doesn't use that horrid prealloc gunk? Compile tested only. diff --git a/security/apparmor/include/secid.h b/security/apparmor/include/secid.h index 48ff1ddecad5..278dff5ecd1f 100644 --- a/security/apparmor/include/secid.h +++ b/security/apparmor/include/secid.h @@ -31,6 +31,4 @@ int aa_alloc_secid(struct aa_label *label, gfp_t gfp); void aa_free_secid(u32 secid); void aa_secid_update(u32 secid, struct aa_label *label); -void aa_secids_init(void); - #endif /* __AA_SECID_H */ diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c index 900bc540656a..9dfb4e4631da 100644 --- a/security/apparmor/lsm.c +++ b/security/apparmor/lsm.c @@ -1857,8 +1857,6 @@ static int __init apparmor_init(void) { int error; - aa_secids_init(); - error = aa_setup_dfa_engine(); if (error) { AA_ERROR("Unable to setup dfa engine\n"); diff --git a/security/apparmor/secid.c b/security/apparmor/secid.c index ce545f99259e..3b08942db1f6 100644 --- a/security/apparmor/secid.c +++ b/security/apparmor/secid.c @@ -13,9 +13,9 @@ #include <linux/errno.h> #include <linux/err.h> #include <linux/gfp.h> -#include <linux/idr.h> #include <linux/slab.h> #include <linux/spinlock.h> +#include <linux/xarray.h> #include "include/cred.h" #include "include/lib.h" @@ -29,8 +29,7 @@ */ #define AA_FIRST_SECID 2 -static DEFINE_IDR(aa_secids); -static DEFINE_SPINLOCK(secid_lock); +static DEFINE_XARRAY_FLAGS(aa_secids, XA_FLAGS_LOCK_IRQ | XA_FLAGS_TRACK_FREE); /* * TODO: allow policy to reserve a secid range? @@ -47,9 +46,9 @@ void aa_secid_update(u32 secid, struct aa_label *label) { unsigned long flags; - spin_lock_irqsave(&secid_lock, flags); - idr_replace(&aa_secids, label, secid); - spin_unlock_irqrestore(&secid_lock, flags); + xa_lock_irqsave(&aa_secids, flags); + __xa_store(&aa_secids, secid, label, 0); + xa_unlock_irqrestore(&aa_secids, flags); } /** @@ -58,13 +57,7 @@ void aa_secid_update(u32 secid, struct aa_label *label) */ struct aa_label *aa_secid_to_label(u32 secid) { - struct aa_label *label; - - rcu_read_lock(); - label = idr_find(&aa_secids, secid); - rcu_read_unlock(); - - return label; + return xa_load(&aa_secids, secid); } int apparmor_secid_to_secctx(u32 secid, char **secdata, u32 *seclen) @@ -126,19 +119,16 @@ int aa_alloc_secid(struct aa_label *label, gfp_t gfp) unsigned long flags; int ret; - idr_preload(gfp); - spin_lock_irqsave(&secid_lock, flags); - ret = idr_alloc(&aa_secids, label, AA_FIRST_SECID, 0, GFP_ATOMIC); - spin_unlock_irqrestore(&secid_lock, flags); - idr_preload_end(); + xa_lock_irqsave(&aa_secids, flags); + ret = __xa_alloc(&aa_secids, &label->secid, label, + XA_LIMIT(AA_FIRST_SECID, INT_MAX), gfp); + xa_unlock_irqrestore(&aa_secids, flags); if (ret < 0) { label->secid = AA_SECID_INVALID; return ret; } - AA_BUG(ret == AA_SECID_INVALID); - label->secid = ret; return 0; } @@ -150,12 +140,7 @@ void aa_free_secid(u32 secid) { unsigned long flags; - spin_lock_irqsave(&secid_lock, flags); - idr_remove(&aa_secids, secid); - spin_unlock_irqrestore(&secid_lock, flags); -} - -void aa_secids_init(void) -{ - idr_init_base(&aa_secids, AA_FIRST_SECID); + xa_lock_irqsave(&aa_secids, flags); + __xa_erase(&aa_secids, secid); + xa_unlock_irqrestore(&aa_secids, flags); } ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: Linux 5.18-rc4 2022-06-06 20:23 ` Matthew Wilcox @ 2022-06-06 21:00 ` John Johansen 2022-06-13 22:48 ` Matthew Wilcox 0 siblings, 1 reply; 11+ messages in thread From: John Johansen @ 2022-06-06 21:00 UTC (permalink / raw) To: Matthew Wilcox Cc: Linus Torvalds, Eric W. Biederman, Ammar Faizi, James Morris, LSM List, Linux Kernel Mailing List, Al Viro, Kees Cook, linux-fsdevel, Linux-MM, gwml On 6/6/22 13:23, Matthew Wilcox wrote: > On Mon, Jun 06, 2022 at 12:19:36PM -0700, John Johansen wrote: >>> I suspect that part is that both Apparmor and IPC use the idr local lock. >>> >> bingo, >> >> apparmor moved its secids allocation from a custom radix tree to idr in >> >> 99cc45e48678 apparmor: Use an IDR to allocate apparmor secids >> >> and ipc is using the idr for its id allocation as well >> >> I can easily lift the secid() allocation out of the ctx->lock but that >> would still leave it happening under the file_lock and not fix the problem. >> I think the quick solution would be for apparmor to stop using idr, reverting >> back at least temporarily to the custom radix tree. > > How about moving forward to the XArray that doesn't use that horrid > prealloc gunk? Compile tested only. > I'm not very familiar with XArray but it does seem like a good fit. We do try to keep the secid allocation dense, ideally no holes. Wrt the current locking issue I want to hear what Thomas has to say. Regardless I am looking into whether we should just switch to XArrays going forward. > > diff --git a/security/apparmor/include/secid.h b/security/apparmor/include/secid.h > index 48ff1ddecad5..278dff5ecd1f 100644 > --- a/security/apparmor/include/secid.h > +++ b/security/apparmor/include/secid.h > @@ -31,6 +31,4 @@ int aa_alloc_secid(struct aa_label *label, gfp_t gfp); > void aa_free_secid(u32 secid); > void aa_secid_update(u32 secid, struct aa_label *label); > > -void aa_secids_init(void); > - > #endif /* __AA_SECID_H */ > diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c > index 900bc540656a..9dfb4e4631da 100644 > --- a/security/apparmor/lsm.c > +++ b/security/apparmor/lsm.c > @@ -1857,8 +1857,6 @@ static int __init apparmor_init(void) > { > int error; > > - aa_secids_init(); > - > error = aa_setup_dfa_engine(); > if (error) { > AA_ERROR("Unable to setup dfa engine\n"); > diff --git a/security/apparmor/secid.c b/security/apparmor/secid.c > index ce545f99259e..3b08942db1f6 100644 > --- a/security/apparmor/secid.c > +++ b/security/apparmor/secid.c > @@ -13,9 +13,9 @@ > #include <linux/errno.h> > #include <linux/err.h> > #include <linux/gfp.h> > -#include <linux/idr.h> > #include <linux/slab.h> > #include <linux/spinlock.h> > +#include <linux/xarray.h> > > #include "include/cred.h" > #include "include/lib.h" > @@ -29,8 +29,7 @@ > */ > #define AA_FIRST_SECID 2 > > -static DEFINE_IDR(aa_secids); > -static DEFINE_SPINLOCK(secid_lock); > +static DEFINE_XARRAY_FLAGS(aa_secids, XA_FLAGS_LOCK_IRQ | XA_FLAGS_TRACK_FREE); > > /* > * TODO: allow policy to reserve a secid range? > @@ -47,9 +46,9 @@ void aa_secid_update(u32 secid, struct aa_label *label) > { > unsigned long flags; > > - spin_lock_irqsave(&secid_lock, flags); > - idr_replace(&aa_secids, label, secid); > - spin_unlock_irqrestore(&secid_lock, flags); > + xa_lock_irqsave(&aa_secids, flags); > + __xa_store(&aa_secids, secid, label, 0); > + xa_unlock_irqrestore(&aa_secids, flags); > } > > /** > @@ -58,13 +57,7 @@ void aa_secid_update(u32 secid, struct aa_label *label) > */ > struct aa_label *aa_secid_to_label(u32 secid) > { > - struct aa_label *label; > - > - rcu_read_lock(); > - label = idr_find(&aa_secids, secid); > - rcu_read_unlock(); > - > - return label; > + return xa_load(&aa_secids, secid); > } > > int apparmor_secid_to_secctx(u32 secid, char **secdata, u32 *seclen) > @@ -126,19 +119,16 @@ int aa_alloc_secid(struct aa_label *label, gfp_t gfp) > unsigned long flags; > int ret; > > - idr_preload(gfp); > - spin_lock_irqsave(&secid_lock, flags); > - ret = idr_alloc(&aa_secids, label, AA_FIRST_SECID, 0, GFP_ATOMIC); > - spin_unlock_irqrestore(&secid_lock, flags); > - idr_preload_end(); > + xa_lock_irqsave(&aa_secids, flags); > + ret = __xa_alloc(&aa_secids, &label->secid, label, > + XA_LIMIT(AA_FIRST_SECID, INT_MAX), gfp); > + xa_unlock_irqrestore(&aa_secids, flags); > > if (ret < 0) { > label->secid = AA_SECID_INVALID; > return ret; > } > > - AA_BUG(ret == AA_SECID_INVALID); > - label->secid = ret; > return 0; > } > > @@ -150,12 +140,7 @@ void aa_free_secid(u32 secid) > { > unsigned long flags; > > - spin_lock_irqsave(&secid_lock, flags); > - idr_remove(&aa_secids, secid); > - spin_unlock_irqrestore(&secid_lock, flags); > -} > - > -void aa_secids_init(void) > -{ > - idr_init_base(&aa_secids, AA_FIRST_SECID); > + xa_lock_irqsave(&aa_secids, flags); > + __xa_erase(&aa_secids, secid); > + xa_unlock_irqrestore(&aa_secids, flags); > } ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 5.18-rc4 2022-06-06 21:00 ` John Johansen @ 2022-06-13 22:48 ` Matthew Wilcox 2022-06-21 20:27 ` John Johansen 0 siblings, 1 reply; 11+ messages in thread From: Matthew Wilcox @ 2022-06-13 22:48 UTC (permalink / raw) To: John Johansen Cc: Linus Torvalds, Eric W. Biederman, Ammar Faizi, James Morris, LSM List, Linux Kernel Mailing List, Al Viro, Kees Cook, linux-fsdevel, Linux-MM, gwml On Mon, Jun 06, 2022 at 02:00:33PM -0700, John Johansen wrote: > On 6/6/22 13:23, Matthew Wilcox wrote: > > On Mon, Jun 06, 2022 at 12:19:36PM -0700, John Johansen wrote: > >>> I suspect that part is that both Apparmor and IPC use the idr local lock. > >>> > >> bingo, > >> > >> apparmor moved its secids allocation from a custom radix tree to idr in > >> > >> 99cc45e48678 apparmor: Use an IDR to allocate apparmor secids > >> > >> and ipc is using the idr for its id allocation as well > >> > >> I can easily lift the secid() allocation out of the ctx->lock but that > >> would still leave it happening under the file_lock and not fix the problem. > >> I think the quick solution would be for apparmor to stop using idr, reverting > >> back at least temporarily to the custom radix tree. > > > > How about moving forward to the XArray that doesn't use that horrid > > prealloc gunk? Compile tested only. > > > > I'm not very familiar with XArray but it does seem like a good fit. We do try > to keep the secid allocation dense, ideally no holes. Wrt the current locking > issue I want to hear what Thomas has to say. Regardless I am looking into > whether we should just switch to XArrays going forward. Nothing from Thomas ... shall we just go with this? Do you want a commit message, etc for the patch? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 5.18-rc4 2022-06-13 22:48 ` Matthew Wilcox @ 2022-06-21 20:27 ` John Johansen 2022-07-13 9:37 ` Ammar Faizi 0 siblings, 1 reply; 11+ messages in thread From: John Johansen @ 2022-06-21 20:27 UTC (permalink / raw) To: Matthew Wilcox Cc: Linus Torvalds, Eric W. Biederman, Ammar Faizi, James Morris, LSM List, Linux Kernel Mailing List, Al Viro, Kees Cook, linux-fsdevel, Linux-MM, gwml On 6/13/22 15:48, Matthew Wilcox wrote: > On Mon, Jun 06, 2022 at 02:00:33PM -0700, John Johansen wrote: >> On 6/6/22 13:23, Matthew Wilcox wrote: >>> On Mon, Jun 06, 2022 at 12:19:36PM -0700, John Johansen wrote: >>>>> I suspect that part is that both Apparmor and IPC use the idr local lock. >>>>> >>>> bingo, >>>> >>>> apparmor moved its secids allocation from a custom radix tree to idr in >>>> >>>> 99cc45e48678 apparmor: Use an IDR to allocate apparmor secids >>>> >>>> and ipc is using the idr for its id allocation as well >>>> >>>> I can easily lift the secid() allocation out of the ctx->lock but that >>>> would still leave it happening under the file_lock and not fix the problem. >>>> I think the quick solution would be for apparmor to stop using idr, reverting >>>> back at least temporarily to the custom radix tree. >>> >>> How about moving forward to the XArray that doesn't use that horrid >>> prealloc gunk? Compile tested only. >>> >> >> I'm not very familiar with XArray but it does seem like a good fit. We do try >> to keep the secid allocation dense, ideally no holes. Wrt the current locking >> issue I want to hear what Thomas has to say. Regardless I am looking into >> whether we should just switch to XArrays going forward. > > Nothing from Thomas ... shall we just go with this? Do you want a > commit message, etc for the patch? Hey Matthew, I have done testing with this and the patch looks good. We will certainly go this route going forward so a commit message, would be good. As for fixing the issue in current kernels. I am not opposed to pulling this back as fixes for 99cc45e48678 apparmor: Use an IDR to allocate apparmor secids but I would like some other peoples opinions on doing that, because we don't understand why we are getting the current lock splat, and without understanding it is a fix by avoiding the issue rather than being sure the actual issue is fixed. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 5.18-rc4 2022-06-21 20:27 ` John Johansen @ 2022-07-13 9:37 ` Ammar Faizi 0 siblings, 0 replies; 11+ messages in thread From: Ammar Faizi @ 2022-07-13 9:37 UTC (permalink / raw) To: John Johansen, Matthew Wilcox Cc: Linus Torvalds, Eric W. Biederman, Ammar Faizi, Al Viro, James Morris, LSM List, Linux Kernel Mailing List, Kees Cook, linux-fsdevel, Linux-MM, GNU/Weeb Mailing List On 6/22/22 3:27 AM, John Johansen wrote: > 99cc45e48678 apparmor: Use an IDR to allocate apparmor secids > > but I would like some other peoples opinions on doing that, because we > don't understand why we are getting the current lock splat, and without > understanding it is a fix by avoiding the issue rather than being sure > the actual issue is fixed. Update from me: I can't reproduce the splat in 5.19.0-rc4. Not sure if it's already fixed though, since it happened randomly and I didn't realize what action provoked the splat. I am running Ubuntu 22.04 for my daily desktop activity. I'll keep my lockdep on and will send you update if it's still triggering. -- Ammar Faizi ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-07-13 9:37 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <CAHk-=whmtHMzjaVUF9bS+7vE_rrRctcCTvsAeB8fuLYcyYLN-g@mail.gmail.com> 2022-04-27 17:59 ` Linux 5.18-rc4 Ammar Faizi 2022-04-27 18:31 ` Linus Torvalds 2022-06-06 15:19 ` Eric W. Biederman 2022-06-06 18:28 ` Linus Torvalds 2022-06-06 19:19 ` John Johansen 2022-06-06 19:47 ` Linus Torvalds 2022-06-06 20:23 ` Matthew Wilcox 2022-06-06 21:00 ` John Johansen 2022-06-13 22:48 ` Matthew Wilcox 2022-06-21 20:27 ` John Johansen 2022-07-13 9:37 ` Ammar Faizi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox