bisecting cause commit starting from 7be97138e7276c71cc9ad1752dcb502d28f4400d building syzkaller on a34e2c332411388ed2b3f6f1a3acdc062feceb79 testing commit 7be97138e7276c71cc9ad1752dcb502d28f4400d with gcc (GCC) 8.1.0 kernel signature: 86b8a2c5ce210b0bb8ac6e9b02e81f0cbb644ce0247aa56b9e4471a579e5c013 run #0: crashed: possible deadlock in free_ioctx_users run #1: crashed: possible deadlock in free_ioctx_users run #2: crashed: possible deadlock in free_ioctx_users run #3: crashed: possible deadlock in free_ioctx_users run #4: crashed: possible deadlock in free_ioctx_users run #5: crashed: possible deadlock in io_submit_one run #6: crashed: possible deadlock in free_ioctx_users run #7: crashed: possible deadlock in free_ioctx_users run #8: crashed: possible deadlock in free_ioctx_users run #9: crashed: possible deadlock in free_ioctx_users testing release v5.6 testing commit 7111951b8d4973bda27ff663f2cf18b663d15b48 with gcc (GCC) 8.1.0 kernel signature: 163c41d55fc582fc53d7155877a8bde32a95bd3662ba4a2323225cf1fee827e8 all runs: OK # git bisect start 7be97138e7276c71cc9ad1752dcb502d28f4400d 7111951b8d4973bda27ff663f2cf18b663d15b48 Bisecting: 4567 revisions left to test after this (roughly 12 steps) [56a451b780676bc1cdac011735fe2869fa2e9abf] Merge tag 'ntb-5.7' of git://github.com/jonmason/ntb testing commit 56a451b780676bc1cdac011735fe2869fa2e9abf with gcc (GCC) 8.1.0 kernel signature: 26be96156dc5e9b9a1137012f59cf03d471ce1fd749422b21ef704d52853e339 all runs: OK # git bisect good 56a451b780676bc1cdac011735fe2869fa2e9abf Bisecting: 2270 revisions left to test after this (roughly 11 steps) [ed52f2c608c9451fa2bad298b2ab927416105d65] Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next testing commit ed52f2c608c9451fa2bad298b2ab927416105d65 with gcc (GCC) 8.1.0 kernel signature: d57b2204ad00cc72492b3647ec8c62de8e9ab89bd6a5dc5030f6ee67f1b34a9a all runs: OK # git bisect good ed52f2c608c9451fa2bad298b2ab927416105d65 Bisecting: 1125 revisions left to test after this (roughly 10 steps) [69ddce0970d9d1de63bed9c24eefa0814db29a5a] Merge tag 'amd-drm-next-5.7-2020-03-10' of git://people.freedesktop.org/~agd5f/linux into drm-next testing commit 69ddce0970d9d1de63bed9c24eefa0814db29a5a with gcc (GCC) 8.1.0 kernel signature: db52e885103086ca55a9cced96d6207b66f85db3eb5a3c957b06cde2274ee34c all runs: OK # git bisect good 69ddce0970d9d1de63bed9c24eefa0814db29a5a Bisecting: 456 revisions left to test after this (roughly 9 steps) [f365ab31efacb70bed1e821f7435626e0b2528a6] Merge tag 'drm-next-2020-04-01' of git://anongit.freedesktop.org/drm/drm testing commit f365ab31efacb70bed1e821f7435626e0b2528a6 with gcc (GCC) 8.1.0 kernel signature: 1248319c2b8b1368f820edbd470ab63127da4c01516a2fd6959d2a4d31ba4fe3 all runs: OK # git bisect good f365ab31efacb70bed1e821f7435626e0b2528a6 Bisecting: 212 revisions left to test after this (roughly 8 steps) [919dce24701f7b34681a6a1d3ef95c9f6c4fb1cc] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma testing commit 919dce24701f7b34681a6a1d3ef95c9f6c4fb1cc with gcc (GCC) 8.1.0 kernel signature: f3b9e1748999958437d8accc1c609d7f2304ae54de104686c2724e389bbb5a49 all runs: OK # git bisect good 919dce24701f7b34681a6a1d3ef95c9f6c4fb1cc Bisecting: 106 revisions left to test after this (roughly 7 steps) [d59f44d3e723c4f7143d910dfa333f2bdb587bbc] xfs: directory bestfree check should release buffers testing commit d59f44d3e723c4f7143d910dfa333f2bdb587bbc with gcc (GCC) 8.1.0 kernel signature: 1a02ab1bb53a6b7ad24fad24dcb831c1911ce9350f41d63ec02bc313e6c046be all runs: OK # git bisect good d59f44d3e723c4f7143d910dfa333f2bdb587bbc Bisecting: 53 revisions left to test after this (roughly 6 steps) [7ef482fa65513b18eed05a5c5f00413aad137810] helper for mount rootwards traversal testing commit 7ef482fa65513b18eed05a5c5f00413aad137810 with gcc (GCC) 8.1.0 kernel signature: 5767db242e731c25a4d3cf1945aec8ca8347024e60b49ff6c9a33ad670101aab all runs: OK # git bisect good 7ef482fa65513b18eed05a5c5f00413aad137810 Bisecting: 26 revisions left to test after this (roughly 5 steps) [4b871ce26ab2c758ea90b8dd659e4267aae9e943] Merged 'Infrastructure to allow fixing exec deadlocks' from Bernd Edlinger testing commit 4b871ce26ab2c758ea90b8dd659e4267aae9e943 with gcc (GCC) 8.1.0 kernel signature: 4c71b6b6182fac0c4d165c4348c4a91fae14fe9eaff5991a01b6bfe7527154e1 all runs: crashed: possible deadlock in free_ioctx_users # git bisect bad 4b871ce26ab2c758ea90b8dd659e4267aae9e943 Bisecting: 13 revisions left to test after this (roughly 4 steps) [2ca7be7d55ad84fb0f7c2a23fb700a28fd76b19a] exec: Only compute current once in flush_old_exec testing commit 2ca7be7d55ad84fb0f7c2a23fb700a28fd76b19a with gcc (GCC) 8.1.0 kernel signature: 6f4a13de62dbbd80d9f19aee68cbf23fc3624cc1f9ac4ac50e647cd6ab3eb4fd all runs: crashed: possible deadlock in free_ioctx_users # git bisect bad 2ca7be7d55ad84fb0f7c2a23fb700a28fd76b19a Bisecting: 6 revisions left to test after this (roughly 3 steps) [7bc3e6e55acf065500a24621f3b313e7e5998acf] proc: Use a list of inodes to flush from proc testing commit 7bc3e6e55acf065500a24621f3b313e7e5998acf with gcc (GCC) 8.1.0 kernel signature: e8b334daef74d7f912ddb378084f13bafd80a1cfa2f71e0e93df63720685ecaf all runs: crashed: possible deadlock in free_ioctx_users # git bisect bad 7bc3e6e55acf065500a24621f3b313e7e5998acf Bisecting: 2 revisions left to test after this (roughly 2 steps) [080f6276fccfb3923691e57c1b44a627eabd1a25] proc: In proc_prune_siblings_dcache cache an aquired super block testing commit 080f6276fccfb3923691e57c1b44a627eabd1a25 with gcc (GCC) 8.1.0 kernel signature: 409dec36528cf0038926cc157b3d5942c6544fb05c08beb730e0b81c9b13ee2b all runs: OK # git bisect good 080f6276fccfb3923691e57c1b44a627eabd1a25 Bisecting: 0 revisions left to test after this (roughly 1 step) [71448011ea2a1cd36d8f5cbdab0ed716c454d565] proc: Clear the pieces of proc_inode that proc_evict_inode cares about testing commit 71448011ea2a1cd36d8f5cbdab0ed716c454d565 with gcc (GCC) 8.1.0 kernel signature: bc2dc1b034ea090d03c518d77847b0d50b2d71597e7daae8145e95c1114f9248 all runs: OK # git bisect good 71448011ea2a1cd36d8f5cbdab0ed716c454d565 7bc3e6e55acf065500a24621f3b313e7e5998acf is the first bad commit commit 7bc3e6e55acf065500a24621f3b313e7e5998acf Author: Eric W. Biederman Date: Wed Feb 19 18:22:26 2020 -0600 proc: Use a list of inodes to flush from proc Rework the flushing of proc to use a list of directory inodes that need to be flushed. The list is kept on struct pid not on struct task_struct, as there is a fixed connection between proc inodes and pids but at least for the case of de_thread the pid of a task_struct changes. This removes the dependency on proc_mnt which allows for different mounts of proc having different mount options even in the same pid namespace and this allows for the removal of proc_mnt which will trivially the first mount of proc to honor it's mount options. This flushing remains an optimization. The functions pid_delete_dentry and pid_revalidate ensure that ordinary dcache management will not attempt to use dentries past the point their respective task has died. When unused the shrinker will eventually be able to remove these dentries. There is a case in de_thread where proc_flush_pid can be called early for a given pid. Which winds up being safe (if suboptimal) as this is just an optiimization. Only pid directories are put on the list as the other per pid files are children of those directories and d_invalidate on the directory will get them as well. So that the pid can be used during flushing it's reference count is taken in release_task and dropped in proc_flush_pid. Further the call of proc_flush_pid is moved after the tasklist_lock is released in release_task so that it is certain that the pid has already been unhashed when flushing it taking place. This removes a small race where a dentry could recreated. As struct pid is supposed to be small and I need a per pid lock I reuse the only lock that currently exists in struct pid the the wait_pidfd.lock. The net result is that this adds all of this functionality with just a little extra list management overhead and a single extra pointer in struct pid. v2: Initialize pid->inodes. I somehow failed to get that initialization into the initial version of the patch. A boot failure was reported by "kernel test robot ", and failure to initialize that pid->inodes matches all of the reported symptoms. Signed-off-by: Eric W. Biederman fs/proc/base.c | 111 ++++++++++++++++-------------------------------- fs/proc/inode.c | 2 +- fs/proc/internal.h | 1 + include/linux/pid.h | 1 + include/linux/proc_fs.h | 4 +- kernel/exit.c | 4 +- kernel/pid.c | 1 + 7 files changed, 45 insertions(+), 79 deletions(-) culprit signature: e8b334daef74d7f912ddb378084f13bafd80a1cfa2f71e0e93df63720685ecaf parent signature: bc2dc1b034ea090d03c518d77847b0d50b2d71597e7daae8145e95c1114f9248 revisions tested: 14, total time: 3h23m36.868635482s (build: 1h28m45.16539389s, test: 1h53m46.997124196s) first bad commit: 7bc3e6e55acf065500a24621f3b313e7e5998acf proc: Use a list of inodes to flush from proc cc: ["adobriyan@gmail.com" "akpm@linux-foundation.org" "allison@lohutok.net" "areber@redhat.com" "aubrey.li@linux.intel.com" "avagin@gmail.com" "christian@brauner.io" "cyphar@cyphar.com" "ebiederm@xmission.com" "gregkh@linuxfoundation.org" "guro@fb.com" "joel@joelfernandes.org" "keescook@chromium.org" "linmiaohe@huawei.com" "linux-fsdevel@vger.kernel.org" "linux-kernel@vger.kernel.org" "mhocko@suse.com" "mingo@kernel.org" "oleg@redhat.com" "peterz@infradead.org" "sargun@sargun.me" "tglx@linutronix.de" "viro@zeniv.linux.org.uk"] crash: possible deadlock in free_ioctx_users ======================================================== WARNING: possible irq lock inversion dependency detected 5.6.0-rc2-syzkaller #0 Not tainted -------------------------------------------------------- ksoftirqd/1/16 just changed the state of lock: ffff88809e097b58 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:363 [inline] ffff88809e097b58 (&(&ctx->ctx_lock)->rlock){..-.}, at: free_ioctx_users+0x2b/0x3e0 fs/aio.c:618 but this lock took another, SOFTIRQ-unsafe lock in the past: (&pid->wait_pidfd){+.+.} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&pid->wait_pidfd); local_irq_disable(); lock(&(&ctx->ctx_lock)->rlock); lock(&pid->wait_pidfd); lock(&(&ctx->ctx_lock)->rlock); *** DEADLOCK *** 2 locks held by ksoftirqd/1/16: #0: ffffffff88ba5540 (rcu_callback){....}, at: rcu_do_batch kernel/rcu/tree.c:2176 [inline] #0: ffffffff88ba5540 (rcu_callback){....}, at: rcu_core+0x505/0x1290 kernel/rcu/tree.c:2410 #1: ffffffff88ba5600 (rcu_read_lock){....}, at: percpu_ref_call_confirm_rcu lib/percpu-refcount.c:126 [inline] #1: ffffffff88ba5600 (rcu_read_lock){....}, at: percpu_ref_switch_to_atomic_rcu+0x1c7/0x450 lib/percpu-refcount.c:165 the shortest dependencies between 2nd lock and 1st lock: -> (&pid->wait_pidfd){+.+.} { HARDIRQ-ON-W at: lock_acquire+0x19b/0x420 kernel/locking/lockdep.c:4484 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] proc_pid_make_inode+0x1c0/0x390 fs/proc/base.c:1880 proc_pid_instantiate+0x40/0x140 fs/proc/base.c:3285 proc_pid_lookup+0x134/0x240 fs/proc/base.c:3320 proc_root_lookup+0x16/0x40 fs/proc/root.c:243 __lookup_slow+0x204/0x3d0 fs/namei.c:1757 lookup_slow fs/namei.c:1774 [inline] walk_component+0x684/0xef0 fs/namei.c:1915 link_path_walk.part.40+0x3c4/0xdc0 fs/namei.c:2238 link_path_walk fs/namei.c:2172 [inline] path_openat+0x194/0x2aa0 fs/namei.c:3606 do_filp_open+0x171/0x240 fs/namei.c:3637 do_sys_openat2+0x2b9/0x480 fs/open.c:1149 do_sys_open+0x85/0xd0 fs/open.c:1165 do_syscall_64+0xc6/0x5e0 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe SOFTIRQ-ON-W at: lock_acquire+0x19b/0x420 kernel/locking/lockdep.c:4484 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] proc_pid_make_inode+0x1c0/0x390 fs/proc/base.c:1880 proc_pid_instantiate+0x40/0x140 fs/proc/base.c:3285 proc_pid_lookup+0x134/0x240 fs/proc/base.c:3320 proc_root_lookup+0x16/0x40 fs/proc/root.c:243 __lookup_slow+0x204/0x3d0 fs/namei.c:1757 lookup_slow fs/namei.c:1774 [inline] walk_component+0x684/0xef0 fs/namei.c:1915 link_path_walk.part.40+0x3c4/0xdc0 fs/namei.c:2238 link_path_walk fs/namei.c:2172 [inline] path_openat+0x194/0x2aa0 fs/namei.c:3606 do_filp_open+0x171/0x240 fs/namei.c:3637 do_sys_openat2+0x2b9/0x480 fs/open.c:1149 do_sys_open+0x85/0xd0 fs/open.c:1165 do_syscall_64+0xc6/0x5e0 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe INITIAL USE at: lock_acquire+0x19b/0x420 kernel/locking/lockdep.c:4484 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x95/0xc0 kernel/locking/spinlock.c:159 __wake_up_common_lock+0xa8/0x120 kernel/sched/wait.c:122 do_notify_pidfd kernel/signal.c:1895 [inline] do_notify_parent+0x14c/0xbb0 kernel/signal.c:1922 exit_notify kernel/exit.c:668 [inline] do_exit+0x206f/0x2a10 kernel/exit.c:824 call_usermodehelper_exec_async+0x47f/0x680 kernel/umh.c:125 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 } ... key at: [] __key.53243+0x0/0x40 ... acquired at: __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] aio_poll fs/aio.c:1767 [inline] __io_submit_one fs/aio.c:1841 [inline] io_submit_one+0x97f/0x2b00 fs/aio.c:1878 __do_sys_io_submit fs/aio.c:1937 [inline] __se_sys_io_submit fs/aio.c:1907 [inline] __x64_sys_io_submit+0x166/0x3d0 fs/aio.c:1907 do_syscall_64+0xc6/0x5e0 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> (&(&ctx->ctx_lock)->rlock){..-.} { IN-SOFTIRQ-W at: lock_acquire+0x19b/0x420 kernel/locking/lockdep.c:4484 __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline] _raw_spin_lock_irq+0x5e/0x80 kernel/locking/spinlock.c:167 spin_lock_irq include/linux/spinlock.h:363 [inline] free_ioctx_users+0x2b/0x3e0 fs/aio.c:618 percpu_ref_put_many include/linux/percpu-refcount.h:309 [inline] percpu_ref_put include/linux/percpu-refcount.h:325 [inline] percpu_ref_call_confirm_rcu lib/percpu-refcount.c:130 [inline] percpu_ref_switch_to_atomic_rcu+0x387/0x450 lib/percpu-refcount.c:165 rcu_do_batch kernel/rcu/tree.c:2186 [inline] rcu_core+0x584/0x1290 kernel/rcu/tree.c:2410 __do_softirq+0x26e/0x9b2 kernel/softirq.c:292 run_ksoftirqd+0x8f/0x100 kernel/softirq.c:603 smpboot_thread_fn+0x511/0x850 kernel/smpboot.c:165 kthread+0x31d/0x3e0 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 INITIAL USE at: lock_acquire+0x19b/0x420 kernel/locking/lockdep.c:4484 __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline] _raw_spin_lock_irq+0x5e/0x80 kernel/locking/spinlock.c:167 spin_lock_irq include/linux/spinlock.h:363 [inline] aio_poll fs/aio.c:1765 [inline] __io_submit_one fs/aio.c:1841 [inline] io_submit_one+0x93f/0x2b00 fs/aio.c:1878 __do_sys_io_submit fs/aio.c:1937 [inline] __se_sys_io_submit fs/aio.c:1907 [inline] __x64_sys_io_submit+0x166/0x3d0 fs/aio.c:1907 do_syscall_64+0xc6/0x5e0 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe } ... key at: [] __key.55053+0x0/0x40 ... acquired at: mark_lock_irq kernel/locking/lockdep.c:3316 [inline] mark_lock+0x501/0x11a0 kernel/locking/lockdep.c:3665 mark_usage kernel/locking/lockdep.c:3565 [inline] __lock_acquire+0x13ff/0x4370 kernel/locking/lockdep.c:3908 lock_acquire+0x19b/0x420 kernel/locking/lockdep.c:4484 __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline] _raw_spin_lock_irq+0x5e/0x80 kernel/locking/spinlock.c:167 spin_lock_irq include/linux/spinlock.h:363 [inline] free_ioctx_users+0x2b/0x3e0 fs/aio.c:618 percpu_ref_put_many include/linux/percpu-refcount.h:309 [inline] percpu_ref_put include/linux/percpu-refcount.h:325 [inline] percpu_ref_call_confirm_rcu lib/percpu-refcount.c:130 [inline] percpu_ref_switch_to_atomic_rcu+0x387/0x450 lib/percpu-refcount.c:165 rcu_do_batch kernel/rcu/tree.c:2186 [inline] rcu_core+0x584/0x1290 kernel/rcu/tree.c:2410 __do_softirq+0x26e/0x9b2 kernel/softirq.c:292 run_ksoftirqd+0x8f/0x100 kernel/softirq.c:603 smpboot_thread_fn+0x511/0x850 kernel/smpboot.c:165 kthread+0x31d/0x3e0 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 stack backtrace: CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x128/0x182 lib/dump_stack.c:118 print_irq_inversion_bug kernel/locking/lockdep.c:3179 [inline] check_usage_forwards.cold.61+0x20/0x29 kernel/locking/lockdep.c:3203 mark_lock_irq kernel/locking/lockdep.c:3316 [inline] mark_lock+0x501/0x11a0 kernel/locking/lockdep.c:3665 mark_usage kernel/locking/lockdep.c:3565 [inline] __lock_acquire+0x13ff/0x4370 kernel/locking/lockdep.c:3908 lock_acquire+0x19b/0x420 kernel/locking/lockdep.c:4484 __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline] _raw_spin_lock_irq+0x5e/0x80 kernel/locking/spinlock.c:167 spin_lock_irq include/linux/spinlock.h:363 [inline] free_ioctx_users+0x2b/0x3e0 fs/aio.c:618 percpu_ref_put_many include/linux/percpu-refcount.h:309 [inline] percpu_ref_put include/linux/percpu-refcount.h:325 [inline] percpu_ref_call_confirm_rcu lib/percpu-refcount.c:130 [inline] percpu_ref_switch_to_atomic_rcu+0x387/0x450 lib/percpu-refcount.c:165 rcu_do_batch kernel/rcu/tree.c:2186 [inline] rcu_core+0x584/0x1290 kernel/rcu/tree.c:2410 __do_softirq+0x26e/0x9b2 kernel/softirq.c:292 run_ksoftirqd+0x8f/0x100 kernel/softirq.c:603 smpboot_thread_fn+0x511/0x850 kernel/smpboot.c:165 kthread+0x31d/0x3e0 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352