bisecting cause commit starting from afe6fe7036c6efdcb46cabc64bec9b6e4a005210 building syzkaller on 18215b8d64c9bf653619b9eb9907445e225adf11 testing commit afe6fe7036c6efdcb46cabc64bec9b6e4a005210 with gcc (GCC) 8.1.0 all runs: crashed: BUG: MAX_STACK_TRACE_ENTRIES too low! testing release v5.0 testing commit 1c163f4c7b3f621efff9b28a47abb36f7378d783 with gcc (GCC) 8.1.0 all runs: OK # git bisect start afe6fe7036c6efdcb46cabc64bec9b6e4a005210 v5.0 Bisecting: 2359 revisions left to test after this (roughly 11 steps) [8c4238df4d0cc3420c5ee14b54d200d74267cfe5] Merge branch 'mlxsw-minimal-Add-ethtool-and-resource-query-support' testing commit 8c4238df4d0cc3420c5ee14b54d200d74267cfe5 with gcc (GCC) 8.1.0 all runs: OK # git bisect good 8c4238df4d0cc3420c5ee14b54d200d74267cfe5 Bisecting: 1178 revisions left to test after this (roughly 10 steps) [3478588b5136966c80c571cf0006f08e9e5b8f04] Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip testing commit 3478588b5136966c80c571cf0006f08e9e5b8f04 with gcc (GCC) 8.1.0 all runs: crashed: BUG: MAX_STACK_TRACE_ENTRIES too low! # git bisect bad 3478588b5136966c80c571cf0006f08e9e5b8f04 Bisecting: 580 revisions left to test after this (roughly 9 steps) [6456300356433873309a1cae6aa05e77d6b59153] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next testing commit 6456300356433873309a1cae6aa05e77d6b59153 with gcc (GCC) 8.1.0 all runs: OK # git bisect good 6456300356433873309a1cae6aa05e77d6b59153 Bisecting: 310 revisions left to test after this (roughly 8 steps) [8feed3efa8022107bcb3432ac3ec9917e078ae70] Merge branch 'parisc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux testing commit 8feed3efa8022107bcb3432ac3ec9917e078ae70 with gcc (GCC) 8.1.0 all runs: OK # git bisect good 8feed3efa8022107bcb3432ac3ec9917e078ae70 Bisecting: 137 revisions left to test after this (roughly 7 steps) [b1b988a6a035212f5ea205155c49ce449beedee8] Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip testing commit b1b988a6a035212f5ea205155c49ce449beedee8 with gcc (GCC) 8.1.0 all runs: OK # git bisect good b1b988a6a035212f5ea205155c49ce449beedee8 Bisecting: 64 revisions left to test after this (roughly 6 steps) [e7ffb4eb9a6d89678e7f62461737899f88dab64e] Merge branches 'doc.2019.01.26a', 'fixes.2019.01.26a', 'sil.2019.01.26a', 'spdx.2019.02.09a', 'srcu.2019.01.26a' and 'torture.2019.01.26a' into HEAD testing commit e7ffb4eb9a6d89678e7f62461737899f88dab64e with gcc (GCC) 8.1.0 all runs: OK # git bisect good e7ffb4eb9a6d89678e7f62461737899f88dab64e Bisecting: 32 revisions left to test after this (roughly 5 steps) [15ea86b58c71d05e0921bebcf707aa30e43e9e25] locking/lockdep: Fix reported required memory size (2/2) testing commit 15ea86b58c71d05e0921bebcf707aa30e43e9e25 with gcc (GCC) 8.1.0 all runs: OK # git bisect good 15ea86b58c71d05e0921bebcf707aa30e43e9e25 Bisecting: 16 revisions left to test after this (roughly 4 steps) [669de8bda87b92ab9a2fc663b3f5743c2ad1ae9f] kernel/workqueue: Use dynamic lockdep keys for workqueues testing commit 669de8bda87b92ab9a2fc663b3f5743c2ad1ae9f with gcc (GCC) 8.1.0 all runs: crashed: WARNING: locking bug in flush_workqueue # git bisect bad 669de8bda87b92ab9a2fc663b3f5743c2ad1ae9f Bisecting: 7 revisions left to test after this (roughly 3 steps) [a0b0fd53e1e67639b303b15939b9c653dbe7a8c4] locking/lockdep: Free lock classes that are no longer in use testing commit a0b0fd53e1e67639b303b15939b9c653dbe7a8c4 with gcc (GCC) 8.1.0 all runs: OK # git bisect good a0b0fd53e1e67639b303b15939b9c653dbe7a8c4 Bisecting: 3 revisions left to test after this (roughly 2 steps) [de4643a77356a77bce73f64275b125b4b71a69cf] locking/lockdep: Reuse lock chains that have been freed testing commit de4643a77356a77bce73f64275b125b4b71a69cf with gcc (GCC) 8.1.0 all runs: OK # git bisect good de4643a77356a77bce73f64275b125b4b71a69cf Bisecting: 1 revision left to test after this (roughly 1 step) [4bf508621855613ca2ac782f70c3171e0e8bb011] locking/lockdep: Verify whether lock objects are small enough to be used as class keys testing commit 4bf508621855613ca2ac782f70c3171e0e8bb011 with gcc (GCC) 8.1.0 all runs: OK # git bisect good 4bf508621855613ca2ac782f70c3171e0e8bb011 Bisecting: 0 revisions left to test after this (roughly 0 steps) [108c14858b9ea224686e476c8f5ec345a0df9e27] locking/lockdep: Add support for dynamic keys testing commit 108c14858b9ea224686e476c8f5ec345a0df9e27 with gcc (GCC) 8.1.0 all runs: OK # git bisect good 108c14858b9ea224686e476c8f5ec345a0df9e27 669de8bda87b92ab9a2fc663b3f5743c2ad1ae9f is the first bad commit commit 669de8bda87b92ab9a2fc663b3f5743c2ad1ae9f Author: Bart Van Assche Date: Thu Feb 14 15:00:54 2019 -0800 kernel/workqueue: Use dynamic lockdep keys for workqueues The following commit: 87915adc3f0a ("workqueue: re-add lockdep dependencies for flushing") improved deadlock checking in the workqueue implementation. Unfortunately that patch also introduced a few false positive lockdep complaints. This patch suppresses these false positives by allocating the workqueue mutex lockdep key dynamically. An example of a false positive lockdep complaint suppressed by this patch can be found below. The root cause of the lockdep complaint shown below is that the direct I/O code can call alloc_workqueue() from inside a work item created by another alloc_workqueue() call and that both workqueues share the same lockdep key. This patch avoids that that lockdep complaint is triggered by allocating the work queue lockdep keys dynamically. In other words, this patch guarantees that a unique lockdep key is associated with each work queue mutex. ====================================================== WARNING: possible circular locking dependency detected 4.19.0-dbg+ #1 Not tainted fio/4129 is trying to acquire lock: 00000000a01cfe1a ((wq_completion)"dio/%s"sb->s_id){+.+.}, at: flush_workqueue+0xd0/0x970 but task is already holding lock: 00000000a0acecf9 (&sb->s_type->i_mutex_key#14){+.+.}, at: ext4_file_write_iter+0x154/0x710 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&sb->s_type->i_mutex_key#14){+.+.}: down_write+0x3d/0x80 __generic_file_fsync+0x77/0xf0 ext4_sync_file+0x3c9/0x780 vfs_fsync_range+0x66/0x100 dio_complete+0x2f5/0x360 dio_aio_complete_work+0x1c/0x20 process_one_work+0x481/0x9f0 worker_thread+0x63/0x5a0 kthread+0x1cf/0x1f0 ret_from_fork+0x24/0x30 -> #1 ((work_completion)(&dio->complete_work)){+.+.}: process_one_work+0x447/0x9f0 worker_thread+0x63/0x5a0 kthread+0x1cf/0x1f0 ret_from_fork+0x24/0x30 -> #0 ((wq_completion)"dio/%s"sb->s_id){+.+.}: lock_acquire+0xc5/0x200 flush_workqueue+0xf3/0x970 drain_workqueue+0xec/0x220 destroy_workqueue+0x23/0x350 sb_init_dio_done_wq+0x6a/0x80 do_blockdev_direct_IO+0x1f33/0x4be0 __blockdev_direct_IO+0x79/0x86 ext4_direct_IO+0x5df/0xbb0 generic_file_direct_write+0x119/0x220 __generic_file_write_iter+0x131/0x2d0 ext4_file_write_iter+0x3fa/0x710 aio_write+0x235/0x330 io_submit_one+0x510/0xeb0 __x64_sys_io_submit+0x122/0x340 do_syscall_64+0x71/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Chain exists of: (wq_completion)"dio/%s"sb->s_id --> (work_completion)(&dio->complete_work) --> &sb->s_type->i_mutex_key#14 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sb->s_type->i_mutex_key#14); lock((work_completion)(&dio->complete_work)); lock(&sb->s_type->i_mutex_key#14); lock((wq_completion)"dio/%s"sb->s_id); *** DEADLOCK *** 1 lock held by fio/4129: #0: 00000000a0acecf9 (&sb->s_type->i_mutex_key#14){+.+.}, at: ext4_file_write_iter+0x154/0x710 stack backtrace: CPU: 3 PID: 4129 Comm: fio Not tainted 4.19.0-dbg+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 Call Trace: dump_stack+0x86/0xc5 print_circular_bug.isra.32+0x20a/0x218 __lock_acquire+0x1c68/0x1cf0 lock_acquire+0xc5/0x200 flush_workqueue+0xf3/0x970 drain_workqueue+0xec/0x220 destroy_workqueue+0x23/0x350 sb_init_dio_done_wq+0x6a/0x80 do_blockdev_direct_IO+0x1f33/0x4be0 __blockdev_direct_IO+0x79/0x86 ext4_direct_IO+0x5df/0xbb0 generic_file_direct_write+0x119/0x220 __generic_file_write_iter+0x131/0x2d0 ext4_file_write_iter+0x3fa/0x710 aio_write+0x235/0x330 io_submit_one+0x510/0xeb0 __x64_sys_io_submit+0x122/0x340 do_syscall_64+0x71/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Bart Van Assche Signed-off-by: Peter Zijlstra (Intel) Cc: Andrew Morton Cc: Johannes Berg Cc: Linus Torvalds Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Tejun Heo Cc: Thomas Gleixner Cc: Waiman Long Cc: Will Deacon Link: https://lkml.kernel.org/r/20190214230058.196511-20-bvanassche@acm.org [ Reworked the changelog a bit. ] Signed-off-by: Ingo Molnar :040000 040000 530930fe3168a62e87f99f11636b73bda9de0499 591356bacb6c4a56275e8c7348f6b072ebdbbafc M include :040000 040000 9b254911561c5975fd2ec055dcd930371e868e45 c68d1c4a4d8242842fbb9a4216de34d52e0dde80 M kernel revisions tested: 14, total time: 3h29m24.902969714s (build: 1h23m9.636205629s, test: 2h3m22.356232811s) first bad commit: 669de8bda87b92ab9a2fc663b3f5743c2ad1ae9f kernel/workqueue: Use dynamic lockdep keys for workqueues cc: ["akpm@linux-foundation.org" "bvanassche@acm.org" "johannes.berg@intel.com" "longman@redhat.com" "mingo@kernel.org" "paulmck@linux.vnet.ibm.com" "peterz@infradead.org" "tglx@linutronix.de" "tj@kernel.org" "torvalds@linux-foundation.org" "will.deacon@arm.com"] crash: WARNING: locking bug in flush_workqueue IPv6: ADDRCONF(NETDEV_CHANGE): veth1_to_hsr: link becomes ready IPv6: ADDRCONF(NETDEV_CHANGE): hsr_slave_1: link becomes ready 8021q: adding VLAN 0 to HW filter on device batadv0 IPv6: ADDRCONF(NETDEV_CHANGE): hsr0: link becomes ready 8021q: adding VLAN 0 to HW filter on device batadv0 WARNING: CPU: 1 PID: 7649 at kernel/locking/lockdep.c:747 arch_local_save_flags arch/x86/include/asm/paravirt.h:761 [inline] WARNING: CPU: 1 PID: 7649 at kernel/locking/lockdep.c:747 look_up_lock_class kernel/locking/lockdep.c:738 [inline] WARNING: CPU: 1 PID: 7649 at kernel/locking/lockdep.c:747 register_lock_class+0x10de/0x2220 kernel/locking/lockdep.c:1060 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 7649 Comm: syz-executor.4 Not tainted 5.0.0-rc8+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x16e/0x1f6 lib/dump_stack.c:113 panic+0x2ad/0x632 kernel/panic.c:214 __warn.cold.8+0x20/0x54 kernel/panic.c:571 report_bug+0x254/0x2d0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:178 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271 do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973 RIP: 0010:look_up_lock_class kernel/locking/lockdep.c:747 [inline] RIP: 0010:register_lock_class+0x10de/0x2220 kernel/locking/lockdep.c:1060 Code: 4c 89 ea 4d 8b 7e c0 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 38 10 00 00 4d 89 7d 00 e9 a2 f2 ff ff 0f 0b <0f> 0b e9 90 f1 ff ff 4c 89 f2 4c 8b 95 60 ff ff ff 48 8b 8d 58 ff RSP: 0018:ffff88801f3d78f8 EFLAGS: 00010002 RAX: 0000000000000004 RBX: ffffffff8a110780 RCX: 1ffff11003e7af29 RDX: ffff88807dc69780 RSI: 1ffffffff14b2b34 RDI: ffff88807be4b408 RBP: ffff88801f3d79d0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000003 R13: ffff88807be4b3f0 R14: 0000000000000000 R15: ffff888076e84f40 __lock_acquire+0xff/0x3a90 kernel/locking/lockdep.c:3550 lock_acquire+0x173/0x3d0 kernel/locking/lockdep.c:4178 flush_workqueue+0x115/0x1530 kernel/workqueue.c:2680 drain_workqueue+0x1ae/0x470 kernel/workqueue.c:2845 destroy_workqueue+0x21/0x700 kernel/workqueue.c:4218 ucma_close+0x276/0x310 drivers/infiniband/core/ucma.c:1779 __fput+0x2cf/0x8b0 fs/file_table.c:278 ____fput+0x15/0x20 fs/file_table.c:309 task_work_run+0x14d/0x1c0 kernel/task_work.c:113 tracehook_notify_resume include/linux/tracehook.h:188 [inline] exit_to_usermode_loop+0x271/0x2c0 arch/x86/entry/common.c:166 prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline] syscall_return_slowpath arch/x86/entry/common.c:268 [inline] do_syscall_64+0x536/0x600 arch/x86/entry/common.c:293 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4112a1 Code: 75 14 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 74 1a 00 00 c3 48 83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01 RSP: 002b:00007ffef5f67250 EFLAGS: 00000293 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00000000004112a1 RDX: 0000000000000000 RSI: 00000000007200b8 RDI: 0000000000000003 RBP: 0000000000000001 R08: 000000000001c6a9 R09: 000000000001c6a9 R10: 00007ffef5f67180 R11: 0000000000000293 R12: 0000000000000000 R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000004 Kernel Offset: disabled Rebooting in 86400 seconds..