ci starts bisection 2024-04-23 17:22:02.06732298 +0000 UTC m=+81742.352678330 bisecting cause commit starting from 4d2008430ce87061c9cefd4f83daf2d5bb323a96 building syzkaller on 21339d7b9986698282dce93709157dc36907fbf8 ensuring issue is reproducible on original commit 4d2008430ce87061c9cefd4f83daf2d5bb323a96 testing commit 4d2008430ce87061c9cefd4f83daf2d5bb323a96 gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 931c001d051533c52ec26bde5afd98317124d0daae37ca6a7e79df139e3a2384 all runs: crashed: possible deadlock in __unix_gc representative crash: possible deadlock in __unix_gc, types: [LOCKDEP] check whether we can drop unnecessary instrumentation disabling configs for [BUG KASAN ATOMIC_SLEEP HANG LEAK UBSAN], they are not needed testing commit 4d2008430ce87061c9cefd4f83daf2d5bb323a96 gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: fed69c82c68b0235bee801cd42c9b631192c19fbd1dcee0cdc3a042b463ba8de all runs: crashed: possible deadlock in __unix_gc representative crash: possible deadlock in __unix_gc, types: [LOCKDEP] the bug reproduces without the instrumentation disabling configs for [UBSAN BUG KASAN ATOMIC_SLEEP HANG LEAK], they are not needed kconfig minimization: base=3976 full=8005 leaves diff=2012 split chunks (needed=false): <2012> split chunk #0 of len 2012 into 5 parts testing without sub-chunk 1/5 disabling configs for [HANG LEAK UBSAN BUG KASAN ATOMIC_SLEEP], they are not needed testing commit 4d2008430ce87061c9cefd4f83daf2d5bb323a96 gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: d5063cb038b6e1debcb075ffd10d6097357f63ba9e38838d28bbfc843bf57189 all runs: crashed: possible deadlock in __unix_gc representative crash: possible deadlock in __unix_gc, types: [LOCKDEP] the chunk can be dropped testing without sub-chunk 2/5 disabling configs for [ATOMIC_SLEEP HANG LEAK UBSAN BUG KASAN], they are not needed testing commit 4d2008430ce87061c9cefd4f83daf2d5bb323a96 gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 8ab0c9a87ae9c5ae67ac67d4deef008a2849f9e9e978a6cb3560897a53e2eb65 all runs: crashed: possible deadlock in __unix_gc representative crash: possible deadlock in __unix_gc, types: [LOCKDEP] the chunk can be dropped testing without sub-chunk 3/5 disabling configs for [BUG KASAN ATOMIC_SLEEP HANG LEAK UBSAN], they are not needed testing commit 4d2008430ce87061c9cefd4f83daf2d5bb323a96 gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 1e77d39bc6e53edc91364e8cde5f0b6b499e15724d2fbaef9fbc6686cc5b81a5 all runs: crashed: possible deadlock in __unix_gc representative crash: possible deadlock in __unix_gc, types: [LOCKDEP] the chunk can be dropped testing without sub-chunk 4/5 disabling configs for [ATOMIC_SLEEP HANG LEAK UBSAN BUG KASAN], they are not needed testing commit 4d2008430ce87061c9cefd4f83daf2d5bb323a96 gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 383141676e62068f430b02a71f625a57be37f9f13e1ce4cd1c997a41860f001e all runs: crashed: possible deadlock in __unix_gc representative crash: possible deadlock in __unix_gc, types: [LOCKDEP] the chunk can be dropped testing without sub-chunk 5/5 disabling configs for [ATOMIC_SLEEP HANG LEAK UBSAN BUG KASAN], they are not needed testing commit 4d2008430ce87061c9cefd4f83daf2d5bb323a96 gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 754426fbbb5f3fd5a1a0ce8a74f7a98dd87c4944bf940c35fa8a4716cf7837ff all runs: crashed: possible deadlock in __unix_gc representative crash: possible deadlock in __unix_gc, types: [LOCKDEP] the chunk can be dropped disabling configs for [ATOMIC_SLEEP HANG LEAK UBSAN BUG KASAN], they are not needed picked [v6.8 v6.7 v6.6 v6.4 v6.2 v6.0 v5.18 v5.16 v5.13 v5.10 v5.7 v5.4 v5.1 v4.19] out of 31 release tags testing release v6.8 testing commit e8f897f4afef0031fe618a8e94127a0934896aba gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 195dd70b41939f5fcec82a336cdda0adb26b20d28db4e7d047377e211d8227c7 all runs: OK false negative chance: 0.000 # git bisect start 4d2008430ce87061c9cefd4f83daf2d5bb323a96 e8f897f4afef0031fe618a8e94127a0934896aba Bisecting: 6731 revisions left to test after this (roughly 13 steps) [480e035fc4c714fb5536e64ab9db04fedc89e910] Merge tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel testing commit 480e035fc4c714fb5536e64ab9db04fedc89e910 gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 0da60916b4353452846a23491895b4d3c0d3beaeee7d6bb310087622638e9f50 all runs: OK false negative chance: 0.000 # git bisect good 480e035fc4c714fb5536e64ab9db04fedc89e910 Bisecting: 3374 revisions left to test after this (roughly 12 steps) [9843231c97267d72be38a0409f5097987bc2cfa4] x86/boot/64: Move 5-level paging global variable assignments back testing commit 9843231c97267d72be38a0409f5097987bc2cfa4 gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 86c97e0f789807f7217cdef97c97b7c56461b76d522c1009f7ee4cfe7eba1ea9 all runs: OK false negative chance: 0.000 # git bisect good 9843231c97267d72be38a0409f5097987bc2cfa4 Bisecting: 1685 revisions left to test after this (roughly 11 steps) [c150b809f7de2afdd3fb5a9adff2a9a68d7331ce] Merge tag 'riscv-for-linus-6.9-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux testing commit c150b809f7de2afdd3fb5a9adff2a9a68d7331ce gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: b8c9eac8e9e1cf125b67946d2697a7f905cd8cc92f8848d85b4b97466a3251fa all runs: OK false negative chance: 0.000 # git bisect good c150b809f7de2afdd3fb5a9adff2a9a68d7331ce Bisecting: 844 revisions left to test after this (roughly 10 steps) [c7830236d58e9e982f3e180f054cfbc14788beca] Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux testing commit c7830236d58e9e982f3e180f054cfbc14788beca gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 8771d5cfd81e141adf9b9795f8f9d8516e98d08b1273a3613a0c60f7bcb7b473 all runs: OK false negative chance: 0.000 # git bisect good c7830236d58e9e982f3e180f054cfbc14788beca Bisecting: 423 revisions left to test after this (roughly 9 steps) [c7c4e1304c2ef69fc2b75b39e681d2c0cb9f1d55] Merge tag 'iommu-fixes-v6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu testing commit c7c4e1304c2ef69fc2b75b39e681d2c0cb9f1d55 gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 3fa8d27f683ad6f2f6959ab16c8629c8e04fa7719820388ca01248c6d549e82e all runs: crashed: possible deadlock in __unix_gc representative crash: possible deadlock in __unix_gc, types: [LOCKDEP] # git bisect bad c7c4e1304c2ef69fc2b75b39e681d2c0cb9f1d55 Bisecting: 211 revisions left to test after this (roughly 8 steps) [e1dc191dbf3f35cf07790b52110267bef55515a2] Merge tag 'bcachefs-2024-04-10' of https://evilpiepirate.org/git/bcachefs testing commit e1dc191dbf3f35cf07790b52110267bef55515a2 gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: afc84f7c00eb4954a85238eee6662450395afe843b698555a24d53716f4f3cd1 all runs: OK false negative chance: 0.000 # git bisect good e1dc191dbf3f35cf07790b52110267bef55515a2 Bisecting: 104 revisions left to test after this (roughly 7 steps) [586b5dfb51b962c1b6c06495715e4c4f76a7fc5a] Merge tag 'cxl-fixes-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl testing commit 586b5dfb51b962c1b6c06495715e4c4f76a7fc5a gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 33d373d7edd0cc42d294a5c0a94bffb7ab3208cc236441d2cc164ab4b37a7c46 all runs: crashed: possible deadlock in __unix_gc representative crash: possible deadlock in __unix_gc, types: [LOCKDEP] # git bisect bad 586b5dfb51b962c1b6c06495715e4c4f76a7fc5a Bisecting: 55 revisions left to test after this (roughly 6 steps) [47d8ac011fe1c9251070e1bd64cb10b48193ec51] af_unix: Fix garbage collector racing against connect() testing commit 47d8ac011fe1c9251070e1bd64cb10b48193ec51 gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 2fd2a80e608c33c31ba1e3612c87140b0028662ed7e74900448cfe196ecc6d31 all runs: crashed: possible deadlock in __unix_gc representative crash: possible deadlock in __unix_gc, types: [LOCKDEP] # git bisect bad 47d8ac011fe1c9251070e1bd64cb10b48193ec51 Bisecting: 25 revisions left to test after this (roughly 5 steps) [6309863b31dd80317cd7d6824820b44e254e2a9c] net: add copy_safe_from_sockptr() helper testing commit 6309863b31dd80317cd7d6824820b44e254e2a9c gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 54ed69128b56551ab2f8b883ec36414c8ff1afb7794e54c5eb0c30d629b80157 all runs: OK false negative chance: 0.000 # git bisect good 6309863b31dd80317cd7d6824820b44e254e2a9c Bisecting: 12 revisions left to test after this (roughly 4 steps) [7c6782ad4911cbee874e85630226ed389ff2e453] net/mlx5: Properly link new fs rules into the tree testing commit 7c6782ad4911cbee874e85630226ed389ff2e453 gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 4cd990e32a671cd4d28b45f9012bc6be50e949edd3e4801631fddbec18e675dd all runs: OK false negative chance: 0.000 # git bisect good 7c6782ad4911cbee874e85630226ed389ff2e453 Bisecting: 6 revisions left to test after this (roughly 3 steps) [49e6c9387051716169ff6a6c5ddd4d9f358db2e9] net/mlx5e: RSS, Block XOR hash with over 128 channels testing commit 49e6c9387051716169ff6a6c5ddd4d9f358db2e9 gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 876f97e622d3322f09b2a986f7350b4f1192c60659c5388a5220e343e1b1a3d4 all runs: OK false negative chance: 0.000 # git bisect good 49e6c9387051716169ff6a6c5ddd4d9f358db2e9 Bisecting: 3 revisions left to test after this (roughly 2 steps) [fe87922cee6161f066f4b9dd542033e048eeedaf] net/mlx5: fix possible stack overflows testing commit fe87922cee6161f066f4b9dd542033e048eeedaf gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: c7906792d5d543e92bfaebeea875b0dda077dbf61dbfa2c012d66b374ec162b9 all runs: OK false negative chance: 0.000 # git bisect good fe87922cee6161f066f4b9dd542033e048eeedaf Bisecting: 1 revision left to test after this (roughly 1 step) [d51dc8dd6ab6f93a894ff8b38d3b8d02c98eb9fb] Revert "s390/ism: fix receive message buffer allocation" testing commit d51dc8dd6ab6f93a894ff8b38d3b8d02c98eb9fb gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: 2fdc35c32307406773dda234073a6691020c05fbaae8a148a9ca51abaf627111 all runs: OK false negative chance: 0.000 # git bisect good d51dc8dd6ab6f93a894ff8b38d3b8d02c98eb9fb Bisecting: 0 revisions left to test after this (roughly 0 steps) [17c560113231ddc20088553c7b499b289b664311] net: dsa: mt7530: trap link-local frames regardless of ST Port State testing commit 17c560113231ddc20088553c7b499b289b664311 gcc compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 kernel signature: c9c91192e8ff9664c743e504f0ff29a99d061f90321c0eb30df2248637543135 all runs: OK false negative chance: 0.000 # git bisect good 17c560113231ddc20088553c7b499b289b664311 47d8ac011fe1c9251070e1bd64cb10b48193ec51 is the first bad commit commit 47d8ac011fe1c9251070e1bd64cb10b48193ec51 Author: Michal Luczaj Date: Tue Apr 9 22:09:39 2024 +0200 af_unix: Fix garbage collector racing against connect() Garbage collector does not take into account the risk of embryo getting enqueued during the garbage collection. If such embryo has a peer that carries SCM_RIGHTS, two consecutive passes of scan_children() may see a different set of children. Leading to an incorrectly elevated inflight count, and then a dangling pointer within the gc_inflight_list. sockets are AF_UNIX/SOCK_STREAM S is an unconnected socket L is a listening in-flight socket bound to addr, not in fdtable V's fd will be passed via sendmsg(), gets inflight count bumped connect(S, addr) sendmsg(S, [V]); close(V) __unix_gc() ---------------- ------------------------- ----------- NS = unix_create1() skb1 = sock_wmalloc(NS) L = unix_find_other(addr) unix_state_lock(L) unix_peer(S) = NS // V count=1 inflight=0 NS = unix_peer(S) skb2 = sock_alloc() skb_queue_tail(NS, skb2[V]) // V became in-flight // V count=2 inflight=1 close(V) // V count=1 inflight=1 // GC candidate condition met for u in gc_inflight_list: if (total_refs == inflight_refs) add u to gc_candidates // gc_candidates={L, V} for u in gc_candidates: scan_children(u, dec_inflight) // embryo (skb1) was not // reachable from L yet, so V's // inflight remains unchanged __skb_queue_tail(L, skb1) unix_state_unlock(L) for u in gc_candidates: if (u.inflight) scan_children(u, inc_inflight_move_tail) // V count=1 inflight=2 (!) If there is a GC-candidate listening socket, lock/unlock its state. This makes GC wait until the end of any ongoing connect() to that socket. After flipping the lock, a possibly SCM-laden embryo is already enqueued. And if there is another embryo coming, it can not possibly carry SCM_RIGHTS. At this point, unix_inflight() can not happen because unix_gc_lock is already taken. Inflight graph remains unaffected. Fixes: 1fd05ba5a2f2 ("[AF_UNIX]: Rewrite garbage collector, fixes race.") Signed-off-by: Michal Luczaj Reviewed-by: Kuniyuki Iwashima Link: https://lore.kernel.org/r/20240409201047.1032217-1-mhal@rbox.co Signed-off-by: Paolo Abeni net/unix/garbage.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) accumulated error probability: 0.00 culprit signature: 2fd2a80e608c33c31ba1e3612c87140b0028662ed7e74900448cfe196ecc6d31 parent signature: c9c91192e8ff9664c743e504f0ff29a99d061f90321c0eb30df2248637543135 revisions tested: 22, total time: 8h38m38.605497544s (build: 4h0m53.885971895s, test: 4h15m55.738719551s) first bad commit: 47d8ac011fe1c9251070e1bd64cb10b48193ec51 af_unix: Fix garbage collector racing against connect() recipients (to): ["kuniyu@amazon.com" "mhal@rbox.co" "pabeni@redhat.com"] recipients (cc): [] crash: possible deadlock in __unix_gc ====================================================== WARNING: possible circular locking dependency detected 6.9.0-rc2-syzkaller #0 Not tainted ------------------------------------------------------ kworker/u8:0/10 is trying to acquire lock: ffff88810136f6e0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffff88810136f6e0 (&u->lock){+.+.}-{2:2}, at: __unix_gc+0x158/0x470 net/unix/garbage.c:302 but task is already holding lock: ffffffff82872678 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffffffff82872678 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0xb2/0x470 net/unix/garbage.c:261 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (unix_gc_lock){+.+.}-{2:2}: __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] unix_notinflight+0x53/0xd0 net/unix/garbage.c:140 unix_detach_fds net/unix/af_unix.c:1819 [inline] unix_destruct_scm+0x7c/0xe0 net/unix/af_unix.c:1876 skb_release_head_state+0x3e/0x90 net/core/skbuff.c:1188 skb_release_all net/core/skbuff.c:1200 [inline] __kfree_skb+0xd/0xa0 net/core/skbuff.c:1216 kfree_skb include/linux/skbuff.h:1262 [inline] manage_oob net/unix/af_unix.c:2670 [inline] unix_stream_read_generic+0x224/0x930 net/unix/af_unix.c:2746 unix_stream_splice_read+0x73/0xa0 net/unix/af_unix.c:2981 do_splice_read fs/splice.c:985 [inline] splice_file_to_pipe+0x110/0x220 fs/splice.c:1295 do_splice+0x73d/0x7b0 fs/splice.c:1379 __do_splice fs/splice.c:1436 [inline] __do_sys_splice fs/splice.c:1652 [inline] __se_sys_splice+0x18f/0x240 fs/splice.c:1634 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xa8/0x190 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x72/0x7a -> #0 (&u->lock){+.+.}-{2:2}: check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain kernel/locking/lockdep.c:3869 [inline] __lock_acquire+0x11fe/0x2490 kernel/locking/lockdep.c:5137 lock_acquire+0xeb/0x270 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] __unix_gc+0x158/0x470 net/unix/garbage.c:302 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0x2a3/0x5b0 kernel/workqueue.c:3335 worker_thread+0x23e/0x300 kernel/workqueue.c:3416 kthread+0xea/0x100 kernel/kthread.c:388 ret_from_fork+0x32/0x40 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(unix_gc_lock); lock(&u->lock); lock(unix_gc_lock); lock(&u->lock); *** DEADLOCK *** 3 locks held by kworker/u8:0/10: #0: ffff88810007c948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3229 [inline] #0: ffff88810007c948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_scheduled_works+0x23e/0x5b0 kernel/workqueue.c:3335 #1: ffffc9000005be58 (unix_gc_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3230 [inline] #1: ffffc9000005be58 (unix_gc_work){+.+.}-{0:0}, at: process_scheduled_works+0x25e/0x5b0 kernel/workqueue.c:3335 #2: ffffffff82872678 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] #2: ffffffff82872678 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0xb2/0x470 net/unix/garbage.c:261 stack backtrace: CPU: 1 PID: 10 Comm: kworker/u8:0 Not tainted 6.9.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Workqueue: events_unbound __unix_gc Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xa3/0x100 lib/dump_stack.c:114 check_noncircular+0x119/0x140 kernel/locking/lockdep.c:2187 check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain kernel/locking/lockdep.c:3869 [inline] __lock_acquire+0x11fe/0x2490 kernel/locking/lockdep.c:5137 lock_acquire+0xeb/0x270 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] __unix_gc+0x158/0x470 net/unix/garbage.c:302 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0x2a3/0x5b0 kernel/workqueue.c:3335 worker_thread+0x23e/0x300 kernel/workqueue.c:3416 kthread+0xea/0x100 kernel/kthread.c:388 ret_from_fork+0x32/0x40 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243