syzbot


possible deadlock in try_to_wake_up (5)

Status: upstream: reported on 2024/05/30 18:36
Subsystems: mm
[Documentation on labels]
Reported-by: syzbot+4970d08867f5a5b0bb78@syzkaller.appspotmail.com
First crash: 30d, last: 3d06h
Discussions (1)
Title Replies (including bot) Last reply
[syzbot] [mm?] possible deadlock in try_to_wake_up (5) 0 (1) 2024/05/30 18:36
Similar bugs (7)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
linux-6.1 possible deadlock in try_to_wake_up (2) origin:upstream C 13 7d22h 47d 0/3 upstream: reported C repro on 2024/05/09 05:15
linux-6.1 possible deadlock in try_to_wake_up C done 1 90d 90d 3/3 fixed on 2024/04/29 07:11
linux-5.15 possible deadlock in try_to_wake_up origin:upstream C 7 10d 81d 0/3 upstream: reported C repro on 2024/04/05 17:03
upstream possible deadlock in try_to_wake_up (2) mm 1 646d 642d 0/27 auto-obsoleted due to no activity on 2023/01/16 12:10
upstream possible deadlock in try_to_wake_up (4) bpf net C error 19 36d 99d 26/27 fixed on 2024/05/22 23:36
upstream possible deadlock in try_to_wake_up (3) net 103 252d 261d 0/27 auto-obsoleted due to no activity on 2023/11/27 02:05
upstream possible deadlock in try_to_wake_up mm 39 1989d 2021d 0/27 auto-closed as invalid on 2019/07/13 09:55

Sample crash report:
======================================================
WARNING: possible circular locking dependency detected
6.10.0-rc2-syzkaller-00007-gf06ce441457d #0 Not tainted
------------------------------------------------------
syz-executor.1/7437 is trying to acquire lock:
ffff88801bba8a18 (&p->pi_lock){-.-.}-{2:2}, at: class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:553 [inline]
ffff88801bba8a18 (&p->pi_lock){-.-.}-{2:2}, at: try_to_wake_up+0x9a/0x13e0 kernel/sched/core.c:4262

but task is already holding lock:
ffff88807ffdbce0 (&pgdat->kswapd_wait){....}-{2:2}, at: __wake_up_common_lock kernel/sched/wait.c:105 [inline]
ffff88807ffdbce0 (&pgdat->kswapd_wait){....}-{2:2}, at: __wake_up+0x1c/0x60 kernel/sched/wait.c:127

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 (&pgdat->kswapd_wait){....}-{2:2}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x3a/0x60 kernel/locking/spinlock.c:162
       __wake_up_common_lock kernel/sched/wait.c:105 [inline]
       __wake_up+0x1c/0x60 kernel/sched/wait.c:127
       wakeup_kswapd+0x45e/0x640 mm/vmscan.c:7240
       rmqueue mm/page_alloc.c:3004 [inline]
       get_page_from_freelist+0x9b0/0x2df0 mm/page_alloc.c:3399
       __alloc_pages_noprof+0x22b/0x2460 mm/page_alloc.c:4660
       __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
       alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
       __kmalloc_large_node+0x7f/0x1a0 mm/slub.c:4066
       __do_kmalloc_node mm/slub.c:4109 [inline]
       __kmalloc_node_noprof.cold+0x5/0x5f mm/slub.c:4128
       kmalloc_node_noprof include/linux/slab.h:681 [inline]
       bpf_map_kmalloc_node+0x98/0x4a0 kernel/bpf/syscall.c:422
       lpm_trie_node_alloc kernel/bpf/lpm_trie.c:299 [inline]
       trie_update_elem+0x1ef/0xe00 kernel/bpf/lpm_trie.c:342
       bpf_map_update_value+0x2c1/0x6c0 kernel/bpf/syscall.c:203
       map_update_elem+0x623/0x910 kernel/bpf/syscall.c:1654
       __sys_bpf+0x90c/0x49a0 kernel/bpf/syscall.c:5675
       __do_sys_bpf kernel/bpf/syscall.c:5794 [inline]
       __se_sys_bpf kernel/bpf/syscall.c:5792 [inline]
       __x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5792
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #2 (&trie->lock){..-.}-{2:2}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x3a/0x60 kernel/locking/spinlock.c:162
       trie_delete_elem+0xb0/0x820 kernel/bpf/lpm_trie.c:462
       ___bpf_prog_run+0x3e51/0xabd0 kernel/bpf/core.c:2012
       __bpf_prog_run32+0xc1/0x100 kernel/bpf/core.c:2253
       bpf_dispatcher_nop_func include/linux/bpf.h:1243 [inline]
       __bpf_prog_run include/linux/filter.h:691 [inline]
       bpf_prog_run include/linux/filter.h:698 [inline]
       __bpf_trace_run kernel/trace/bpf_trace.c:2403 [inline]
       bpf_trace_run4+0x245/0x5a0 kernel/trace/bpf_trace.c:2446
       __bpf_trace_sched_switch+0x13e/0x190 include/trace/events/sched.h:222
       __traceiter_sched_switch+0x6c/0xc0 include/trace/events/sched.h:222
       trace_sched_switch include/trace/events/sched.h:222 [inline]
       __schedule+0x252c/0x5d00 kernel/sched/core.c:6742
       __schedule_loop kernel/sched/core.c:6822 [inline]
       schedule+0xe7/0x350 kernel/sched/core.c:6837
       futex_wait_queue+0xfc/0x1f0 kernel/futex/waitwake.c:370
       __futex_wait+0x291/0x3c0 kernel/futex/waitwake.c:669
       futex_wait+0xe9/0x380 kernel/futex/waitwake.c:697
       do_futex+0x22b/0x350 kernel/futex/syscalls.c:102
       __do_sys_futex kernel/futex/syscalls.c:179 [inline]
       __se_sys_futex kernel/futex/syscalls.c:160 [inline]
       __x64_sys_futex+0x1e1/0x4c0 kernel/futex/syscalls.c:160
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #1 (&rq->__lock){-.-.}-{2:2}:
       _raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
       raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:559
       raw_spin_rq_lock kernel/sched/sched.h:1406 [inline]
       rq_lock kernel/sched/sched.h:1702 [inline]
       task_fork_fair+0x73/0x250 kernel/sched/fair.c:12710
       sched_cgroup_fork+0x3cf/0x510 kernel/sched/core.c:4844
       copy_process+0x439b/0x8f10 kernel/fork.c:2499
       kernel_clone+0xfd/0x980 kernel/fork.c:2797
       user_mode_thread+0xb4/0xf0 kernel/fork.c:2875
       rest_init+0x23/0x2b0 init/main.c:712
       start_kernel+0x3df/0x4c0 init/main.c:1103
       x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:507
       x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:488
       common_startup_64+0x13e/0x148

-> #0 (&p->pi_lock){-.-.}-{2:2}:
       check_prev_add kernel/locking/lockdep.c:3134 [inline]
       check_prevs_add kernel/locking/lockdep.c:3253 [inline]
       validate_chain kernel/locking/lockdep.c:3869 [inline]
       __lock_acquire+0x2478/0x3b30 kernel/locking/lockdep.c:5137
       lock_acquire kernel/locking/lockdep.c:5754 [inline]
       lock_acquire+0x1b1/0x560 kernel/locking/lockdep.c:5719
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x3a/0x60 kernel/locking/spinlock.c:162
       class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:553 [inline]
       try_to_wake_up+0x9a/0x13e0 kernel/sched/core.c:4262
       autoremove_wake_function+0x16/0x150 kernel/sched/wait.c:384
       __wake_up_common+0x131/0x1e0 kernel/sched/wait.c:89
       __wake_up_common_lock kernel/sched/wait.c:106 [inline]
       __wake_up+0x31/0x60 kernel/sched/wait.c:127
       wakeup_kswapd+0x45e/0x640 mm/vmscan.c:7240
       rmqueue mm/page_alloc.c:3004 [inline]
       get_page_from_freelist+0x9b0/0x2df0 mm/page_alloc.c:3399
       __alloc_pages_noprof+0x22b/0x2460 mm/page_alloc.c:4660
       __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
       alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
       __kmalloc_large_node+0x7f/0x1a0 mm/slub.c:4066
       __do_kmalloc_node mm/slub.c:4109 [inline]
       __kmalloc_node_noprof.cold+0x5/0x5f mm/slub.c:4128
       kmalloc_node_noprof include/linux/slab.h:681 [inline]
       bpf_map_kmalloc_node+0x98/0x4a0 kernel/bpf/syscall.c:422
       lpm_trie_node_alloc kernel/bpf/lpm_trie.c:299 [inline]
       trie_update_elem+0x1ef/0xe00 kernel/bpf/lpm_trie.c:342
       bpf_map_update_value+0x2c1/0x6c0 kernel/bpf/syscall.c:203
       map_update_elem+0x623/0x910 kernel/bpf/syscall.c:1654
       __sys_bpf+0x90c/0x49a0 kernel/bpf/syscall.c:5675
       __do_sys_bpf kernel/bpf/syscall.c:5794 [inline]
       __se_sys_bpf kernel/bpf/syscall.c:5792 [inline]
       __x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5792
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

Chain exists of:
  &p->pi_lock --> &trie->lock --> &pgdat->kswapd_wait

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&pgdat->kswapd_wait);
                               lock(&trie->lock);
                               lock(&pgdat->kswapd_wait);
  lock(&p->pi_lock);

 *** DEADLOCK ***

3 locks held by syz-executor.1/7437:
 #0: ffffffff8dbb18e0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:329 [inline]
 #0: ffffffff8dbb18e0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:781 [inline]
 #0: ffffffff8dbb18e0 (rcu_read_lock){....}-{1:2}, at: bpf_map_update_value+0x24b/0x6c0 kernel/bpf/syscall.c:202
 #1: ffff88802a3e91f8 (&trie->lock){..-.}-{2:2}, at: trie_update_elem+0xc8/0xe00 kernel/bpf/lpm_trie.c:333
 #2: ffff88807ffdbce0 (&pgdat->kswapd_wait){....}-{2:2}, at: __wake_up_common_lock kernel/sched/wait.c:105 [inline]
 #2: ffff88807ffdbce0 (&pgdat->kswapd_wait){....}-{2:2}, at: __wake_up+0x1c/0x60 kernel/sched/wait.c:127

stack backtrace:
CPU: 1 PID: 7437 Comm: syz-executor.1 Not tainted 6.10.0-rc2-syzkaller-00007-gf06ce441457d #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:114
 check_noncircular+0x31a/0x400 kernel/locking/lockdep.c:2187
 check_prev_add kernel/locking/lockdep.c:3134 [inline]
 check_prevs_add kernel/locking/lockdep.c:3253 [inline]
 validate_chain kernel/locking/lockdep.c:3869 [inline]
 __lock_acquire+0x2478/0x3b30 kernel/locking/lockdep.c:5137
 lock_acquire kernel/locking/lockdep.c:5754 [inline]
 lock_acquire+0x1b1/0x560 kernel/locking/lockdep.c:5719
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0x3a/0x60 kernel/locking/spinlock.c:162
 class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:553 [inline]
 try_to_wake_up+0x9a/0x13e0 kernel/sched/core.c:4262
 autoremove_wake_function+0x16/0x150 kernel/sched/wait.c:384
 __wake_up_common+0x131/0x1e0 kernel/sched/wait.c:89
 __wake_up_common_lock kernel/sched/wait.c:106 [inline]
 __wake_up+0x31/0x60 kernel/sched/wait.c:127
 wakeup_kswapd+0x45e/0x640 mm/vmscan.c:7240
 rmqueue mm/page_alloc.c:3004 [inline]
 get_page_from_freelist+0x9b0/0x2df0 mm/page_alloc.c:3399
 __alloc_pages_noprof+0x22b/0x2460 mm/page_alloc.c:4660
 __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
 alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
 __kmalloc_large_node+0x7f/0x1a0 mm/slub.c:4066
 __do_kmalloc_node mm/slub.c:4109 [inline]
 __kmalloc_node_noprof.cold+0x5/0x5f mm/slub.c:4128
 kmalloc_node_noprof include/linux/slab.h:681 [inline]
 bpf_map_kmalloc_node+0x98/0x4a0 kernel/bpf/syscall.c:422
 lpm_trie_node_alloc kernel/bpf/lpm_trie.c:299 [inline]
 trie_update_elem+0x1ef/0xe00 kernel/bpf/lpm_trie.c:342
 bpf_map_update_value+0x2c1/0x6c0 kernel/bpf/syscall.c:203
 map_update_elem+0x623/0x910 kernel/bpf/syscall.c:1654
 __sys_bpf+0x90c/0x49a0 kernel/bpf/syscall.c:5675
 __do_sys_bpf kernel/bpf/syscall.c:5794 [inline]
 __se_sys_bpf kernel/bpf/syscall.c:5792 [inline]
 __x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5792
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f731167cee9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f73123b80c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00007f73117abf80 RCX: 00007f731167cee9
RDX: 0000000000000020 RSI: 0000000020000000 RDI: 0000000000000002
RBP: 00007f73116c949e R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007f73117abf80 R15: 00007ffeff22f7a8
 </TASK>

Crashes (4):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2024/06/03 23:40 upstream f06ce441457d c2e07261 .config console log report info [disk image (non-bootable)] [vmlinux] [kernel image] ci-qemu-upstream possible deadlock in try_to_wake_up
2024/06/22 20:02 bpf 36534d3c5453 edc5149a .config console log report info [disk image] [vmlinux] [kernel image] ci-upstream-bpf-kasan-gce possible deadlock in try_to_wake_up
2024/05/27 03:01 bpf 95348e463eab a10a183e .config console log report info [disk image] [vmlinux] [kernel image] ci-upstream-bpf-kasan-gce possible deadlock in try_to_wake_up
2024/05/26 18:33 bpf-next e245ef8a0b06 a10a183e .config console log report info [disk image] [vmlinux] [kernel image] ci-upstream-bpf-next-kasan-gce possible deadlock in try_to_wake_up
* Struck through repros no longer work on HEAD.