syzbot


possible deadlock in rcu_report_exp_cpu_mult

Status: upstream: reported C repro on 2024/03/16 22:50
Reported-by: syzbot+3b001e9ea0e979613227@syzkaller.appspotmail.com
First crash: 47d, last: 23d
Bug presence (1)
Date Name Commit Repro Result
2024/04/28 upstream (ToT) 245c8e81741b C Didn't crash
Similar bugs (1)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
upstream possible deadlock in rcu_report_exp_cpu_mult net bpf C done 30 8d06h 46d 25/26 upstream: reported C repro on 2024/03/18 10:07

Sample crash report:
=====================================================
WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
6.1.84-syzkaller #0 Not tainted
-----------------------------------------------------
kworker/0:3/3954 [HC0[0]:SC0[2]:HE0:SE0] is trying to acquire:
ffff88804b81da40 (&stab->lock){+...}-{2:2}, at: __sock_map_delete net/core/sock_map.c:416 [inline]
ffff88804b81da40 (&stab->lock){+...}-{2:2}, at: sock_map_delete_elem+0x97/0x130 net/core/sock_map.c:448

and this task is already holding:
ffffffff8d12f818 (rcu_node_0){-.-.}-{2:2}, at: sync_rcu_exp_done_unlocked+0xe/0x140 kernel/rcu/tree_exp.h:168
which would create a new lock dependency:
 (rcu_node_0){-.-.}-{2:2} -> (&stab->lock){+...}-{2:2}

but this new dependency connects a HARDIRQ-irq-safe lock:
 (rcu_node_0){-.-.}-{2:2}

... which became HARDIRQ-irq-safe at:
  lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
  _raw_spin_lock_irqsave+0xd1/0x120 kernel/locking/spinlock.c:162
  rcu_report_exp_cpu_mult+0x27/0x2e0 kernel/rcu/tree_exp.h:238
  __flush_smp_call_function_queue+0x60c/0xd00 kernel/smp.c:676
  __sysvec_call_function_single+0xbb/0x360 arch/x86/kernel/smp.c:267
  sysvec_call_function_single+0x89/0xb0 arch/x86/kernel/smp.c:262
  asm_sysvec_call_function_single+0x16/0x20 arch/x86/include/asm/idtentry.h:661
  clear_page_erms+0x7/0x10 arch/x86/lib/clear_page_64.S:49
  clear_page arch/x86/include/asm/page_64.h:57 [inline]
  clear_highpage include/linux/highmem.h:242 [inline]
  clear_highpage_kasan_tagged include/linux/highmem.h:252 [inline]
  kernel_init_pages mm/page_alloc.c:1377 [inline]
  post_alloc_hook+0x145/0x1b0 mm/page_alloc.c:2508
  prep_new_page mm/page_alloc.c:2520 [inline]
  get_page_from_freelist+0x31a1/0x3320 mm/page_alloc.c:4279
  __alloc_pages+0x28d/0x770 mm/page_alloc.c:5547
  __alloc_pages_node include/linux/gfp.h:237 [inline]
  alloc_pages_node include/linux/gfp.h:260 [inline]
  alloc_pages_exact_nid+0x115/0x1b9 mm/page_alloc.c:5847
  alloc_page_ext+0x1f/0x48 mm/page_ext.c:294
  init_section_page_ext+0x101/0x15e mm/page_ext.c:317
  page_ext_init+0x5b8/0x782 mm/page_ext.c:511
  kernel_init_freeable+0x450/0x60f init/main.c:1623
  kernel_init+0x19/0x290 init/main.c:1513
  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:307

to a HARDIRQ-irq-unsafe lock:
 (&stab->lock){+...}-{2:2}

... which became HARDIRQ-irq-unsafe at:
...
  lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
  __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
  _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:178
  __sock_map_delete net/core/sock_map.c:416 [inline]
  sock_map_delete_elem+0x97/0x130 net/core/sock_map.c:448
  0xffffffffa0001fe2
  bpf_dispatcher_nop_func include/linux/bpf.h:989 [inline]
  __bpf_prog_run include/linux/filter.h:603 [inline]
  bpf_prog_run include/linux/filter.h:610 [inline]
  __bpf_trace_run kernel/trace/bpf_trace.c:2273 [inline]
  bpf_trace_run2+0x1fd/0x410 kernel/trace/bpf_trace.c:2312
  trace_contention_end+0x12f/0x170 include/trace/events/lock.h:122
  __mutex_lock_common kernel/locking/mutex.c:612 [inline]
  __mutex_lock+0x2ed/0xd80 kernel/locking/mutex.c:747
  futex_cleanup_begin kernel/futex/core.c:1076 [inline]
  futex_exit_release+0x30/0x1e0 kernel/futex/core.c:1128
  exit_mm_release+0x16/0x30 kernel/fork.c:1505
  exit_mm+0xa9/0x300 kernel/exit.c:535
  do_exit+0x9f6/0x26a0 kernel/exit.c:856
  do_group_exit+0x202/0x2b0 kernel/exit.c:1019
  __do_sys_exit_group kernel/exit.c:1030 [inline]
  __se_sys_exit_group kernel/exit.c:1028 [inline]
  __x64_sys_exit_group+0x3b/0x40 kernel/exit.c:1028
  do_syscall_x64 arch/x86/entry/common.c:51 [inline]
  do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:81
  entry_SYSCALL_64_after_hwframe+0x63/0xcd

other info that might help us debug this:

 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&stab->lock);
                               local_irq_disable();
                               lock(rcu_node_0);
                               lock(&stab->lock);
  <Interrupt>
    lock(rcu_node_0);

 *** DEADLOCK ***

4 locks held by kworker/0:3/3954:
 #0: ffff888012472138 ((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x7a9/0x11d0 kernel/workqueue.c:2267
 #1: ffffc90005757d20 ((work_completion)(&rew->rew_work)){+.+.}-{0:0}, at: process_one_work+0x7a9/0x11d0 kernel/workqueue.c:2267
 #2: ffffffff8d12f818 (rcu_node_0){-.-.}-{2:2}, at: sync_rcu_exp_done_unlocked+0xe/0x140 kernel/rcu/tree_exp.h:168
 #3: ffffffff8d12a980 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:350 [inline]
 #3: ffffffff8d12a980 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:791 [inline]
 #3: ffffffff8d12a980 (rcu_read_lock){....}-{1:2}, at: __bpf_trace_run kernel/trace/bpf_trace.c:2272 [inline]
 #3: ffffffff8d12a980 (rcu_read_lock){....}-{1:2}, at: bpf_trace_run2+0x110/0x410 kernel/trace/bpf_trace.c:2312

the dependencies between HARDIRQ-irq-safe lock and the holding lock:
-> (rcu_node_0){-.-.}-{2:2} {
   IN-HARDIRQ-W at:
                    lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
                    __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
                    _raw_spin_lock_irqsave+0xd1/0x120 kernel/locking/spinlock.c:162
                    rcu_report_exp_cpu_mult+0x27/0x2e0 kernel/rcu/tree_exp.h:238
                    __flush_smp_call_function_queue+0x60c/0xd00 kernel/smp.c:676
                    __sysvec_call_function_single+0xbb/0x360 arch/x86/kernel/smp.c:267
                    sysvec_call_function_single+0x89/0xb0 arch/x86/kernel/smp.c:262
                    asm_sysvec_call_function_single+0x16/0x20 arch/x86/include/asm/idtentry.h:661
                    clear_page_erms+0x7/0x10 arch/x86/lib/clear_page_64.S:49
                    clear_page arch/x86/include/asm/page_64.h:57 [inline]
                    clear_highpage include/linux/highmem.h:242 [inline]
                    clear_highpage_kasan_tagged include/linux/highmem.h:252 [inline]
                    kernel_init_pages mm/page_alloc.c:1377 [inline]
                    post_alloc_hook+0x145/0x1b0 mm/page_alloc.c:2508
                    prep_new_page mm/page_alloc.c:2520 [inline]
                    get_page_from_freelist+0x31a1/0x3320 mm/page_alloc.c:4279
                    __alloc_pages+0x28d/0x770 mm/page_alloc.c:5547
                    __alloc_pages_node include/linux/gfp.h:237 [inline]
                    alloc_pages_node include/linux/gfp.h:260 [inline]
                    alloc_pages_exact_nid+0x115/0x1b9 mm/page_alloc.c:5847
                    alloc_page_ext+0x1f/0x48 mm/page_ext.c:294
                    init_section_page_ext+0x101/0x15e mm/page_ext.c:317
                    page_ext_init+0x5b8/0x782 mm/page_ext.c:511
                    kernel_init_freeable+0x450/0x60f init/main.c:1623
                    kernel_init+0x19/0x290 init/main.c:1513
                    ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:307
   IN-SOFTIRQ-W at:
                    lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
                    __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
                    _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154
                    rcu_accelerate_cbs_unlocked+0x8a/0x230 kernel/rcu/tree.c:1184
                    rcu_core+0x5a0/0x17e0 kernel/rcu/tree.c:2547
                    __do_softirq+0x2e9/0xa4c kernel/softirq.c:571
                    invoke_softirq kernel/softirq.c:445 [inline]
                    __irq_exit_rcu+0x155/0x240 kernel/softirq.c:650
                    irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
                    sysvec_apic_timer_interrupt+0x91/0xb0 arch/x86/kernel/apic/apic.c:1106
                    asm_sysvec_apic_timer_interrupt+0x16/0x20 arch/x86/include/asm/idtentry.h:653
                    arch_test_and_set_bit arch/x86/include/asm/bitops.h:138 [inline]
                    test_and_set_bit include/asm-generic/bitops/instrumented-atomic.h:72 [inline]
                    queue_work_on+0x1f4/0x250 kernel/workqueue.c:1547
                    smp_call_on_cpu+0x2d6/0x3b0 kernel/smp.c:1263
                    softlockup_start_all kernel/watchdog.c:530 [inline]
                    __lockup_detector_reconfigure+0x258/0x370 kernel/watchdog.c:556
                    lockup_detector_setup+0x9e/0xb2 kernel/watchdog.c:590
                    lockup_detector_init+0x78/0xb5 kernel/watchdog.c:873
                    kernel_init_freeable+0x407/0x60f init/main.c:1614
                    kernel_init+0x19/0x290 init/main.c:1513
                    ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:307
   INITIAL USE at:
                   lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
                   __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
                   _raw_spin_lock_irqsave+0xd1/0x120 kernel/locking/spinlock.c:162
                   rcutree_prepare_cpu+0x6d/0x520 kernel/rcu/tree.c:4173
                   rcu_init+0xb4/0x200 kernel/rcu/tree.c:4857
                   start_kernel+0x20d/0x53f init/main.c:1032
                   secondary_startup_64_no_verify+0xcf/0xdb
 }
 ... key      at: [<ffffffff91cd4d60>] rcu_init_one.rcu_node_class+0x0/0x20

the dependencies between the lock to be acquired
 and HARDIRQ-irq-unsafe lock:
-> (&stab->lock){+...}-{2:2} {
   HARDIRQ-ON-W at:
                    lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
                    __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
                    _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:178
                    __sock_map_delete net/core/sock_map.c:416 [inline]
                    sock_map_delete_elem+0x97/0x130 net/core/sock_map.c:448
                    0xffffffffa0001fe2
                    bpf_dispatcher_nop_func include/linux/bpf.h:989 [inline]
                    __bpf_prog_run include/linux/filter.h:603 [inline]
                    bpf_prog_run include/linux/filter.h:610 [inline]
                    __bpf_trace_run kernel/trace/bpf_trace.c:2273 [inline]
                    bpf_trace_run2+0x1fd/0x410 kernel/trace/bpf_trace.c:2312
                    trace_contention_end+0x12f/0x170 include/trace/events/lock.h:122
                    __mutex_lock_common kernel/locking/mutex.c:612 [inline]
                    __mutex_lock+0x2ed/0xd80 kernel/locking/mutex.c:747
                    futex_cleanup_begin kernel/futex/core.c:1076 [inline]
                    futex_exit_release+0x30/0x1e0 kernel/futex/core.c:1128
                    exit_mm_release+0x16/0x30 kernel/fork.c:1505
                    exit_mm+0xa9/0x300 kernel/exit.c:535
                    do_exit+0x9f6/0x26a0 kernel/exit.c:856
                    do_group_exit+0x202/0x2b0 kernel/exit.c:1019
                    __do_sys_exit_group kernel/exit.c:1030 [inline]
                    __se_sys_exit_group kernel/exit.c:1028 [inline]
                    __x64_sys_exit_group+0x3b/0x40 kernel/exit.c:1028
                    do_syscall_x64 arch/x86/entry/common.c:51 [inline]
                    do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:81
                    entry_SYSCALL_64_after_hwframe+0x63/0xcd
   INITIAL USE at:
                   lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
                   __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
                   _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:178
                   __sock_map_delete net/core/sock_map.c:416 [inline]
                   sock_map_delete_elem+0x97/0x130 net/core/sock_map.c:448
                   0xffffffffa0001fe2
                   bpf_dispatcher_nop_func include/linux/bpf.h:989 [inline]
                   __bpf_prog_run include/linux/filter.h:603 [inline]
                   bpf_prog_run include/linux/filter.h:610 [inline]
                   __bpf_trace_run kernel/trace/bpf_trace.c:2273 [inline]
                   bpf_trace_run2+0x1fd/0x410 kernel/trace/bpf_trace.c:2312
                   trace_contention_end+0x12f/0x170 include/trace/events/lock.h:122
                   __mutex_lock_common kernel/locking/mutex.c:612 [inline]
                   __mutex_lock+0x2ed/0xd80 kernel/locking/mutex.c:747
                   futex_cleanup_begin kernel/futex/core.c:1076 [inline]
                   futex_exit_release+0x30/0x1e0 kernel/futex/core.c:1128
                   exit_mm_release+0x16/0x30 kernel/fork.c:1505
                   exit_mm+0xa9/0x300 kernel/exit.c:535
                   do_exit+0x9f6/0x26a0 kernel/exit.c:856
                   do_group_exit+0x202/0x2b0 kernel/exit.c:1019
                   __do_sys_exit_group kernel/exit.c:1030 [inline]
                   __se_sys_exit_group kernel/exit.c:1028 [inline]
                   __x64_sys_exit_group+0x3b/0x40 kernel/exit.c:1028
                   do_syscall_x64 arch/x86/entry/common.c:51 [inline]
                   do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:81
                   entry_SYSCALL_64_after_hwframe+0x63/0xcd
 }
 ... key      at: [<ffffffff920b1320>] sock_map_alloc.__key+0x0/0x20
 ... acquired at:
   lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
   __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
   _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:178
   __sock_map_delete net/core/sock_map.c:416 [inline]
   sock_map_delete_elem+0x97/0x130 net/core/sock_map.c:448
   bpf_prog_2c29ac5cdc6b1842+0x3a/0x3e
   bpf_dispatcher_nop_func include/linux/bpf.h:989 [inline]
   __bpf_prog_run include/linux/filter.h:603 [inline]
   bpf_prog_run include/linux/filter.h:610 [inline]
   __bpf_trace_run kernel/trace/bpf_trace.c:2273 [inline]
   bpf_trace_run2+0x1fd/0x410 kernel/trace/bpf_trace.c:2312
   trace_contention_end+0x14c/0x190 include/trace/events/lock.h:122
   __pv_queued_spin_lock_slowpath+0x935/0xc50 kernel/locking/qspinlock.c:560
   pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:591 [inline]
   queued_spin_lock_slowpath+0x42/0x50 arch/x86/include/asm/qspinlock.h:51
   queued_spin_lock include/asm-generic/qspinlock.h:114 [inline]
   do_raw_spin_lock+0x269/0x370 kernel/locking/spinlock_debug.c:115
   __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:111 [inline]
   _raw_spin_lock_irqsave+0xdd/0x120 kernel/locking/spinlock.c:162
   sync_rcu_exp_done_unlocked+0xe/0x140 kernel/rcu/tree_exp.h:168
   synchronize_rcu_expedited_wait_once kernel/rcu/tree_exp.h:580 [inline]
   synchronize_rcu_expedited_wait kernel/rcu/tree_exp.h:631 [inline]
   rcu_exp_wait_wake kernel/rcu/tree_exp.h:699 [inline]
   rcu_exp_sel_wait_wake+0x6b0/0x1d50 kernel/rcu/tree_exp.h:733
   process_one_work+0x8a9/0x11d0 kernel/workqueue.c:2292
   worker_thread+0xa47/0x1200 kernel/workqueue.c:2439
   kthread+0x28d/0x320 kernel/kthread.c:376
   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:307


stack backtrace:
CPU: 0 PID: 3954 Comm: kworker/0:3 Not tainted 6.1.84-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Workqueue: rcu_gp wait_rcu_exp_gp
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106
 print_bad_irq_dependency kernel/locking/lockdep.c:2604 [inline]
 check_irq_usage kernel/locking/lockdep.c:2843 [inline]
 check_prev_add kernel/locking/lockdep.c:3094 [inline]
 check_prevs_add kernel/locking/lockdep.c:3209 [inline]
 validate_chain+0x4d16/0x5950 kernel/locking/lockdep.c:3825
 __lock_acquire+0x125b/0x1f80 kernel/locking/lockdep.c:5049
 lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
 _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:178
 __sock_map_delete net/core/sock_map.c:416 [inline]
 sock_map_delete_elem+0x97/0x130 net/core/sock_map.c:448
 bpf_prog_2c29ac5cdc6b1842+0x3a/0x3e
 bpf_dispatcher_nop_func include/linux/bpf.h:989 [inline]
 __bpf_prog_run include/linux/filter.h:603 [inline]
 bpf_prog_run include/linux/filter.h:610 [inline]
 __bpf_trace_run kernel/trace/bpf_trace.c:2273 [inline]
 bpf_trace_run2+0x1fd/0x410 kernel/trace/bpf_trace.c:2312
 trace_contention_end+0x14c/0x190 include/trace/events/lock.h:122
 __pv_queued_spin_lock_slowpath+0x935/0xc50 kernel/locking/qspinlock.c:560
 pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:591 [inline]
 queued_spin_lock_slowpath+0x42/0x50 arch/x86/include/asm/qspinlock.h:51
 queued_spin_lock include/asm-generic/qspinlock.h:114 [inline]
 do_raw_spin_lock+0x269/0x370 kernel/locking/spinlock_debug.c:115
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:111 [inline]
 _raw_spin_lock_irqsave+0xdd/0x120 kernel/locking/spinlock.c:162
 sync_rcu_exp_done_unlocked+0xe/0x140 kernel/rcu/tree_exp.h:168
 synchronize_rcu_expedited_wait_once kernel/rcu/tree_exp.h:580 [inline]
 synchronize_rcu_expedited_wait kernel/rcu/tree_exp.h:631 [inline]
 rcu_exp_wait_wake kernel/rcu/tree_exp.h:699 [inline]
 rcu_exp_sel_wait_wake+0x6b0/0x1d50 kernel/rcu/tree_exp.h:733
 process_one_work+0x8a9/0x11d0 kernel/workqueue.c:2292
 worker_thread+0xa47/0x1200 kernel/workqueue.c:2439
 kthread+0x28d/0x320 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:307
 </TASK>

Crashes (7):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2024/04/10 14:48 linux-6.1.y 347385861c50 4320ec32 .config console log report syz C [disk image] [vmlinux] [kernel image] ci2-linux-6-1-kasan-perf possible deadlock in rcu_report_exp_cpu_mult
2024/04/10 02:43 linux-6.1.y 347385861c50 171ec371 .config console log report syz C [disk image] [vmlinux] [kernel image] ci2-linux-6-1-kasan-perf possible deadlock in rcu_report_exp_cpu_mult
2024/04/09 10:26 linux-6.1.y 347385861c50 f3234354 .config console log report syz C [disk image] [vmlinux] [kernel image] ci2-linux-6-1-kasan-perf possible deadlock in rcu_report_exp_cpu_mult
2024/04/08 21:14 linux-6.1.y 347385861c50 53df08b6 .config console log report syz C [disk image] [vmlinux] [kernel image] ci2-linux-6-1-kasan-perf possible deadlock in rcu_report_exp_cpu_mult
2024/04/08 11:07 linux-6.1.y 347385861c50 ca620dd8 .config console log report syz C [disk image] [vmlinux] [kernel image] ci2-linux-6-1-kasan-perf possible deadlock in rcu_report_exp_cpu_mult
2024/03/16 22:50 linux-6.1.y d7543167affd d615901c .config console log report syz C [disk image] [vmlinux] [kernel image] ci2-linux-6-1-kasan-perf possible deadlock in rcu_report_exp_cpu_mult
2024/04/10 13:19 linux-6.1.y 347385861c50 4320ec32 .config console log report info [disk image] [vmlinux] [kernel image] ci2-linux-6-1-kasan-perf possible deadlock in rcu_report_exp_cpu_mult
* Struck through repros no longer work on HEAD.