syzbot


possible deadlock in htab_map_delete_elem

Status: upstream: reported C repro on 2024/11/25 00:38
Bug presence: origin:lts-only
[Documentation on labels]
Reported-by: syzbot+4c3d16f74f429ff25a21@syzkaller.appspotmail.com
First crash: 26d, last: 26d
Fix commit to backport (bisect log) :
tree: upstream
commit 9f2c6e96c65e6fa1aebef546be0c30a5895fcb37
Author: Alexei Starovoitov <ast@kernel.org>
Date: Fri Sep 2 21:10:58 2022 +0000

  bpf: Optimize rcu_barrier usage between hash map and bpf_mem_alloc.

  
Bug presence (2)
Date Name Commit Repro Result
2024/11/25 linux-5.15.y (ToT) 0a51d2d4527b C [report] possible deadlock in htab_map_delete_elem
2024/11/25 upstream (ToT) 9f16d5e6f220 C Didn't crash
Similar bugs (1)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
upstream possible deadlock in htab_map_delete_elem bpf C 3 20d 100d 0/28 upstream: reported C repro on 2024/09/11 15:38
Fix bisection attempts (1)
Created Duration User Patch Repo Result
2024/12/06 18:27 8h04m fix candidate upstream OK (1) job log

Sample crash report:
======================================================
WARNING: possible circular locking dependency detected
5.15.173-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor203/4169 is trying to acquire lock:
ffff888022f56020 (&htab->lockdep_key#2){....}-{2:2}, at: htab_lock_bucket kernel/bpf/hashtab.c:183 [inline]
ffff888022f56020 (&htab->lockdep_key#2){....}-{2:2}, at: htab_map_delete_elem+0x1bd/0x560 kernel/bpf/hashtab.c:1361

but task is already holding lock:
ffff888022f560a0 (&htab->lockdep_key#3){....}-{2:2}, at: htab_lock_bucket kernel/bpf/hashtab.c:183 [inline]
ffff888022f560a0 (&htab->lockdep_key#3){....}-{2:2}, at: htab_map_update_elem+0x245/0x9c0 kernel/bpf/hashtab.c:1082

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&htab->lockdep_key#3){....}-{2:2}:
       lock_acquire+0x1db/0x4f0 kernel/locking/lockdep.c:5623
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0xd1/0x120 kernel/locking/spinlock.c:162
       htab_lock_bucket kernel/bpf/hashtab.c:183 [inline]
       htab_map_delete_elem+0x1bd/0x560 kernel/bpf/hashtab.c:1361
       bpf_prog_2c29ac5cdc6b1842+0x3a/0xd34
       bpf_dispatcher_nop_func include/linux/bpf.h:790 [inline]
       __bpf_prog_run include/linux/filter.h:628 [inline]
       bpf_prog_run include/linux/filter.h:635 [inline]
       __bpf_trace_run kernel/trace/bpf_trace.c:1878 [inline]
       bpf_trace_run4+0x1ea/0x390 kernel/trace/bpf_trace.c:1917
       __bpf_trace_mm_page_alloc+0xba/0xe0 include/trace/events/kmem.h:201
       __traceiter_mm_page_alloc+0x35/0x50 include/trace/events/kmem.h:201
       trace_mm_page_alloc include/trace/events/kmem.h:201 [inline]
       __alloc_pages+0x6e0/0x700 mm/page_alloc.c:5486
       __alloc_pages_node include/linux/gfp.h:570 [inline]
       alloc_pages_node include/linux/gfp.h:584 [inline]
       kmalloc_large_node+0x7c/0x180 mm/slub.c:4421
       __kmalloc_node+0x22d/0x390 mm/slub.c:4437
       kmalloc_node include/linux/slab.h:614 [inline]
       bpf_map_kmalloc_node+0xdb/0x160 kernel/bpf/syscall.c:430
       alloc_htab_elem+0x28b/0x920 kernel/bpf/hashtab.c:973
       htab_map_update_elem+0x3cb/0x9c0 kernel/bpf/hashtab.c:1106
       bpf_map_update_value+0x5d7/0x6c0 kernel/bpf/syscall.c:221
       map_update_elem+0x6a0/0x7c0 kernel/bpf/syscall.c:1185
       __sys_bpf+0x2fd/0x670 kernel/bpf/syscall.c:4639
       __do_sys_bpf kernel/bpf/syscall.c:4755 [inline]
       __se_sys_bpf kernel/bpf/syscall.c:4753 [inline]
       __x64_sys_bpf+0x78/0x90 kernel/bpf/syscall.c:4753
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3b/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x66/0xd0

-> #0 (&htab->lockdep_key#2){....}-{2:2}:
       check_prev_add kernel/locking/lockdep.c:3053 [inline]
       check_prevs_add kernel/locking/lockdep.c:3172 [inline]
       validate_chain+0x1649/0x5930 kernel/locking/lockdep.c:3788
       __lock_acquire+0x1295/0x1ff0 kernel/locking/lockdep.c:5012
       lock_acquire+0x1db/0x4f0 kernel/locking/lockdep.c:5623
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0xd1/0x120 kernel/locking/spinlock.c:162
       htab_lock_bucket kernel/bpf/hashtab.c:183 [inline]
       htab_map_delete_elem+0x1bd/0x560 kernel/bpf/hashtab.c:1361
       bpf_prog_2c29ac5cdc6b1842+0x3a/0x6b8
       bpf_dispatcher_nop_func include/linux/bpf.h:790 [inline]
       __bpf_prog_run include/linux/filter.h:628 [inline]
       bpf_prog_run include/linux/filter.h:635 [inline]
       __bpf_trace_run kernel/trace/bpf_trace.c:1878 [inline]
       bpf_trace_run4+0x1ea/0x390 kernel/trace/bpf_trace.c:1917
       __bpf_trace_mm_page_alloc+0xba/0xe0 include/trace/events/kmem.h:201
       __traceiter_mm_page_alloc+0x35/0x50 include/trace/events/kmem.h:201
       trace_mm_page_alloc include/trace/events/kmem.h:201 [inline]
       __alloc_pages+0x6e0/0x700 mm/page_alloc.c:5486
       __alloc_pages_node include/linux/gfp.h:570 [inline]
       alloc_pages_node include/linux/gfp.h:584 [inline]
       kmalloc_large_node+0x7c/0x180 mm/slub.c:4421
       __kmalloc_node+0x22d/0x390 mm/slub.c:4437
       kmalloc_node include/linux/slab.h:614 [inline]
       bpf_map_kmalloc_node+0xdb/0x160 kernel/bpf/syscall.c:430
       alloc_htab_elem+0x28b/0x920 kernel/bpf/hashtab.c:973
       htab_map_update_elem+0x3cb/0x9c0 kernel/bpf/hashtab.c:1106
       bpf_map_update_value+0x5d7/0x6c0 kernel/bpf/syscall.c:221
       map_update_elem+0x6a0/0x7c0 kernel/bpf/syscall.c:1185
       __sys_bpf+0x2fd/0x670 kernel/bpf/syscall.c:4639
       __do_sys_bpf kernel/bpf/syscall.c:4755 [inline]
       __se_sys_bpf kernel/bpf/syscall.c:4753 [inline]
       __x64_sys_bpf+0x78/0x90 kernel/bpf/syscall.c:4753
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3b/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x66/0xd0

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&htab->lockdep_key#3);
                               lock(&htab->lockdep_key#2);
                               lock(&htab->lockdep_key#3);
  lock(&htab->lockdep_key#2);

 *** DEADLOCK ***

3 locks held by syz-executor203/4169:
 #0: ffffffff8c91fc60 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x5/0x30 include/linux/rcupdate.h:311
 #1: ffff888022f560a0 (&htab->lockdep_key#3){....}-{2:2}, at: htab_lock_bucket kernel/bpf/hashtab.c:183 [inline]
 #1: ffff888022f560a0 (&htab->lockdep_key#3){....}-{2:2}, at: htab_map_update_elem+0x245/0x9c0 kernel/bpf/hashtab.c:1082
 #2: ffffffff8c91fc60 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x5/0x30 include/linux/rcupdate.h:311

stack backtrace:
CPU: 0 PID: 4169 Comm: syz-executor203 Not tainted 5.15.173-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x1e3/0x2d0 lib/dump_stack.c:106
 check_noncircular+0x2f8/0x3b0 kernel/locking/lockdep.c:2133
 check_prev_add kernel/locking/lockdep.c:3053 [inline]
 check_prevs_add kernel/locking/lockdep.c:3172 [inline]
 validate_chain+0x1649/0x5930 kernel/locking/lockdep.c:3788
 __lock_acquire+0x1295/0x1ff0 kernel/locking/lockdep.c:5012
 lock_acquire+0x1db/0x4f0 kernel/locking/lockdep.c:5623
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0xd1/0x120 kernel/locking/spinlock.c:162
 htab_lock_bucket kernel/bpf/hashtab.c:183 [inline]
 htab_map_delete_elem+0x1bd/0x560 kernel/bpf/hashtab.c:1361
 bpf_prog_2c29ac5cdc6b1842+0x3a/0x6b8
 bpf_dispatcher_nop_func include/linux/bpf.h:790 [inline]
 __bpf_prog_run include/linux/filter.h:628 [inline]
 bpf_prog_run include/linux/filter.h:635 [inline]
 __bpf_trace_run kernel/trace/bpf_trace.c:1878 [inline]
 bpf_trace_run4+0x1ea/0x390 kernel/trace/bpf_trace.c:1917
 __bpf_trace_mm_page_alloc+0xba/0xe0 include/trace/events/kmem.h:201
 __traceiter_mm_page_alloc+0x35/0x50 include/trace/events/kmem.h:201
 trace_mm_page_alloc include/trace/events/kmem.h:201 [inline]
 __alloc_pages+0x6e0/0x700 mm/page_alloc.c:5486
 __alloc_pages_node include/linux/gfp.h:570 [inline]
 alloc_pages_node include/linux/gfp.h:584 [inline]
 kmalloc_large_node+0x7c/0x180 mm/slub.c:4421
 __kmalloc_node+0x22d/0x390 mm/slub.c:4437
 kmalloc_node include/linux/slab.h:614 [inline]
 bpf_map_kmalloc_node+0xdb/0x160 kernel/bpf/syscall.c:430
 alloc_htab_elem+0x28b/0x920 kernel/bpf/hashtab.c:973
 htab_map_update_elem+0x3cb/0x9c0 kernel/bpf/hashtab.c:1106
 bpf_map_update_value+0x5d7/0x6c0 kernel/bpf/syscall.c:221
 map_update_elem+0x6a0/0x7c0 kernel/bpf/syscall.c:1185
 __sys_bpf+0x2fd/0x670 kernel/bpf/syscall.c:4639
 __do_sys_bpf kernel/bpf/syscall.c:4755 [inline]
 __se_sys_bpf kernel/bpf/syscall.c:4753 [inline]
 __x64_sys_bpf+0x78/0x90 kernel/bpf/syscall.c:4753
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x66/0xd0
RIP: 0033:0x7ff2c8e36029
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 c1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffd165a5538 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff2c8e36029
RDX: 0000000000000020 RSI: 0000000020000280 RDI: 0000000000000002
RBP: 0000000000000000 R08: 00000000000000a0 R09: 00000000000000a0
R10: 00000000000000a0 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 </TASK>

Crashes (2):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2024/11/25 01:21 linux-5.15.y 0a51d2d4527b 68da6d95 .config console log report syz / log C [disk image] [vmlinux] [kernel image] ci2-linux-5-15-kasan possible deadlock in htab_map_delete_elem
2024/11/25 00:37 linux-5.15.y 0a51d2d4527b 68da6d95 .config console log report info [disk image] [vmlinux] [kernel image] ci2-linux-5-15-kasan possible deadlock in htab_map_delete_elem
* Struck through repros no longer work on HEAD.