syzbot


possible deadlock in htab_lru_map_delete_node

Status: fixed on 2020/04/15 17:19
Subsystems: bpf
[Documentation on labels]
Reported-by: syzbot+a38ff3d9356388f2fb83@syzkaller.appspotmail.com
Fix commit: b9aff38de2cb bpf: Fix a potential deadlock with bpf_map_do_batch
First crash: 1724d, last: 1718d
Cause bisection: introduced by (bisect log) :
commit 057996380a42bb64ccc04383cfa9c0ace4ea11f0
Author: Yonghong Song <yhs@fb.com>
Date: Wed Jan 15 18:43:04 2020 +0000

  bpf: Add batch ops to all htab bpf map

Crash: possible deadlock in htab_lru_map_delete_node (log)
Repro: C syz .config
  
Discussions (5)
Title Replies (including bot) Last reply
[PATCH bpf v4] bpf: fix a potential deadlock with bpf_map_do_batch 2 (2) 2020/02/20 00:07
[PATCH bpf v3] bpf: fix a potential deadlock with bpf_map_do_batch 3 (3) 2020/02/19 23:13
[PATCH bpf v2] bpf: fix a potential deadlock with bpf_map_do_batch 3 (3) 2020/02/19 18:15
[PATCH bpf] bpf: fix a potential deadlock with bpf_map_do_batch 4 (4) 2020/02/19 16:15
possible deadlock in htab_lru_map_delete_node 0 (1) 2020/02/17 04:27

Sample crash report:
======================================================
WARNING: possible circular locking dependency detected
5.6.0-rc2-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor253/9751 is trying to acquire lock:
ffff888098b1eae0 (&htab->buckets[i].lock){....}, at: htab_lru_map_delete_node+0xce/0x2f0 kernel/bpf/hashtab.c:593

but task is already holding lock:
ffff8880973ee218 (&l->lock){....}, at: bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:325 [inline]
ffff8880973ee218 (&l->lock){....}, at: bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline]
ffff8880973ee218 (&l->lock){....}, at: bpf_lru_pop_free+0x67f/0x1670 kernel/bpf/bpf_lru_list.c:499

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&l->lock){....}:
       __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
       _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
       bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:325 [inline]
       bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline]
       bpf_lru_pop_free+0x67f/0x1670 kernel/bpf/bpf_lru_list.c:499
       prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132
       __htab_lru_percpu_map_update_elem+0x67e/0xa90 kernel/bpf/hashtab.c:1069
       bpf_percpu_hash_update+0x16e/0x210 kernel/bpf/hashtab.c:1585
       bpf_map_update_value.isra.0+0x2d7/0x8e0 kernel/bpf/syscall.c:181
       generic_map_update_batch+0x41f/0x610 kernel/bpf/syscall.c:1319
       bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348
       __do_sys_bpf+0x9b7/0x41e0 kernel/bpf/syscall.c:3460
       __se_sys_bpf kernel/bpf/syscall.c:3355 [inline]
       __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #1 (&loc_l->lock){....}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159
       bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline]
       bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555
       __htab_map_lookup_and_delete_batch+0x8d4/0x1540 kernel/bpf/hashtab.c:1374
       htab_lru_map_lookup_and_delete_batch+0x34/0x40 kernel/bpf/hashtab.c:1491
       bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348
       __do_sys_bpf+0x1f7d/0x41e0 kernel/bpf/syscall.c:3456
       __se_sys_bpf kernel/bpf/syscall.c:3355 [inline]
       __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 (&htab->buckets[i].lock){....}:
       check_prev_add kernel/locking/lockdep.c:2475 [inline]
       check_prevs_add kernel/locking/lockdep.c:2580 [inline]
       validate_chain kernel/locking/lockdep.c:2970 [inline]
       __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954
       lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159
       htab_lru_map_delete_node+0xce/0x2f0 kernel/bpf/hashtab.c:593
       __bpf_lru_list_shrink_inactive kernel/bpf/bpf_lru_list.c:220 [inline]
       __bpf_lru_list_shrink+0xf9/0x470 kernel/bpf/bpf_lru_list.c:266
       bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:340 [inline]
       bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline]
       bpf_lru_pop_free+0x87c/0x1670 kernel/bpf/bpf_lru_list.c:499
       prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132
       __htab_lru_percpu_map_update_elem+0x67e/0xa90 kernel/bpf/hashtab.c:1069
       bpf_percpu_hash_update+0x16e/0x210 kernel/bpf/hashtab.c:1585
       bpf_map_update_value.isra.0+0x2d7/0x8e0 kernel/bpf/syscall.c:181
       generic_map_update_batch+0x41f/0x610 kernel/bpf/syscall.c:1319
       bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348
       __do_sys_bpf+0x9b7/0x41e0 kernel/bpf/syscall.c:3460
       __se_sys_bpf kernel/bpf/syscall.c:3355 [inline]
       __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

Chain exists of:
  &htab->buckets[i].lock --> &loc_l->lock --> &l->lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&l->lock);
                               lock(&loc_l->lock);
                               lock(&l->lock);
  lock(&htab->buckets[i].lock);

 *** DEADLOCK ***

3 locks held by syz-executor253/9751:
 #0: ffffffff89bac240 (rcu_read_lock){....}, at: bpf_percpu_hash_update+0x0/0x210 kernel/bpf/hashtab.c:1565
 #1: ffffe8ffffc784c8 (&loc_l->lock){....}, at: bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:443 [inline]
 #1: ffffe8ffffc784c8 (&loc_l->lock){....}, at: bpf_lru_pop_free+0x32b/0x1670 kernel/bpf/bpf_lru_list.c:499
 #2: ffff8880973ee218 (&l->lock){....}, at: bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:325 [inline]
 #2: ffff8880973ee218 (&l->lock){....}, at: bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline]
 #2: ffff8880973ee218 (&l->lock){....}, at: bpf_lru_pop_free+0x67f/0x1670 kernel/bpf/bpf_lru_list.c:499

stack backtrace:
CPU: 0 PID: 9751 Comm: syz-executor253 Not tainted 5.6.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x197/0x210 lib/dump_stack.c:118
 print_circular_bug.isra.0.cold+0x163/0x172 kernel/locking/lockdep.c:1684
 check_noncircular+0x32e/0x3e0 kernel/locking/lockdep.c:1808
 check_prev_add kernel/locking/lockdep.c:2475 [inline]
 check_prevs_add kernel/locking/lockdep.c:2580 [inline]
 validate_chain kernel/locking/lockdep.c:2970 [inline]
 __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954
 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159
 htab_lru_map_delete_node+0xce/0x2f0 kernel/bpf/hashtab.c:593
 __bpf_lru_list_shrink_inactive kernel/bpf/bpf_lru_list.c:220 [inline]
 __bpf_lru_list_shrink+0xf9/0x470 kernel/bpf/bpf_lru_list.c:266
 bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:340 [inline]
 bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline]
 bpf_lru_pop_free+0x87c/0x1670 kernel/bpf/bpf_lru_list.c:499
 prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132
 __htab_lru_percpu_map_update_elem+0x67e/0xa90 kernel/bpf/hashtab.c:1069
 bpf_percpu_hash_update+0x16e/0x210 kernel/bpf/hashtab.c:1585
 bpf_map_update_value.isra.0+0x2d7/0x8e0 kernel/bpf/syscall.c:181
 generic_map_update_batch+0x41f/0x610 kernel/bpf/syscall.c:1319
 bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348
 __do_sys_bpf+0x9b7/0x41e0 kernel/bpf/syscall.c:3460
 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline]
 __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355
 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x449499
Code: e8 ec 0e 03 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 ab 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fa6d86dcdb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00000000006dec28 RCX: 0000000000449499
RDX: 0000000000000038 RSI: 0000000020000040 RDI: 000000000000001a
RBP: 00000000006dec20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dec2c
R13: 00007ffc6bea610f R14: 00007fa6d86dd9c0 R15: 0000000000000000

Crashes (264):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2020/02/18 22:00 upstream b1da3acc781c 012fbc32 .config console log report syz C ci-upstream-kasan-gce-root
2020/02/16 15:44 net-old 2019fc96af22 cf914200 .config console log report syz C ci-upstream-net-this-kasan-gce
2020/02/16 13:48 bpf eecd618b4516 cf914200 .config console log report syz C ci-upstream-bpf-kasan-gce
2020/02/16 17:02 bpf-next 2019fc96af22 cf914200 .config console log report syz C ci-upstream-bpf-next-kasan-gce
2020/02/16 16:05 net-next-old 2019fc96af22 cf914200 .config console log report syz C ci-upstream-net-kasan-gce
2020/02/21 19:04 upstream ca7e1fd1026c bd2a74a3 .config console log report syz ci-upstream-kasan-gce-root
2020/02/20 18:16 upstream ca7e1fd1026c 81230308 .config console log report syz ci-upstream-kasan-gce-selinux-root
2020/02/19 14:45 upstream 0a44cac81050 135c18aa .config console log report syz ci-upstream-kasan-gce-smack-root
2020/02/19 01:40 net-old 9b64208f74fb 012fbc32 .config console log report syz ci-upstream-net-this-kasan-gce
2020/02/18 23:37 net-next-old b182a66792fe 012fbc32 .config console log report syz ci-upstream-net-kasan-gce
2020/02/22 00:42 upstream b0dd1eb220c0 2ffa6679 .config console log report ci-upstream-kasan-gce-smack-root
2020/02/21 20:17 upstream ca7e1fd1026c 2ffa6679 .config console log report ci-upstream-kasan-gce-selinux-root
2020/02/21 13:38 upstream ca7e1fd1026c bd2a74a3 .config console log report ci-upstream-kasan-gce-selinux-root
2020/02/20 21:59 upstream ca7e1fd1026c bd2a74a3 .config console log report ci-upstream-kasan-gce-selinux-root
2020/02/20 07:21 upstream ca7e1fd1026c 81230308 .config console log report ci-upstream-kasan-gce-root
2020/02/19 20:31 upstream 4b205766d8fc b690a6e3 .config console log report ci-upstream-kasan-gce-selinux-root
2020/02/19 19:15 upstream 4b205766d8fc b690a6e3 .config console log report ci-upstream-kasan-gce-root
2020/02/19 15:59 net-old 9facfdb54673 b690a6e3 .config console log report ci-upstream-net-this-kasan-gce
2020/02/19 12:18 bpf 113e6b7e15e2 135c18aa .config console log report ci-upstream-bpf-kasan-gce
2020/02/22 11:09 bpf-next e42da4c62abb 2ffa6679 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/22 10:39 bpf-next e42da4c62abb 2ffa6679 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/22 09:07 bpf-next e42da4c62abb 2ffa6679 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/22 06:46 bpf-next e42da4c62abb 2ffa6679 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/22 05:25 bpf-next e42da4c62abb 2ffa6679 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/22 04:18 bpf-next e42da4c62abb 2ffa6679 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/22 03:14 bpf-next e42da4c62abb 2ffa6679 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/22 00:39 bpf-next e42da4c62abb 2ffa6679 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/21 23:32 bpf-next e42da4c62abb 2ffa6679 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/21 21:35 bpf-next e42da4c62abb 2ffa6679 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/21 16:08 net-next-old 5f9721a2d119 bd2a74a3 .config console log report ci-upstream-net-kasan-gce
2020/02/21 14:45 bpf-next 5327644614a1 bd2a74a3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/21 11:55 bpf-next 5327644614a1 bd2a74a3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/21 10:54 net-next-old 2e92a2d0e450 bd2a74a3 .config console log report ci-upstream-net-kasan-gce
2020/02/21 08:58 bpf-next 5327644614a1 bd2a74a3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/21 07:26 bpf-next 5327644614a1 bd2a74a3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/21 06:26 bpf-next 5327644614a1 bd2a74a3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/21 04:44 bpf-next 5327644614a1 bd2a74a3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/21 03:36 bpf-next 5327644614a1 bd2a74a3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/21 00:48 bpf-next 5327644614a1 bd2a74a3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/20 20:31 bpf-next 5327644614a1 bd2a74a3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/20 19:23 bpf-next 500897804a36 81230308 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/20 17:02 bpf-next 500897804a36 81230308 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/20 15:33 bpf-next 500897804a36 81230308 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/20 13:52 bpf-next 500897804a36 81230308 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/20 12:23 bpf-next 500897804a36 81230308 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/20 10:36 bpf-next 500897804a36 81230308 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/20 10:00 bpf-next 500897804a36 81230308 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/20 06:03 bpf-next 500897804a36 b690a6e3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/20 04:15 bpf-next 2f14b2d9dd80 b690a6e3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/20 03:03 bpf-next 2f14b2d9dd80 b690a6e3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/20 01:15 bpf-next 2f14b2d9dd80 b690a6e3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/19 23:28 bpf-next 2f14b2d9dd80 b690a6e3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/19 22:27 bpf-next 2f14b2d9dd80 b690a6e3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/19 21:42 bpf-next 2f14b2d9dd80 b690a6e3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/19 17:46 bpf-next 2f14b2d9dd80 b690a6e3 .config console log report ci-upstream-bpf-next-kasan-gce
2020/02/19 16:38 net-next-old 00796b929ce8 b690a6e3 .config console log report ci-upstream-net-kasan-gce
2020/02/20 21:39 linux-next f4aba10148cd bd2a74a3 .config console log report ci-upstream-linux-next-kasan-gce-root
* Struck through repros no longer work on HEAD.