syzbot


possible deadlock in sock_hash_free

Status: fixed on 2018/07/09 18:05
Subsystems: bpf
[Documentation on labels]
Reported-by: syzbot+83bdee62c80cc044cb1a@syzkaller.appspotmail.com
Fix commit: e9db4ef6bf4c bpf: sockhash fix omitted bucket lock in sock_close
First crash: 2131d, last: 2130d
Discussions (1)
Title Replies (including bot) Last reply
possible deadlock in sock_hash_free 1 (2) 2018/07/02 18:50

Sample crash report:
======================================================
WARNING: possible circular locking dependency detected
4.17.0-rc6+ #25 Not tainted
------------------------------------------------------
kworker/1:3/4633 is trying to acquire lock:
000000008079bfc5 (clock-AF_INET6){++..}, at: sock_hash_free+0x377/0x700 kernel/bpf/sockmap.c:2089

but task is already holding lock:
00000000c31833aa (&htab->buckets[i].lock){+...}, at: sock_hash_free+0x1d4/0x700 kernel/bpf/sockmap.c:2083

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&htab->buckets[i].lock){+...}:
       __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
       _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
       bpf_tcp_close+0x822/0x10b0 kernel/bpf/sockmap.c:285
       inet_release+0x104/0x1f0 net/ipv4/af_inet.c:427
       inet6_release+0x50/0x70 net/ipv6/af_inet6.c:459
       sock_release+0x96/0x1b0 net/socket.c:594
       sock_close+0x16/0x20 net/socket.c:1149
       __fput+0x34d/0x890 fs/file_table.c:209
       ____fput+0x15/0x20 fs/file_table.c:243
       task_work_run+0x1e4/0x290 kernel/task_work.c:113
       exit_task_work include/linux/task_work.h:22 [inline]
       do_exit+0x1aee/0x2730 kernel/exit.c:865
       do_group_exit+0x16f/0x430 kernel/exit.c:968
       __do_sys_exit_group kernel/exit.c:979 [inline]
       __se_sys_exit_group kernel/exit.c:977 [inline]
       __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:977
       do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 (clock-AF_INET6){++..}:
       lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
       __raw_write_lock_bh include/linux/rwlock_api_smp.h:203 [inline]
       _raw_write_lock_bh+0x31/0x40 kernel/locking/spinlock.c:312
       sock_hash_free+0x377/0x700 kernel/bpf/sockmap.c:2089
       bpf_map_free_deferred+0xba/0xf0 kernel/bpf/syscall.c:261
       process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
       worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
       kthread+0x345/0x410 kernel/kthread.c:240
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&htab->buckets[i].lock);
                               lock(clock-AF_INET6);
                               lock(&htab->buckets[i].lock);
  lock(clock-AF_INET6);

 *** DEADLOCK ***

4 locks held by kworker/1:3/4633:
 #0: 000000004e52b89d ((wq_completion)"events"){+.+.}, at: __write_once_size include/linux/compiler.h:215 [inline]
 #0: 000000004e52b89d ((wq_completion)"events"){+.+.}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: 000000004e52b89d ((wq_completion)"events"){+.+.}, at: atomic64_set include/asm-generic/atomic-instrumented.h:40 [inline]
 #0: 000000004e52b89d ((wq_completion)"events"){+.+.}, at: atomic_long_set include/asm-generic/atomic-long.h:57 [inline]
 #0: 000000004e52b89d ((wq_completion)"events"){+.+.}, at: set_work_data kernel/workqueue.c:617 [inline]
 #0: 000000004e52b89d ((wq_completion)"events"){+.+.}, at: set_work_pool_and_clear_pending kernel/workqueue.c:644 [inline]
 #0: 000000004e52b89d ((wq_completion)"events"){+.+.}, at: process_one_work+0xaef/0x1b50 kernel/workqueue.c:2116
 #1: 00000000eb3f65af ((work_completion)(&map->work)){+.+.}, at: process_one_work+0xb46/0x1b50 kernel/workqueue.c:2120
 #2: 00000000ed2f7c6d (rcu_read_lock){....}, at: sock_hash_free+0x0/0x700 include/net/sock.h:2178
 #3: 00000000c31833aa (&htab->buckets[i].lock){+...}, at: sock_hash_free+0x1d4/0x700 kernel/bpf/sockmap.c:2083

stack backtrace:
CPU: 1 PID: 4633 Comm: kworker/1:3 Not tainted 4.17.0-rc6+ #25
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events bpf_map_free_deferred
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
 print_circular_bug.isra.36.cold.54+0x1bd/0x27d kernel/locking/lockdep.c:1223
 check_prev_add kernel/locking/lockdep.c:1863 [inline]
 check_prevs_add kernel/locking/lockdep.c:1976 [inline]
 validate_chain kernel/locking/lockdep.c:2417 [inline]
 __lock_acquire+0x343e/0x5140 kernel/locking/lockdep.c:3431
 lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
 __raw_write_lock_bh include/linux/rwlock_api_smp.h:203 [inline]
 _raw_write_lock_bh+0x31/0x40 kernel/locking/spinlock.c:312
 sock_hash_free+0x377/0x700 kernel/bpf/sockmap.c:2089
 bpf_map_free_deferred+0xba/0xf0 kernel/bpf/syscall.c:261
 process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
 worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
 kthread+0x345/0x410 kernel/kthread.c:240
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412

Crashes (45):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2018/05/29 00:22 bpf-next 7a1a98c171ea f48c20b8 .config console log report syz C ci-upstream-bpf-next-kasan-gce
2018/05/28 18:42 bpf-next 7a1a98c171ea f48c20b8 .config console log report syz C ci-upstream-bpf-next-kasan-gce
2018/05/28 17:18 bpf-next 7a1a98c171ea f48c20b8 .config console log report syz C ci-upstream-bpf-next-kasan-gce
2018/05/28 14:51 bpf-next 7a1a98c171ea f48c20b8 .config console log report syz C ci-upstream-bpf-next-kasan-gce
2018/05/28 14:04 bpf-next 7a1a98c171ea f48c20b8 .config console log report syz C ci-upstream-bpf-next-kasan-gce
2018/05/28 13:36 bpf-next 7a1a98c171ea f48c20b8 .config console log report syz C ci-upstream-bpf-next-kasan-gce
2018/05/29 00:27 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/29 00:15 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/29 00:11 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/29 00:04 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 23:39 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 23:27 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 23:22 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 22:53 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 22:05 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 21:54 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 21:50 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 21:45 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 20:42 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 20:11 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 19:55 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 19:28 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 19:21 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 19:20 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 19:08 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 18:55 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 18:54 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 18:50 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 18:00 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 17:53 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 16:34 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 16:13 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 13:52 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 13:49 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 12:10 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 12:09 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 12:06 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 12:06 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 12:06 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
2018/05/28 12:06 bpf-next 7a1a98c171ea f48c20b8 .config console log report ci-upstream-bpf-next-kasan-gce
* Struck through repros no longer work on HEAD.