syzbot


possible deadlock in hci_conn_hash_flush

Status: auto-obsoleted due to no activity on 2023/06/17 14:27
Subsystems: bluetooth
[Documentation on labels]
Reported-by: syzbot+76a9cc07a77bc3e48ef7@syzkaller.appspotmail.com
First crash: 733d, last: 602d
Discussions (1)
Title Replies (including bot) Last reply
[syzbot] possible deadlock in hci_conn_hash_flush 0 (1) 2022/10/10 07:18
Similar bugs (4)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
linux-6.1 possible deadlock in hci_conn_hash_flush (2) origin:lts-only syz done 13 151d 227d 0/3 upstream: reported syz repro on 2024/02/27 07:43
linux-5.15 possible deadlock in hci_conn_hash_flush 9 419d 496d 0/3 auto-obsoleted due to no activity on 2023/11/27 03:55
linux-5.15 possible deadlock in hci_conn_hash_flush (2) origin:lts-only syz 5 137d 262d 0/3 upstream: reported syz repro on 2024/01/23 23:43
linux-6.1 possible deadlock in hci_conn_hash_flush 5 443d 548d 0/3 auto-obsoleted due to no activity on 2023/11/03 12:59

Sample crash report:
Bluetooth: hci4: Controller not accepting commands anymore: ncmd = 0
Bluetooth: hci4: Injecting HCI hardware error event
Bluetooth: hci4: hardware error 0x00
======================================================
WARNING: possible circular locking dependency detected
6.2.0-rc8-syzkaller-00098-gec35307e18ba #0 Not tainted
------------------------------------------------------
kworker/u5:2/5095 is trying to acquire lock:
ffff888023c2be70 ((work_completion)(&(&conn->timeout_work)->work)){+.+.}-{0:0}, at: __flush_work+0xdd/0xaf0 kernel/workqueue.c:3066

but task is already holding lock:
ffffffff8e2fefa8 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_disconn_cfm include/net/bluetooth/hci_core.h:1790 [inline]
ffffffff8e2fefa8 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_conn_hash_flush+0xd9/0x260 net/bluetooth/hci_conn.c:2437

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 (hci_cb_list_lock){+.+.}-{3:3}:
       __mutex_lock_common kernel/locking/mutex.c:603 [inline]
       __mutex_lock+0x12f/0x1360 kernel/locking/mutex.c:747
       hci_connect_cfm include/net/bluetooth/hci_core.h:1775 [inline]
       hci_remote_features_evt+0x49e/0xa20 net/bluetooth/hci_event.c:3768
       hci_event_func net/bluetooth/hci_event.c:7485 [inline]
       hci_event_packet+0x956/0xfd0 net/bluetooth/hci_event.c:7537
       hci_rx_work+0xaeb/0x1230 net/bluetooth/hci_core.c:4043
       process_one_work+0x9bf/0x1710 kernel/workqueue.c:2289
       worker_thread+0x669/0x1090 kernel/workqueue.c:2436
       kthread+0x2e8/0x3a0 kernel/kthread.c:376
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308

-> #2 (&hdev->lock){+.+.}-{3:3}:
       __mutex_lock_common kernel/locking/mutex.c:603 [inline]
       __mutex_lock+0x12f/0x1360 kernel/locking/mutex.c:747
       sco_sock_connect+0x1ea/0xa60 net/bluetooth/sco.c:593
       __sys_connect_file+0x153/0x1a0 net/socket.c:1979
       __sys_connect+0x165/0x1a0 net/socket.c:1996
       __do_sys_connect net/socket.c:2006 [inline]
       __se_sys_connect net/socket.c:2003 [inline]
       __x64_sys_connect+0x73/0xb0 net/socket.c:2003
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #1 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}:
       lock_sock_nested+0x3a/0xf0 net/core/sock.c:3471
       lock_sock include/net/sock.h:1725 [inline]
       sco_sock_timeout+0xd1/0x290 net/bluetooth/sco.c:97
       process_one_work+0x9bf/0x1710 kernel/workqueue.c:2289
       worker_thread+0x669/0x1090 kernel/workqueue.c:2436
       kthread+0x2e8/0x3a0 kernel/kthread.c:376
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308

-> #0 ((work_completion)(&(&conn->timeout_work)->work)){+.+.}-{0:0}:
       check_prev_add kernel/locking/lockdep.c:3097 [inline]
       check_prevs_add kernel/locking/lockdep.c:3216 [inline]
       validate_chain kernel/locking/lockdep.c:3831 [inline]
       __lock_acquire+0x2a43/0x56d0 kernel/locking/lockdep.c:5055
       lock_acquire kernel/locking/lockdep.c:5668 [inline]
       lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633
       __flush_work+0x109/0xaf0 kernel/workqueue.c:3069
       __cancel_work_timer+0x3f9/0x570 kernel/workqueue.c:3160
       sco_conn_del+0x1b5/0x2b0 net/bluetooth/sco.c:205
       sco_disconn_cfm+0x75/0xb0 net/bluetooth/sco.c:1379
       hci_disconn_cfm include/net/bluetooth/hci_core.h:1793 [inline]
       hci_conn_hash_flush+0x126/0x260 net/bluetooth/hci_conn.c:2437
       hci_dev_close_sync+0x5fb/0x1200 net/bluetooth/hci_sync.c:4850
       hci_dev_do_close+0x31/0x70 net/bluetooth/hci_core.c:554
       hci_error_reset+0xa2/0x140 net/bluetooth/hci_core.c:1059
       process_one_work+0x9bf/0x1710 kernel/workqueue.c:2289
       worker_thread+0x669/0x1090 kernel/workqueue.c:2436
       kthread+0x2e8/0x3a0 kernel/kthread.c:376
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308

other info that might help us debug this:

Chain exists of:
  (work_completion)(&(&conn->timeout_work)->work) --> &hdev->lock --> hci_cb_list_lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(hci_cb_list_lock);
                               lock(&hdev->lock);
                               lock(hci_cb_list_lock);
  lock((work_completion)(&(&conn->timeout_work)->work));

 *** DEADLOCK ***

5 locks held by kworker/u5:2/5095:
 #0: ffff888072435938 ((wq_completion)hci4){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888072435938 ((wq_completion)hci4){+.+.}-{0:0}, at: arch_atomic_long_set include/linux/atomic/atomic-long.h:41 [inline]
 #0: ffff888072435938 ((wq_completion)hci4){+.+.}-{0:0}, at: atomic_long_set include/linux/atomic/atomic-instrumented.h:1280 [inline]
 #0: ffff888072435938 ((wq_completion)hci4){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:636 [inline]
 #0: ffff888072435938 ((wq_completion)hci4){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:663 [inline]
 #0: ffff888072435938 ((wq_completion)hci4){+.+.}-{0:0}, at: process_one_work+0x86d/0x1710 kernel/workqueue.c:2260
 #1: ffffc900035efda8 ((work_completion)(&hdev->error_reset)){+.+.}-{0:0}, at: process_one_work+0x8a1/0x1710 kernel/workqueue.c:2264
 #2: ffff88802022d028 (&hdev->req_lock){+.+.}-{3:3}, at: hci_dev_do_close+0x29/0x70 net/bluetooth/hci_core.c:552
 #3: ffff88802022c078 (&hdev->lock){+.+.}-{3:3}, at: hci_dev_close_sync+0x306/0x1200 net/bluetooth/hci_sync.c:4837
 #4: ffffffff8e2fefa8 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_disconn_cfm include/net/bluetooth/hci_core.h:1790 [inline]
 #4: ffffffff8e2fefa8 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_conn_hash_flush+0xd9/0x260 net/bluetooth/hci_conn.c:2437

stack backtrace:
CPU: 1 PID: 5095 Comm: kworker/u5:2 Not tainted 6.2.0-rc8-syzkaller-00098-gec35307e18ba #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023
Workqueue: hci4 hci_error_reset
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
 check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2177
 check_prev_add kernel/locking/lockdep.c:3097 [inline]
 check_prevs_add kernel/locking/lockdep.c:3216 [inline]
 validate_chain kernel/locking/lockdep.c:3831 [inline]
 __lock_acquire+0x2a43/0x56d0 kernel/locking/lockdep.c:5055
 lock_acquire kernel/locking/lockdep.c:5668 [inline]
 lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633
 __flush_work+0x109/0xaf0 kernel/workqueue.c:3069
 __cancel_work_timer+0x3f9/0x570 kernel/workqueue.c:3160
 sco_conn_del+0x1b5/0x2b0 net/bluetooth/sco.c:205
 sco_disconn_cfm+0x75/0xb0 net/bluetooth/sco.c:1379
 hci_disconn_cfm include/net/bluetooth/hci_core.h:1793 [inline]
 hci_conn_hash_flush+0x126/0x260 net/bluetooth/hci_conn.c:2437
 hci_dev_close_sync+0x5fb/0x1200 net/bluetooth/hci_sync.c:4850
 hci_dev_do_close+0x31/0x70 net/bluetooth/hci_core.c:554
 hci_error_reset+0xa2/0x140 net/bluetooth/hci_core.c:1059
 process_one_work+0x9bf/0x1710 kernel/workqueue.c:2289
 worker_thread+0x669/0x1090 kernel/workqueue.c:2436
 kthread+0x2e8/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
 </TASK>
Bluetooth: hci4: Opcode 0x c03 failed: -110

Crashes (4):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2023/02/17 14:27 upstream ec35307e18ba 3e7039f4 .config console log report info [disk image] [vmlinux] [kernel image] ci-upstream-kasan-gce-selinux-root possible deadlock in hci_conn_hash_flush
2022/12/13 15:14 upstream 764822972d64 67be1ae7 .config console log report info ci-upstream-kasan-gce-selinux-root possible deadlock in hci_conn_hash_flush
2022/11/30 20:49 upstream 04aa64375f48 4c2a66e8 .config console log report info ci-upstream-kasan-gce-selinux-root possible deadlock in hci_conn_hash_flush
2022/10/09 09:35 git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci bbed346d5a96 aea5da89 .config console log report info [disk image] [vmlinux] ci-upstream-gce-arm64 possible deadlock in hci_conn_hash_flush
* Struck through repros no longer work on HEAD.