syzbot


possible deadlock in kvm_arch_pm_notifier

Status: upstream: reported on 2025/01/06 17:13
Subsystems: kvm
[Documentation on labels]
Reported-by: syzbot+352e553a86e0d75f5120@syzkaller.appspotmail.com
Fix commit: KVM: x86: Don't take kvm->lock when iterating over vCPUs in suspend notifier
Patched on: [ci-upstream-linux-next-kasan-gce-root], missing on: [ci-qemu-gce-upstream-auto ci-qemu-native-arm64-kvm ci-qemu-upstream ci-qemu-upstream-386 ci-qemu2-arm32 ci-qemu2-arm64 ci-qemu2-arm64-compat ci-qemu2-arm64-mte ci-qemu2-riscv64 ci-snapshot-upstream-root ci-upstream-bpf-kasan-gce ci-upstream-bpf-next-kasan-gce ci-upstream-gce-arm64 ci-upstream-gce-leak ci-upstream-kasan-badwrites-root ci-upstream-kasan-gce ci-upstream-kasan-gce-386 ci-upstream-kasan-gce-root ci-upstream-kasan-gce-selinux-root ci-upstream-kasan-gce-smack-root ci-upstream-kmsan-gce-386-root ci-upstream-kmsan-gce-root ci-upstream-net-kasan-gce ci-upstream-net-this-kasan-gce ci2-upstream-fs ci2-upstream-kcsan-gce ci2-upstream-usb]
First crash: 43d, last: 13d
Discussions (3)
Title Replies (including bot) Last reply
[PATCH v2 00/11] KVM: x86: pvclock fixes and cleanups 20 (20) 2025/02/15 00:50
[PATCH 00/10] KVM: x86: pvclock fixes and cleanups 30 (30) 2025/01/21 18:45
[syzbot] [kvm?] possible deadlock in kvm_arch_pm_notifier 1 (2) 2025/01/07 15:18

Sample crash report:
======================================================
WARNING: possible circular locking dependency detected
6.13.0-syzkaller-09760-g69e858e0b8b2 #0 Not tainted
------------------------------------------------------
syz.4.4437/23112 is trying to acquire lock:
ffffc900051beb58 (&kvm->lock){+.+.}-{4:4}, at: kvm_arch_suspend_notifier arch/x86/kvm/x86.c:6910 [inline]
ffffc900051beb58 (&kvm->lock){+.+.}-{4:4}, at: kvm_arch_pm_notifier+0xda/0x370 arch/x86/kvm/x86.c:6932

but task is already holding lock:
ffffffff8e80c2d0 ((pm_chain_head).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain_robust+0xac/0x1e0 kernel/notifier.c:344

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #4 ((pm_chain_head).rwsem){++++}-{4:4}:
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
       down_write+0x99/0x220 kernel/locking/rwsem.c:1577
       __blocking_notifier_chain_register kernel/notifier.c:263 [inline]
       blocking_notifier_chain_register+0x50/0xc0 kernel/notifier.c:282
       hci_register_dev+0x6f4/0x8b0 net/bluetooth/hci_core.c:2632
       hci_uart_register_dev drivers/bluetooth/hci_ldisc.c:686 [inline]
       hci_uart_set_proto drivers/bluetooth/hci_ldisc.c:710 [inline]
       hci_uart_tty_ioctl+0x821/0x9e0 drivers/bluetooth/hci_ldisc.c:762
       tty_ioctl+0x998/0xdc0 drivers/tty/tty_io.c:2811
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:906 [inline]
       __se_sys_ioctl+0xf5/0x170 fs/ioctl.c:892
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #3 (&tty->ldisc_sem){++++}-{0:0}:
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
       __ldsem_down_read_nested+0xb1/0x9a0 drivers/tty/tty_ldsem.c:300
       tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:243
       tty_poll+0x6a/0x170 drivers/tty/tty_io.c:2204
       vfs_poll include/linux/poll.h:82 [inline]
       __io_arm_poll_handler+0x34b/0x9d0 io_uring/poll.c:578
       io_poll_add+0xe5/0x240 io_uring/poll.c:888
       io_issue_sqe+0x37f/0x12b0 io_uring/io_uring.c:1735
       io_queue_sqe io_uring/io_uring.c:1945 [inline]
       io_submit_sqe io_uring/io_uring.c:2200 [inline]
       io_submit_sqes+0xa75/0x1d60 io_uring/io_uring.c:2317
       __do_sys_io_uring_enter io_uring/io_uring.c:3368 [inline]
       __se_sys_io_uring_enter+0x2c8/0x3390 io_uring/io_uring.c:3303
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #2 (&ctx->uring_lock){+.+.}-{4:4}:
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
       __mutex_lock_common kernel/locking/mutex.c:585 [inline]
       __mutex_lock+0x19c/0x1010 kernel/locking/mutex.c:730
       io_handle_tw_list+0x1b2/0x500 io_uring/io_uring.c:1054
       tctx_task_work_run+0x9a/0x370 io_uring/io_uring.c:1121
       tctx_task_work+0x9a/0x100 io_uring/io_uring.c:1139
       task_work_run+0x24f/0x310 kernel/task_work.c:227
       resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
       xfer_to_guest_mode_work kernel/entry/kvm.c:20 [inline]
       xfer_to_guest_mode_handle_work+0x88/0xd0 kernel/entry/kvm.c:47
       vcpu_run+0xfb2/0x89e0 arch/x86/kvm/x86.c:11275
       kvm_arch_vcpu_ioctl_run+0xa68/0x1940 arch/x86/kvm/x86.c:11572
       kvm_vcpu_ioctl+0x996/0x1020 virt/kvm/kvm_main.c:4385
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:906 [inline]
       __se_sys_ioctl+0xf5/0x170 fs/ioctl.c:892
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #1 (&vcpu->mutex){+.+.}-{4:4}:
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
       __mutex_lock_common kernel/locking/mutex.c:585 [inline]
       __mutex_lock+0x19c/0x1010 kernel/locking/mutex.c:730
       kvm_vm_ioctl_create_vcpu+0x55f/0x8b0 virt/kvm/kvm_main.c:4147
       kvm_vm_ioctl+0x7e2/0xd30 virt/kvm/kvm_main.c:5064
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:906 [inline]
       __se_sys_ioctl+0xf5/0x170 fs/ioctl.c:892
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #0 (&kvm->lock){+.+.}-{4:4}:
       check_prev_add kernel/locking/lockdep.c:3163 [inline]
       check_prevs_add kernel/locking/lockdep.c:3282 [inline]
       validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3906
       __lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
       __mutex_lock_common kernel/locking/mutex.c:585 [inline]
       __mutex_lock+0x19c/0x1010 kernel/locking/mutex.c:730
       kvm_arch_suspend_notifier arch/x86/kvm/x86.c:6910 [inline]
       kvm_arch_pm_notifier+0xda/0x370 arch/x86/kvm/x86.c:6932
       notifier_call_chain+0x1a5/0x3f0 kernel/notifier.c:85
       notifier_call_chain_robust kernel/notifier.c:120 [inline]
       blocking_notifier_call_chain_robust+0xe8/0x1e0 kernel/notifier.c:345
       pm_notifier_call_chain_robust+0x2c/0x60 kernel/power/main.c:102
       snapshot_open+0x19b/0x280 kernel/power/user.c:77
       misc_open+0x2cc/0x340 drivers/char/misc.c:179
       chrdev_open+0x521/0x600 fs/char_dev.c:414
       do_dentry_open+0xdec/0x1960 fs/open.c:955
       vfs_open+0x3b/0x370 fs/open.c:1085
       do_open fs/namei.c:3830 [inline]
       path_openat+0x2c81/0x3590 fs/namei.c:3989
       do_filp_open+0x27f/0x4e0 fs/namei.c:4016
       do_sys_openat2+0x13e/0x1d0 fs/open.c:1427
       do_sys_open fs/open.c:1442 [inline]
       __do_sys_openat fs/open.c:1458 [inline]
       __se_sys_openat fs/open.c:1453 [inline]
       __x64_sys_openat+0x247/0x2a0 fs/open.c:1453
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

Chain exists of:
  &kvm->lock --> &tty->ldisc_sem --> (pm_chain_head).rwsem

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  rlock((pm_chain_head).rwsem);
                               lock(&tty->ldisc_sem);
                               lock((pm_chain_head).rwsem);
  lock(&kvm->lock);

 *** DEADLOCK ***

3 locks held by syz.4.4437/23112:
 #0: ffffffff8f17f248 (misc_mtx){+.+.}-{4:4}, at: misc_open+0x54/0x340 drivers/char/misc.c:143
 #1: ffffffff8e7eca48 (system_transition_mutex){+.+.}-{4:4}, at: lock_system_sleep+0x60/0xa0 kernel/power/main.c:56
 #2: ffffffff8e80c2d0 ((pm_chain_head).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain_robust+0xac/0x1e0 kernel/notifier.c:344

stack backtrace:
CPU: 1 UID: 0 PID: 23112 Comm: syz.4.4437 Not tainted 6.13.0-syzkaller-09760-g69e858e0b8b2 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
 print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2076
 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2208
 check_prev_add kernel/locking/lockdep.c:3163 [inline]
 check_prevs_add kernel/locking/lockdep.c:3282 [inline]
 validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3906
 __lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
 __mutex_lock_common kernel/locking/mutex.c:585 [inline]
 __mutex_lock+0x19c/0x1010 kernel/locking/mutex.c:730
 kvm_arch_suspend_notifier arch/x86/kvm/x86.c:6910 [inline]
 kvm_arch_pm_notifier+0xda/0x370 arch/x86/kvm/x86.c:6932
 notifier_call_chain+0x1a5/0x3f0 kernel/notifier.c:85
 notifier_call_chain_robust kernel/notifier.c:120 [inline]
 blocking_notifier_call_chain_robust+0xe8/0x1e0 kernel/notifier.c:345
 pm_notifier_call_chain_robust+0x2c/0x60 kernel/power/main.c:102
 snapshot_open+0x19b/0x280 kernel/power/user.c:77
 misc_open+0x2cc/0x340 drivers/char/misc.c:179
 chrdev_open+0x521/0x600 fs/char_dev.c:414
 do_dentry_open+0xdec/0x1960 fs/open.c:955
 vfs_open+0x3b/0x370 fs/open.c:1085
 do_open fs/namei.c:3830 [inline]
 path_openat+0x2c81/0x3590 fs/namei.c:3989
 do_filp_open+0x27f/0x4e0 fs/namei.c:4016
 do_sys_openat2+0x13e/0x1d0 fs/open.c:1427
 do_sys_open fs/open.c:1442 [inline]
 __do_sys_openat fs/open.c:1458 [inline]
 __se_sys_openat fs/open.c:1453 [inline]
 __x64_sys_openat+0x247/0x2a0 fs/open.c:1453
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f583078cda9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f58315de038 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 00007f58309a6240 RCX: 00007f583078cda9
RDX: 00000000001c5100 RSI: 00000000200002c0 RDI: ffffffffffffff9c
RBP: 00007f583080e2a0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f58309a6240 R15: 00007fff820aeb68
 </TASK>
Bluetooth: hci0: Opcode 0x0c1a failed: -4
Bluetooth: hci0: Opcode 0x0406 failed: -4
Bluetooth: hci2: Opcode 0x0c1a failed: -4
Bluetooth: hci2: Opcode 0x0406 failed: -4
Bluetooth: hci5: Opcode 0x0c1a failed: -4
Bluetooth: hci5: Opcode 0x0406 failed: -4
Bluetooth: hci5: Opcode 0x0406 failed: -4

Crashes (2):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2025/02/02 20:15 upstream 69e858e0b8b2 568559e4 .config console log report info [disk image] [vmlinux] [kernel image] ci-upstream-kasan-gce-smack-root possible deadlock in kvm_arch_pm_notifier
2025/01/03 07:24 upstream 0bc21e701a6f d3ccff63 .config console log report info [disk image (non-bootable)] [vmlinux] [kernel image] ci-qemu-upstream-386 possible deadlock in kvm_arch_pm_notifier
* Struck through repros no longer work on HEAD.