syzbot


possible deadlock in wake_up_all_idle_cpus

Status: fixed on 2022/03/08 16:11
Subsystems: perf
[Documentation on labels]
Reported-by: syzbot+d5b23b18d2f4feae8a67@syzkaller.appspotmail.com
Fix commit: 96611c26dc35 sched: Improve wake_up_all_idle_cpus() take #2
First crash: 924d, last: 915d
Cause bisection: introduced by (bisect log) :
commit 8850cb663b5cda04d33f9cfbc38889d73d3c8e24
Author: Peter Zijlstra <peterz@infradead.org>
Date: Tue Sep 21 20:16:02 2021 +0000

  sched: Simplify wake_up_*idle*()

Crash: possible deadlock in wake_up_all_idle_cpus (log)
Repro: C syz .config
  
Discussions (3)
Title Replies (including bot) Last reply
[tip: sched/core] sched: Improve wake_up_all_idle_cpus() take #2 1 (1) 2021/10/22 15:42
[syzbot] possible deadlock in wake_up_all_idle_cpus 3 (6) 2021/10/20 06:32
5.15-rc on x86-32: chromium dies with floating point exception 9 (9) 2021/10/18 14:52

Sample crash report:
======================================================
WARNING: possible circular locking dependency detected
5.15.0-rc5-next-20211015-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor097/6554 is trying to acquire lock:
ffffffff8ba2e370 (cpu_hotplug_lock){++++}-{0:0}, at: wake_up_all_idle_cpus+0x13/0x80 kernel/smp.c:1173

but task is already holding lock:
ffff888079003228 (&mm->mmap_lock#2){++++}-{3:3}, at: mmap_write_lock_killable include/linux/mmap_lock.h:87 [inline]
ffff888079003228 (&mm->mmap_lock#2){++++}-{3:3}, at: vm_mmap_pgoff+0x15c/0x290 mm/util.c:517

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 (&mm->mmap_lock#2){++++}-{3:3}:
       __might_fault mm/memory.c:5244 [inline]
       __might_fault+0x104/0x170 mm/memory.c:5229
       _copy_from_user+0x27/0x180 lib/usercopy.c:13
       copy_from_user include/linux/uaccess.h:192 [inline]
       memdup_user+0x69/0xc0 mm/util.c:177
       strndup_user+0x70/0xe0 mm/util.c:232
       perf_event_set_filter kernel/events/core.c:10512 [inline]
       _perf_ioctl+0x1a2/0x1f00 kernel/events/core.c:5659
       perf_ioctl+0x76/0xb0 kernel/events/core.c:5730
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:874 [inline]
       __se_sys_ioctl fs/ioctl.c:860 [inline]
       __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #2 (&cpuctx_mutex){+.+.}-{3:3}:
       __mutex_lock_common kernel/locking/mutex.c:599 [inline]
       __mutex_lock+0x12f/0x12f0 kernel/locking/mutex.c:732
       perf_event_init_cpu+0x172/0x3e0 kernel/events/core.c:13295
       perf_event_init+0x39d/0x408 kernel/events/core.c:13342
       start_kernel+0x2bb/0x49b init/main.c:1059
       secondary_startup_64_no_verify+0xb0/0xbb

-> #1 (pmus_lock){+.+.}-{3:3}:
       __mutex_lock_common kernel/locking/mutex.c:599 [inline]
       __mutex_lock+0x12f/0x12f0 kernel/locking/mutex.c:732
       perf_event_init_cpu+0xc4/0x3e0 kernel/events/core.c:13289
       cpuhp_invoke_callback+0x3b5/0x9a0 kernel/cpu.c:190
       cpuhp_invoke_callback_range kernel/cpu.c:665 [inline]
       cpuhp_up_callbacks kernel/cpu.c:693 [inline]
       _cpu_up+0x3b0/0x790 kernel/cpu.c:1368
       cpu_up kernel/cpu.c:1404 [inline]
       cpu_up+0xfe/0x1a0 kernel/cpu.c:1376
       bringup_nonboot_cpus+0xfe/0x130 kernel/cpu.c:1470
       smp_init+0x2e/0x145 kernel/smp.c:1092
       kernel_init_freeable+0x477/0x73a init/main.c:1614
       kernel_init+0x1a/0x1d0 init/main.c:1511
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

-> #0 (cpu_hotplug_lock){++++}-{0:0}:
       check_prev_add kernel/locking/lockdep.c:3063 [inline]
       check_prevs_add kernel/locking/lockdep.c:3186 [inline]
       validate_chain kernel/locking/lockdep.c:3801 [inline]
       __lock_acquire+0x2a07/0x54a0 kernel/locking/lockdep.c:5027
       lock_acquire kernel/locking/lockdep.c:5637 [inline]
       lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5602
       percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
       cpus_read_lock+0x3e/0x140 kernel/cpu.c:308
       wake_up_all_idle_cpus+0x13/0x80 kernel/smp.c:1173
       cpu_latency_qos_apply kernel/power/qos.c:249 [inline]
       cpu_latency_qos_remove_request.part.0+0xc4/0x2f0 kernel/power/qos.c:328
       cpu_latency_qos_remove_request+0x65/0x80 kernel/power/qos.c:330
       snd_pcm_hw_params+0x1481/0x1990 sound/core/pcm_native.c:784
       snd_pcm_kernel_ioctl+0x164/0x310 sound/core/pcm_native.c:3355
       snd_pcm_oss_change_params_locked+0x1936/0x3a60 sound/core/oss/pcm_oss.c:947
       snd_pcm_oss_change_params sound/core/oss/pcm_oss.c:1091 [inline]
       snd_pcm_oss_mmap+0x442/0x550 sound/core/oss/pcm_oss.c:2910
       call_mmap include/linux/fs.h:2164 [inline]
       mmap_region+0xd8c/0x1650 mm/mmap.c:1787
       do_mmap+0x869/0xfb0 mm/mmap.c:1575
       vm_mmap_pgoff+0x1b7/0x290 mm/util.c:519
       ksys_mmap_pgoff+0x49f/0x620 mm/mmap.c:1624
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae

other info that might help us debug this:

Chain exists of:
  cpu_hotplug_lock --> &cpuctx_mutex --> &mm->mmap_lock#2

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&mm->mmap_lock#2);
                               lock(&cpuctx_mutex);
                               lock(&mm->mmap_lock#2);
  lock(cpu_hotplug_lock);

 *** DEADLOCK ***

2 locks held by syz-executor097/6554:
 #0: ffff888079003228 (&mm->mmap_lock#2){++++}-{3:3}, at: mmap_write_lock_killable include/linux/mmap_lock.h:87 [inline]
 #0: ffff888079003228 (&mm->mmap_lock#2){++++}-{3:3}, at: vm_mmap_pgoff+0x15c/0x290 mm/util.c:517
 #1: ffff88802317a440 (&runtime->oss.params_lock){+.+.}-{3:3}, at: snd_pcm_oss_change_params sound/core/oss/pcm_oss.c:1086 [inline]
 #1: ffff88802317a440 (&runtime->oss.params_lock){+.+.}-{3:3}, at: snd_pcm_oss_mmap+0x424/0x550 sound/core/oss/pcm_oss.c:2910

stack backtrace:
CPU: 0 PID: 6554 Comm: syz-executor097 Not tainted 5.15.0-rc5-next-20211015-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2143
 check_prev_add kernel/locking/lockdep.c:3063 [inline]
 check_prevs_add kernel/locking/lockdep.c:3186 [inline]
 validate_chain kernel/locking/lockdep.c:3801 [inline]
 __lock_acquire+0x2a07/0x54a0 kernel/locking/lockdep.c:5027
 lock_acquire kernel/locking/lockdep.c:5637 [inline]
 lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5602
 percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
 cpus_read_lock+0x3e/0x140 kernel/cpu.c:308
 wake_up_all_idle_cpus+0x13/0x80 kernel/smp.c:1173
 cpu_latency_qos_apply kernel/power/qos.c:249 [inline]
 cpu_latency_qos_remove_request.part.0+0xc4/0x2f0 kernel/power/qos.c:328
 cpu_latency_qos_remove_request+0x65/0x80 kernel/power/qos.c:330
 snd_pcm_hw_params+0x1481/0x1990 sound/core/pcm_native.c:784
 snd_pcm_kernel_ioctl+0x164/0x310 sound/core/pcm_native.c:3355
 snd_pcm_oss_change_params_locked+0x1936/0x3a60 sound/core/oss/pcm_oss.c:947
 snd_pcm_oss_change_params sound/core/oss/pcm_oss.c:1091 [inline]
 snd_pcm_oss_mmap+0x442/0x550 sound/core/oss/pcm_oss.c:2910
 call_mmap include/linux/fs.h:2164 [inline]
 mmap_region+0xd8c/0x1650 mm/mmap.c:1787
 do_mmap+0x869/0xfb0 mm/mmap.c:1575
 vm_mmap_pgoff+0x1b7/0x290 mm/util.c:519
 ksys_mmap_pgoff+0x49f/0x620 mm/mmap.c:1624
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fc0cb6221c9
Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fffcd48ab28 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc0cb6221c9
RDX: 0000000001800003 RSI: 0000000000800000 RDI: 0000000020000000
RBP: 00007fc0cb5e61b0 R08: 0000000000000004 R09: 0000000000000000

Crashes (14):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2021/10/16 18:06 linux-next 7c832d2f9b95 0c5d9412 .config console log report syz C ci-upstream-linux-next-kasan-gce-root possible deadlock in wake_up_all_idle_cpus
2021/10/22 23:09 linux-next cf6c9d12750c 282f03fb .config console log report info ci-upstream-linux-next-kasan-gce-root possible deadlock in wake_up_all_idle_cpus
2021/10/21 01:01 linux-next 51dba6e335ff f111d03b .config console log report info ci-upstream-linux-next-kasan-gce-root possible deadlock in wake_up_all_idle_cpus
2021/10/20 23:19 linux-next 51dba6e335ff f111d03b .config console log report info ci-upstream-linux-next-kasan-gce-root possible deadlock in wake_up_all_idle_cpus
2021/10/20 08:49 linux-next 51dba6e335ff 466b7db1 .config console log report info ci-upstream-linux-next-kasan-gce-root possible deadlock in wake_up_all_idle_cpus
2021/10/19 20:59 linux-next 60e8840126bd 466b7db1 .config console log report info ci-upstream-linux-next-kasan-gce-root possible deadlock in wake_up_all_idle_cpus
2021/10/19 16:48 linux-next 60e8840126bd 24dc29db .config console log report info ci-upstream-linux-next-kasan-gce-root possible deadlock in wake_up_all_idle_cpus
2021/10/18 10:35 linux-next 60e8840126bd 0c5d9412 .config console log report info ci-upstream-linux-next-kasan-gce-root possible deadlock in wake_up_all_idle_cpus
2021/10/18 09:16 linux-next 7c832d2f9b95 0c5d9412 .config console log report info ci-upstream-linux-next-kasan-gce-root possible deadlock in wake_up_all_idle_cpus
2021/10/17 23:49 linux-next 7c832d2f9b95 0c5d9412 .config console log report info ci-upstream-linux-next-kasan-gce-root possible deadlock in wake_up_all_idle_cpus
2021/10/17 21:58 linux-next 7c832d2f9b95 0c5d9412 .config console log report info ci-upstream-linux-next-kasan-gce-root possible deadlock in wake_up_all_idle_cpus
2021/10/17 04:20 linux-next 7c832d2f9b95 0c5d9412 .config console log report info ci-upstream-linux-next-kasan-gce-root possible deadlock in wake_up_all_idle_cpus
2021/10/16 17:16 linux-next 7c832d2f9b95 0c5d9412 .config console log report info ci-upstream-linux-next-kasan-gce-root possible deadlock in wake_up_all_idle_cpus
2021/10/13 23:38 linux-next 8006b911c90a 5462d470 .config console log report info ci-upstream-linux-next-kasan-gce-root possible deadlock in wake_up_all_idle_cpus
* Struck through repros no longer work on HEAD.