syzbot


possible deadlock in try_to_wake_up (6)

Status: upstream: reported on 2024/09/01 05:41
Subsystems: kernel
[Documentation on labels]
Reported-by: syzbot+353ce01560bf76a2c560@syzkaller.appspotmail.com
First crash: 19d, last: 19d
Discussions (1)
Title Replies (including bot) Last reply
[syzbot] [kernel?] possible deadlock in try_to_wake_up (6) 0 (1) 2024/09/01 05:41
Similar bugs (8)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
linux-6.1 possible deadlock in try_to_wake_up (2) origin:upstream C 20 11d 130d 0/3 upstream: reported C repro on 2024/05/09 05:15
linux-6.1 possible deadlock in try_to_wake_up C done 1 173d 173d 3/3 fixed on 2024/04/29 07:11
linux-5.15 possible deadlock in try_to_wake_up origin:upstream C 31 18d 164d 0/3 upstream: reported C repro on 2024/04/05 17:03
upstream possible deadlock in try_to_wake_up (2) mm 1 729d 725d 0/28 auto-obsoleted due to no activity on 2023/01/16 12:10
upstream possible deadlock in try_to_wake_up (4) bpf net C error 19 119d 182d 25/28 fixed on 2024/05/22 23:36
upstream possible deadlock in try_to_wake_up (5) mm C 88 36d 109d 27/28 fixed on 2024/08/14 03:44
upstream possible deadlock in try_to_wake_up (3) net 103 334d 344d 0/28 auto-obsoleted due to no activity on 2023/11/27 02:05
upstream possible deadlock in try_to_wake_up mm 39 2072d 2104d 0/28 auto-closed as invalid on 2019/07/13 09:55

Sample crash report:
=====================================================
WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
6.11.0-rc5-next-20240827-syzkaller #0 Not tainted
-----------------------------------------------------
syz.0.48/5524 [HC1[1]:SC0[0]:HE0:SE1] is trying to acquire:
ffffffff8e8151e0 (console_lock){+.+.}-{0:0}, at: _printk+0xd5/0x120 kernel/printk/printk.c:2424

and this task is already holding:
ffff8880119ec618 (&p->pi_lock){-.-.}-{2:2}, at: class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline]
ffff8880119ec618 (&p->pi_lock){-.-.}-{2:2}, at: try_to_wake_up+0xb0/0x1480 kernel/sched/core.c:4150
which would create a new lock dependency:
 (&p->pi_lock){-.-.}-{2:2} -> (console_lock){+.+.}-{0:0}

but this new dependency connects a HARDIRQ-irq-safe lock:
 (&p->pi_lock){-.-.}-{2:2}

... which became HARDIRQ-irq-safe at:
  lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
  _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
  class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline]
  try_to_wake_up+0xb0/0x1480 kernel/sched/core.c:4150
  autoremove_wake_function+0x16/0x110 kernel/sched/wait.c:384
  __wake_up_common kernel/sched/wait.c:89 [inline]
  __wake_up_common_lock+0x130/0x1e0 kernel/sched/wait.c:106
  irq_work_single+0xe2/0x240 kernel/irq_work.c:221
  irq_work_run_list kernel/irq_work.c:252 [inline]
  irq_work_run+0x18b/0x350 kernel/irq_work.c:261
  __sysvec_irq_work+0xb8/0x430 arch/x86/kernel/irq_work.c:22
  instr_sysvec_irq_work arch/x86/kernel/irq_work.c:17 [inline]
  sysvec_irq_work+0x9e/0xc0 arch/x86/kernel/irq_work.c:17
  asm_sysvec_irq_work+0x1a/0x20 arch/x86/include/asm/idtentry.h:738
  __wrmsr arch/x86/include/asm/msr.h:96 [inline]
  native_write_msr arch/x86/include/asm/msr.h:147 [inline]
  wrmsr arch/x86/include/asm/msr.h:256 [inline]
  native_apic_msr_write+0x39/0x50 arch/x86/include/asm/apic.h:212
  __apic_send_IPI_self arch/x86/include/asm/apic.h:455 [inline]
  arch_irq_work_raise+0x6f/0x80 arch/x86/kernel/irq_work.c:31
  irq_work_queue+0xa7/0x150 kernel/irq_work.c:124
  __kfence_alloc+0x241/0x370 mm/kfence/core.c:1112
  kfence_alloc include/linux/kfence.h:129 [inline]
  slab_alloc_node mm/slub.c:4073 [inline]
  __do_kmalloc_node mm/slub.c:4209 [inline]
  __kmalloc_noprof+0x374/0x400 mm/slub.c:4222
  kmalloc_noprof include/linux/slab.h:685 [inline]
  kzalloc_noprof include/linux/slab.h:817 [inline]
  __alloc_workqueue+0x10a/0x1f20 kernel/workqueue.c:5654
  alloc_workqueue+0xd6/0x210 kernel/workqueue.c:5757
  init_mm_internals+0x17/0x120 mm/vmstat.c:2183
  kernel_init_freeable+0x403/0x5d0 init/main.c:1566
  kernel_init+0x1d/0x2b0 init/main.c:1469
  ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

to a HARDIRQ-irq-unsafe lock:
 (console_lock){+.+.}-{0:0}

... which became HARDIRQ-irq-unsafe at:
...
  lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
  console_lock+0x164/0x1b0 kernel/printk/printk.c:2792
  con_init+0x1c/0x9c0 drivers/tty/vt/vt.c:3627
  console_init+0x1b8/0x6f0 kernel/printk/printk.c:3919
  start_kernel+0x2d8/0x500 init/main.c:1040
  x86_64_start_reservations+0x2a/0x30 arch/x86/kernel/head64.c:507
  x86_64_start_kernel+0x9f/0xa0 arch/x86/kernel/head64.c:488
  common_startup_64+0x13e/0x147

other info that might help us debug this:

 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(console_lock);
                               local_irq_disable();
                               lock(&p->pi_lock);
                               lock(console_lock);
  <Interrupt>
    lock(&p->pi_lock);

 *** DEADLOCK ***

6 locks held by syz.0.48/5524:
 #0: ffff8881427e3d58 (&x->wait#13){-.-.}-{2:2}, at: complete_with_flags kernel/sched/completion.c:20 [inline]
 #0: ffff8881427e3d58 (&x->wait#13){-.-.}-{2:2}, at: complete+0x28/0x1c0 kernel/sched/completion.c:47
 #1: ffff8880119ec618 (&p->pi_lock){-.-.}-{2:2}, at: class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline]
 #1: ffff8880119ec618 (&p->pi_lock){-.-.}-{2:2}, at: try_to_wake_up+0xb0/0x1480 kernel/sched/core.c:4150
 #2: ffffffff8e8151e0 (console_lock){+.+.}-{0:0}, at: _printk+0xd5/0x120 kernel/printk/printk.c:2424
 #3: ffffffff8e814df0 (console_srcu){....}-{0:0}, at: rcu_try_lock_acquire include/linux/rcupdate.h:342 [inline]
 #3: ffffffff8e814df0 (console_srcu){....}-{0:0}, at: srcu_read_lock_nmisafe include/linux/srcu.h:267 [inline]
 #3: ffffffff8e814df0 (console_srcu){....}-{0:0}, at: console_srcu_read_lock kernel/printk/printk.c:287 [inline]
 #3: ffffffff8e814df0 (console_srcu){....}-{0:0}, at: console_flush_all+0x147/0xf50 kernel/printk/printk.c:3068
 #4: ffffffff8e815180 (console_owner){-...}-{0:0}, at: rcu_try_lock_acquire include/linux/rcupdate.h:342 [inline]
 #4: ffffffff8e815180 (console_owner){-...}-{0:0}, at: srcu_read_lock_nmisafe include/linux/srcu.h:267 [inline]
 #4: ffffffff8e815180 (console_owner){-...}-{0:0}, at: console_srcu_read_lock kernel/printk/printk.c:287 [inline]
 #4: ffffffff8e815180 (console_owner){-...}-{0:0}, at: console_flush_all+0x147/0xf50 kernel/printk/printk.c:3068
 #5: ffffffff952e0eb8 (&port_lock_key){-.-.}-{2:2}, at: uart_port_lock_irqsave include/linux/serial_core.h:711 [inline]
 #5: ffffffff952e0eb8 (&port_lock_key){-.-.}-{2:2}, at: serial8250_console_write+0x1a7/0x1ed0 drivers/tty/serial/8250/8250_port.c:3352

the dependencies between HARDIRQ-irq-safe lock and the holding lock:
-> (&p->pi_lock){-.-.}-{2:2} {
   IN-HARDIRQ-W at:
                    lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
                    __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
                    _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
                    class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline]
                    try_to_wake_up+0xb0/0x1480 kernel/sched/core.c:4150
                    autoremove_wake_function+0x16/0x110 kernel/sched/wait.c:384
                    __wake_up_common kernel/sched/wait.c:89 [inline]
                    __wake_up_common_lock+0x130/0x1e0 kernel/sched/wait.c:106
                    irq_work_single+0xe2/0x240 kernel/irq_work.c:221
                    irq_work_run_list kernel/irq_work.c:252 [inline]
                    irq_work_run+0x18b/0x350 kernel/irq_work.c:261
                    __sysvec_irq_work+0xb8/0x430 arch/x86/kernel/irq_work.c:22
                    instr_sysvec_irq_work arch/x86/kernel/irq_work.c:17 [inline]
                    sysvec_irq_work+0x9e/0xc0 arch/x86/kernel/irq_work.c:17
                    asm_sysvec_irq_work+0x1a/0x20 arch/x86/include/asm/idtentry.h:738
                    __wrmsr arch/x86/include/asm/msr.h:96 [inline]
                    native_write_msr arch/x86/include/asm/msr.h:147 [inline]
                    wrmsr arch/x86/include/asm/msr.h:256 [inline]
                    native_apic_msr_write+0x39/0x50 arch/x86/include/asm/apic.h:212
                    __apic_send_IPI_self arch/x86/include/asm/apic.h:455 [inline]
                    arch_irq_work_raise+0x6f/0x80 arch/x86/kernel/irq_work.c:31
                    irq_work_queue+0xa7/0x150 kernel/irq_work.c:124
                    __kfence_alloc+0x241/0x370 mm/kfence/core.c:1112
                    kfence_alloc include/linux/kfence.h:129 [inline]
                    slab_alloc_node mm/slub.c:4073 [inline]
                    __do_kmalloc_node mm/slub.c:4209 [inline]
                    __kmalloc_noprof+0x374/0x400 mm/slub.c:4222
                    kmalloc_noprof include/linux/slab.h:685 [inline]
                    kzalloc_noprof include/linux/slab.h:817 [inline]
                    __alloc_workqueue+0x10a/0x1f20 kernel/workqueue.c:5654
                    alloc_workqueue+0xd6/0x210 kernel/workqueue.c:5757
                    init_mm_internals+0x17/0x120 mm/vmstat.c:2183
                    kernel_init_freeable+0x403/0x5d0 init/main.c:1566
                    kernel_init+0x1d/0x2b0 init/main.c:1469
                    ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
                    ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
   IN-SOFTIRQ-W at:
                    lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
                    __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
                    _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
                    class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline]
                    try_to_wake_up+0xb0/0x1480 kernel/sched/core.c:4150
                    call_timer_fn+0x18e/0x650 kernel/time/timer.c:1794
                    expire_timers kernel/time/timer.c:1845 [inline]
                    __run_timers kernel/time/timer.c:2419 [inline]
                    __run_timer_base+0x66a/0x8e0 kernel/time/timer.c:2430
                    run_timer_base kernel/time/timer.c:2439 [inline]
                    run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2449
                    handle_softirqs+0x2c5/0x980 kernel/softirq.c:554
                    __do_softirq kernel/softirq.c:588 [inline]
                    invoke_softirq kernel/softirq.c:428 [inline]
                    __irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637
                    irq_exit_rcu+0x9/0x30 kernel/softirq.c:649
                    instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1037 [inline]
                    sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1037
                    asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
                    native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
                    arch_safe_halt arch/x86/include/asm/irqflags.h:106 [inline]
                    default_idle+0x13/0x20 arch/x86/kernel/process.c:742
                    default_idle_call+0x74/0xb0 kernel/sched/idle.c:117
                    cpuidle_idle_call kernel/sched/idle.c:185 [inline]
                    do_idle+0x22f/0x5d0 kernel/sched/idle.c:326
                    cpu_startup_entry+0x42/0x60 kernel/sched/idle.c:424
                    rest_init+0x2dc/0x300 init/main.c:747
                    start_kernel+0x47f/0x500 init/main.c:1105
                    x86_64_start_reservations+0x2a/0x30 arch/x86/kernel/head64.c:507
                    x86_64_start_kernel+0x9f/0xa0 arch/x86/kernel/head64.c:488
                    common_startup_64+0x13e/0x147
   INITIAL USE at:
                   lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
                   __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
                   _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
                   sched_cgroup_fork+0x33/0x420 kernel/sched/core.c:4731
                   copy_process+0x21c1/0x3d50 kernel/fork.c:2481
                   kernel_clone+0x226/0x8f0 kernel/fork.c:2784
                   user_mode_thread+0x132/0x1a0 kernel/fork.c:2862
                   rest_init+0x23/0x300 init/main.c:712
                   start_kernel+0x47f/0x500 init/main.c:1105
                   x86_64_start_reservations+0x2a/0x30 arch/x86/kernel/head64.c:507
                   x86_64_start_kernel+0x9f/0xa0 arch/x86/kernel/head64.c:488
                   common_startup_64+0x13e/0x147
 }
 ... key      at: [<ffffffff9319b980>] rt_mutex_init_task.__key+0x0/0x20

the dependencies between the lock to be acquired
 and HARDIRQ-irq-unsafe lock:
-> (console_lock){+.+.}-{0:0} {
   HARDIRQ-ON-W at:
                    lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
                    console_lock+0x164/0x1b0 kernel/printk/printk.c:2792
                    con_init+0x1c/0x9c0 drivers/tty/vt/vt.c:3627
                    console_init+0x1b8/0x6f0 kernel/printk/printk.c:3919
                    start_kernel+0x2d8/0x500 init/main.c:1040
                    x86_64_start_reservations+0x2a/0x30 arch/x86/kernel/head64.c:507
                    x86_64_start_kernel+0x9f/0xa0 arch/x86/kernel/head64.c:488
                    common_startup_64+0x13e/0x147
   SOFTIRQ-ON-W at:
                    lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
                    console_lock+0x164/0x1b0 kernel/printk/printk.c:2792
                    con_init+0x1c/0x9c0 drivers/tty/vt/vt.c:3627
                    console_init+0x1b8/0x6f0 kernel/printk/printk.c:3919
                    start_kernel+0x2d8/0x500 init/main.c:1040
                    x86_64_start_reservations+0x2a/0x30 arch/x86/kernel/head64.c:507
                    x86_64_start_kernel+0x9f/0xa0 arch/x86/kernel/head64.c:488
                    common_startup_64+0x13e/0x147
   INITIAL USE at:
 }
 ... key      at: [<ffffffff8e8151e0>] console_lock_dep_map+0x0/0x60
 ... acquired at:
   lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
   _raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
   raw_spin_rq_lock_nested+0xb0/0x140 kernel/sched/core.c:595
   raw_spin_rq_lock kernel/sched/sched.h:1488 [inline]
   __task_rq_lock+0xdf/0x3e0 kernel/sched/core.c:665
   ttwu_runnable kernel/sched/core.c:3728 [inline]
   try_to_wake_up+0x182/0x1480 kernel/sched/core.c:4180
   swake_up_locked kernel/sched/swait.c:29 [inline]
   complete_with_flags kernel/sched/completion.c:24 [inline]
   complete+0xac/0x1c0 kernel/sched/completion.c:47
   random_recv_done+0x138/0x1e0 drivers/char/hw_random/virtio-rng.c:48
   vring_interrupt+0x21d/0x380 drivers/virtio/virtio_ring.c:2595
   __handle_irq_event_percpu+0x29a/0xa80 kernel/irq/handle.c:158
   handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
   handle_irq_event+0x89/0x1f0 kernel/irq/handle.c:210
   handle_edge_irq+0x25f/0xc20 kernel/irq/chip.c:831
   generic_handle_irq_desc include/linux/irqdesc.h:173 [inline]
   handle_irq arch/x86/kernel/irq.c:247 [inline]
   call_irq_handler arch/x86/kernel/irq.c:259 [inline]
   __common_interrupt+0x136/0x230 arch/x86/kernel/irq.c:285
   common_interrupt+0xb4/0xd0 arch/x86/kernel/irq.c:278


stack backtrace:
CPU: 1 UID: 0 PID: 5524 Comm: syz.0.48 Not tainted 6.11.0-rc5-next-20240827-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
 print_bad_irq_dependency kernel/locking/lockdep.c:2647 [inline]
 check_irq_usage kernel/locking/lockdep.c:2888 [inline]
 check_prev_add kernel/locking/lockdep.c:3165 [inline]
 check_prevs_add kernel/locking/lockdep.c:3280 [inline]
 validate_chain+0x4ebd/0x5920 kernel/locking/lockdep.c:3904
 __lock_acquire+0x1384/0x2050 kernel/locking/lockdep.c:5202
 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
 _raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
 raw_spin_rq_lock_nested+0xb0/0x140 kernel/sched/core.c:595
 raw_spin_rq_lock kernel/sched/sched.h:1488 [inline]
 __task_rq_lock+0xdf/0x3e0 kernel/sched/core.c:665
 ttwu_runnable kernel/sched/core.c:3728 [inline]
 try_to_wake_up+0x182/0x1480 kernel/sched/core.c:4180
 swake_up_locked kernel/sched/swait.c:29 [inline]
 complete_with_flags kernel/sched/completion.c:24 [inline]
 complete+0xac/0x1c0 kernel/sched/completion.c:47
 random_recv_done+0x138/0x1e0 drivers/char/hw_random/virtio-rng.c:48
 vring_interrupt+0x21d/0x380 drivers/virtio/virtio_ring.c:2595
 __handle_irq_event_percpu+0x29a/0xa80 kernel/irq/handle.c:158
 handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
 handle_irq_event+0x89/0x1f0 kernel/irq/handle.c:210
 handle_edge_irq+0x25f/0xc20 kernel/irq/chip.c:831
 generic_handle_irq_desc include/linux/irqdesc.h:173 [inline]
 handle_irq arch/x86/kernel/irq.c:247 [inline]
 call_irq_handler arch/x86/kernel/irq.c:259 [inline]
 __common_interrupt+0x136/0x230 arch/x86/kernel/irq.c:285
 common_interrupt+0xb4/0xd0 arch/x86/kernel/irq.c:278
 </IRQ>
 <TASK>
 </TASK>

Crashes (1):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2024/08/28 05:34 linux-next 6f923748057a 6c853ff9 .config console log report info [disk image] [vmlinux] [kernel image] ci-upstream-linux-next-kasan-gce-root possible deadlock in try_to_wake_up
* Struck through repros no longer work on HEAD.