bisecting fixing commit since 7cc2a8ea104820dd9e702202621e8fd4d9f6c8cf
building syzkaller on 510951950dc0ee69cfdaf746061d3dbe31b49fd8
testing commit 7cc2a8ea104820dd9e702202621e8fd4d9f6c8cf with gcc (GCC) 8.4.1 20210217
kernel signature: 1a1d2595ef8b42cb606e33e16b65cfdcc5769cddd33209112c5bdb2507607901
run #0: crashed: BUG: workqueue lockup
run #1: crashed: BUG: soft lockup in smp_call_function
run #2: crashed: BUG: workqueue lockup
run #3: crashed: BUG: soft lockup in smp_call_function
run #4: crashed: BUG: soft lockup in smp_call_function
run #5: crashed: BUG: workqueue lockup
run #6: crashed: BUG: soft lockup in smp_call_function
run #7: crashed: BUG: workqueue lockup
run #8: crashed: BUG: soft lockup in smp_call_function
run #9: crashed: BUG: workqueue lockup
run #10: crashed: BUG: soft lockup in smp_call_function
run #11: crashed: BUG: soft lockup in smp_call_function
run #12: crashed: BUG: soft lockup in smp_call_function
run #13: crashed: no output from test machine
run #14: crashed: no output from test machine
run #15: crashed: no output from test machine
run #16: crashed: no output from test machine
run #17: crashed: no output from test machine
run #18: crashed: no output from test machine
run #19: crashed: no output from test machine
testing current HEAD f88cd3fb9df228e5ce4e13ec3dbad671ddb2146e
testing commit f88cd3fb9df228e5ce4e13ec3dbad671ddb2146e with gcc (GCC) 10.2.1 20210217
kernel signature: b7a86e7763392bd7001413ed4c7a7e5de21ab44973d651845b87ce06de56cc53
run #0: crashed: BUG: soft lockup in smp_call_function
run #1: crashed: BUG: soft lockup in smp_call_function
run #2: crashed: BUG: soft lockup in smp_call_function
run #3: crashed: BUG: soft lockup in smp_call_function
run #4: crashed: BUG: soft lockup in smp_call_function
run #5: crashed: BUG: soft lockup in smp_call_function
run #6: crashed: BUG: soft lockup in smp_call_function
run #7: crashed: BUG: soft lockup in smp_call_function
run #8: crashed: BUG: soft lockup in smp_call_function
run #9: crashed: INFO: rcu detected stall in do_idle
revisions tested: 2, total time: 26m52.775296375s (build: 13m3.498161437s, test: 13m3.153472493s)
the crash still happens on HEAD
commit msg: Merge tag 'vfio-v5.13-rc5' of git://github.com/awilliam/linux-vfio
crash: INFO: rcu detected stall in do_idle
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 0-...!: (0 ticks this GP) idle=736/1/0x4000000000000000 softirq=8444/8444 fqs=0
(detected by 1, t=10502 jiffies, g=7133, q=4128)
============================================
WARNING: possible recursive locking detected
5.13.0-rc4-syzkaller #0 Not tainted
--------------------------------------------
swapper/1/0 is trying to acquire lock:
ffffffff894cc258 (rcu_node_0){-.-.}-{2:2}, at: rcu_dump_cpu_stacks+0xbb/0x360 kernel/rcu/tree_stall.h:336
but task is already holding lock:
ffffffff894cc258 (rcu_node_0){-.-.}-{2:2}, at: print_other_cpu_stall kernel/rcu/tree_stall.h:542 [inline]
ffffffff894cc258 (rcu_node_0){-.-.}-{2:2}, at: check_cpu_stall kernel/rcu/tree_stall.h:708 [inline]
ffffffff894cc258 (rcu_node_0){-.-.}-{2:2}, at: rcu_pending kernel/rcu/tree.c:3911 [inline]
ffffffff894cc258 (rcu_node_0){-.-.}-{2:2}, at: rcu_sched_clock_irq+0xc3c/0x1f30 kernel/rcu/tree.c:2649
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(rcu_node_0);
lock(rcu_node_0);
*** DEADLOCK ***
May be due to missing lock nesting notation
1 lock held by swapper/1/0:
#0: ffffffff894cc258 (rcu_node_0){-.-.}-{2:2}, at: print_other_cpu_stall kernel/rcu/tree_stall.h:542 [inline]
#0: ffffffff894cc258 (rcu_node_0){-.-.}-{2:2}, at: check_cpu_stall kernel/rcu/tree_stall.h:708 [inline]
#0: ffffffff894cc258 (rcu_node_0){-.-.}-{2:2}, at: rcu_pending kernel/rcu/tree.c:3911 [inline]
#0: ffffffff894cc258 (rcu_node_0){-.-.}-{2:2}, at: rcu_sched_clock_irq+0xc3c/0x1f30 kernel/rcu/tree.c:2649
stack backtrace:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.13.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x10c/0x14b lib/dump_stack.c:120
print_deadlock_bug kernel/locking/lockdep.c:2831 [inline]
check_deadlock kernel/locking/lockdep.c:2874 [inline]
validate_chain kernel/locking/lockdep.c:3663 [inline]
__lock_acquire.cold+0x227/0x3a6 kernel/locking/lockdep.c:4902
lock_acquire kernel/locking/lockdep.c:5512 [inline]
lock_acquire+0x212/0x850 kernel/locking/lockdep.c:5477
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x94/0xd0 kernel/locking/spinlock.c:159
rcu_dump_cpu_stacks+0xbb/0x360 kernel/rcu/tree_stall.h:336
print_other_cpu_stall kernel/rcu/tree_stall.h:560 [inline]
check_cpu_stall kernel/rcu/tree_stall.h:708 [inline]
rcu_pending kernel/rcu/tree.c:3911 [inline]
rcu_sched_clock_irq+0x1b28/0x1f30 kernel/rcu/tree.c:2649
update_process_times+0x131/0x1a0 kernel/time/timer.c:1796
tick_sched_handle+0x6f/0x130 kernel/time/tick-sched.c:226
tick_sched_timer+0x132/0x210 kernel/time/tick-sched.c:1373
__run_hrtimer kernel/time/hrtimer.c:1537 [inline]
__hrtimer_run_queues+0x19e/0xb90 kernel/time/hrtimer.c:1601
hrtimer_interrupt+0x2ef/0x900 kernel/time/hrtimer.c:1663
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
__sysvec_apic_timer_interrupt+0x13e/0x520 arch/x86/kernel/apic/apic.c:1106
sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1100
asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:647
RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:52
Code: e8 27 a7 2e fa e9 e4 fe ff ff 48 89 df e8 1a a7 2e fa eb 9c cc cc cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d e4 f9 50 00 fb f4 90 e9 07 00 00 00 0f 00 2d d4 f9 50 00 f4 c3 cc cc 55 53 e8 89
RSP: 0018:ffffc9000010fd98 EFLAGS: 00000282
RAX: 1ffffffff126e861 RBX: ffff8881048fc065 RCX: 1ffffffff1487bb1
RDX: dffffc0000000000 RSI: ffffffff87cb42c0 RDI: ffffffff881346a0
RBP: ffff8881008523c0 R08: 0000000000000001 R09: 0000000000000001
R10: ffffed102010a478 R11: 0000000000000001 R12: 0000000000000001
R13: ffffffff89a8bdc0 R14: ffff8881048fc064 R15: ffff888107daa004
arch_safe_halt arch/x86/include/asm/paravirt.h:167 [inline]
acpi_safe_halt drivers/acpi/processor_idle.c:108 [inline]
acpi_idle_do_entry+0x189/0x290 drivers/acpi/processor_idle.c:513
acpi_idle_enter+0x2d4/0x4a0 drivers/acpi/processor_idle.c:648
cpuidle_enter_state+0x145/0x990 drivers/cpuidle/cpuidle.c:237
cpuidle_enter+0x45/0xa0 drivers/cpuidle/cpuidle.c:351
call_cpuidle kernel/sched/idle.c:158 [inline]
cpuidle_idle_call kernel/sched/idle.c:239 [inline]
do_idle+0x497/0x730 kernel/sched/idle.c:306
cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:403
secondary_startup_64_no_verify+0xb0/0xbb