bisecting fixing commit since d6765985a42a660f078896d5c5b27f97c580a490
building syzkaller on 9d2ab5dfe7727dfea4b9b279f4edf731acb386ef
testing commit d6765985a42a660f078896d5c5b27f97c580a490
compiler: gcc (GCC) 10.2.1 20210217, GNU ld (GNU Binutils for Debian) 2.35.2
kernel signature: 65241d6ac4131102e3bb9c9a61f5d375abcc730a52f149fb3b1cdca04a394b59
run #0: crashed: INFO: rcu detected stall in mac80211_hwsim_beacon
run #1: crashed: INFO: rcu detected stall in mac80211_hwsim_beacon
run #2: crashed: INFO: rcu detected stall in smp_call_function
run #3: crashed: INFO: rcu detected stall in newlstat
run #4: crashed: INFO: rcu detected stall in tc_modify_qdisc
run #5: crashed: BUG: soft lockup in tc_modify_qdisc
run #6: crashed: BUG: soft lockup in mac80211_hwsim_beacon
run #7: crashed: BUG: soft lockup in mld_ifc_work
run #8: crashed: INFO: rcu detected stall in mac80211_hwsim_beacon
run #9: crashed: INFO: rcu detected stall in mac80211_hwsim_beacon
run #10: crashed: BUG: soft lockup in mac80211_hwsim_beacon
run #11: crashed: BUG: soft lockup in net_tx_action
run #12: crashed: INFO: rcu detected stall in mac80211_hwsim_beacon
run #13: crashed: BUG: soft lockup in tc_modify_qdisc
run #14: crashed: BUG: soft lockup in net_tx_action
run #15: crashed: INFO: rcu detected stall in netlink_release
run #16: crashed: INFO: rcu detected stall in mac80211_hwsim_beacon
run #17: crashed: INFO: rcu detected stall in addrconf_rs_timer
run #18: crashed: INFO: rcu detected stall in tx
run #19: crashed: no output from test machine
testing current HEAD 9dbe33cf371bd70330858370bdbc35c7668f00c3
testing commit 9dbe33cf371bd70330858370bdbc35c7668f00c3
compiler: gcc (GCC) 10.2.1 20210217, GNU ld (GNU Binutils for Debian) 2.35.2
kernel signature: 574dbecd7da653c83f5a0b9004298c42e5a15a3e1aac010f6ad7d1892e6dac17
run #0: crashed: INFO: rcu detected stall in wb_workfn
run #1: crashed: INFO: rcu detected stall in tc_modify_qdisc
run #2: crashed: INFO: rcu detected stall in security_file_open
run #3: crashed: INFO: rcu detected stall in do_idle
run #4: crashed: INFO: rcu detected stall in wait4
run #5: crashed: BUG: soft lockup in tc_modify_qdisc
run #6: crashed: INFO: rcu detected stall in smp_call_function
run #7: crashed: INFO: rcu detected stall in corrupted
run #8: crashed: BUG: soft lockup in wg_packet_handshake_receive_worker
run #9: crashed: INFO: rcu detected stall in gc_worker
revisions tested: 2, total time: 29m39.055261953s (build: 12m57.307636249s, test: 15m53.391158625s)
the crash still happens on HEAD
commit msg: mdio: aspeed: Fix "Link is Down" issue
crash: INFO: rcu detected stall in gc_worker
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 0-...!: (1 GPs behind) idle=1b5/1/0x4000000000000000 softirq=10215/10225 fqs=0
(detected by 1, t=15580 jiffies, g=8417, q=1235)
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 3002 Comm: kworker/0:4 Not tainted 5.16.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events_power_efficient gc_worker
RIP: 0010:lockdep_recursion_finish kernel/locking/lockdep.c:438 [inline]
RIP: 0010:lock_is_held_type+0xe4/0x140 kernel/locking/lockdep.c:5681
Code: f6 47 22 03 0f 95 c0 45 31 ed 44 39 f0 41 0f 94 c5 48 c7 c7 00 58 eb 88 e8 29 0c 00 00 b8 ff ff ff ff 65 0f c1 05 0c 31 7a 77 <83> f8 01 75 29 9c 58 f6 c4 02 75 3d 48 f7 04 24 00 02 00 00 74 01
RSP: 0018:ffffc90000007c00 EFLAGS: 00000057
RAX: 0000000000000001 RBX: 0000000000000004 RCX: 0000000000000001
RDX: 0000000000000000 RSI: ffffffff88eb5800 RDI: ffffffff89416460
RBP: ffffffff8ad78780 R08: 0000000000000000 R09: ffffffff8ca1d5d7
R10: fffffbfff1943aba R11: 0000000000000001 R12: ffff888079389cc0
R13: 0000000000000000 R14: 00000000ffffffff R15: ffff88807938a758
FS: 0000000000000000(0000) GS:ffff8880b9e00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f543911e5b8 CR3: 0000000017f57000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
lock_is_held include/linux/lockdep.h:283 [inline]
rcu_read_lock_sched_held+0x3a/0x70 kernel/rcu/update.c:125
trace_lock_acquire include/trace/events/lock.h:13 [inline]
lock_acquire+0x442/0x510 kernel/locking/lockdep.c:5608
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x39/0x50 kernel/locking/spinlock.c:162
debug_object_deactivate lib/debugobjects.c:735 [inline]
debug_object_deactivate+0x101/0x300 lib/debugobjects.c:723
debug_hrtimer_deactivate kernel/time/hrtimer.c:425 [inline]
debug_deactivate kernel/time/hrtimer.c:481 [inline]
__run_hrtimer kernel/time/hrtimer.c:1653 [inline]
__hrtimer_run_queues+0x337/0xb00 kernel/time/hrtimer.c:1749
hrtimer_interrupt+0x2f5/0x780 kernel/time/hrtimer.c:1811
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline]
__sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103
sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097
asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:__seqprop_spinlock_sequence include/linux/seqlock.h:277 [inline]
RIP: 0010:nf_conntrack_get_ht include/net/netfilter/nf_conntrack.h:326 [inline]
RIP: 0010:gc_worker+0x190/0xbf0 net/netfilter/nf_conntrack_core.c:1441
Code: 48 8b b4 24 80 00 00 00 48 c7 c7 08 7a a2 8c e8 f6 48 77 fa 58 9c 58 f6 c4 02 0f 85 ac 06 00 00 48 85 db 74 05 fb eb 02 f3 90 <8b> 05 0a 45 c6 05 a8 01 75 f4 8b 35 3c 45 c6 05 48 8b 0d 49 45 c6
RSP: 0018:ffffc90001a8fca8 EFLAGS: 00000206
RAX: 0000000000000002 RBX: 0000000000000200 RCX: 1ffffffff1e151be
RDX: 0000000000000000 RSI: ffffffff81445d0f RDI: ffffffff89416460
RBP: dffffc0000000000 R08: 0000000000000001 R09: ffffffff8f0739cf
R10: 0000000000000001 R11: 000000000007c000 R12: ffff888018902840
R13: ffffffff8aa0f180 R14: ffff888018902780 R15: 00000000ffffd5d3
process_one_work+0x87f/0x1450 kernel/workqueue.c:2298
worker_thread+0x598/0x1040 kernel/workqueue.c:2445
kthread+0x3ab/0x480 kernel/kthread.c:327
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
rcu: rcu_preempt kthread timer wakeup didn't happen for 15579 jiffies! g8417 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
rcu: Possible timer handling issue on cpu=1 timer-softirq=4170
rcu: rcu_preempt kthread starved for 15580 jiffies! g8417 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt state:I stack:28760 pid: 14 ppid: 2 flags:0x00004000
Call Trace:
context_switch kernel/sched/core.c:4972 [inline]
__schedule+0x90d/0x26c0 kernel/sched/core.c:6253
schedule+0xd2/0x260 kernel/sched/core.c:6326
schedule_timeout+0x11d/0x250 kernel/time/timer.c:1881
rcu_gp_fqs_loop+0x186/0x810 kernel/rcu/tree.c:1955
rcu_gp_kthread+0x1de/0x320 kernel/rcu/tree.c:2128
kthread+0x3ab/0x480 kernel/kthread.c:327
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
rcu: Stack dump where RCU GP kthread last ran:
NMI backtrace for cpu 1
CPU: 1 PID: 1084 Comm: kworker/u4:5 Not tainted 5.16.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: bat_events batadv_tt_purge
Call Trace:
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x57/0x7d lib/dump_stack.c:106
nmi_cpu_backtrace.cold+0x30/0xc0 lib/nmi_backtrace.c:111
nmi_trigger_cpumask_backtrace+0x11f/0x170 lib/nmi_backtrace.c:62
trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
rcu_check_gp_kthread_starvation.cold+0x1fb/0x200 kernel/rcu/tree_stall.h:481
print_other_cpu_stall kernel/rcu/tree_stall.h:586 [inline]
check_cpu_stall kernel/rcu/tree_stall.h:729 [inline]
rcu_pending kernel/rcu/tree.c:3878 [inline]
rcu_sched_clock_irq+0x2125/0x2200 kernel/rcu/tree.c:2597
update_process_times+0x13b/0x1c0 kernel/time/timer.c:1785
tick_sched_handle+0x6f/0x130 kernel/time/tick-sched.c:226
tick_sched_timer+0x132/0x210 kernel/time/tick-sched.c:1421
__run_hrtimer kernel/time/hrtimer.c:1685 [inline]
__hrtimer_run_queues+0x18a/0xb00 kernel/time/hrtimer.c:1749
hrtimer_interrupt+0x2f5/0x780 kernel/time/hrtimer.c:1811
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline]
__sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103
sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097
asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:__local_bh_enable_ip+0xa8/0x120 kernel/softirq.c:390
Code: 1d cd cb c2 7e 65 8b 05 c6 cb c2 7e a9 00 ff ff 00 74 45 bf 01 00 00 00 e8 c5 fa 07 00 e8 10 c3 30 00 fb 65 8b 05 a8 cb c2 7e <85> c0 74 58 5b 5d c3 65 8b 05 f6 d2 c2 7e 85 c0 75 a2 0f 0b eb 9e
RSP: 0018:ffffc9000502fc08 EFLAGS: 00000202
RAX: 0000000080000000 RBX: 00000000fffffe00 RCX: 1ffffffff1e1b236
RDX: 0000000000000000 RSI: ffffffff88eb5520 RDI: ffffffff89416460
RBP: ffffffff882b425a R08: 0000000000000001 R09: ffffffff8f073a4f
R10: 0000000000000001 R11: 000000000007c08a R12: ffff888017ddda00
R13: dffffc0000000000 R14: 00000000000927c0 R15: ffff888017ddda18
spin_unlock_bh include/linux/spinlock.h:394 [inline]
batadv_tt_local_purge+0x22a/0x310 net/batman-adv/translation-table.c:1357
batadv_tt_purge+0x27/0x990 net/batman-adv/translation-table.c:3561
process_one_work+0x87f/0x1450 kernel/workqueue.c:2298
worker_thread+0x598/0x1040 kernel/workqueue.c:2445
kthread+0x3ab/0x480 kernel/kthread.c:327
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
----------------
Code disassembly (best guess):
0: f6 47 22 03 testb $0x3,0x22(%rdi)
4: 0f 95 c0 setne %al
7: 45 31 ed xor %r13d,%r13d
a: 44 39 f0 cmp %r14d,%eax
d: 41 0f 94 c5 sete %r13b
11: 48 c7 c7 00 58 eb 88 mov $0xffffffff88eb5800,%rdi
18: e8 29 0c 00 00 callq 0xc46
1d: b8 ff ff ff ff mov $0xffffffff,%eax
22: 65 0f c1 05 0c 31 7a xadd %eax,%gs:0x777a310c(%rip) # 0x777a3136
29: 77
* 2a: 83 f8 01 cmp $0x1,%eax <-- trapping instruction
2d: 75 29 jne 0x58
2f: 9c pushfq
30: 58 pop %rax
31: f6 c4 02 test $0x2,%ah
34: 75 3d jne 0x73
36: 48 f7 04 24 00 02 00 testq $0x200,(%rsp)
3d: 00
3e: 74 01 je 0x41