syzbot


INFO: rcu detected stall in wg_expired_send_persistent_keepalive

Status: auto-closed as invalid on 2022/06/25 10:14
Subsystems: kvm
[Documentation on labels]
First crash: 731d, last: 730d

Sample crash report:
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 	0-...!: (1 GPs behind) idle=22f/1/0x4000000000000000 softirq=17206/17207 fqs=130 
	(t=10500 jiffies g=19329 q=553 ncpus=2)
rcu: rcu_preempt kthread starved for 10204 jiffies! g19329 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:R  running task     stack:29264 pid:   16 ppid:     2 flags:0x00004000
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5106 [inline]
 __schedule+0xa9a/0x4cc0 kernel/sched/core.c:6421
 schedule+0xd2/0x1f0 kernel/sched/core.c:6493
 schedule_timeout+0x14a/0x2a0 kernel/time/timer.c:1907
 rcu_gp_fqs_loop+0x1c0/0x840 kernel/rcu/tree.c:2078
 rcu_gp_kthread+0x1de/0x320 kernel/rcu/tree.c:2267
 kthread+0x2e9/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
 </TASK>
rcu: Stack dump where RCU GP kthread last ran:
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 6735 Comm: syz-executor.1 Not tainted 5.18.0-rc3-next-20220422-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:kvm_wait+0x98/0x100 arch/x86/kernel/kvm.c:1058
Code: fa 83 e2 07 38 d0 7f 04 84 c0 75 63 0f b6 07 40 38 c6 74 35 48 83 c4 10 c3 c3 e8 23 91 4b 00 eb 07 0f 00 2d da b1 94 08 fb f4 <48> 83 c4 10 c3 89 74 24 0c 48 89 3c 24 e8 56 8f 4b 00 8b 74 24 0c
RSP: 0018:ffffc90002bcf978 EFLAGS: 00000246
RAX: 0000000000000007 RBX: ffff8880b9c3ae40 RCX: 1ffffffff1b73199
RDX: 0000000000000000 RSI: ffffffff81807171 RDI: ffffffff8134dffd
RBP: ffff8880b9d3ae54 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff81807158 R11: 0000000000000000 R12: ffffed10173875ca
R13: ffff8880b9c3ae54 R14: dffffc0000000000 R15: ffff8880b9d3ae40
FS:  0000000000000000(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f36675681b8 CR3: 0000000043189000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 pv_wait arch/x86/include/asm/paravirt.h:603 [inline]
 pv_wait_node kernel/locking/qspinlock_paravirt.h:325 [inline]
 __pv_queued_spin_lock_slowpath+0x6d8/0xb50 kernel/locking/qspinlock.c:476
 pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:591 [inline]
 queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline]
 queued_spin_lock include/asm-generic/qspinlock.h:85 [inline]
 do_raw_spin_lock+0x200/0x2a0 kernel/locking/spinlock_debug.c:115
 spin_lock include/linux/spinlock.h:354 [inline]
 task_lock include/linux/sched/task.h:170 [inline]
 mm_update_next_owner+0x21a/0x7a0 kernel/exit.c:457
 exit_mm kernel/exit.c:509 [inline]
 do_exit+0xa0a/0x2a00 kernel/exit.c:782
 do_group_exit+0xd2/0x2f0 kernel/exit.c:925
 get_signal+0x22df/0x24c0 kernel/signal.c:2864
 arch_do_signal_or_restart+0x82/0x20f0 arch/x86/kernel/signal.c:869
 exit_to_user_mode_loop kernel/entry/common.c:166 [inline]
 exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:201
 __syscall_exit_to_user_mode_work kernel/entry/common.c:283 [inline]
 syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:294
 do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fde7c689132
Code: Unable to access opcode bytes at RIP 0x7fde7c689108.
RSP: 002b:00007ffc460bc4f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
RAX: fffffffffffffffc RBX: 0000000000020022 RCX: 00007fde7c689132
RDX: 0000000000000000 RSI: 0000000000021000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 00000000ffffffff R09: 0000000000000000
R10: 0000000000020022 R11: 0000000000000246 R12: 00007ffc460bc700
R13: 00007fde7bdff700 R14: 0000000000000000 R15: 0000000000022000
 </TASK>
NMI backtrace for cpu 0
CPU: 0 PID: 6745 Comm: syz-executor.1 Not tainted 5.18.0-rc3-next-20220422-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:111
 nmi_trigger_cpumask_backtrace+0x1e6/0x230 lib/nmi_backtrace.c:62
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
 rcu_dump_cpu_stacks+0x262/0x3f0 kernel/rcu/tree_stall.h:369
 print_cpu_stall kernel/rcu/tree_stall.h:665 [inline]
 check_cpu_stall kernel/rcu/tree_stall.h:749 [inline]
 rcu_pending kernel/rcu/tree.c:4068 [inline]
 rcu_sched_clock_irq.cold+0x144/0x8fc kernel/rcu/tree.c:2755
 update_process_times+0x16d/0x200 kernel/time/timer.c:1811
 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:243
 tick_sched_timer+0xee/0x120 kernel/time/tick-sched.c:1473
 __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
 __hrtimer_run_queues+0x1c0/0xe50 kernel/time/hrtimer.c:1749
 hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1095 [inline]
 __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1112
 sysvec_apic_timer_interrupt+0x40/0xc0 arch/x86/kernel/apic/apic.c:1106
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:649
RIP: 0010:kvm_wait+0x98/0x100 arch/x86/kernel/kvm.c:1058
Code: fa 83 e2 07 38 d0 7f 04 84 c0 75 63 0f b6 07 40 38 c6 74 35 48 83 c4 10 c3 c3 e8 23 91 4b 00 eb 07 0f 00 2d da b1 94 08 fb f4 <48> 83 c4 10 c3 89 74 24 0c 48 89 3c 24 e8 56 8f 4b 00 8b 74 24 0c
RSP: 0000:ffffc90000007608 EFLAGS: 00000246
RAX: 0000000000000007 RBX: 0000000000000000 RCX: 1ffffffff1b73199
RDX: 0000000000000000 RSI: ffffffff81807171 RDI: ffffffff8134dffd
RBP: ffff88801bb00948 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff81807158 R11: 0000000000000001 R12: 0000000000000000
R13: ffffed1003760129 R14: 0000000000000001 R15: ffff8880b9c3ae40
 pv_wait arch/x86/include/asm/paravirt.h:603 [inline]
 pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:470 [inline]
 __pv_queued_spin_lock_slowpath+0x8c7/0xb50 kernel/locking/qspinlock.c:511
 pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:591 [inline]
 queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline]
 queued_spin_lock include/asm-generic/qspinlock.h:85 [inline]
 do_raw_spin_lock+0x200/0x2a0 kernel/locking/spinlock_debug.c:115
 spin_lock include/linux/spinlock.h:354 [inline]
 task_lock include/linux/sched/task.h:170 [inline]
 __get_task_comm+0x23/0x50 fs/exec.c:1219
 __set_page_owner_handle mm/page_owner.c:174 [inline]
 __set_page_owner+0x253/0x380 mm/page_owner.c:192
 prep_new_page mm/page_alloc.c:2394 [inline]
 get_page_from_freelist+0xba2/0x3e00 mm/page_alloc.c:4135
 __alloc_pages+0x1b2/0x500 mm/page_alloc.c:5356
 alloc_pages+0x1aa/0x310 mm/mempolicy.c:2273
 alloc_slab_page mm/slub.c:1797 [inline]
 allocate_slab+0x26c/0x3c0 mm/slub.c:1942
 new_slab mm/slub.c:2002 [inline]
 ___slab_alloc+0x985/0xd90 mm/slub.c:3002
 __slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3089
 slab_alloc_node mm/slub.c:3180 [inline]
 kmem_cache_alloc_node+0x122/0x3f0 mm/slub.c:3264
 __alloc_skb+0x215/0x340 net/core/skbuff.c:414
 alloc_skb include/linux/skbuff.h:1337 [inline]
 wg_packet_send_keepalive+0x6e/0x300 drivers/net/wireguard/send.c:226
 wg_expired_send_persistent_keepalive+0x5a/0x70 drivers/net/wireguard/timers.c:141
 call_timer_fn+0x1a5/0x6b0 kernel/time/timer.c:1444
 expire_timers kernel/time/timer.c:1489 [inline]
 __run_timers.part.0+0x679/0xa80 kernel/time/timer.c:1760
 __run_timers kernel/time/timer.c:1738 [inline]
 run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1773
 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
 invoke_softirq kernel/softirq.c:432 [inline]
 __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1106
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:649
RIP: 0010:arch_atomic_try_cmpxchg arch/x86/include/asm/atomic.h:202 [inline]
RIP: 0010:atomic_try_cmpxchg_acquire include/linux/atomic/atomic-instrumented.h:543 [inline]
RIP: 0010:queued_spin_lock include/asm-generic/qspinlock.h:82 [inline]
RIP: 0010:do_raw_spin_lock+0x132/0x2a0 kernel/locking/spinlock_debug.c:115
Code: 00 00 00 00 e8 0f 0c 68 00 be 04 00 00 00 48 8d 7c 24 28 e8 00 0c 68 00 8b 44 24 28 ba 01 00 00 00 89 44 24 04 f0 0f b1 55 00 <0f> 85 91 00 00 00 65 44 8b 35 30 78 a3 7e 48 b8 00 00 00 00 00 fc
RSP: 0000:ffffc90002cef758 EFLAGS: 00000246
RAX: 0000000000000000 RBX: 1ffff9200059deec RCX: ffffffff815ea180
RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffc90002cef780
RBP: ffff88801bb00948 R08: 0000000000000001 R09: 0000000000000003
R10: fffff5200059def0 R11: 0000000000000001 R12: ffff88801bb00950
R13: ffff88801bb00958 R14: ffff8880132c0ac8 R15: 0000000000000007
 spin_lock include/linux/spinlock.h:354 [inline]
 task_lock include/linux/sched/task.h:170 [inline]
 __get_task_comm+0x23/0x50 fs/exec.c:1219
 __set_page_owner_handle mm/page_owner.c:174 [inline]
 __set_page_owner+0x253/0x380 mm/page_owner.c:192
 prep_new_page mm/page_alloc.c:2394 [inline]
 get_page_from_freelist+0xba2/0x3e00 mm/page_alloc.c:4135
 __alloc_pages+0x1b2/0x500 mm/page_alloc.c:5356
 alloc_pages_vma+0xf9/0x770 mm/mempolicy.c:2221
 wp_page_copy+0x1f6/0x1e20 mm/memory.c:3105
 do_wp_page+0x389/0x1b60 mm/memory.c:3472
 handle_pte_fault mm/memory.c:4922 [inline]
 __handle_mm_fault+0x1fe6/0x33d0 mm/memory.c:5043
 handle_mm_fault+0x1c8/0x790 mm/memory.c:5141
 do_user_addr_fault+0x489/0x11c0 arch/x86/mm/fault.c:1397
 handle_page_fault arch/x86/mm/fault.c:1484 [inline]
 exc_page_fault+0x9e/0x180 arch/x86/mm/fault.c:1540
 asm_exc_page_fault+0x1e/0x30 arch/x86/include/asm/idtentry.h:570
RIP: 0033:0x7fde7c62fedc
Code: 2a 59 ff ff 41 39 5c 24 2c 7f d3 31 c0 48 8d 3d f5 20 0b 00 e8 15 59 ff ff 48 8b 44 24 08 c7 44 24 1c ff ff ff ff 44 8b 60 78 <c6> 80 c8 00 00 00 00 45 85 e4 0f 8e 83 00 00 00 48 8b 44 24 08 8b
RSP: 002b:00007fde7d896190 EFLAGS: 00010202
RAX: 00007fde7c79bf60 RBX: 0000000000000004 RCX: 00007fde7c6f095d
RDX: 00000000000228b5 RSI: 0000000020000200 RDI: 00007fde7c6e1fbb
RBP: 00007fde7c6e308d R08: 000000000000011b R09: 00007ffc4619f080
R10: 00007ffc4619f090 R11: 000000000000d1c2 R12: 0000000000000000
R13: 00007ffc460bc56f R14: 00007fde7d896300 R15: 0000000000022000
 </TASK>
----------------
Code disassembly (best guess):
   0:	fa                   	cli
   1:	83 e2 07             	and    $0x7,%edx
   4:	38 d0                	cmp    %dl,%al
   6:	7f 04                	jg     0xc
   8:	84 c0                	test   %al,%al
   a:	75 63                	jne    0x6f
   c:	0f b6 07             	movzbl (%rdi),%eax
   f:	40 38 c6             	cmp    %al,%sil
  12:	74 35                	je     0x49
  14:	48 83 c4 10          	add    $0x10,%rsp
  18:	c3                   	retq
  19:	c3                   	retq
  1a:	e8 23 91 4b 00       	callq  0x4b9142
  1f:	eb 07                	jmp    0x28
  21:	0f 00 2d da b1 94 08 	verw   0x894b1da(%rip)        # 0x894b202
  28:	fb                   	sti
  29:	f4                   	hlt
* 2a:	48 83 c4 10          	add    $0x10,%rsp <-- trapping instruction
  2e:	c3                   	retq
  2f:	89 74 24 0c          	mov    %esi,0xc(%rsp)
  33:	48 89 3c 24          	mov    %rdi,(%rsp)
  37:	e8 56 8f 4b 00       	callq  0x4b8f92
  3c:	8b 74 24 0c          	mov    0xc(%rsp),%esi

Crashes (3):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2022/04/26 10:08 linux-next e7d6987e09a3 1fa34c1b .config console log report info ci-upstream-linux-next-kasan-gce-root INFO: rcu detected stall in wg_expired_send_persistent_keepalive
2022/04/26 07:52 linux-next e7d6987e09a3 1fa34c1b .config console log report info ci-upstream-linux-next-kasan-gce-root INFO: rcu detected stall in wg_expired_send_persistent_keepalive
2022/04/25 22:23 linux-next e7d6987e09a3 152baedd .config console log report info ci-upstream-linux-next-kasan-gce-root INFO: rcu detected stall in wg_expired_send_persistent_keepalive
* Struck through repros no longer work on HEAD.