syzbot


INFO: rcu detected stall in wg_packet_tx_worker (4)

Status: closed as invalid on 2022/02/08 10:10
Reported-by: syzbot+@syzkaller.appspotmail.com
First crash: 379d, last: 379d
similar bugs (3):
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
upstream INFO: rcu detected stall in wg_packet_tx_worker C done 24 904d 957d 17/24 fixed on 2020/07/17 17:58
upstream INFO: rcu detected stall in wg_packet_tx_worker (2) 4 759d 854d 0/24 auto-closed as invalid on 2021/02/08 20:10
upstream INFO: rcu detected stall in wg_packet_tx_worker (3) C inconclusive 5 480d 463d 22/24 fixed on 2021/11/10 00:50

Sample crash report:
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 	1-....: (1 GPs behind) idle=58d/1/0x4000000000000000 softirq=168894/168895 fqs=36 
	(t=10502 jiffies g=328765 q=518)
rcu: rcu_preempt kthread timer wakeup didn't happen for 4818 jiffies! g328765 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
rcu: 	Possible timer handling issue on cpu=0 timer-softirq=149381
rcu: rcu_preempt kthread starved for 4821 jiffies! g328765 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:I stack:28656 pid:   14 ppid:     2 flags:0x00004000
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:4972 [inline]
 __schedule+0xa9a/0x4940 kernel/sched/core.c:6253
 schedule+0xd2/0x260 kernel/sched/core.c:6326
 schedule_timeout+0x14a/0x2a0 kernel/time/timer.c:1881
 rcu_gp_fqs_loop+0x186/0x810 kernel/rcu/tree.c:1955
 rcu_gp_kthread+0x1de/0x320 kernel/rcu/tree.c:2128
 kthread+0x405/0x4f0 kernel/kthread.c:327
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
 </TASK>
rcu: Stack dump where RCU GP kthread last ran:
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 1879 Comm: kworker/0:9 Not tainted 5.16.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: wg-crypt-wg0 wg_packet_tx_worker
RIP: 0010:kvm_wait+0x98/0x100 arch/x86/kernel/kvm.c:1001
Code: fa 83 e2 07 38 d0 7f 04 84 c0 75 63 0f b6 07 40 38 c6 74 35 48 83 c4 10 c3 c3 e8 73 56 4a 00 eb 07 0f 00 2d ea 8f 75 08 fb f4 <48> 83 c4 10 c3 89 74 24 0c 48 89 3c 24 e8 36 51 4a 00 8b 74 24 0c
RSP: 0018:ffffc90001a8f508 EFLAGS: 00000206
RAX: 000000000001f8f6 RBX: 0000000000000000 RCX: 1ffffffff1ff892e
RDX: 0000000000000000 RSI: 0000000000000803 RDI: 0000000000000000
RBP: ffff8880149890f0 R08: 0000000000000001 R09: ffffffff8ff71adf
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
R13: ffffed100293121e R14: 0000000000000001 R15: ffff8880b9c3a880
FS:  0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555555f106d0 CR3: 0000000072f4b000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 pv_wait arch/x86/include/asm/paravirt.h:603 [inline]
 pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:470 [inline]
 __pv_queued_spin_lock_slowpath+0x8b8/0xb40 kernel/locking/qspinlock.c:508
 pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:591 [inline]
 queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline]
 queued_spin_lock include/asm-generic/qspinlock.h:85 [inline]
 do_raw_spin_lock+0x200/0x2b0 kernel/locking/spinlock_debug.c:115
 spin_lock include/linux/spinlock.h:349 [inline]
 __dev_xmit_skb net/core/dev.c:3844 [inline]
 __dev_queue_xmit+0x1e63/0x3630 net/core/dev.c:4194
 neigh_resolve_output net/core/neighbour.c:1523 [inline]
 neigh_resolve_output+0x50e/0x820 net/core/neighbour.c:1503
 neigh_output include/net/neighbour.h:527 [inline]
 ip6_finish_output2+0x571/0x14e0 net/ipv6/ip6_output.c:126
 __ip6_finish_output net/ipv6/ip6_output.c:191 [inline]
 __ip6_finish_output+0x4c1/0x1050 net/ipv6/ip6_output.c:170
 ip6_finish_output+0x32/0x200 net/ipv6/ip6_output.c:201
 NF_HOOK_COND include/linux/netfilter.h:296 [inline]
 ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:224
 dst_output include/net/dst.h:450 [inline]
 ip6_local_out+0xaf/0x1a0 net/ipv6/output_core.c:161
 ip6tunnel_xmit include/net/ip6_tunnel.h:160 [inline]
 udp_tunnel6_xmit_skb+0x72e/0xc90 net/ipv6/ip6_udp_tunnel.c:109
 send6+0x4a2/0xcc0 drivers/net/wireguard/socket.c:152
 wg_socket_send_skb_to_peer+0xf5/0x220 drivers/net/wireguard/socket.c:177
 wg_packet_create_data_done drivers/net/wireguard/send.c:251 [inline]
 wg_packet_tx_worker+0x1a7/0x720 drivers/net/wireguard/send.c:276
 process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298
 worker_thread+0x658/0x11f0 kernel/workqueue.c:2445
 kthread+0x405/0x4f0 kernel/kthread.c:327
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
 </TASK>
NMI backtrace for cpu 1
CPU: 1 PID: 1889 Comm: kworker/1:6 Not tainted 5.16.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: wg-crypt-wg0 wg_packet_tx_worker
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:111
 nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
 rcu_dump_cpu_stacks+0x25e/0x3f0 kernel/rcu/tree_stall.h:343
 print_cpu_stall kernel/rcu/tree_stall.h:627 [inline]
 check_cpu_stall kernel/rcu/tree_stall.h:711 [inline]
 rcu_pending kernel/rcu/tree.c:3878 [inline]
 rcu_sched_clock_irq.cold+0x9d/0x746 kernel/rcu/tree.c:2597
 update_process_times+0x16d/0x200 kernel/time/timer.c:1785
 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226
 tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1421
 __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
 __hrtimer_run_queues+0x1c0/0xe50 kernel/time/hrtimer.c:1749
 hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline]
 __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103
 sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:fq_flow_add_tail net/sched/sch_fq.c:168 [inline]
RIP: 0010:fq_dequeue+0x76e/0x1fe0 net/sched/sch_fq.c:564
Code: 85 ba 16 00 00 4d 8b af d8 02 00 00 49 8d 7d 48 48 89 f8 48 c1 e8 03 80 3c 18 00 0f 85 94 16 00 00 4d 89 65 48 e8 02 e5 1a fa <48> 89 e8 48 c1 e8 03 80 3c 18 00 0f 85 3f 14 00 00 4d 89 a7 d8 02
RSP: 0018:ffffc900028ef570 EFLAGS: 00000293
RAX: 0000000000000000 RBX: dffffc0000000000 RCX: 0000000000000000
RDX: ffff888078c6d700 RSI: ffffffff875cb45e RDI: ffff888033f5b888
RBP: ffff8880149892d8 R08: 0000000000000000 R09: ffffffff8ff71adf
R10: ffffffff875cb1e3 R11: 0000000000000000 R12: ffff888033f5b840
R13: ffff88807d6b8cc0 R14: ffff8880149892d0 R15: ffff888014989000
 dequeue_skb net/sched/sch_generic.c:292 [inline]
 qdisc_restart net/sched/sch_generic.c:397 [inline]
 __qdisc_run+0x1ae/0x1700 net/sched/sch_generic.c:415
 __dev_xmit_skb net/core/dev.c:3875 [inline]
 __dev_queue_xmit+0x2091/0x3630 net/core/dev.c:4194
 neigh_hh_output include/net/neighbour.h:511 [inline]
 neigh_output include/net/neighbour.h:525 [inline]
 ip6_finish_output2+0xf63/0x14e0 net/ipv6/ip6_output.c:126
 __ip6_finish_output net/ipv6/ip6_output.c:191 [inline]
 __ip6_finish_output+0x4c1/0x1050 net/ipv6/ip6_output.c:170
 ip6_finish_output+0x32/0x200 net/ipv6/ip6_output.c:201
 NF_HOOK_COND include/linux/netfilter.h:296 [inline]
 ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:224
 dst_output include/net/dst.h:450 [inline]
 ip6_local_out+0xaf/0x1a0 net/ipv6/output_core.c:161
 ip6tunnel_xmit include/net/ip6_tunnel.h:160 [inline]
 udp_tunnel6_xmit_skb+0x72e/0xc90 net/ipv6/ip6_udp_tunnel.c:109
 send6+0x4a2/0xcc0 drivers/net/wireguard/socket.c:152
 wg_socket_send_skb_to_peer+0xf5/0x220 drivers/net/wireguard/socket.c:177
 wg_packet_create_data_done drivers/net/wireguard/send.c:251 [inline]
 wg_packet_tx_worker+0x1a7/0x720 drivers/net/wireguard/send.c:276
 process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298
 worker_thread+0x658/0x11f0 kernel/workqueue.c:2445
 kthread+0x405/0x4f0 kernel/kthread.c:327
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
 </TASK>
----------------
Code disassembly (best guess):
   0:	fa                   	cli
   1:	83 e2 07             	and    $0x7,%edx
   4:	38 d0                	cmp    %dl,%al
   6:	7f 04                	jg     0xc
   8:	84 c0                	test   %al,%al
   a:	75 63                	jne    0x6f
   c:	0f b6 07             	movzbl (%rdi),%eax
   f:	40 38 c6             	cmp    %al,%sil
  12:	74 35                	je     0x49
  14:	48 83 c4 10          	add    $0x10,%rsp
  18:	c3                   	retq
  19:	c3                   	retq
  1a:	e8 73 56 4a 00       	callq  0x4a5692
  1f:	eb 07                	jmp    0x28
  21:	0f 00 2d ea 8f 75 08 	verw   0x8758fea(%rip)        # 0x8759012
  28:	fb                   	sti
  29:	f4                   	hlt
* 2a:	48 83 c4 10          	add    $0x10,%rsp <-- trapping instruction
  2e:	c3                   	retq
  2f:	89 74 24 0c          	mov    %esi,0xc(%rsp)
  33:	48 89 3c 24          	mov    %rdi,(%rsp)
  37:	e8 36 51 4a 00       	callq  0x4a5172
  3c:	8b 74 24 0c          	mov    0xc(%rsp),%esi

Crashes (1):
Manager Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Title
ci-upstream-kasan-gce 2021/11/25 22:00 upstream b501b85957de 63eeac02 .config log report info INFO: rcu detected stall in wg_packet_tx_worker
* Struck through repros no longer work on HEAD.