INFO: rcu detected stall in wg_packet_handshake_receive_worker

Status: upstream: reported C repro on 2021/01/29 04:11
Labels: wireguard (incorrect?)
First crash: 863d, last: 161d

Cause bisection: failed (error log, bisect log)

Fix bisection: the fix commit could be any of (bisect log):
  e68061375f79 Merge tag 'irq_urgent_for_v5.11_rc5' of git://
  037c50bfbeb3 Merge tag 'for-5.16-tag' of git://
Discussions (1)
Title Replies (including bot) Last reply
INFO: rcu detected stall in wg_packet_handshake_receive_worker 0 (1) 2021/01/29 04:11
Similar bugs (1)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
linux-6.1 BUG: soft lockup in wg_packet_handshake_receive_worker origin:upstream C 1 8d16h 8d16h 0/3 upstream: reported C repro on 2023/05/29 21:30
Last patch testing requests (3)
Created Duration User Patch Repo Result
2023/04/14 10:21 14m retest repro upstream report log
2022/12/27 09:31 15m retest repro upstream error
2022/09/18 06:29 14m retest repro upstream report log

Sample crash report:
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 	0-...0: (1 GPs behind) idle=c9a/1/0x4000000000000000 softirq=12635/12642 fqs=2 
	(t=18523 jiffies g=12681 q=487)
rcu: rcu_preempt kthread starved for 8264 jiffies! g12681 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:I stack:29176 pid:   13 ppid:     2 flags:0x00004000
Call Trace:
 context_switch kernel/sched/core.c:4327 [inline]
 __schedule+0x90c/0x21a0 kernel/sched/core.c:5078
 schedule+0xcf/0x270 kernel/sched/core.c:5157
 schedule_timeout+0x148/0x250 kernel/time/timer.c:1878
 rcu_gp_fqs_loop kernel/rcu/tree.c:1940 [inline]
 rcu_gp_kthread+0xbbe/0x1d70 kernel/rcu/tree.c:2113
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
NMI backtrace for cpu 0
CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.11.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: wg-kex-wg2 wg_packet_handshake_receive_worker
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 nmi_cpu_backtrace.cold+0x44/0xd7 lib/nmi_backtrace.c:105
 nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
 rcu_dump_cpu_stacks+0x1f4/0x230 kernel/rcu/tree_stall.h:337
 print_cpu_stall kernel/rcu/tree_stall.h:569 [inline]
 check_cpu_stall kernel/rcu/tree_stall.h:643 [inline]
 rcu_pending kernel/rcu/tree.c:3751 [inline]
 rcu_sched_clock_irq.cold+0x48e/0xedf kernel/rcu/tree.c:2580
 update_process_times+0x16d/0x200 kernel/time/timer.c:1782
 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226
 tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1369
 __run_hrtimer kernel/time/hrtimer.c:1519 [inline]
 __hrtimer_run_queues+0x1c0/0xe40 kernel/time/hrtimer.c:1583
 hrtimer_interrupt+0x334/0x940 kernel/time/hrtimer.c:1645
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1085 [inline]
 __sysvec_apic_timer_interrupt+0x146/0x540 arch/x86/kernel/apic/apic.c:1102
 __run_sysvec_on_irqstack arch/x86/include/asm/irq_stack.h:37 [inline]
 run_sysvec_on_irqstack_cond arch/x86/include/asm/irq_stack.h:89 [inline]
 sysvec_apic_timer_interrupt+0xbd/0x100 arch/x86/kernel/apic/apic.c:1096
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:628
RIP: 0010:queue_work_on+0x83/0xd0 kernel/workqueue.c:1530
Code: 31 ff 89 ee e8 1e fe 28 00 40 84 ed 74 46 e8 94 f7 28 00 31 ff 48 89 de e8 7a ff 28 00 48 85 db 75 26 e8 80 f7 28 00 41 56 9d <48> 83 c4 08 44 89 f8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 e8 66 f7 28
RSP: 0018:ffffc90000ca7a78 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000200 RCX: 0000000000000000
RDX: ffff888010d18000 RSI: ffffffff8149c970 RDI: 0000000000000000
RBP: ffff888033f9a000 R08: 0000000000000001 R09: ffffffff8ed3287f
R10: fffffbfff1da650f R11: 0000000000000000 R12: ffffe8ffffd4bcf0
R13: ffff8880149b3c00 R14: 0000000000000293 R15: 0000000000000001
 wg_queue_enqueue_per_device_and_peer drivers/net/wireguard/queueing.h:156 [inline]
 wg_packet_create_data drivers/net/wireguard/send.c:326 [inline]
 wg_packet_send_staged_packets+0xf42/0x1840 drivers/net/wireguard/send.c:396
 wg_packet_send_keepalive+0x4a/0x310 drivers/net/wireguard/send.c:239
 wg_receive_handshake_packet+0x930/0xb40 drivers/net/wireguard/receive.c:193
 wg_packet_handshake_receive_worker+0x45/0x90 drivers/net/wireguard/receive.c:220
 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

Crashes (2):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets Manager Title
2021/01/25 04:03 upstream e68061375f79 52e37319 .config console log report syz C ci-upstream-kasan-gce INFO: rcu detected stall in wg_packet_handshake_receive_worker
2022/12/28 09:14 upstream 1b929c02afd3 44712fbc .config console log report info [disk image] [vmlinux] [kernel image] ci-upstream-kasan-gce INFO: rcu detected stall in wg_packet_handshake_receive_worker
* Struck through repros no longer work on HEAD.