syzbot


INFO: rcu detected stall in ip_rcv (2)

Status: auto-closed as invalid on 2021/03/01 14:28
Subsystems: net
[Documentation on labels]
First crash: 1272d, last: 1272d
Similar bugs (7)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
linux-4.19 BUG: soft lockup in ip_rcv C error 16 450d 1118d 0/1 upstream: reported C repro on 2021/05/04 03:32
upstream INFO: rcu detected stall in ip_rcv netfilter 2 1765d 1928d 0/26 auto-closed as invalid on 2019/10/25 14:11
linux-4.14 INFO: rcu detected stall in ip_rcv (2) 1 1282d 1282d 0/1 auto-closed as invalid on 2021/03/20 21:11
upstream INFO: rcu detected stall in ip_rcv (3) kernel 1 1133d 1133d 0/26 auto-closed as invalid on 2021/07/17 18:11
linux-4.14 INFO: rcu detected stall in ip_rcv 3 1463d 1481d 0/1 auto-closed as invalid on 2020/09/21 12:55
upstream INFO: rcu detected stall in ip_rcv (4) bpf net 12 761d 832d 0/26 auto-closed as invalid on 2022/07/25 09:18
upstream INFO: rcu detected stall in ip_rcv (5) net 2 275d 287d 0/26 auto-obsoleted due to no activity on 2023/11/23 01:36

Sample crash report:
GRED: Unable to relocate VQ 0x0 after dequeue, screwing up backlog
GRED: Unable to relocate VQ 0x0 after dequeue, screwing up backlog
GRED: Unable to relocate VQ 0x0 after dequeue, screwing up backlog
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
	(detected by 0, t=10502 jiffies, g=57829, q=693)
rcu: All QSes seen, last rcu_preempt kthread activity 10502 (4294989435-4294978933), jiffies_till_next_fqs=1, root ->qsmask 0x0
rcu: rcu_preempt kthread starved for 10502 jiffies! g57829 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:R  running task     stack:28880 pid:   11 ppid:     2 flags:0x00004000
Call Trace:
 context_switch kernel/sched/core.c:3779 [inline]
 __schedule+0x893/0x2130 kernel/sched/core.c:4528
 schedule+0xcf/0x270 kernel/sched/core.c:4606
 schedule_timeout+0x148/0x250 kernel/time/timer.c:1871
 rcu_gp_fqs_loop kernel/rcu/tree.c:1925 [inline]
 rcu_gp_kthread+0xb4c/0x1c90 kernel/rcu/tree.c:2099
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

================================
WARNING: inconsistent lock state
5.10.0-rc6-syzkaller #0 Not tainted
--------------------------------
inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
ksoftirqd/0/10 [HC0[0]:SC1[1]:HE0:SE0] takes:
ffffffff8b33f998 (rcu_node_0){?.-.}-{2:2}, at: print_other_cpu_stall kernel/rcu/tree_stall.h:487 [inline]
ffffffff8b33f998 (rcu_node_0){?.-.}-{2:2}, at: check_cpu_stall kernel/rcu/tree_stall.h:646 [inline]
ffffffff8b33f998 (rcu_node_0){?.-.}-{2:2}, at: rcu_pending kernel/rcu/tree.c:3694 [inline]
ffffffff8b33f998 (rcu_node_0){?.-.}-{2:2}, at: rcu_sched_clock_irq.cold+0xbc/0xee8 kernel/rcu/tree.c:2567
{IN-HARDIRQ-W} state was registered at:
  lock_acquire kernel/locking/lockdep.c:5437 [inline]
  lock_acquire+0x29d/0x740 kernel/locking/lockdep.c:5402
  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
  _raw_spin_lock_irqsave+0x39/0x50 kernel/locking/spinlock.c:159
  rcu_report_exp_cpu_mult+0x72/0x320 kernel/rcu/tree_exp.h:237
  flush_smp_call_function_queue+0x34b/0x640 kernel/smp.c:425
  __sysvec_call_function_single+0x95/0x3d0 arch/x86/kernel/smp.c:248
  asm_call_irq_on_stack+0xf/0x20
  __run_sysvec_on_irqstack arch/x86/include/asm/irq_stack.h:37 [inline]
  run_sysvec_on_irqstack_cond arch/x86/include/asm/irq_stack.h:89 [inline]
  sysvec_call_function_single+0xbd/0x100 arch/x86/kernel/smp.c:243
  asm_sysvec_call_function_single+0x12/0x20 arch/x86/include/asm/idtentry.h:639
  native_restore_fl arch/x86/include/asm/irqflags.h:41 [inline]
  arch_local_irq_restore arch/x86/include/asm/irqflags.h:84 [inline]
  lock_acquire kernel/locking/lockdep.c:5440 [inline]
  lock_acquire+0x2c7/0x740 kernel/locking/lockdep.c:5402
  down_write_killable+0x90/0x170 kernel/locking/rwsem.c:1542
  mmap_write_lock_killable include/linux/mmap_lock.h:26 [inline]
  __bprm_mm_init fs/exec.c:259 [inline]
  bprm_mm_init fs/exec.c:378 [inline]
  alloc_bprm+0x3a8/0x880 fs/exec.c:1506
  kernel_execve+0x55/0x460 fs/exec.c:1936
  call_usermodehelper_exec_async+0x2de/0x580 kernel/umh.c:110
  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
irq event stamp: 34579079
hardirqs last  enabled at (34579078): [<ffffffff812f70fe>] kvm_wait arch/x86/kernel/kvm.c:851 [inline]
hardirqs last  enabled at (34579078): [<ffffffff812f70fe>] kvm_wait+0x8e/0xd0 arch/x86/kernel/kvm.c:831
hardirqs last disabled at (34579079): [<ffffffff88e55bbc>] sysvec_apic_timer_interrupt+0xc/0x100 arch/x86/kernel/apic/apic.c:1091
softirqs last  enabled at (34575758): [<ffffffff814266fd>] run_ksoftirqd kernel/softirq.c:653 [inline]
softirqs last  enabled at (34575758): [<ffffffff814266fd>] run_ksoftirqd+0x2d/0x50 kernel/softirq.c:645
softirqs last disabled at (34575763): [<ffffffff814266fd>] run_ksoftirqd kernel/softirq.c:653 [inline]
softirqs last disabled at (34575763): [<ffffffff814266fd>] run_ksoftirqd+0x2d/0x50 kernel/softirq.c:645

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(rcu_node_0);
  <Interrupt>
    lock(rcu_node_0);

 *** DEADLOCK ***

4 locks held by ksoftirqd/0/10:
 #0: ffffffff8b3378e0 (rcu_read_lock){....}-{1:2}, at: __skb_unlink include/linux/skbuff.h:2067 [inline]
 #0: ffffffff8b3378e0 (rcu_read_lock){....}-{1:2}, at: __skb_dequeue include/linux/skbuff.h:2082 [inline]
 #0: ffffffff8b3378e0 (rcu_read_lock){....}-{1:2}, at: process_backlog+0x1c1/0x6c0 net/core/dev.c:6317
 #1: ffffffff8b3378e0 (rcu_read_lock){....}-{1:2}, at: __skb_pull include/linux/skbuff.h:2298 [inline]
 #1: ffffffff8b3378e0 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x124/0x370 net/ipv4/ip_input.c:228
 #2: ffff8881473e8920 (k-slock-AF_INET6){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
 #2: ffff8881473e8920 (k-slock-AF_INET6){+.-.}-{2:2}, at: sctp_rcv+0xd96/0x2e30 net/sctp/input.c:231
 #3: ffffffff8b33f998 (rcu_node_0){?.-.}-{2:2}, at: print_other_cpu_stall kernel/rcu/tree_stall.h:487 [inline]
 #3: ffffffff8b33f998 (rcu_node_0){?.-.}-{2:2}, at: check_cpu_stall kernel/rcu/tree_stall.h:646 [inline]
 #3: ffffffff8b33f998 (rcu_node_0){?.-.}-{2:2}, at: rcu_pending kernel/rcu/tree.c:3694 [inline]
 #3: ffffffff8b33f998 (rcu_node_0){?.-.}-{2:2}, at: rcu_sched_clock_irq.cold+0xbc/0xee8 kernel/rcu/tree.c:2567

stack backtrace:
CPU: 0 PID: 10 Comm: ksoftirqd/0 Not tainted 5.10.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:118
 print_usage_bug kernel/locking/lockdep.c:3740 [inline]
 valid_state kernel/locking/lockdep.c:3751 [inline]
 mark_lock_irq kernel/locking/lockdep.c:3954 [inline]
 mark_lock.cold+0x31/0x73 kernel/locking/lockdep.c:4411
 mark_held_locks+0x9f/0xe0 kernel/locking/lockdep.c:4012
 __trace_hardirqs_on_caller kernel/locking/lockdep.c:4030 [inline]
 lockdep_hardirqs_on_prepare kernel/locking/lockdep.c:4098 [inline]
 lockdep_hardirqs_on_prepare+0x135/0x400 kernel/locking/lockdep.c:4050
 trace_hardirqs_on+0x5b/0x1c0 kernel/trace/trace_preemptirq.c:49
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:631
RIP: 0010:kvm_wait arch/x86/kernel/kvm.c:854 [inline]
RIP: 0010:kvm_wait+0x9c/0xd0 arch/x86/kernel/kvm.c:831
Code: 02 48 89 da 83 e2 07 38 d0 7f 04 84 c0 75 32 0f b6 03 41 38 c4 75 13 e8 f2 51 46 00 e9 07 00 00 00 0f 00 2d 96 06 19 08 fb f4 <e8> df 51 46 00 eb af c3 e9 07 00 00 00 0f 00 2d 80 06 19 08 f4 eb
RSP: 0018:ffffc90000cf7728 EFLAGS: 00000202
RAX: 00000000020fa286 RBX: ffff8881473e8908 RCX: ffffffff8155a937
RDX: 0000000000000000 RSI: 0000000000000101 RDI: 0000000000000000
RBP: 0000000000000246 R08: 0000000000000001 R09: ffffffff8ebae6df
R10: fffffbfff1d75cdb R11: 0000000000000001 R12: 0000000000000003
R13: ffffed1028e7d121 R14: 0000000000000001 R15: ffff8880b9e356c0
 pv_wait arch/x86/include/asm/paravirt.h:564 [inline]
 pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:470 [inline]
 __pv_queued_spin_lock_slowpath+0x8b8/0xb40 kernel/locking/qspinlock.c:508
 pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:554 [inline]
 queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline]
 queued_spin_lock include/asm-generic/qspinlock.h:85 [inline]
 do_raw_spin_lock+0x200/0x2b0 kernel/locking/spinlock_debug.c:113
 spin_lock include/linux/spinlock.h:354 [inline]
 sctp_rcv+0xd96/0x2e30 net/sctp/input.c:231
 ip_protocol_deliver_rcu+0x5c/0x8a0 net/ipv4/ip_input.c:204
 ip_local_deliver_finish+0x20a/0x370 net/ipv4/ip_input.c:231
 NF_HOOK include/linux/netfilter.h:301 [inline]
 NF_HOOK include/linux/netfilter.h:295 [inline]
 ip_local_deliver+0x1b3/0x200 net/ipv4/ip_input.c:252
 dst_input include/net/dst.h:449 [inline]
 ip_rcv_finish+0x1da/0x2f0 net/ipv4/ip_input.c:428
 NF_HOOK include/linux/netfilter.h:301 [inline]
 NF_HOOK include/linux/netfilter.h:295 [inline]
 ip_rcv+0xaa/0xd0 net/ipv4/ip_input.c:539
 __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5315
 __netif_receive_skb+0x27/0x1c0 net/core/dev.c:5429
 process_backlog+0x232/0x6c0 net/core/dev.c:6319
 napi_poll net/core/dev.c:6763 [inline]
 net_rx_action+0x4dc/0x1100 net/core/dev.c:6833
 __do_softirq+0x2a0/0x9f6 kernel/softirq.c:298
 run_ksoftirqd kernel/softirq.c:653 [inline]
 run_ksoftirqd+0x2d/0x50 kernel/softirq.c:645
 smpboot_thread_fn+0x655/0x9e0 kernel/smpboot.c:165
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
net_ratelimit: 164 callbacks suppressed
GRED: Unable to relocate VQ 0x0 after dequeue, screwing up backlog
GRED: Unable to relocate VQ 0x0 after dequeue, screwing up backlog
GRED: Unable to relocate VQ 0x0 after dequeue, screwing up backlog
GRED: Unable to relocate VQ 0x0 after dequeue, screwing up backlog
GRED: Unable to relocate VQ 0x0 after dequeue, screwing up backlog
GRED: Unable to relocate VQ 0x0 after dequeue, screwing up backlog
GRED: Unable to relocate VQ 0x0 after dequeue, screwing up backlog
GRED: Unable to relocate VQ 0x0 after dequeue, screwing up backlog
GRED: Unable to relocate VQ 0x0 after dequeue, screwing up backlog
GRED: Unable to relocate VQ 0x0 after dequeue, screwing up backlog
softirq: huh, entered softirq 3 NET_RX 00000000e7978df2 with preempt_count 00000100, exited with 00000101?

Crashes (1):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2020/12/01 14:22 upstream b65054597872 07bfe8a5 .config console log report info ci-upstream-kasan-gce-root
* Struck through repros no longer work on HEAD.