syzbot


INFO: rcu detected stall in neigh_timer_handler (2)

Status: fixed on 2019/10/09 10:54
Reported-by: syzbot+@syzkaller.appspotmail.com
Fix commit: d4d6ec6dac07 sch_hhf: ensure quantum and hhf_non_hh_weight are non-zero
First crash: 1179d, last: 1177d
similar bugs (7):
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
upstream INFO: rcu detected stall in neigh_timer_handler (6) 1 722d 722d 0/24 auto-closed as invalid on 2021/03/09 03:20
upstream INFO: rcu detected stall in neigh_timer_handler 1 1468d 1468d 0/24 auto-closed as invalid on 2019/05/23 06:47
upstream INFO: rcu detected stall in neigh_timer_handler (4) 1 1092d 1092d 0/24 closed as invalid on 2019/12/04 14:04
upstream INFO: rcu detected stall in neigh_timer_handler (5) 1 824d 824d 0/24 auto-closed as invalid on 2020/11/26 10:25
upstream INFO: rcu detected stall in neigh_timer_handler (3) 2 1097d 1097d 0/24 closed as invalid on 2019/11/29 14:24
upstream INFO: rcu detected stall in neigh_timer_handler (7) 1 597d 597d 0/24 auto-closed as invalid on 2021/07/11 19:14
linux-4.14 INFO: rcu detected stall in neigh_timer_handler 5 942d 1062d 0/1 auto-closed as invalid on 2020/08/30 16:21

Sample crash report:
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 	0-....: (1 GPs behind) idle=a3a/1/0x4000000000000004 softirq=59811/59814 fqs=3413 
	(t=10501 jiffies g=88805 q=2510)
rcu: rcu_preempt kthread starved for 3674 jiffies! g88805 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
rcu: RCU grace-period kthread stack dump:
rcu_preempt     I29112    10      2 0x80004000
Call Trace:
 context_switch kernel/sched/core.c:3254 [inline]
 __schedule+0x755/0x1580 kernel/sched/core.c:3880
 schedule+0xd9/0x260 kernel/sched/core.c:3947
 schedule_timeout+0x486/0xc50 kernel/time/timer.c:1807
 rcu_gp_fqs_loop kernel/rcu/tree.c:1611 [inline]
 rcu_gp_kthread+0x9b2/0x18c0 kernel/rcu/tree.c:1768
 kthread+0x361/0x430 kernel/kthread.c:255
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
NMI backtrace for cpu 0
CPU: 0 PID: 10073 Comm: kworker/u4:7 Not tainted 5.3.0-rc7+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue:  0x0 (bat_events)
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101
 nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62
 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
 rcu_dump_cpu_stacks+0x183/0x1cf kernel/rcu/tree_stall.h:254
 print_cpu_stall kernel/rcu/tree_stall.h:455 [inline]
 check_cpu_stall kernel/rcu/tree_stall.h:529 [inline]
 rcu_pending kernel/rcu/tree.c:2736 [inline]
 rcu_sched_clock_irq.cold+0x4dd/0xc13 kernel/rcu/tree.c:2183
 update_process_times+0x32/0x80 kernel/time/timer.c:1639
 tick_sched_handle+0xa2/0x190 kernel/time/tick-sched.c:167
 tick_sched_timer+0x53/0x140 kernel/time/tick-sched.c:1296
 __run_hrtimer kernel/time/hrtimer.c:1389 [inline]
 __hrtimer_run_queues+0x364/0xe40 kernel/time/hrtimer.c:1451
 hrtimer_interrupt+0x314/0x770 kernel/time/hrtimer.c:1509
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1106 [inline]
 smp_apic_timer_interrupt+0x160/0x610 arch/x86/kernel/apic/apic.c:1131
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830
RIP: 0010:__list_del_entry_valid+0x45/0xf5 lib/list_debug.c:43
Code: 55 48 c1 ea 03 41 54 80 3c 02 00 0f 85 a1 00 00 00 4c 89 f2 4d 8b 66 08 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 <0f> 85 9d 00 00 00 48 b8 00 01 00 00 00 00 ad de 4d 8b 2e 49 39 c5
RSP: 0018:ffff8880ae808820 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: dffffc0000000000 RBX: ffff8880938eec38 RCX: ffffffff85c7dbb9
RDX: 1ffff1101271dd87 RSI: ffffffff85c7e086 RDI: ffff8880938eec40
RBP: ffff8880ae808838 R08: ffff88805d20c340 R09: fffffbfff14e9555
R10: ffff88805d20cce8 R11: ffff88805d20c340 R12: ffff8880938eecd0
R13: ffff8880938ee940 R14: ffff8880938eec38 R15: 0000000000000000
 __list_del_entry include/linux/list.h:131 [inline]
 list_move_tail include/linux/list.h:213 [inline]
 hhf_dequeue+0x5c5/0xa20 net/sched/sch_hhf.c:439
 dequeue_skb net/sched/sch_generic.c:258 [inline]
 qdisc_restart net/sched/sch_generic.c:361 [inline]
 __qdisc_run+0x1e7/0x19d0 net/sched/sch_generic.c:379
 __dev_xmit_skb net/core/dev.c:3533 [inline]
 __dev_queue_xmit+0x16f1/0x3650 net/core/dev.c:3838
 dev_queue_xmit+0x18/0x20 net/core/dev.c:3902
 br_dev_queue_push_xmit+0x3f3/0x5c0 net/bridge/br_forward.c:52
 NF_HOOK include/linux/netfilter.h:305 [inline]
 NF_HOOK include/linux/netfilter.h:299 [inline]
 br_forward_finish+0xfa/0x400 net/bridge/br_forward.c:65
 NF_HOOK include/linux/netfilter.h:305 [inline]
 NF_HOOK include/linux/netfilter.h:299 [inline]
 __br_forward+0x641/0xb00 net/bridge/br_forward.c:109
 br_flood+0x372/0x3d0 net/bridge/br_forward.c:234
 br_dev_xmit+0xd33/0x15a0 net/bridge/br_device.c:84
 __netdev_start_xmit include/linux/netdevice.h:4419 [inline]
 netdev_start_xmit include/linux/netdevice.h:4433 [inline]
 xmit_one net/core/dev.c:3280 [inline]
 dev_hard_start_xmit+0x1a3/0x9c0 net/core/dev.c:3296
 __dev_queue_xmit+0x2b15/0x3650 net/core/dev.c:3869
 dev_queue_xmit+0x18/0x20 net/core/dev.c:3902
 arp_xmit_finish net/ipv4/arp.c:630 [inline]
 NF_HOOK include/linux/netfilter.h:305 [inline]
 NF_HOOK include/linux/netfilter.h:299 [inline]
 arp_xmit+0xdf/0x3f0 net/ipv4/arp.c:639
 arp_send_dst.part.0+0x202/0x280 net/ipv4/arp.c:317
 arp_send_dst net/ipv4/arp.c:390 [inline]
 arp_solicit+0x752/0x1380 net/ipv4/arp.c:389
 neigh_probe+0xc6/0x110 net/core/neighbour.c:1012
 __neigh_event_send+0x3e3/0x1760 net/core/neighbour.c:1172
 neigh_event_send include/net/neighbour.h:445 [inline]
 neigh_resolve_output+0x5cf/0x970 net/core/neighbour.c:1474
 neigh_output include/net/neighbour.h:511 [inline]
 ip_finish_output2+0x8a5/0x2570 net/ipv4/ip_output.c:228
 __ip_finish_output net/ipv4/ip_output.c:308 [inline]
 __ip_finish_output+0x5fc/0xb90 net/ipv4/ip_output.c:290
 ip_finish_output+0x38/0x1f0 net/ipv4/ip_output.c:318
 NF_HOOK_COND include/linux/netfilter.h:294 [inline]
 ip_output+0x21f/0x640 net/ipv4/ip_output.c:432
 dst_output include/net/dst.h:436 [inline]
 ip_local_out+0xbb/0x190 net/ipv4/ip_output.c:125
 ip_send_skb+0x42/0xf0 net/ipv4/ip_output.c:1554
 ip_push_pending_frames+0x64/0x80 net/ipv4/ip_output.c:1574
 icmp_push_reply+0x350/0x4a0 net/ipv4/icmp.c:389
 __icmp_send+0xc44/0x1410 net/ipv4/icmp.c:738
 ipv4_send_dest_unreach net/ipv4/route.c:1220 [inline]
 ipv4_link_failure+0x54e/0x900 net/ipv4/route.c:1227
 dst_link_failure include/net/dst.h:419 [inline]
 arp_error_report+0xce/0x1a0 net/ipv4/arp.c:293
 neigh_invalidate+0x245/0x570 net/core/neighbour.c:996
 neigh_timer_handler+0xc69/0xf60 net/core/neighbour.c:1082
 call_timer_fn+0x1ac/0x780 kernel/time/timer.c:1322
 expire_timers kernel/time/timer.c:1366 [inline]
 __run_timers kernel/time/timer.c:1685 [inline]
 __run_timers kernel/time/timer.c:1653 [inline]
 run_timer_softirq+0x697/0x17a0 kernel/time/timer.c:1698
 __do_softirq+0x262/0x98c kernel/softirq.c:292
 invoke_softirq kernel/softirq.c:373 [inline]
 irq_exit+0x19b/0x1e0 kernel/softirq.c:413
 exiting_irq arch/x86/include/asm/apic.h:537 [inline]
 smp_apic_timer_interrupt+0x1a3/0x610 arch/x86/kernel/apic/apic.c:1133
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830
 </IRQ>
RIP: 0010:__raw_spin_unlock_irq include/linux/spinlock_api_smp.h:169 [inline]
RIP: 0010:_raw_spin_unlock_irq+0x54/0x90 kernel/locking/spinlock.c:199
Code: c0 60 f4 f2 88 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 75 33 48 83 3d e5 f8 b1 01 00 74 20 fb 66 0f 1f 44 00 00 <bf> 01 00 00 00 e8 22 c6 0d fa 65 8b 05 73 02 c1 78 85 c0 74 06 41
RSP: 0018:ffff888069777cf0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
RAX: 1ffffffff11e5e8c RBX: ffff88805d20c340 RCX: 0000000000000000
RDX: dffffc0000000000 RSI: 0000000000000006 RDI: ffff88805d20cbcc
RBP: ffff888069777cf8 R08: ffff88805d20c340 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880ae835800
R13: ffff88805fc94000 R14: ffff8880938a6180 R15: 0000000000000000
 finish_lock_switch kernel/sched/core.c:3004 [inline]
 finish_task_switch+0x147/0x720 kernel/sched/core.c:3104
 context_switch kernel/sched/core.c:3257 [inline]
 __schedule+0x75d/0x1580 kernel/sched/core.c:3880
 schedule+0xd9/0x260 kernel/sched/core.c:3947
 worker_thread+0x248/0xe40 kernel/workqueue.c:2436
 kthread+0x361/0x430 kernel/kthread.c:255
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

Crashes (2):
Manager Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Title
ci-upstream-net-kasan-gce 2019/09/10 18:01 net-next db63864786c7 a60cb4cd .config log report
ci-upstream-net-kasan-gce 2019/09/09 02:29 net-next 6703a605b5ab a60cb4cd .config log report
* Struck through repros no longer work on HEAD.