syzbot


INFO: rcu detected stall in mld_dad_timer_expire

Status: fixed on 2019/12/09 13:28
Reported-by: syzbot+4e4787335de34327667b@syzkaller.appspotmail.com
Fix commit: cc243e2427ce sch_hhf: ensure quantum and hhf_non_hh_weight are non-zero
First crash: 1887d, last: 1887d
Fix bisection: fixed by (bisect log) :
commit cc243e2427cef2a5dd7367cb0e0b846503350ffe
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date: Sun Sep 8 20:40:51 2019 +0000

  sch_hhf: ensure quantum and hhf_non_hh_weight are non-zero

  
Similar bugs (3)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
upstream INFO: rcu detected stall in mld_dad_timer_expire net 3 1891d 1892d 13/28 fixed on 2019/10/09 10:54
linux-4.19 INFO: rcu detected stall in mld_dad_timer_expire 1 1536d 1536d 0/1 auto-closed as invalid on 2020/12/29 00:26
upstream INFO: rcu detected stall in mld_dad_timer_expire (2) net 2 1812d 1812d 0/28 closed as invalid on 2019/11/29 14:24

Sample crash report:
INFO: rcu_preempt self-detected stall on CPU
	1-...: (10499 ticks this GP) idle=442/2/0 softirq=10135/10135 fqs=1 
	 (t=10500 jiffies g=1061 c=1060 q=227)
rcu_preempt kthread starved for 10498 jiffies! g1061 c1060 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
rcu_preempt     I29776     8      2 0x80000000
Call Trace:
 context_switch kernel/sched/core.c:2807 [inline]
 __schedule+0x7b8/0x1cd0 kernel/sched/core.c:3383
 schedule+0x92/0x1c0 kernel/sched/core.c:3427
 schedule_timeout+0x43e/0xe10 kernel/time/timer.c:1744
 rcu_gp_kthread+0xbf4/0x1ec0 kernel/rcu/tree.c:2255
 kthread+0x319/0x430 kernel/kthread.c:232
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:404
NMI backtrace for cpu 1
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.14.143 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x138/0x197 lib/dump_stack.c:53
 nmi_cpu_backtrace.cold+0x57/0x94 lib/nmi_backtrace.c:101
 nmi_trigger_cpumask_backtrace+0x141/0x189 lib/nmi_backtrace.c:62
 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
 trigger_single_cpu_backtrace include/linux/nmi.h:158 [inline]
 rcu_dump_cpu_stacks+0x186/0x1d2 kernel/rcu/tree.c:1396
 print_cpu_stall kernel/rcu/tree.c:1542 [inline]
 check_cpu_stall kernel/rcu/tree.c:1610 [inline]
 __rcu_pending kernel/rcu/tree.c:3390 [inline]
 rcu_pending kernel/rcu/tree.c:3452 [inline]
 rcu_check_callbacks.cold+0x43d/0xd0a kernel/rcu/tree.c:2792
 update_process_times+0x31/0x70 kernel/time/timer.c:1588
 tick_sched_handle+0x85/0x160 kernel/time/tick-sched.c:161
 tick_sched_timer+0x43/0x130 kernel/time/tick-sched.c:1219
 __run_hrtimer kernel/time/hrtimer.c:1220 [inline]
 __hrtimer_run_queues+0x270/0xbc0 kernel/time/hrtimer.c:1284
 hrtimer_interrupt+0x1d8/0x5d0 kernel/time/hrtimer.c:1318
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1075 [inline]
 smp_apic_timer_interrupt+0x11c/0x5e0 arch/x86/kernel/apic/apic.c:1100
 apic_timer_interrupt+0x96/0xa0 arch/x86/entry/entry_64.S:792
RIP: 0010:__list_del_entry_valid+0xb3/0xf5 lib/list_debug.c:54
RSP: 0018:ffff8880aef071a8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
RAX: dffffc0000000000 RBX: ffff888094f00678 RCX: 0000000000000000
RDX: 1ffff110129e00e3 RSI: ffff888094f00710 RDI: ffff888094f00718
RBP: ffff8880aef071c0 R08: 0000000000000000 R09: ffff8880a9d1ccf8
R10: ffff8880a9d1ccd8 R11: ffff8880a9d1c340 R12: ffff888094f00710
R13: ffff888094f00710 R14: ffff888094f00678 R15: ffff888094f00700
 __list_del_entry include/linux/list.h:117 [inline]
 list_move_tail include/linux/list.h:182 [inline]
 hhf_dequeue+0x57f/0xa60 net/sched/sch_hhf.c:438
 dequeue_skb net/sched/sch_generic.c:148 [inline]
 qdisc_restart net/sched/sch_generic.c:241 [inline]
 __qdisc_run+0x2b8/0xe00 net/sched/sch_generic.c:257
 __dev_xmit_skb net/core/dev.c:3235 [inline]
 __dev_queue_xmit+0x1571/0x25e0 net/core/dev.c:3493
 dev_queue_xmit+0x18/0x20 net/core/dev.c:3558
 br_dev_queue_push_xmit+0x367/0x530 net/bridge/br_forward.c:55
 NF_HOOK include/linux/netfilter.h:250 [inline]
 NF_HOOK include/linux/netfilter.h:244 [inline]
 br_forward_finish+0xbc/0x320 net/bridge/br_forward.c:67
 NF_HOOK include/linux/netfilter.h:250 [inline]
 NF_HOOK include/linux/netfilter.h:244 [inline]
 __br_forward+0x560/0x9c0 net/bridge/br_forward.c:111
 deliver_clone+0x61/0xc0 net/bridge/br_forward.c:127
 maybe_deliver net/bridge/br_forward.c:168 [inline]
 maybe_deliver net/bridge/br_forward.c:156 [inline]
 br_flood+0x3c8/0x530 net/bridge/br_forward.c:210
 br_dev_xmit+0x9a4/0xd40 net/bridge/br_device.c:83
 __netdev_start_xmit include/linux/netdevice.h:4033 [inline]
 netdev_start_xmit include/linux/netdevice.h:4042 [inline]
 xmit_one net/core/dev.c:3009 [inline]
 dev_hard_start_xmit+0x18c/0x8b0 net/core/dev.c:3025
 __dev_queue_xmit+0x1d95/0x25e0 net/core/dev.c:3525
 dev_queue_xmit+0x18/0x20 net/core/dev.c:3558
 neigh_hh_output include/net/neighbour.h:490 [inline]
 neigh_output include/net/neighbour.h:498 [inline]
 ip6_finish_output2+0x10bd/0x21b0 net/ipv6/ip6_output.c:120
 ip6_finish_output+0x4f4/0xb50 net/ipv6/ip6_output.c:154
 NF_HOOK_COND include/linux/netfilter.h:239 [inline]
 ip6_output+0x20f/0x6d0 net/ipv6/ip6_output.c:171
 dst_output include/net/dst.h:462 [inline]
 NF_HOOK include/linux/netfilter.h:250 [inline]
 NF_HOOK include/linux/netfilter.h:244 [inline]
 mld_sendpack+0x8d1/0xd60 net/ipv6/mcast.c:1660
 mld_send_initial_cr.part.0+0x103/0x150 net/ipv6/mcast.c:2077
 mld_send_initial_cr net/ipv6/mcast.c:2061 [inline]
 mld_dad_timer_expire+0x2c/0x180 net/ipv6/mcast.c:2096
 call_timer_fn+0x161/0x670 kernel/time/timer.c:1279
 expire_timers kernel/time/timer.c:1318 [inline]
 __run_timers kernel/time/timer.c:1634 [inline]
 __run_timers kernel/time/timer.c:1602 [inline]
 run_timer_softirq+0x5b4/0x1570 kernel/time/timer.c:1647
 __do_softirq+0x244/0x9a0 kernel/softirq.c:288
 invoke_softirq kernel/softirq.c:368 [inline]
 irq_exit+0x160/0x1b0 kernel/softirq.c:409
 exiting_irq arch/x86/include/asm/apic.h:648 [inline]
 smp_apic_timer_interrupt+0x146/0x5e0 arch/x86/kernel/apic/apic.c:1102
 apic_timer_interrupt+0x96/0xa0 arch/x86/entry/entry_64.S:792
 </IRQ>
RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
RSP: 0018:ffff8880a9d2fe70 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10
RAX: 1ffffffff0ee2a84 RBX: ffff8880a9d1c340 RCX: 0000000000000000
RDX: dffffc0000000000 RSI: 0000000000000001 RDI: ffff8880a9d1cbbc
RBP: ffff8880a9d2fe98 R08: 1ffffffff104a601 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff87715410
R13: 0000000000000000 R14: 0000000000000000 R15: ffff8880a9d1c340
 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:557
 default_idle_call+0x36/0x90 kernel/sched/idle.c:98
 cpuidle_idle_call kernel/sched/idle.c:156 [inline]
 do_idle+0x262/0x3d0 kernel/sched/idle.c:246
 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:351
 start_secondary+0x346/0x4b0 arch/x86/kernel/smpboot.c:272
 secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:240
INFO: rcu_sched detected stalls on CPUs/tasks:
	1-...: (10501 ticks this GP) idle=442/1/0 softirq=10135/10135 fqs=0 
	(detected by 0, t=10564 jiffies, g=731, c=730, q=7)
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.14.143 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
task: ffff8880a9d1c340 task.stack: ffff8880a9d28000
RIP: 0010:__list_del_entry_valid+0x10/0xf5 lib/list_debug.c:39
RSP: 0018:ffff8880aef071b8 EFLAGS: 00000246
RAX: dffffc0000000000 RBX: ffff888094f00678 RCX: 0000000000000000
RDX: 0000000000000007 RSI: ffff888094f00710 RDI: ffff888094f00678
RBP: ffff8880aef071c0 R08: 0000000000000000 R09: ffff8880a9d1ccf8
R10: ffff8880a9d1ccd8 R11: ffff8880a9d1c340 R12: dffffc0000000000
R13: ffff888094f00480 R14: 0000000000000000 R15: ffff888094f00700
FS:  0000000000000000(0000) GS:ffff8880aef00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000006d9e70 CR3: 000000000766a000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 __list_del_entry include/linux/list.h:117 [inline]
 list_move_tail include/linux/list.h:182 [inline]
 hhf_dequeue+0x57f/0xa60 net/sched/sch_hhf.c:438
 dequeue_skb net/sched/sch_generic.c:148 [inline]
 qdisc_restart net/sched/sch_generic.c:241 [inline]
 __qdisc_run+0x2b8/0xe00 net/sched/sch_generic.c:257
 __dev_xmit_skb net/core/dev.c:3235 [inline]
 __dev_queue_xmit+0x1571/0x25e0 net/core/dev.c:3493
 dev_queue_xmit+0x18/0x20 net/core/dev.c:3558
 br_dev_queue_push_xmit+0x367/0x530 net/bridge/br_forward.c:55
 NF_HOOK include/linux/netfilter.h:250 [inline]
 NF_HOOK include/linux/netfilter.h:244 [inline]
 br_forward_finish+0xbc/0x320 net/bridge/br_forward.c:67
 NF_HOOK include/linux/netfilter.h:250 [inline]
 NF_HOOK include/linux/netfilter.h:244 [inline]
 __br_forward+0x560/0x9c0 net/bridge/br_forward.c:111
 deliver_clone+0x61/0xc0 net/bridge/br_forward.c:127
 maybe_deliver net/bridge/br_forward.c:168 [inline]
 maybe_deliver net/bridge/br_forward.c:156 [inline]
 br_flood+0x3c8/0x530 net/bridge/br_forward.c:210
 br_dev_xmit+0x9a4/0xd40 net/bridge/br_device.c:83
 __netdev_start_xmit include/linux/netdevice.h:4033 [inline]
 netdev_start_xmit include/linux/netdevice.h:4042 [inline]
 xmit_one net/core/dev.c:3009 [inline]
 dev_hard_start_xmit+0x18c/0x8b0 net/core/dev.c:3025
 __dev_queue_xmit+0x1d95/0x25e0 net/core/dev.c:3525
 dev_queue_xmit+0x18/0x20 net/core/dev.c:3558
 neigh_hh_output include/net/neighbour.h:490 [inline]
 neigh_output include/net/neighbour.h:498 [inline]
 ip6_finish_output2+0x10bd/0x21b0 net/ipv6/ip6_output.c:120
 ip6_finish_output+0x4f4/0xb50 net/ipv6/ip6_output.c:154
 NF_HOOK_COND include/linux/netfilter.h:239 [inline]
 ip6_output+0x20f/0x6d0 net/ipv6/ip6_output.c:171
 dst_output include/net/dst.h:462 [inline]
 NF_HOOK include/linux/netfilter.h:250 [inline]
 NF_HOOK include/linux/netfilter.h:244 [inline]
 mld_sendpack+0x8d1/0xd60 net/ipv6/mcast.c:1660
 mld_send_initial_cr.part.0+0x103/0x150 net/ipv6/mcast.c:2077
 mld_send_initial_cr net/ipv6/mcast.c:2061 [inline]
 mld_dad_timer_expire+0x2c/0x180 net/ipv6/mcast.c:2096
 call_timer_fn+0x161/0x670 kernel/time/timer.c:1279
 expire_timers kernel/time/timer.c:1318 [inline]
 __run_timers kernel/time/timer.c:1634 [inline]
 __run_timers kernel/time/timer.c:1602 [inline]
 run_timer_softirq+0x5b4/0x1570 kernel/time/timer.c:1647
 __do_softirq+0x244/0x9a0 kernel/softirq.c:288
 invoke_softirq kernel/softirq.c:368 [inline]
 irq_exit+0x160/0x1b0 kernel/softirq.c:409
 exiting_irq arch/x86/include/asm/apic.h:648 [inline]
 smp_apic_timer_interrupt+0x146/0x5e0 arch/x86/kernel/apic/apic.c:1102
 apic_timer_interrupt+0x96/0xa0 arch/x86/entry/entry_64.S:792
 </IRQ>
RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
RSP: 0018:ffff8880a9d2fe70 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10
RAX: 1ffffffff0ee2a84 RBX: ffff8880a9d1c340 RCX: 0000000000000000
RDX: dffffc0000000000 RSI: 0000000000000001 RDI: ffff8880a9d1cbbc
RBP: ffff8880a9d2fe98 R08: 1ffffffff104a601 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff87715410
R13: 0000000000000000 R14: 0000000000000000 R15: ffff8880a9d1c340
 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:557
 default_idle_call+0x36/0x90 kernel/sched/idle.c:98
 cpuidle_idle_call kernel/sched/idle.c:156 [inline]
 do_idle+0x262/0x3d0 kernel/sched/idle.c:246
 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:351
 start_secondary+0x346/0x4b0 arch/x86/kernel/smpboot.c:272
 secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:240
Code: 23 ae fe 48 8b 75 e8 eb 9f 48 89 f7 48 89 75 e8 e8 46 23 ae fe 48 8b 75 e8 eb b2 48 b8 00 00 00 00 00 fc ff df 55 48 89 e5 41 56 <49> 89 fe 48 83 c7 08 48 89 fa 41 55 48 c1 ea 03 41 54 80 3c 02 
rcu_sched kthread starved for 10565 jiffies! g731 c730 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=1
rcu_sched       R  running task    29824     9      2 0x80000000
Call Trace:
 context_switch kernel/sched/core.c:2807 [inline]
 __schedule+0x7b8/0x1cd0 kernel/sched/core.c:3383
 schedule+0x92/0x1c0 kernel/sched/core.c:3427
 schedule_timeout+0x43e/0xe10 kernel/time/timer.c:1744
 rcu_gp_kthread+0xbf4/0x1ec0 kernel/rcu/tree.c:2255
 kthread+0x319/0x430 kernel/kthread.c:232
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:404

Crashes (1):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2019/09/14 23:07 linux-4.14.y e2cd24b62938 32d59357 .config console log report syz C ci2-linux-4-14
* Struck through repros no longer work on HEAD.