syzbot


INFO: rcu detected stall in mld_ifc_timer_expire
Status: fixed on 2019/12/06 15:57
Reported-by: syzbot+42e1254b6c2097b925ed@syzkaller.appspotmail.com
Fix commit: cc243e2427ce sch_hhf: ensure quantum and hhf_non_hh_weight are non-zero
First crash: 990d, last: 981d

Fix bisection: fixed by (bisect log) :
commit cc243e2427cef2a5dd7367cb0e0b846503350ffe
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date: Sun Sep 8 20:40:51 2019 +0000

  sch_hhf: ensure quantum and hhf_non_hh_weight are non-zero

similar bugs (4):
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
upstream INFO: rcu detected stall in mld_ifc_timer_expire C done 152 981d 989d 14/22 fixed on 2019/10/09 10:54
upstream INFO: rcu detected stall in mld_ifc_timer_expire (3) 1 716d 716d 0/22 auto-closed as invalid on 2020/09/04 20:48
linux-4.19 INFO: rcu detected stall in mld_ifc_timer_expire C done 33 978d 990d 1/1 fixed on 2019/12/07 19:18
upstream INFO: rcu detected stall in mld_ifc_timer_expire (2) 1 902d 902d 0/22 closed as invalid on 2019/12/04 14:04

Sample crash report:
IPv6: ADDRCONF(NETDEV_UP): hsr0: link is not ready
IPv6: ADDRCONF(NETDEV_CHANGE): hsr0: link becomes ready
IPv6: ADDRCONF(NETDEV_UP): vxcan1: link is not ready
8021q: adding VLAN 0 to HW filter on device batadv0
INFO: rcu_preempt self-detected stall on CPU
	1-...: (10499 ticks this GP) idle=4f2/140000000000002/0 softirq=10822/10825 fqs=6 
	 (t=10500 jiffies g=1117 c=1116 q=176)
rcu_preempt kthread starved for 10488 jiffies! g1117 c1116 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
rcu_preempt     I29824     8      2 0x80000000
Call Trace:
 context_switch kernel/sched/core.c:2807 [inline]
 __schedule+0x7b8/0x1cd0 kernel/sched/core.c:3383
 schedule+0x92/0x1c0 kernel/sched/core.c:3427
 schedule_timeout+0x43e/0xe10 kernel/time/timer.c:1744
 rcu_gp_kthread+0xbf4/0x1ec0 kernel/rcu/tree.c:2255
 kthread+0x319/0x430 kernel/kthread.c:232
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:404
NMI backtrace for cpu 1
CPU: 1 PID: 6987 Comm: syz-executor523 Not tainted 4.14.144 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x138/0x197 lib/dump_stack.c:53
 nmi_cpu_backtrace.cold+0x57/0x94 lib/nmi_backtrace.c:101
 nmi_trigger_cpumask_backtrace+0x141/0x189 lib/nmi_backtrace.c:62
 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
 trigger_single_cpu_backtrace include/linux/nmi.h:158 [inline]
 rcu_dump_cpu_stacks+0x186/0x1d2 kernel/rcu/tree.c:1396
 print_cpu_stall kernel/rcu/tree.c:1542 [inline]
 check_cpu_stall kernel/rcu/tree.c:1610 [inline]
 __rcu_pending kernel/rcu/tree.c:3390 [inline]
 rcu_pending kernel/rcu/tree.c:3452 [inline]
 rcu_check_callbacks.cold+0x43d/0xd0a kernel/rcu/tree.c:2792
 update_process_times+0x31/0x70 kernel/time/timer.c:1588
 tick_sched_handle+0x85/0x160 kernel/time/tick-sched.c:161
 tick_sched_timer+0x43/0x130 kernel/time/tick-sched.c:1219
 __run_hrtimer kernel/time/hrtimer.c:1220 [inline]
 __hrtimer_run_queues+0x270/0xbc0 kernel/time/hrtimer.c:1284
 hrtimer_interrupt+0x1d8/0x5d0 kernel/time/hrtimer.c:1318
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1075 [inline]
 smp_apic_timer_interrupt+0x11c/0x5e0 arch/x86/kernel/apic/apic.c:1100
 apic_timer_interrupt+0x96/0xa0 arch/x86/entry/entry_64.S:792
RIP: 0010:__list_add_valid+0x50/0xa0 lib/list_debug.c:26
RSP: 0018:ffff8880aef07198 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
RAX: dffffc0000000000 RBX: ffff8880a1110678 RCX: 0000000000000000
RDX: 1ffff110142220e2 RSI: ffff8880a1110710 RDI: ffff8880a1110718
RBP: ffff8880aef071b0 R08: 0000000000000000 R09: ffff8880807aaeb8
R10: ffff8880807aae98 R11: ffff8880807aa500 R12: ffff8880a1110710
R13: ffff8880a1110678 R14: ffff8880a1110710 R15: ffff8880a1110700
 __list_add include/linux/list.h:60 [inline]
 list_add_tail include/linux/list.h:93 [inline]
 list_move_tail include/linux/list.h:183 [inline]
 hhf_dequeue+0x61b/0xa60 net/sched/sch_hhf.c:438
 dequeue_skb net/sched/sch_generic.c:148 [inline]
 qdisc_restart net/sched/sch_generic.c:241 [inline]
 __qdisc_run+0x2b8/0xe00 net/sched/sch_generic.c:257
 __dev_xmit_skb net/core/dev.c:3235 [inline]
 __dev_queue_xmit+0x1571/0x25e0 net/core/dev.c:3493
 dev_queue_xmit+0x18/0x20 net/core/dev.c:3558
 br_dev_queue_push_xmit+0x367/0x530 net/bridge/br_forward.c:55
 NF_HOOK include/linux/netfilter.h:250 [inline]
 NF_HOOK include/linux/netfilter.h:244 [inline]
 br_forward_finish+0xbc/0x320 net/bridge/br_forward.c:67
 NF_HOOK include/linux/netfilter.h:250 [inline]
 NF_HOOK include/linux/netfilter.h:244 [inline]
 __br_forward+0x560/0x9c0 net/bridge/br_forward.c:111
 deliver_clone+0x61/0xc0 net/bridge/br_forward.c:127
 maybe_deliver net/bridge/br_forward.c:168 [inline]
 maybe_deliver net/bridge/br_forward.c:156 [inline]
 br_flood+0x3c8/0x530 net/bridge/br_forward.c:210
 br_dev_xmit+0x9a4/0xd40 net/bridge/br_device.c:83
 __netdev_start_xmit include/linux/netdevice.h:4033 [inline]
 netdev_start_xmit include/linux/netdevice.h:4042 [inline]
 xmit_one net/core/dev.c:3009 [inline]
 dev_hard_start_xmit+0x18c/0x8b0 net/core/dev.c:3025
 __dev_queue_xmit+0x1d95/0x25e0 net/core/dev.c:3525
 dev_queue_xmit+0x18/0x20 net/core/dev.c:3558
 neigh_hh_output include/net/neighbour.h:490 [inline]
 neigh_output include/net/neighbour.h:498 [inline]
 ip6_finish_output2+0x10bd/0x21b0 net/ipv6/ip6_output.c:120
 ip6_finish_output+0x4f4/0xb50 net/ipv6/ip6_output.c:154
 NF_HOOK_COND include/linux/netfilter.h:239 [inline]
 ip6_output+0x20f/0x6d0 net/ipv6/ip6_output.c:171
 dst_output include/net/dst.h:462 [inline]
 NF_HOOK include/linux/netfilter.h:250 [inline]
 NF_HOOK include/linux/netfilter.h:244 [inline]
 mld_sendpack+0x8d1/0xd60 net/ipv6/mcast.c:1660
 mld_send_cr net/ipv6/mcast.c:1956 [inline]
 mld_ifc_timer_expire+0x3b6/0x7b0 net/ipv6/mcast.c:2455
 call_timer_fn+0x161/0x670 kernel/time/timer.c:1279
 expire_timers kernel/time/timer.c:1318 [inline]
 __run_timers kernel/time/timer.c:1634 [inline]
 __run_timers kernel/time/timer.c:1602 [inline]
 run_timer_softirq+0x5b4/0x1570 kernel/time/timer.c:1647
 __do_softirq+0x244/0x9a0 kernel/softirq.c:288
 invoke_softirq kernel/softirq.c:368 [inline]
 irq_exit+0x160/0x1b0 kernel/softirq.c:409
 exiting_irq arch/x86/include/asm/apic.h:648 [inline]
 smp_apic_timer_interrupt+0x146/0x5e0 arch/x86/kernel/apic/apic.c:1102
 apic_timer_interrupt+0x96/0xa0 arch/x86/entry/entry_64.S:792
 </IRQ>
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:779 [inline]
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x95/0xe0 kernel/locking/spinlock.c:192
RSP: 0018:ffff88809b537870 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10
RAX: 1ffffffff0ee2a81 RBX: 0000000000000286 RCX: 0000000000000000
RDX: dffffc0000000000 RSI: 0000000000000001 RDI: 0000000000000286
RBP: ffff88809b537880 R08: ffff8880807aa500 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff892d6a68
R13: 0000000000000000 R14: ffffffff892d6a68 R15: ffff8880a8966760
 __debug_check_no_obj_freed lib/debugobjects.c:760 [inline]
 debug_check_no_obj_freed+0x2aa/0x7b7 lib/debugobjects.c:776
 free_pages_prepare mm/page_alloc.c:1063 [inline]
 __free_pages_ok+0x23f/0xea0 mm/page_alloc.c:1256
 free_compound_page+0x63/0x80 mm/page_alloc.c:599
 free_transhuge_page+0x1fb/0x2b0 mm/huge_memory.c:2694
 __put_compound_page+0x7a/0xc0 mm/swap.c:95
 release_pages+0x33b/0xb90 mm/swap.c:788
 free_pages_and_swap_cache+0x178/0x210 mm/swap_state.c:322
 tlb_flush_mmu_free+0x7e/0x150 mm/memory.c:260
 tlb_flush_mmu mm/memory.c:269 [inline]
 arch_tlb_finish_mmu+0x9f/0x140 mm/memory.c:284
 tlb_finish_mmu+0x7d/0xd0 mm/memory.c:436
 exit_mmap+0x2a5/0x4e0 mm/mmap.c:3064
 __mmput kernel/fork.c:935 [inline]
 mmput+0x114/0x440 kernel/fork.c:956
 exit_mm kernel/exit.c:545 [inline]
 do_exit+0x71d/0x2c10 kernel/exit.c:861
 do_group_exit+0x111/0x330 kernel/exit.c:977
 SYSC_exit_group kernel/exit.c:988 [inline]
 SyS_exit_group+0x1d/0x20 kernel/exit.c:986
 do_syscall_64+0x1e8/0x640 arch/x86/entry/common.c:292
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x441358
RSP: 002b:00007fffd5720b38 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441358
RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
RBP: 00000000004c7ab0 R08: 00000000000000e7 R09: ffffffffffffffd0
R10: 00000000004a9e35 R11: 0000000000000246 R12: 0000000000000001
R13: 00000000006da5e0 R14: 0000000000000000 R15: 0000000000000000
INFO: rcu_sched detected stalls on CPUs/tasks:
	1-...: (10501 ticks this GP) idle=4f2/140000000000001/0 softirq=10822/10825 fqs=6 
	(detected by 0, t=10579 jiffies, g=800, c=799, q=9)
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 6987 Comm: syz-executor523 Not tainted 4.14.144 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
task: ffff8880807aa500 task.stack: ffff88809b530000
RIP: 0010:__sanitizer_cov_trace_pc+0x28/0x60 kernel/kcov.c:68
RSP: 0018:ffff8880aef071b8 EFLAGS: 00000206
RAX: ffff8880807aa500 RBX: ffff8880a1110678 RCX: ffff8880a1110640
RDX: 0000000000000100 RSI: ffff8880a1110710 RDI: ffff8880a1110688
RBP: ffff8880aef07210 R08: 0000000000000000 R09: ffff8880807aaeb8
R10: ffff8880807aae98 R11: ffff8880807aa500 R12: dffffc0000000000
R13: ffff8880a1110480 R14: 0000000000000001 R15: ffff8880a1110700
FS:  0000000000000000(0000) GS:ffff8880aef00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000006dbb10 CR3: 000000000766a000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 dequeue_skb net/sched/sch_generic.c:148 [inline]
 qdisc_restart net/sched/sch_generic.c:241 [inline]
 __qdisc_run+0x2b8/0xe00 net/sched/sch_generic.c:257
 __dev_xmit_skb net/core/dev.c:3235 [inline]
 __dev_queue_xmit+0x1571/0x25e0 net/core/dev.c:3493
 dev_queue_xmit+0x18/0x20 net/core/dev.c:3558
 br_dev_queue_push_xmit+0x367/0x530 net/bridge/br_forward.c:55
 NF_HOOK include/linux/netfilter.h:250 [inline]
 NF_HOOK include/linux/netfilter.h:244 [inline]
 br_forward_finish+0xbc/0x320 net/bridge/br_forward.c:67
 NF_HOOK include/linux/netfilter.h:250 [inline]
 NF_HOOK include/linux/netfilter.h:244 [inline]
 __br_forward+0x560/0x9c0 net/bridge/br_forward.c:111
 deliver_clone+0x61/0xc0 net/bridge/br_forward.c:127
 maybe_deliver net/bridge/br_forward.c:168 [inline]
 maybe_deliver net/bridge/br_forward.c:156 [inline]
 br_flood+0x3c8/0x530 net/bridge/br_forward.c:210
 br_dev_xmit+0x9a4/0xd40 net/bridge/br_device.c:83
 __netdev_start_xmit include/linux/netdevice.h:4033 [inline]
 netdev_start_xmit include/linux/netdevice.h:4042 [inline]
 xmit_one net/core/dev.c:3009 [inline]
 dev_hard_start_xmit+0x18c/0x8b0 net/core/dev.c:3025
 __dev_queue_xmit+0x1d95/0x25e0 net/core/dev.c:3525
 dev_queue_xmit+0x18/0x20 net/core/dev.c:3558
 neigh_hh_output include/net/neighbour.h:490 [inline]
 neigh_output include/net/neighbour.h:498 [inline]
 ip6_finish_output2+0x10bd/0x21b0 net/ipv6/ip6_output.c:120
 ip6_finish_output+0x4f4/0xb50 net/ipv6/ip6_output.c:154
 NF_HOOK_COND include/linux/netfilter.h:239 [inline]
 ip6_output+0x20f/0x6d0 net/ipv6/ip6_output.c:171
 dst_output include/net/dst.h:462 [inline]
 NF_HOOK include/linux/netfilter.h:250 [inline]
 NF_HOOK include/linux/netfilter.h:244 [inline]
 mld_sendpack+0x8d1/0xd60 net/ipv6/mcast.c:1660
 mld_send_cr net/ipv6/mcast.c:1956 [inline]
 mld_ifc_timer_expire+0x3b6/0x7b0 net/ipv6/mcast.c:2455
 call_timer_fn+0x161/0x670 kernel/time/timer.c:1279
 expire_timers kernel/time/timer.c:1318 [inline]
 __run_timers kernel/time/timer.c:1634 [inline]
 __run_timers kernel/time/timer.c:1602 [inline]
 run_timer_softirq+0x5b4/0x1570 kernel/time/timer.c:1647
 __do_softirq+0x244/0x9a0 kernel/softirq.c:288
 invoke_softirq kernel/softirq.c:368 [inline]
 irq_exit+0x160/0x1b0 kernel/softirq.c:409
 exiting_irq arch/x86/include/asm/apic.h:648 [inline]
 smp_apic_timer_interrupt+0x146/0x5e0 arch/x86/kernel/apic/apic.c:1102
 apic_timer_interrupt+0x96/0xa0 arch/x86/entry/entry_64.S:792
 </IRQ>
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:779 [inline]
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x95/0xe0 kernel/locking/spinlock.c:192
RSP: 0018:ffff88809b537870 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10
RAX: 1ffffffff0ee2a81 RBX: 0000000000000286 RCX: 0000000000000000
RDX: dffffc0000000000 RSI: 0000000000000001 RDI: 0000000000000286
RBP: ffff88809b537880 R08: ffff8880807aa500 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff892d6a68
R13: 0000000000000000 R14: ffffffff892d6a68 R15: ffff8880a8966760
 __debug_check_no_obj_freed lib/debugobjects.c:760 [inline]
 debug_check_no_obj_freed+0x2aa/0x7b7 lib/debugobjects.c:776
 free_pages_prepare mm/page_alloc.c:1063 [inline]
 __free_pages_ok+0x23f/0xea0 mm/page_alloc.c:1256
 free_compound_page+0x63/0x80 mm/page_alloc.c:599
 free_transhuge_page+0x1fb/0x2b0 mm/huge_memory.c:2694
 __put_compound_page+0x7a/0xc0 mm/swap.c:95
 release_pages+0x33b/0xb90 mm/swap.c:788
 free_pages_and_swap_cache+0x178/0x210 mm/swap_state.c:322
 tlb_flush_mmu_free+0x7e/0x150 mm/memory.c:260
 tlb_flush_mmu mm/memory.c:269 [inline]
 arch_tlb_finish_mmu+0x9f/0x140 mm/memory.c:284
 tlb_finish_mmu+0x7d/0xd0 mm/memory.c:436
 exit_mmap+0x2a5/0x4e0 mm/mmap.c:3064
 __mmput kernel/fork.c:935 [inline]
 mmput+0x114/0x440 kernel/fork.c:956
 exit_mm kernel/exit.c:545 [inline]
 do_exit+0x71d/0x2c10 kernel/exit.c:861
 do_group_exit+0x111/0x330 kernel/exit.c:977
 SYSC_exit_group kernel/exit.c:988 [inline]
 SyS_exit_group+0x1d/0x20 kernel/exit.c:986
 do_syscall_64+0x1e8/0x640 arch/x86/entry/common.c:292
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x441358
RSP: 002b:00007fffd5720b38 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441358
RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
RBP: 00000000004c7ab0 R08: 00000000000000e7 R09: ffffffffffffffd0
R10: 00000000004a9e35 R11: 0000000000000246 R12: 0000000000000001
R13: 00000000006da5e0 R14: 0000000000000000 R15: 0000000000000000
Code: ec ff ff 65 48 8b 04 25 40 ee 01 00 48 85 c0 74 1a 65 8b 15 cb 0a a5 7e 81 e2 00 01 1f 00 75 0b 8b 90 c8 12 00 00 83 fa 01 74 01 <c3> 55 48 89 e5 48 8b 75 08 48 8b 88 d0 12 00 00 8b 80 cc 12 00 
rcu_sched kthread starved for 10568 jiffies! g800 c799 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
rcu_sched       I29824     9      2 0x80000000
Call Trace:
 context_switch kernel/sched/core.c:2807 [inline]
 __schedule+0x7b8/0x1cd0 kernel/sched/core.c:3383
 schedule+0x92/0x1c0 kernel/sched/core.c:3427
 schedule_timeout+0x43e/0xe10 kernel/time/timer.c:1744
 rcu_gp_kthread+0xbf4/0x1ec0 kernel/rcu/tree.c:2255
 kthread+0x319/0x430 kernel/kthread.c:232
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:404

Crashes (9):
Manager Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Title
ci2-linux-4-14 2019/09/16 15:04 linux-4.14.y 968722f5371a 55c50e70 .config log report syz C
ci2-linux-4-14 2019/09/10 03:32 linux-4.14.y 414510bc00a5 a60cb4cd .config log report syz C
ci2-linux-4-14 2019/09/09 10:56 linux-4.14.y 414510bc00a5 a60cb4cd .config log report syz C
ci2-linux-4-14 2019/09/09 08:52 linux-4.14.y 414510bc00a5 a60cb4cd .config log report syz C
ci2-linux-4-14 2019/09/08 20:22 linux-4.14.y 414510bc00a5 a60cb4cd .config log report syz C
ci2-linux-4-14 2019/09/07 15:55 linux-4.14.y 414510bc00a5 a60cb4cd .config log report syz C
ci2-linux-4-14 2019/09/07 10:48 linux-4.14.y 414510bc00a5 a60cb4cd .config log report syz C
ci2-linux-4-14 2019/09/07 07:50 linux-4.14.y 414510bc00a5 a60cb4cd .config log report syz C
ci2-linux-4-14 2019/09/16 09:22 linux-4.14.y 968722f5371a 55c50e70 .config log report