syzbot


INFO: rcu detected stall in mld_dad_work

Status: closed as invalid on 2023/09/22 04:28
Subsystems: net
[Documentation on labels]
First crash: 808d, last: 691d
Cause bisection: failed (error log, bisect log)
  
Similar bugs (2)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
upstream INFO: rcu detected stall in mld_dad_work (2) net 1 505d 505d 0/28 auto-obsoleted due to no activity on 2024/04/28 18:46
android-5-15 BUG: soft lockup in mld_dad_work origin:lts syz 1 278d 278d 0/2 auto-obsoleted due to no activity on 2024/12/11 14:40
Last patch testing requests (4)
Created Duration User Patch Repo Result
2023/08/23 13:21 28m retest repro git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci OK log
2023/08/23 13:21 22m retest repro linux-next OK log
2023/08/23 13:21 19m retest repro upstream OK log
2023/06/11 01:12 20m retest repro linux-next error

Sample crash report:
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 	0-...!: (1 GPs behind) idle=f46c/1/0x4000000000000000 softirq=9850/9854 fqs=1
rcu: 	(t=10500 jiffies g=7857 q=1306 ncpus=2)
rcu: rcu_preempt kthread starved for 10492 jiffies! g7857 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:R  running task     stack:28368 pid:16    ppid:2      flags:0x00004000
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5333 [inline]
 __schedule+0x1d23/0x5650 kernel/sched/core.c:6658
 schedule+0xde/0x1a0 kernel/sched/core.c:6734
 schedule_timeout+0x14e/0x2b0 kernel/time/timer.c:2167
 rcu_gp_fqs_loop+0x190/0x910 kernel/rcu/tree.c:1609
 rcu_gp_kthread+0x23a/0x360 kernel/rcu/tree.c:1808
 kthread+0x33e/0x440 kernel/kthread.c:379
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
 </TASK>
rcu: Stack dump where RCU GP kthread last ran:
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 12 Comm: kworker/u4:1 Not tainted 6.3.0-rc4-next-20230331-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
Workqueue: events_unbound toggle_allocation_gate
RIP: 0010:check_kcov_mode kernel/kcov.c:173 [inline]
RIP: 0010:__sanitizer_cov_trace_pc+0xb/0x70 kernel/kcov.c:207
Code: 0f 1e fa 48 8b be a8 01 00 00 e8 b0 ff ff ff 31 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 65 8b 05 ad aa 80 7e <89> c1 48 8b 34 24 81 e1 00 01 00 00 65 48 8b 14 25 00 bc 03 00 a9
RSP: 0018:ffffc90000117940 EFLAGS: 00000202
RAX: 0000000000000001 RBX: ffff8880b98453a0 RCX: ffffffff8177e508
RDX: ffff8880167bd7c0 RSI: 0000000000000000 RDI: 0000000000000005
RBP: 0000000000000003 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffffed1017308a75
R13: 0000000000000000 R14: dffffc0000000000 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020001794 CR3: 000000000c571000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 rep_nop arch/x86/include/asm/vdso/processor.h:13 [inline]
 cpu_relax arch/x86/include/asm/vdso/processor.h:18 [inline]
 csd_lock_wait kernel/smp.c:285 [inline]
 smp_call_function_many_cond+0x682/0x1240 kernel/smp.c:828
 on_each_cpu_cond_mask+0x5a/0xa0 kernel/smp.c:996
 on_each_cpu include/linux/smp.h:71 [inline]
 text_poke_sync arch/x86/kernel/alternative.c:1770 [inline]
 text_poke_bp_batch+0x237/0x770 arch/x86/kernel/alternative.c:1970
 text_poke_flush arch/x86/kernel/alternative.c:2161 [inline]
 text_poke_flush arch/x86/kernel/alternative.c:2158 [inline]
 text_poke_finish+0x1a/0x30 arch/x86/kernel/alternative.c:2168
 arch_jump_label_transform_apply+0x17/0x30 arch/x86/kernel/jump_label.c:146
 jump_label_update+0x32f/0x410 kernel/jump_label.c:829
 static_key_enable_cpuslocked+0x1b5/0x270 kernel/jump_label.c:205
 static_key_enable+0x1a/0x20 kernel/jump_label.c:218
 toggle_allocation_gate mm/kfence/core.c:803 [inline]
 toggle_allocation_gate+0xf8/0x230 mm/kfence/core.c:795
 process_one_work+0x99a/0x15e0 kernel/workqueue.c:2405
 worker_thread+0x67d/0x10c0 kernel/workqueue.c:2552
 kthread+0x33e/0x440 kernel/kthread.c:379
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
 </TASK>
INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 1.259 msecs
CPU: 0 PID: 4409 Comm: kworker/0:3 Not tainted 6.3.0-rc4-next-20230331-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
Workqueue: mld mld_dad_work
RIP: 0010:pie_calculate_probability+0x2d4/0x7c0 net/sched/sch_pie.c:355
Code: 41 83 ee 01 75 ae 48 89 5c 24 20 4c 8b 74 24 30 4c 8b 6c 24 38 48 8b 5c 24 40 e8 77 3c 4a f9 48 8b 44 24 20 48 0f af 6c 24 10 <48> 0f af 44 24 18 48 01 c5 e8 5e 3c 4a f9 4c 89 f6 bf ca 9a 3b 00
RSP: 0018:ffffc90000007c50 EFLAGS: 00000286
RAX: 00000000002af31d RBX: ffff88806ce0e5b0 RCX: 0000000000000100
RDX: ffff88802b8257c0 RSI: ffffffff8838d4d9 RDI: 0000000000000005
RBP: fffffff0a3da8872 R08: 0000000000000005 R09: 00000000000f4240
R10: 0000000000989680 R11: 0000000000000001 R12: 0000000000989680
R13: ffff888027320b00 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe947b62b8 CR3: 000000002a877000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 fq_pie_timer+0x174/0x2a0 net/sched/sch_fq_pie.c:380
 call_timer_fn+0x1a0/0x580 kernel/time/timer.c:1700
 expire_timers+0x234/0x330 kernel/time/timer.c:1751
 __run_timers kernel/time/timer.c:2022 [inline]
 __run_timers kernel/time/timer.c:1995 [inline]
 run_timer_softirq+0x326/0x910 kernel/time/timer.c:2035
 __do_softirq+0x1d4/0x905 kernel/softirq.c:571
 invoke_softirq kernel/softirq.c:445 [inline]
 __irq_exit_rcu+0x114/0x190 kernel/softirq.c:650
 irq_exit_rcu+0x9/0x20 kernel/softirq.c:662
 sysvec_apic_timer_interrupt+0x97/0xc0 arch/x86/kernel/apic/apic.c:1107
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x3c/0x70 kernel/locking/spinlock.c:194
Code: 74 24 10 e8 86 a7 5a f7 48 89 ef e8 6e 15 5b f7 81 e3 00 02 00 00 75 25 9c 58 f6 c4 02 75 2d 48 85 db 74 01 fb bf 01 00 00 00 <e8> 6f 29 4d f7 65 8b 05 80 82 f9 75 85 c0 74 0a 5b 5d c3 e8 0c 61
RSP: 0018:ffffc9000722f5e8 EFLAGS: 00000206
RAX: 0000000000000006 RBX: 0000000000000200 RCX: 1ffffffff229fba6
RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000001
RBP: ffff88813fffacc0 R08: 0000000000000001 R09: ffffffff914fbc47
R10: 0000000000000001 R11: 0000000000000000 R12: ffffea00008a0008
R13: 0000000000000002 R14: ffff8880b984378c R15: ffff8880b9843780
 free_unref_page_commit+0x25f/0x6e0 mm/page_alloc.c:2635
 free_unref_page+0x191/0x370 mm/page_alloc.c:2673
 qlink_free mm/kasan/quarantine.c:166 [inline]
 qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:185
 kasan_quarantine_reduce+0x195/0x220 mm/kasan/quarantine.c:292
 __kasan_slab_alloc+0x63/0x90 mm/kasan/common.c:305
 kasan_slab_alloc include/linux/kasan.h:186 [inline]
 slab_post_alloc_hook mm/slab.h:711 [inline]
 slab_alloc_node mm/slub.c:3452 [inline]
 kmem_cache_alloc_node+0x185/0x3e0 mm/slub.c:3497
 __alloc_skb+0x288/0x330 net/core/skbuff.c:594
 alloc_skb include/linux/skbuff.h:1269 [inline]
 alloc_skb_with_frags+0x9a/0x6c0 net/core/skbuff.c:6316
 sock_alloc_send_pskb+0x7a7/0x930 net/core/sock.c:2734
 sock_alloc_send_skb include/net/sock.h:1860 [inline]
 mld_newpack.isra.0+0x1b9/0x770 net/ipv6/mcast.c:1748
 add_grhead+0x295/0x340 net/ipv6/mcast.c:1851
 add_grec+0x1053/0x1610 net/ipv6/mcast.c:1989
 mld_send_initial_cr.part.0+0xe7/0x260 net/ipv6/mcast.c:2236
 mld_send_initial_cr net/ipv6/mcast.c:1232 [inline]
 mld_dad_work+0x1d7/0x680 net/ipv6/mcast.c:2262
 process_one_work+0x99a/0x15e0 kernel/workqueue.c:2405
 worker_thread+0x67d/0x10c0 kernel/workqueue.c:2552
 kthread+0x33e/0x440 kernel/kthread.c:379
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
 </TASK>

Crashes (3):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2023/04/02 00:57 linux-next 4b0f4525dc4f f325deb0 .config console log report syz C [disk image] [vmlinux] [kernel image] ci-upstream-linux-next-kasan-gce-root INFO: rcu detected stall in mld_dad_work
2023/07/24 03:01 git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci e40939bbfc68 27cbe77f .config console log report syz C [disk image] [vmlinux] [kernel image] ci-upstream-gce-arm64 BUG: soft lockup in mld_dad_work
2023/07/27 18:22 upstream 0a8db05b571a 92476829 .config console log report syz [disk image] [vmlinux] [kernel image] ci-upstream-kasan-gce INFO: rcu detected stall in mld_dad_work
* Struck through repros no longer work on HEAD.