syzbot


INFO: rcu detected stall in __run_timer_base

Status: upstream: reported C repro on 2024/04/14 02:04
Subsystems: kasan mm
[Documentation on labels]
Reported-by: syzbot+1acbadd9f48eeeacda29@syzkaller.appspotmail.com
First crash: 20d, last: 20d
Discussions (1)
Title Replies (including bot) Last reply
[syzbot] [kasan?] [mm?] INFO: rcu detected stall in __run_timer_base 1 (3) 2024/04/14 03:26
Last patch testing requests (2)
Created Duration User Patch Repo Result
2024/04/25 08:07 18m retest repro upstream report log
2024/04/14 02:53 23m hdanton@sina.com patch https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git fe46a7dd189e OK log

Sample crash report:
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 	0-...!: (1 GPs behind) idle=d3cc/1/0x4000000000000000 softirq=6440/6443 fqs=2
rcu: 	(detected by 1, t=10506 jiffies, g=7245, q=210 ncpus=2)
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 5367 Comm: syz-executor780 Not tainted 6.8.0-syzkaller-08951-gfe46a7dd189e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
RIP: 0010:lockdep_recursion_finish kernel/locking/lockdep.c:467 [inline]
RIP: 0010:lock_release+0x5c0/0x9d0 kernel/locking/lockdep.c:5776
Code: 00 fc ff df 4c 8b 64 24 08 48 8b 5c 24 28 49 89 dd 4c 8d b4 24 90 00 00 00 48 c7 c7 60 d3 aa 8b e8 d5 9c 02 0a b8 ff ff ff ff <65> 0f c1 05 28 c5 90 7e 83 f8 01 0f 85 9a 00 00 00 4c 89 f3 48 c1
RSP: 0000:ffffc90000007720 EFLAGS: 00000082
RAX: 00000000ffffffff RBX: 0000000000000046 RCX: ffffc90000007703
RDX: 0000000000000001 RSI: ffffffff8baad360 RDI: ffffffff8bfed300
RBP: ffffc90000007860 R08: ffffffff8f873a6f R09: 1ffffffff1f0e74d
R10: dffffc0000000000 R11: fffffbfff1f0e74e R12: 1ffff92000000ef0
R13: 0000000000000046 R14: ffffc900000077b0 R15: dffffc0000000000
FS:  0000555594caf380(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000600 CR3: 0000000023676000 CR4: 0000000000350ef0
Call Trace:
 <NMI>
 </NMI>
 <IRQ>
 rcu_lock_release include/linux/rcupdate.h:308 [inline]
 rcu_read_unlock include/linux/rcupdate.h:783 [inline]
 advance_sched+0xb37/0xca0 net/sched/sch_taprio.c:987
 __run_hrtimer kernel/time/hrtimer.c:1692 [inline]
 __hrtimer_run_queues+0x597/0xd00 kernel/time/hrtimer.c:1756
 hrtimer_interrupt+0x396/0x990 kernel/time/hrtimer.c:1818
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1032 [inline]
 __sysvec_apic_timer_interrupt+0x109/0x3a0 arch/x86/kernel/apic/apic.c:1049
 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
 sysvec_apic_timer_interrupt+0x52/0xc0 arch/x86/kernel/apic/apic.c:1043
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
RIP: 0010:__raw_spin_unlock_irq include/linux/spinlock_api_smp.h:160 [inline]
RIP: 0010:_raw_spin_unlock_irq+0x29/0x50 kernel/locking/spinlock.c:202
Code: 90 f3 0f 1e fa 53 48 89 fb 48 83 c7 18 48 8b 74 24 08 e8 0a b4 f2 f5 48 89 df e8 c2 f3 f3 f5 e8 1d 19 1d f6 fb bf 01 00 00 00 <e8> 52 e0 e5 f5 65 8b 05 a3 c4 84 74 85 c0 74 06 5b e9 71 40 00 00
RSP: 0000:ffffc90000007cb0 EFLAGS: 00000282
RAX: 49e89c1a0716e600 RBX: ffff8880b942a740 RCX: ffffffff81720c2a
RDX: dffffc0000000000 RSI: ffffffff8baac1e0 RDI: 0000000000000001
RBP: ffffc90000007e10 R08: ffffffff92ce5537 R09: 1ffffffff259caa6
R10: dffffc0000000000 R11: fffffbfff259caa7 R12: ffff8880b942a788
R13: ffffc90000007d60 R14: dffffc0000000000 R15: 00000000ffffdaa5
 __run_timer_base+0x1c0/0x8e0 kernel/time/timer.c:2420
 run_timer_base kernel/time/timer.c:2428 [inline]
 run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2438
 __do_softirq+0x2be/0x943 kernel/softirq.c:554
 invoke_softirq kernel/softirq.c:428 [inline]
 __irq_exit_rcu+0xf2/0x1c0 kernel/softirq.c:633
 irq_exit_rcu+0x9/0x30 kernel/softirq.c:645
 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
 sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
RIP: 0010:srso_safe_ret+0x0/0x20 arch/x86/lib/retpoline.S:208
Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 48 b8 <48> 8d 64 24 08 c3 cc cc 0f ae e8 e8 f0 ff ff ff 0f 0b 66 2e 0f 1f
RSP: 0000:ffffc90004907030 EFLAGS: 00000293
RAX: ffffffff814095ec RBX: 0000000000000000 RCX: ffff888028fe0000
RDX: 0000000000000000 RSI: ffffffff8140c1eb RDI: ffffffff8140c035
RBP: 1ffff92000920e30 R08: ffffffff81409480 R09: 0000000000000000
R10: ffffc90004907180 R11: fffff52000920e3c R12: ffffffff8f9755b0
R13: dffffc0000000000 R14: 1ffff92000920e30 R15: ffffffff9008ea3e
 srso_return_thunk+0x5/0x5f arch/x86/lib/retpoline.S:222
 unwind_next_frame+0x67c/0x2a00 arch/x86/kernel/unwind_orc.c:495
 __unwind_start+0x641/0x7c0 arch/x86/kernel/unwind_orc.c:760
 unwind_start arch/x86/include/asm/unwind.h:64 [inline]
 arch_stack_walk+0x103/0x1b0 arch/x86/kernel/stacktrace.c:24
 stack_trace_save+0x118/0x1d0 kernel/stacktrace.c:122
 save_stack+0xfb/0x1f0 mm/page_owner.c:129
 __set_page_owner+0x29/0x380 mm/page_owner.c:195
 set_page_owner include/linux/page_owner.h:31 [inline]
 post_alloc_hook+0x1ea/0x210 mm/page_alloc.c:1533
 prep_new_page mm/page_alloc.c:1540 [inline]
 get_page_from_freelist+0x33ea/0x3580 mm/page_alloc.c:3311
 __alloc_pages+0x256/0x680 mm/page_alloc.c:4569
 alloc_pages_mpol+0x3de/0x650 mm/mempolicy.c:2133
 pagetable_alloc include/linux/mm.h:2842 [inline]
 __pud_alloc_one include/asm-generic/pgalloc.h:169 [inline]
 pud_alloc_one include/asm-generic/pgalloc.h:189 [inline]
 __pud_alloc+0x93/0x4b0 mm/memory.c:5692
 pud_alloc include/linux/mm.h:2799 [inline]
 __handle_mm_fault+0x4472/0x72d0 mm/memory.c:5236
 handle_mm_fault+0x3c2/0x8a0 mm/memory.c:5470
 do_user_addr_fault arch/x86/mm/fault.c:1413 [inline]
 handle_page_fault arch/x86/mm/fault.c:1505 [inline]
 exc_page_fault+0x2a8/0x890 arch/x86/mm/fault.c:1563
 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
RIP: 0033:0x7f37687f9bcc
Code: 00 00 e8 67 52 03 00 48 83 f8 ff 74 07 48 89 05 3a 15 0b 00 31 d2 b9 00 06 00 20 bf 10 00 00 00 48 b8 74 65 61 6d 30 00 00 00 <48> 89 04 25 00 06 00 20 31 c0 48 89 14 25 08 06 00 20 48 8b 35 0b
RSP: 002b:00007ffc3f74a370 EFLAGS: 00010246
RAX: 000000306d616574 RBX: 0000000000000000 RCX: 0000000020000600
RDX: 0000000000000000 RSI: 0000000800000003 RDI: 0000000000000010
RBP: 00000000000f4240 R08: 0000000000000000 R09: 0000000100000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc3f74a3c0
R13: 000000000003239a R14: 00007ffc3f74a38c R15: 0000000000000003
 </TASK>
INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 4.146 msecs
rcu: rcu_preempt kthread starved for 10495 jiffies! g7245 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:R  running task     stack:26256 pid:16    tgid:16    ppid:2      flags:0x00004000
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5409 [inline]
 __schedule+0x17d3/0x4a20 kernel/sched/core.c:6736
 __schedule_loop kernel/sched/core.c:6813 [inline]
 schedule+0x14b/0x320 kernel/sched/core.c:6828
 schedule_timeout+0x1be/0x310 kernel/time/timer.c:2572
 rcu_gp_fqs_loop+0x2df/0x1370 kernel/rcu/tree.c:1663
 rcu_gp_kthread+0xa7/0x3b0 kernel/rcu/tree.c:1862
 kthread+0x2f2/0x390 kernel/kthread.c:388
 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
 </TASK>
rcu: Stack dump where RCU GP kthread last ran:
CPU: 1 PID: 61 Comm: kworker/u8:4 Not tainted 6.8.0-syzkaller-08951-gfe46a7dd189e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Workqueue: events_unbound toggle_allocation_gate
RIP: 0010:csd_lock_wait kernel/smp.c:311 [inline]
RIP: 0010:smp_call_function_many_cond+0x1855/0x2960 kernel/smp.c:855
Code: 89 e6 83 e6 01 31 ff e8 d9 d5 0b 00 41 83 e4 01 49 bc 00 00 00 00 00 fc ff df 75 07 e8 84 d1 0b 00 eb 38 f3 90 42 0f b6 04 23 <84> c0 75 11 41 f7 45 00 01 00 00 00 74 1e e8 68 d1 0b 00 eb e4 44
RSP: 0018:ffffc900015c76e0 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 1ffff11017288c0d RCX: ffff88801aadbc00
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffffc900015c78e0 R08: ffffffff818923b7 R09: 1ffffffff259caa0
R10: dffffc0000000000 R11: fffffbfff259caa1 R12: dffffc0000000000
R13: ffff8880b9446068 R14: ffff8880b953f480 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555594cafca8 CR3: 000000000df32000 CR4: 0000000000350ef0
Call Trace:
 <IRQ>
 </IRQ>
 <TASK>
 on_each_cpu_cond_mask+0x3f/0x80 kernel/smp.c:1023
 on_each_cpu include/linux/smp.h:71 [inline]
 text_poke_sync arch/x86/kernel/alternative.c:2086 [inline]
 text_poke_bp_batch+0x352/0xb30 arch/x86/kernel/alternative.c:2296
 text_poke_flush arch/x86/kernel/alternative.c:2487 [inline]
 text_poke_finish+0x30/0x50 arch/x86/kernel/alternative.c:2494
 arch_jump_label_transform_apply+0x1c/0x30 arch/x86/kernel/jump_label.c:146
 static_key_enable_cpuslocked+0x136/0x260 kernel/jump_label.c:205
 static_key_enable+0x1a/0x20 kernel/jump_label.c:218
 toggle_allocation_gate+0xb5/0x250 mm/kfence/core.c:826
 process_one_work kernel/workqueue.c:3254 [inline]
 process_scheduled_works+0xa02/0x1770 kernel/workqueue.c:3335
 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
 kthread+0x2f2/0x390 kernel/kthread.c:388
 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
 </TASK>

Crashes (1):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2024/04/10 02:02 upstream fe46a7dd189e 56086b24 .config console log report syz C [disk image] [vmlinux] [kernel image] ci-upstream-kasan-gce-root INFO: rcu detected stall in __run_timer_base
* Struck through repros no longer work on HEAD.