syzbot


INFO: rcu detected stall in mrp_join_timer (3)

Status: closed as invalid on 2024/02/01 10:31
Subsystems: net
[Documentation on labels]
First crash: 133d, last: 133d
Similar bugs (7)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
linux-4.14 INFO: rcu detected stall in mrp_join_timer 7 1412d 1577d 0/1 auto-closed as invalid on 2020/10/17 01:22
upstream INFO: rcu detected stall in mrp_join_timer net 1 1575d 1575d 0/26 closed as invalid on 2020/01/09 08:13
linux-4.19 INFO: rcu detected stall in mrp_join_timer 28 1263d 1575d 0/1 auto-closed as invalid on 2021/03/15 12:15
upstream INFO: rcu detected stall in mrp_join_timer (2) net 57 787d 1536d 0/26 auto-closed as invalid on 2022/07/04 18:27
linux-4.14 BUG: soft lockup in mrp_join_timer 1 1136d 1136d 0/1 auto-closed as invalid on 2021/07/20 15:25
upstream BUG: soft lockup in mrp_join_timer (2) net 1 212d 212d 0/26 closed as invalid on 2023/11/01 18:32
linux-4.19 BUG: soft lockup in mrp_join_timer (2) 58 434d 1111d 0/1 upstream: reported on 2021/04/16 18:29

Sample crash report:
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 	1-...!: (1 GPs behind) idle=3cec/1/0x4000000000000000 softirq=95740/95742 fqs=1
rcu: 	(detected by 0, t=10505 jiffies, g=169225, q=93 ncpus=2)
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 5091 Comm: syz-executor.0 Not tainted 6.7.0-rc6-syzkaller-00022-g55cb5f43689d #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
RIP: 0010:__lock_is_held kernel/locking/lockdep.c:5493 [inline]
RIP: 0010:lock_is_held_type+0x97/0x190 kernel/locking/lockdep.c:5825
Code: fa 48 c7 c7 40 b2 6a 8b e8 26 1a 00 00 65 ff 05 7f 18 eb 74 41 83 bd b8 0a 00 00 00 7e 47 4c 89 eb 48 81 c3 c0 0a 00 00 31 ed <48> 83 fd 31 73 24 48 89 df 4c 89 fe e8 48 02 00 00 85 c0 75 2a 48
RSP: 0018:ffffc900001f06f8 EFLAGS: 00000097
RAX: 000000000000000b RBX: ffff88801e5d4668 RCX: ffff88801e5d3b80
RDX: ffff88801e5d3b80 RSI: ffff8880b992b898 RDI: ffff88801e5d4640
RBP: 0000000000000001 R08: ffffffff817d47df R09: 0000000000000000
R10: ffff88803485f340 R11: ffffed100690be6b R12: 0000000000000046
R13: ffff88801e5d3b80 R14: 00000000ffffffff R15: ffff8880b992b898
FS:  0000555555ce9480(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f2e442ffd81 CR3: 0000000033540000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <NMI>
 </NMI>
 <IRQ>
 lock_is_held include/linux/lockdep.h:288 [inline]
 __run_hrtimer kernel/time/hrtimer.c:1654 [inline]
 __hrtimer_run_queues+0x2f5/0xd20 kernel/time/hrtimer.c:1752
 hrtimer_interrupt+0x396/0x980 kernel/time/hrtimer.c:1814
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1065 [inline]
 __sysvec_apic_timer_interrupt+0x104/0x3a0 arch/x86/kernel/apic/apic.c:1082
 sysvec_apic_timer_interrupt+0x43/0xb0 arch/x86/kernel/apic/apic.c:1076
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:649
RIP: 0010:queued_spin_lock include/asm-generic/qspinlock.h:109 [inline]
RIP: 0010:do_raw_spin_lock+0x2ca/0x370 kernel/locking/spinlock_debug.c:115
Code: fd ff ff 89 d9 80 e1 07 80 c1 03 38 c1 0f 8c 41 fe ff ff 48 89 df e8 d5 fb 7b 00 48 ba 00 00 00 00 00 fc ff df e9 2a fe ff ff <44> 89 f1 80 e1 07 80 c1 03 38 c1 0f 8c 51 fe ff ff 4c 89 f7 e8 4d
RSP: 0018:ffffc900001f0aa0 EFLAGS: 00000202
RAX: 0000000000000004 RBX: dffffc0000000000 RCX: 0000000000000001
RDX: dffffc0000000000 RSI: 1ffff9200003e15c RDI: ffff88807f8a3cb0
RBP: ffffc900001f0b80 R08: ffffffff90dd743f R09: 1ffffffff21bae87
R10: dffffc0000000000 R11: fffffbfff21bae88 R12: ffff88807f8a3cb0
R13: 1ffff9200003e160 R14: ffffc900001f0b00 R15: 1ffff1100ff14797
 spin_lock include/linux/spinlock.h:351 [inline]
 mrp_join_timer+0xce/0x180 net/802/mrp.c:610
 call_timer_fn+0x17a/0x5f0 kernel/time/timer.c:1700
 expire_timers kernel/time/timer.c:1751 [inline]
 __run_timers+0x64f/0x860 kernel/time/timer.c:2022
 run_timer_softirq+0x67/0xf0 kernel/time/timer.c:2035
 __do_softirq+0x2b8/0x939 kernel/softirq.c:553
 invoke_softirq kernel/softirq.c:427 [inline]
 __irq_exit_rcu+0xf1/0x1b0 kernel/softirq.c:632
 irq_exit_rcu+0x9/0x20 kernel/softirq.c:644
 sysvec_apic_timer_interrupt+0x97/0xb0 arch/x86/kernel/apic/apic.c:1076
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:649
RIP: 0010:lock_acquire+0x25a/0x530 kernel/locking/lockdep.c:5758
Code: 2b 00 74 08 4c 89 f7 e8 54 58 7d 00 f6 44 24 61 02 0f 85 8a 01 00 00 41 f7 c7 00 02 00 00 74 01 fb 48 c7 44 24 40 0e 36 e0 45 <4b> c7 44 25 00 00 00 00 00 43 c7 44 25 09 00 00 00 00 43 c7 44 25
RSP: 0018:ffffc90003e5f120 EFLAGS: 00000206
RAX: 0000000000000001 RBX: 1ffff920007cbe30 RCX: 0000000000000001
RDX: dffffc0000000000 RSI: ffffffff8b6ab6e0 RDI: ffffffff8bbde0a0
RBP: ffffc90003e5f270 R08: ffffffff90dd7367 R09: 1ffffffff21bae6c
R10: dffffc0000000000 R11: fffffbfff21bae6d R12: 1ffff920007cbe2c
R13: dffffc0000000000 R14: ffffc90003e5f180 R15: 0000000000000246
 rcu_lock_acquire include/linux/rcupdate.h:301 [inline]
 rcu_read_lock include/linux/rcupdate.h:747 [inline]
 page_ext_get+0x3d/0x2a0 mm/page_ext.c:508
 page_table_check_set+0x1fd/0x860 mm/page_table_check.c:109
 __page_table_check_ptes_set+0x220/0x280 mm/page_table_check.c:196
 page_table_check_ptes_set include/linux/page_table_check.h:74 [inline]
 set_ptes include/linux/pgtable.h:234 [inline]
 copy_present_pte mm/memory.c:987 [inline]
 copy_pte_range mm/memory.c:1091 [inline]
 copy_pmd_range mm/memory.c:1176 [inline]
 copy_pud_range mm/memory.c:1213 [inline]
 copy_p4d_range mm/memory.c:1237 [inline]
 copy_page_range+0x2ca4/0x43e0 mm/memory.c:1335
 dup_mmap kernel/fork.c:758 [inline]
 dup_mm kernel/fork.c:1691 [inline]
 copy_mm+0x11fc/0x1f10 kernel/fork.c:1740
 copy_process+0x1d6f/0x3fb0 kernel/fork.c:2502
 kernel_clone+0x222/0x840 kernel/fork.c:2907
 __do_sys_clone kernel/fork.c:3050 [inline]
 __se_sys_clone kernel/fork.c:3034 [inline]
 __x64_sys_clone+0x258/0x2a0 kernel/fork.c:3034
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0x45/0x110 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f6490279c13
Code: 1f 84 00 00 00 00 00 64 48 8b 04 25 10 00 00 00 45 31 c0 31 d2 31 f6 bf 11 00 20 01 4c 8d 90 d0 02 00 00 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 89 c2 85 c0 75 2c 64 48 8b 04 25 10 00 00
RSP: 002b:00007ffe900fdce8 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f6490279c13
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: 0000555555ce9750 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000001
 </TASK>
rcu: rcu_preempt kthread starved for 10500 jiffies! g169225 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:R  running task     stack:25168 pid:17    tgid:17    ppid:2      flags:0x00004000
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5376 [inline]
 __schedule+0x1961/0x4ab0 kernel/sched/core.c:6688
 __schedule_loop kernel/sched/core.c:6763 [inline]
 schedule+0x149/0x260 kernel/sched/core.c:6778
 schedule_timeout+0x1bd/0x300 kernel/time/timer.c:2167
 rcu_gp_fqs_loop+0x30a/0x1500 kernel/rcu/tree.c:1631
 rcu_gp_kthread+0xa7/0x3b0 kernel/rcu/tree.c:1830
 kthread+0x2d3/0x370 kernel/kthread.c:388
 ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
 </TASK>
rcu: Stack dump where RCU GP kthread last ran:
CPU: 0 PID: 13183 Comm: kworker/u4:18 Not tainted 6.7.0-rc6-syzkaller-00022-g55cb5f43689d #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Workqueue: events_unbound toggle_allocation_gate
RIP: 0010:csd_lock_wait kernel/smp.c:311 [inline]
RIP: 0010:smp_call_function_many_cond+0x1832/0x2940 kernel/smp.c:855
Code: 45 8b 65 00 44 89 e6 83 e6 01 31 ff e8 a7 87 0b 00 41 83 e4 01 49 bc 00 00 00 00 00 fc ff df 75 07 e8 e2 83 0b 00 eb 38 f3 90 <42> 0f b6 04 23 84 c0 75 11 41 f7 45 00 01 00 00 00 74 1e e8 c6 83
RSP: 0018:ffffc90014b47720 EFLAGS: 00000293
RAX: ffffffff8182e39a RBX: 1ffff110173282d5 RCX: ffff888033431dc0
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffffc90014b47920 R08: ffffffff8182e369 R09: 1ffffffff21bae6c
R10: dffffc0000000000 R11: fffffbfff21bae6d R12: dffffc0000000000
R13: ffff8880b99416a8 R14: ffff8880b983d480 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f2e4439d988 CR3: 000000000d731000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 </IRQ>
 <TASK>
 on_each_cpu_cond_mask+0x3f/0x80 kernel/smp.c:1023
 on_each_cpu include/linux/smp.h:71 [inline]
 text_poke_sync arch/x86/kernel/alternative.c:2006 [inline]
 text_poke_bp_batch+0x352/0xb30 arch/x86/kernel/alternative.c:2216
 text_poke_flush arch/x86/kernel/alternative.c:2407 [inline]
 text_poke_finish+0x30/0x50 arch/x86/kernel/alternative.c:2414
 arch_jump_label_transform_apply+0x1c/0x30 arch/x86/kernel/jump_label.c:146
 static_key_enable_cpuslocked+0x132/0x260 kernel/jump_label.c:205
 static_key_enable+0x1a/0x20 kernel/jump_label.c:218
 toggle_allocation_gate+0xb5/0x250 mm/kfence/core.c:830
 process_one_work kernel/workqueue.c:2627 [inline]
 process_scheduled_works+0x90f/0x1420 kernel/workqueue.c:2700
 worker_thread+0xa5f/0x1000 kernel/workqueue.c:2781
 kthread+0x2d3/0x370 kernel/kthread.c:388
 ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
 </TASK>

Crashes (1):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2023/12/20 00:59 upstream 55cb5f43689d 3ad490ea .config console log report info [disk image] [vmlinux] [kernel image] ci-upstream-kasan-gce-smack-root INFO: rcu detected stall in mrp_join_timer
* Struck through repros no longer work on HEAD.