syzbot


INFO: rcu detected stall in do_idle (2)

Status: upstream: reported C repro on 2022/04/22 10:09
Reported-by: syzbot+8368796deb9660660296@syzkaller.appspotmail.com
First crash: 895d, last: 766d
Fix bisection: failed (error log, bisect log)
  
Similar bugs (7)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
linux-5.15 BUG: soft lockup in do_idle origin:upstream C error 232 9h06m 493d 0/3 upstream: reported C repro on 2023/05/29 19:14
android-414 INFO: rcu detected stall in do_idle 1 2211d 2211d 0/1 auto-closed as invalid on 2019/03/13 10:31
linux-4.19 INFO: rcu detected stall in do_idle (2) 2 891d 891d 0/1 auto-closed as invalid on 2022/08/24 07:48
linux-4.19 INFO: rcu detected stall in do_idle 1 1749d 1749d 0/1 auto-closed as invalid on 2020/04/18 00:25
linux-4.14 INFO: rcu detected stall in do_idle 4 1844d 1850d 0/1 auto-closed as invalid on 2020/01/14 08:59
upstream INFO: rcu detected stall in do_idle acpi C done error 2465 1h24m 2182d 0/28 upstream: reported C repro on 2018/10/13 07:31
linux-6.1 BUG: soft lockup in do_idle origin:upstream C 25 19h47m 481d 0/3 upstream: reported C repro on 2023/06/10 08:51
Last patch testing requests (2)
Created Duration User Patch Repo Result
2023/03/05 00:32 11m retest repro linux-4.14.y report log
2023/03/04 23:32 0m retest repro linux-4.14.y error
Fix bisection attempts (5)
Created Duration User Patch Repo Result
2022/10/13 02:09 0m bisect fix linux-4.14.y error job log
2022/08/29 18:25 29m bisect fix linux-4.14.y OK (0) job log log
2022/07/26 15:42 23m bisect fix linux-4.14.y OK (0) job log log
2022/06/26 15:22 19m bisect fix linux-4.14.y OK (0) job log log
2022/05/27 14:52 30m bisect fix linux-4.14.y OK (0) job log log

Sample crash report:
[Firmware Bug]: TSC ADJUST differs: CPU0 0 --> -852877649960. Restoring
clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
INFO: rcu_preempt self-detected stall on CPU
	1-...: (1 GPs behind) idle=d06/1/0 softirq=11874/11881 fqs=0 
	 (t=38766 jiffies g=911 c=910 q=212)
rcu_preempt kthread starved for 38766 jiffies! g911 c910 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
rcu_preempt     I30008     8      2 0x80000000
Call Trace:
 context_switch kernel/sched/core.c:2811 [inline]
 __schedule+0x88b/0x1de0 kernel/sched/core.c:3387
 schedule+0x8d/0x1b0 kernel/sched/core.c:3431
 schedule_timeout+0x4af/0xe90 kernel/time/timer.c:1747
 rcu_gp_kthread+0xc0a/0x1e60 kernel/rcu/tree.c:2255
 kthread+0x30d/0x420 kernel/kthread.c:232
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:404
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.276-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
task: ffffffff88e74480 task.stack: ffffffff88e00000
RIP: 0010:trace_hardirqs_on_caller+0x20f/0x580 kernel/locking/lockdep.c:2929
RSP: 0018:ffff8880ba4079f8 EFLAGS: 00000046
RAX: 0000000000000007 RBX: ffffffff88e74480 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff87ccff40 RDI: ffffffff88e74d04
RBP: ffffffff87400976 R08: ffff88823fff7018 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8880ba400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffdc4d658b8 CR3: 00000000b5727000 CR4: 00000000003406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 trace_hardirqs_on_thunk+0x1a/0x1c
 retint_kernel+0x2d/0x2d
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:780 [inline]
RIP: 0010:console_unlock+0xd81/0xf20 kernel/printk/printk.c:2413
RSP: 0018:ffff8880ba407b18 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff10
RAX: ffffffff88e74480 RBX: 0000000000000200 RCX: 1ffffffff11ce9aa
RDX: 0000000000000100 RSI: ffffffff88e74d30 RDI: 0000000000000206
RBP: 0000000000000001 R08: ffffffff8ba5340c R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff83d28470
R13: ffffffff89620ed0 R14: dffffc0000000000 R15: 000000000000007f
 vprintk_emit+0x224/0x620 kernel/printk/printk.c:1925
 vprintk_func+0x58/0x160 kernel/printk/printk_safe.c:409
 printk+0x9e/0xbc kernel/printk/printk.c:1998
 clocksource_watchdog+0x73e/0x8b0 kernel/time/clocksource.c:226
 call_timer_fn+0x14a/0x650 kernel/time/timer.c:1280
 expire_timers+0x232/0x4d0 kernel/time/timer.c:1319
 __run_timers kernel/time/timer.c:1637 [inline]
 run_timer_softirq+0x1d5/0x5a0 kernel/time/timer.c:1650
 __do_softirq+0x24d/0x9ff kernel/softirq.c:288
 invoke_softirq kernel/softirq.c:368 [inline]
 irq_exit+0x193/0x240 kernel/softirq.c:409
 exiting_irq arch/x86/include/asm/apic.h:638 [inline]
 smp_apic_timer_interrupt+0x141/0x5e0 arch/x86/kernel/apic/apic.c:1106
 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:793
 </IRQ>
RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
RSP: 0018:ffffffff88e07e78 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff10
RAX: 1ffffffff11e1314 RBX: dffffc0000000000 RCX: 0000000000000000
RDX: dffffc0000000000 RSI: 0000000000000001 RDI: ffffffff88e74d04
RBP: ffffffff88f09890 R08: 0000000000000047 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: fffffbfff11ce890
R13: ffffffff88e74480 R14: 0000000000000000 R15: 0000000000000000
 arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline]
 default_idle+0x47/0x370 arch/x86/kernel/process.c:558
 cpuidle_idle_call kernel/sched/idle.c:156 [inline]
 do_idle+0x250/0x3c0 kernel/sched/idle.c:246
 cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:351
 start_kernel+0x750/0x770 init/main.c:708
 secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:240
Code: 03 38 d0 7c 08 84 d2 0f 85 33 03 00 00 8b 8b 54 08 00 00 85 c9 0f 85 b3 01 00 00 65 48 8b 1c 25 c0 7f 02 00 48 8d bb 84 08 00 00 <48> b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 0f b6 14 02 
NMI backtrace for cpu 1
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.14.276-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x1b2/0x281 lib/dump_stack.c:58
 nmi_cpu_backtrace.cold+0x57/0x93 lib/nmi_backtrace.c:101
 nmi_trigger_cpumask_backtrace+0x13a/0x180 lib/nmi_backtrace.c:62
 trigger_single_cpu_backtrace include/linux/nmi.h:158 [inline]
 rcu_dump_cpu_stacks+0x15f/0x19c kernel/rcu/tree.c:1396
 print_cpu_stall kernel/rcu/tree.c:1542 [inline]
 check_cpu_stall kernel/rcu/tree.c:1610 [inline]
 __rcu_pending kernel/rcu/tree.c:3390 [inline]
 rcu_pending kernel/rcu/tree.c:3452 [inline]
 rcu_check_callbacks.cold+0x464/0xd8d kernel/rcu/tree.c:2792
 update_process_times+0x29/0x60 kernel/time/timer.c:1591
 tick_sched_handle+0x7d/0x150 kernel/time/tick-sched.c:165
 tick_sched_timer+0x92/0x200 kernel/time/tick-sched.c:1223
 __run_hrtimer kernel/time/hrtimer.c:1223 [inline]
 __hrtimer_run_queues+0x30b/0xc80 kernel/time/hrtimer.c:1287
 hrtimer_interrupt+0x1e6/0x5e0 kernel/time/hrtimer.c:1321
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1079 [inline]
 smp_apic_timer_interrupt+0x117/0x5e0 arch/x86/kernel/apic/apic.c:1104
 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:793
 </IRQ>
RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
RSP: 0018:ffff8880b5487e68 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff10
RAX: 1ffffffff11e1314 RBX: dffffc0000000000 RCX: 0000000000000000
RDX: dffffc0000000000 RSI: 0000000000000001 RDI: ffff8880b5478bc4
RBP: ffffffff88f09890 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffed1016a8f068
R13: ffff8880b5478340 R14: 0000000000000000 R15: 0000000000000000
 arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline]
 default_idle+0x47/0x370 arch/x86/kernel/process.c:558
 cpuidle_idle_call kernel/sched/idle.c:156 [inline]
 do_idle+0x250/0x3c0 kernel/sched/idle.c:246
 cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:351
 start_secondary+0x4db/0x670 arch/x86/kernel/smpboot.c:272
 secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:240
systemd[1]: systemd-udevd.service: Watchdog timeout (limit 3min)!
systemd[1]: systemd-udevd.service: Killing process 4630 (systemd-udevd) with signal SIGABRT.
systemd[1]: Time has been changed
systemd[1]: apt-daily-upgrade.timer: Adding 6min 57.721758s random time.
systemd[1]: apt-daily.timer: Adding 10h 34min 49.367677s random time.
clocksource:                       'kvm-clock' wd_now: b4da0bb863 wd_last: 5a7aa62b22 mask: ffffffffffffffff
clocksource:                       'tsc' cs_now: c6ab555f64 cs_last: c6689f65c0 mask: ffffffffffffffff
tsc: Marking TSC unstable due to clocksource watchdog
systemd[1]: systemd-journald.service: Main process exited, code=killed, status=6/ABRT
systemd[1]: systemd-journald.service: Unit entered failed state.
systemd[1]: systemd-journald.service: Failed with result 'watchdog'.
systemd[1]: systemd-journald.service: Service has no hold-off time, scheduling restart.
systemd[1]: Stopped Flush Journal to Persistent Storage.
systemd-journald[8060]: File /run/log/journal/04d8c135ee6b410280ba31a58c89679d/system.journal corrupted or uncleanly shut down, renaming and replacing.
----------------
Code disassembly (best guess):
   0:	03 38                	add    (%rax),%edi
   2:	d0 7c 08 84          	sarb   -0x7c(%rax,%rcx,1)
   6:	d2 0f                	rorb   %cl,(%rdi)
   8:	85 33                	test   %esi,(%rbx)
   a:	03 00                	add    (%rax),%eax
   c:	00 8b 8b 54 08 00    	add    %cl,0x8548b(%rbx)
  12:	00 85 c9 0f 85 b3    	add    %al,-0x4c7af037(%rbp)
  18:	01 00                	add    %eax,(%rax)
  1a:	00 65 48             	add    %ah,0x48(%rbp)
  1d:	8b 1c 25 c0 7f 02 00 	mov    0x27fc0,%ebx
  24:	48 8d bb 84 08 00 00 	lea    0x884(%rbx),%rdi
* 2b:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax <-- trapping instruction
  32:	fc ff df
  35:	48 89 fa             	mov    %rdi,%rdx
  38:	48 c1 ea 03          	shr    $0x3,%rdx
  3c:	0f b6 14 02          	movzbl (%rdx,%rax,1),%edx

Crashes (4):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2022/04/23 02:36 linux-4.14.y 15a1c6b6f516 131df97d .config console log report syz C ci2-linux-4-14 INFO: rcu detected stall in do_idle
2022/04/22 10:08 linux-4.14.y 15a1c6b6f516 2738b391 .config console log report syz ci2-linux-4-14 INFO: rcu detected stall in do_idle
2022/04/27 14:52 linux-4.14.y e3a56aaade89 1fa34c1b .config console log report info ci2-linux-4-14 INFO: rcu detected stall in do_idle
2022/04/26 10:07 linux-4.14.y 15a1c6b6f516 1fa34c1b .config console log report info ci2-linux-4-14 INFO: rcu detected stall in do_idle
* Struck through repros no longer work on HEAD.