syzbot


INFO: rcu detected stall in iterate_cleanup_work (2)

Status: upstream: reported C repro on 2020/07/07 07:23
Reported-by: syzbot+cc8495ea4052b9b79b72@syzkaller.appspotmail.com
First crash: 815d, last: 54d

Cause bisection: introduced by (bisect log) :
commit 5a781ccbd19e4664babcbe4b4ead7aa2b9283d22
Author: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date: Sat Sep 29 00:59:43 2018 +0000

  tc: Add support for configuring the taprio scheduler

Crash: no output from test machine (log)
Repro: C syz .config

Fix bisection: the fix commit could be any of (bisect log):
  fb893de323e2 Merge tag 'tag-chrome-platform-for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux
  9e9fb7655ed5 Merge tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
similar bugs (1):
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
upstream INFO: rcu detected stall in iterate_cleanup_work 1 992d 991d 0/24 closed as invalid on 2020/01/09 08:13

Sample crash report:
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 	1-...!: (2 ticks this GP) idle=71e/1/0x4000000000000000 softirq=8988/8988 fqs=0 
	(detected by 0, t=10502 jiffies, g=9369, q=287)
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 3915 Comm: kworker/1:3 Not tainted 5.8.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events iterate_cleanup_work
RIP: 0010:trace_lock_release include/trace/events/lock.h:58 [inline]
RIP: 0010:lock_release+0x196/0x8e0 kernel/locking/lockdep.c:5022
Code: 3f 0f 87 0a 05 00 00 89 db be 08 00 00 00 48 89 d8 48 c1 f8 06 48 8d 3c c5 c8 ea b1 8a e8 72 9a 59 00 48 0f a3 1d 42 7a 57 09 <0f> 82 54 04 00 00 48 c7 c0 3c 1d b2 8a c7 44 24 40 01 00 00 00 48
RSP: 0018:ffffc90000da8d08 EFLAGS: 00000047
RAX: 0000000000000001 RBX: 0000000000000001 RCX: ffffffff815a707e
RDX: fffffbfff1563d5a RSI: 0000000000000008 RDI: ffffffff8ab1eac8
RBP: 1ffff920001b51a3 R08: 0000000000000000 R09: ffffffff8ab1eacf
R10: fffffbfff1563d59 R11: 0000000000000000 R12: ffff8880ae727758
R13: ffffffff816492c1 R14: ffff88808e3de340 R15: dffffc0000000000
FS:  0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000610 CR3: 0000000009a8d000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:158 [inline]
 _raw_spin_unlock_irqrestore+0x16/0xe0 kernel/locking/spinlock.c:191
 __run_hrtimer kernel/time/hrtimer.c:1520 [inline]
 __hrtimer_run_queues+0x5d1/0xfc0 kernel/time/hrtimer.c:1588
 hrtimer_interrupt+0x32a/0x930 kernel/time/hrtimer.c:1650
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1080 [inline]
 __sysvec_apic_timer_interrupt+0x142/0x5e0 arch/x86/kernel/apic/apic.c:1097
 asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:706
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
 sysvec_apic_timer_interrupt+0xb2/0xf0 arch/x86/kernel/apic/apic.c:1091
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:581
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:770 [inline]
RIP: 0010:lock_acquire+0x27b/0xad0 kernel/locking/lockdep.c:5008
Code: 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 0f 85 f8 06 00 00 48 83 3d 0a ba 5b 08 00 0f 84 a6 05 00 00 48 8b 7c 24 08 57 9d <0f> 1f 44 00 00 48 b8 00 00 00 00 00 fc ff df 48 03 44 24 10 48 c7
RSP: 0018:ffffc90000f07ab8 EFLAGS: 00000286
RAX: 1ffffffff136c689 RBX: ffff888098da25c0 RCX: ffffffff815a221b
RDX: dffffc0000000000 RSI: 0000000000000001 RDI: 0000000000000286
RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff8c5eea97
R10: fffffbfff18bdd52 R11: 0000000000000000 R12: 0000000000000000
R13: ffffffff89a6aed8 R14: 0000000000000000 R15: ffff888098da25c0
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:354 [inline]
 nf_conntrack_lock net/netfilter/nf_conntrack_core.c:91 [inline]
 get_next_corpse net/netfilter/nf_conntrack_core.c:2204 [inline]
 nf_ct_iterate_cleanup+0x102/0x330 net/netfilter/nf_conntrack_core.c:2249
 nf_ct_iterate_cleanup_net net/netfilter/nf_conntrack_core.c:2334 [inline]
 nf_ct_iterate_cleanup_net+0x113/0x170 net/netfilter/nf_conntrack_core.c:2319
 iterate_cleanup_work+0x45/0x130 net/netfilter/nf_nat_masquerade.c:216
 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
 kthread+0x3b5/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 0.000 msecs
rcu: rcu_preempt kthread starved for 10502 jiffies! g9369 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
rcu_preempt     I29080    10      2 0x00004000
Call Trace:
 context_switch kernel/sched/core.c:3778 [inline]
 __schedule+0x8e5/0x21e0 kernel/sched/core.c:4527
 schedule+0xd0/0x2a0 kernel/sched/core.c:4602
 schedule_timeout+0x148/0x250 kernel/time/timer.c:1879
 rcu_gp_fqs_loop kernel/rcu/tree.c:1888 [inline]
 rcu_gp_kthread+0xae5/0x1b50 kernel/rcu/tree.c:2058
 kthread+0x3b5/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

Crashes (4):
Manager Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Title
ci-upstream-kasan-gce 2020/08/13 04:59 upstream fb893de323e2 bc15f7db .config log report syz C
ci-upstream-kasan-gce 2020/07/03 07:17 upstream cd77006e01b3 bed10395 .config log report syz C
ci-upstream-net-this-kasan-gce 2020/08/16 06:02 net 4ca0d9ac3fd8 424dd8e7 .config log report syz C
ci-upstream-kasan-gce-smack-root 2022/08/02 20:42 upstream 7d0d3fa7339e 1c9013ac .config log report info INFO: rcu detected stall in iterate_cleanup_work
* Struck through repros no longer work on HEAD.