syzbot


BUG: soft lockup in run_rebalance_domains

Status: premoderation: reported on 2024/08/05 18:56
Reported-by: syzbot+d50a7c763f0a5a82aacf@syzkaller.appspotmail.com
First crash: 69d, last: 69d

Sample crash report:
watchdog: BUG: soft lockup - CPU#1 stuck for 123s! [syz.0.15:377]
Modules linked in:
CPU: 1 PID: 377 Comm: syz.0.15 Tainted: G        W         5.10.222-syzkaller-01494-gfd58936f3c1f #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/27/2024
RIP: 0010:cpu_overutilized+0x30d/0x5d0 kernel/sched/fair.c:5715
Code: 14 48 81 f9 00 04 00 00 74 0b 48 89 f2 48 39 f1 0f 93 c1 20 c8 34 01 49 bd 00 00 00 00 00 fc ff df 48 c7 44 24 40 0e 36 e0 45 <48> 8b 4c 24 38 49 c7 44 0d 00 00 00 00 00 65 48 8b 0c 25 28 00 00
RSP: 0018:ffffc90000170740 EFLAGS: 00000202
RAX: 0000000000040001 RBX: ffff8881f7156b70 RCX: 0000000000000001
RDX: 1ffff9200002e0f0 RSI: 0000000000000002 RDI: ffff8881f7156270
RBP: ffffc90000170810 R08: ffffffff82527659 R09: ffffed1020057a85
R10: 0000000000000000 R11: dffffc0000000001 R12: ffffffff85dc36b8
R13: dffffc0000000000 R14: 0000000000140000 R15: 1ffffffff0bb86d7
FS:  0000000000000000(0000) GS:ffff8881f7100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000110c2c0a7f CR3: 00000001464dc000 CR4: 00000000003506a0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 update_sg_lb_stats kernel/sched/fair.c:8819 [inline]
 update_sd_lb_stats kernel/sched/fair.c:9359 [inline]
 find_busiest_group kernel/sched/fair.c:9624 [inline]
 load_balance+0x122e/0x76d0 kernel/sched/fair.c:10004
 rebalance_domains+0x4da/0xa90 kernel/sched/fair.c:10442
 run_rebalance_domains+0xfc/0x1b0 kernel/sched/fair.c:11120
 __do_softirq+0x268/0x5bb kernel/softirq.c:309
 asm_call_irq_on_stack+0xf/0x20
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
 do_softirq_own_stack+0x60/0x80 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:405 [inline]
 __irq_exit_rcu+0x128/0x150 kernel/softirq.c:435
 irq_exit_rcu+0x9/0x10 kernel/softirq.c:447
 sysvec_apic_timer_interrupt+0xbf/0xe0 arch/x86/kernel/apic/apic.c:1094
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:635
RIP: 0010:preempt_schedule_irq+0xc2/0x140 kernel/sched/core.c:5065
Code: 4c 89 e7 e8 90 2a f7 fc f6 44 24 21 02 74 0b 0f 0b 48 f7 03 08 00 00 00 74 4d bf 01 00 00 00 e8 54 f7 97 fc fb bf 01 00 00 00 <e8> 89 e6 ff ff fa bf 01 00 00 00 e8 de f8 97 fc 65 48 8b 1d 06 57
RSP: 0018:ffffc90000d97700 EFLAGS: 00000246
RAX: 1ffff11021abd146 RBX: 1ffff920001b2ee4 RCX: ffffffff84b23e00
RDX: 1ffff11021abd004 RSI: 0000000000000000 RDI: 0000000000000001
RBP: ffffc90000d97780 R08: ffffffff87084048 R09: ffffffff87084058
R10: ffffffff87084050 R11: ffffffff87084043 R12: ffffc90000d97720
R13: 0000000000000000 R14: dffffc0000000000 R15: 1ffff920001b2ee0
 irqentry_exit_cond_resched kernel/entry/common.c:365 [inline]
 irqentry_exit+0x4f/0x60 kernel/entry/common.c:395
 sysvec_apic_timer_interrupt+0xcb/0xe0 arch/x86/kernel/apic/apic.c:1094
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:635
RIP: 0010:__vunmap+0x79f/0x8f0 mm/vmalloc.c:2298
Code: 41 8d 5e ff 43 80 7c 25 00 00 74 08 4c 89 ff e8 57 c0 05 00 48 63 db 48 c1 e3 03 49 03 1f 48 89 d8 48 c1 e8 03 42 80 3c 20 00 <74> 08 48 89 df e8 37 c0 05 00 48 8b 3b 48 85 ff 0f 84 28 01 00 00
RSP: 0018:ffffc90000d97878 EFLAGS: 00000246
RAX: 1ffff92000015c54 RBX: ffffc900000ae2a0 RCX: ffff88810d5e8000
RDX: 0000000000000000 RSI: ffff8881f715ab60 RDI: 0000000000000001
RBP: ffffc90000d978f8 R08: dffffc0000000000 R09: 0000000000000003
R10: fffff520001b2ed0 R11: dffffc0000000001 R12: dffffc0000000000
R13: 1ffff110236ba864 R14: 0000000000000655 R15: ffff88811b5d4320
 __vfree mm/vmalloc.c:2349 [inline]
 vfree+0x5c/0x80 mm/vmalloc.c:2380
 kcov_put kernel/kcov.c:408 [inline]
 kcov_close+0x2b/0x50 kernel/kcov.c:510
 __fput+0x33d/0x7b0 fs/file_table.c:281
 ____fput+0x15/0x20 fs/file_table.c:314
 task_work_run+0x129/0x190 kernel/task_work.c:165
 exit_task_work include/linux/task_work.h:32 [inline]
 do_exit+0xc83/0x2a50 kernel/exit.c:863
 do_group_exit+0x141/0x310 kernel/exit.c:985
 get_signal+0x10a0/0x1410 kernel/signal.c:2782
 arch_do_signal_or_restart+0xbd/0x17c0 arch/x86/kernel/signal.c:805
 handle_signal_work kernel/entry/common.c:145 [inline]
 exit_to_user_mode_loop+0x9b/0xd0 kernel/entry/common.c:169
 exit_to_user_mode_prepare kernel/entry/common.c:199 [inline]
 syscall_exit_to_user_mode+0xa2/0x1a0 kernel/entry/common.c:274
 do_syscall_64+0x40/0x70 arch/x86/entry/common.c:56
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f8468a179f9
Code: Unable to access opcode bytes at RIP 0x7f8468a179cf.
RSP: 002b:00007f8467697048 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: 0000000000000000 RBX: 00007f8468ba5f80 RCX: 00007f8468a179f9
RDX: 00000000200006c0 RSI: 0000000000005452 RDI: 0000000000000004
RBP: 00007f8468a858ee R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007f8468ba5f80 R15: 00007ffd90180418
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 2179 Comm: syz.2.242 Tainted: G        W         5.10.222-syzkaller-01494-gfd58936f3c1f #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/27/2024
RIP: 0010:__this_cpu_preempt_check+0x13/0x20 lib/smp_processor_id.c:66
Code: 39 76 ff ff e8 ed 8c ff ff eb ad e8 47 f5 ff ff 0f 1f 80 00 00 00 00 55 48 89 e5 48 89 fe 48 c7 c7 40 20 60 85 e8 ed fe ff ff <5d> c3 cc cc cc cc cc cc cc cc cc cc cc eb 1e 0f 1f 00 48 89 f8 48
RSP: 0000:ffffc900000071d0 EFLAGS: 00000086
RAX: 0000000000000000 RBX: ffffc90000007218 RCX: ffffffff8708b903
RDX: 1ffffffff0e10800 RSI: ffffffff85068d40 RDI: ffffffff85602040
RBP: ffffc900000071d0 R08: ffffffff87084008 R09: ffffffff87084018
R10: ffffffff87084010 R11: ffffffff87084003 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f8b5beca6c0(0000) GS:ffff8881f7000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000122529000 CR4: 00000000003506b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Call Trace:
 <NMI>
 </NMI>
 <IRQ>
 kvm_set_cpu_l1tf_flush_l1d+0x10/0x20 arch/x86/include/asm/hardirq.h:67
 sysvec_apic_timer_interrupt+0x1e/0xe0 arch/x86/kernel/apic/apic.c:1094
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:635
RIP: 0010:skb_zcopy include/linux/skbuff.h:1477 [inline]
RIP: 0010:skb_orphan_frags include/linux/skbuff.h:2852 [inline]
RIP: 0010:____dev_forward_skb include/linux/netdevice.h:4008 [inline]
RIP: 0010:__dev_forward_skb+0x7a/0x700 net/core/dev.c:2232
Code: 48 89 d8 48 c1 e8 03 42 0f b6 04 20 84 c0 4c 89 e2 0f 85 91 05 00 00 44 8b 23 4b 8d 5c 27 03 48 89 d8 48 c1 e8 03 0f b6 04 10 <84> c0 0f 85 99 05 00 00 0f b6 1b 89 de 83 e6 08 31 ff e8 ef 9b a6
RSP: 0000:ffffc900000072c0 EFLAGS: 00000a06
RAX: 0000000000000000 RBX: ffff888147517c83 RCX: ffff88811bb9bb40
RDX: dffffc0000000000 RSI: ffff8881275303c0 RDI: ffff888127530498
RBP: ffffc900000072f0 R08: ffffffff82f38c6b R09: ffffffff83c487db
R10: 0000000000000002 R11: ffff88811bb9bb40 R12: 0000000000000080
R13: ffff8881275303c0 R14: ffff888128cbc000 R15: ffff888147517c00
 veth_forward_skb drivers/net/veth.c:279 [inline]
 veth_xmit+0x242/0x820 drivers/net/veth.c:309
 __netdev_start_xmit include/linux/netdevice.h:4858 [inline]
 netdev_start_xmit include/linux/netdevice.h:4872 [inline]
 xmit_one net/core/dev.c:3607 [inline]
 dev_hard_start_xmit+0x228/0x620 net/core/dev.c:3623
 __dev_queue_xmit+0x16f1/0x28e0 net/core/dev.c:4190
 dev_queue_xmit+0x17/0x20 net/core/dev.c:4223
 neigh_resolve_output+0x6b8/0x760 net/core/neighbour.c:1509
 neigh_output include/net/neighbour.h:517 [inline]
 ip6_finish_output2+0xf21/0x1850 net/ipv6/ip6_output.c:145
 __ip6_finish_output+0x5ec/0x780 net/ipv6/ip6_output.c:216
 ip6_finish_output+0x34/0x1e0 net/ipv6/ip6_output.c:226
 NF_HOOK_COND include/linux/netfilter.h:288 [inline]
 ip6_output+0x1f7/0x4c0 net/ipv6/ip6_output.c:249
 dst_output include/net/dst.h:437 [inline]
 NF_HOOK include/linux/netfilter.h:299 [inline]
 ndisc_send_skb+0x6e9/0xc00 net/ipv6/ndisc.c:509
 ndisc_send_rs+0x532/0x6a0 net/ipv6/ndisc.c:703
 addrconf_rs_timer+0x2d1/0x600 net/ipv6/addrconf.c:3963
 call_timer_fn+0x3b/0x2d0 kernel/time/timer.c:1450
 expire_timers kernel/time/timer.c:1495 [inline]
 __run_timers+0x72a/0xa10 kernel/time/timer.c:1789
 run_timer_softirq+0x69/0xf0 kernel/time/timer.c:1802
 __do_softirq+0x268/0x5bb kernel/softirq.c:309
 asm_call_irq_on_stack+0xf/0x20
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
 do_softirq_own_stack+0x60/0x80 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:405 [inline]
 __irq_exit_rcu+0x128/0x150 kernel/softirq.c:435
 irq_exit_rcu+0x9/0x10 kernel/softirq.c:447
 sysvec_apic_timer_interrupt+0xbf/0xe0 arch/x86/kernel/apic/apic.c:1094
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:635
RIP: 0010:exit_to_user_mode_loop+0x41/0xd0 kernel/entry/common.c:159
Code: 89 fe eb 22 e8 80 49 52 00 e8 1b 8e f0 00 fa 65 48 8b 05 d2 08 a9 7e 48 8b 18 f7 c3 0e 30 08 00 0f 84 87 00 00 00 fb f6 c3 08 <74> 05 e8 58 ab 57 03 f7 c3 00 10 00 00 74 08 4c 89 f7 e8 08 70 34
RSP: 0000:ffffc900016f7ee0 EFLAGS: 00000202
RAX: 0000000000000000 RBX: 0000000000000008 RCX: ffff88811bb9bb40
RDX: 1ffff1102377376c RSI: 0000000000000008 RDI: ffffc900016f7f58
RBP: ffffc900016f7ef0 R08: ffffffff87084008 R09: ffffffff87084018
R10: ffffffff87084010 R11: ffffffff87084003 R12: ffffffff84c00c7a
R13: 0000000000000000 R14: ffffc900016f7f58 R15: 0000000000000000
 exit_to_user_mode_prepare kernel/entry/common.c:199 [inline]
 irqentry_exit_to_user_mode+0x4e/0x80 kernel/entry/common.c:287
 irqentry_exit+0x12/0x60 kernel/entry/common.c:375
 sysvec_apic_timer_interrupt+0xcb/0xe0 arch/x86/kernel/apic/apic.c:1094
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:635
RIP: 0033:0x7f8b5d26b9f9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f8b5beca048 EFLAGS: 00000246
RAX: 0000000000000000 RBX: 00007f8b5d3fa058 RCX: 00007f8b5d26b9f9
RDX: 00000000200006c0 RSI: 0000000000005452 RDI: 000000000000000d
RBP: 00007f8b5d2d98ee R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000006e R14: 00007f8b5d3fa058 R15: 00007ffde16df4a8

Crashes (1):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2024/08/05 18:55 android13-5.10-lts fd58936f3c1f e35c337f .config console log report info [disk image] [vmlinux] [kernel image] ci2-android-5-10-perf BUG: soft lockup in run_rebalance_domains
* Struck through repros no longer work on HEAD.