syzbot


BUG: corrupted list in load_balance

Status: auto-closed as invalid on 2019/02/22 10:29
Subsystems: kernel
[Documentation on labels]
Reported-by: syzbot+e3a0a643e134804b7798@syzkaller.appspotmail.com
First crash: 2613d, last: 2613d

Sample crash report:
list_del corruption. prev->next should be ffff880193f92370, but was ffff8801c1c3a1e0
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:53!
invalid opcode: 0000 [#1] SMP KASAN
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.18.0-rc3+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__list_del_entry_valid.cold.1+0x48/0x58 lib/list_debug.c:51
Code: 5f 1a 88 e8 46 79 02 fe 0f 0b 48 89 de 48 c7 c7 20 60 1a 88 e8 35 79 02 fe 0f 0b 48 89 de 48 c7 c7 c0 5f 1a 88 e8 24 79 02 fe <0f> 0b 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 
RSP: 0018:ffff8801dae07598 EFLAGS: 00010086
RAX: 0000000000000054 RBX: ffff880193f92370 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff81631851 RDI: 0000000000000001
RBP: ffff8801dae075b0 R08: ffffffff88e75dc0 R09: ffffed003b5c4fc0
R10: ffffed003b5c4fc0 R11: ffff8801dae27e07 R12: ffff8801c1c3a1b0
R13: ffff8801daf2d490 R14: ffff880193f922c0 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff8801c26cd8d8 CR3: 000000019d7d8000 CR4: 00000000001406f0
DR0: 00000000200001c0 DR1: 0000000020000000 DR2: 0000000020000000
DR3: 0000000020000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Call Trace:
 <IRQ>
 __list_del_entry include/linux/list.h:117 [inline]
 list_move include/linux/list.h:170 [inline]
 detach_tasks kernel/sched/fair.c:7550 [inline]
 load_balance+0x1639/0x3640 kernel/sched/fair.c:8884
 rebalance_domains+0x82a/0xd90 kernel/sched/fair.c:9262
 run_rebalance_domains+0x365/0x4c0 kernel/sched/fair.c:9884
 __do_softirq+0x2e8/0xb17 kernel/softirq.c:288
 invoke_softirq kernel/softirq.c:368 [inline]
 irq_exit+0x1d1/0x200 kernel/softirq.c:408
 exiting_irq arch/x86/include/asm/apic.h:527 [inline]
 smp_apic_timer_interrupt+0x186/0x730 arch/x86/kernel/apic/apic.c:1052
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
 </IRQ>
RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:54
Code: c7 48 89 45 d8 e8 4a 30 26 fa 48 8b 45 d8 e9 d2 fe ff ff 48 89 df e8 39 30 26 fa eb 8a 90 90 90 90 90 90 90 55 48 89 e5 fb f4 <5d> c3 0f 1f 84 00 00 00 00 00 55 48 89 e5 f4 5d c3 90 90 90 90 90 
RSP: 0018:ffffffff88e07bc0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
RAX: dffffc0000000000 RBX: 1ffffffff11c0f7b RCX: 0000000000000000
RDX: 1ffffffff11e3610 RSI: 0000000000000001 RDI: ffffffff88f1b080
RBP: ffffffff88e07bc0 R08: ffffed003b5c46d7 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffffffff88e07c78 R14: ffffffff899ec8a0 R15: 0000000000000000
 arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline]
 default_idle+0xc7/0x450 arch/x86/kernel/process.c:500
 arch_cpu_idle+0x10/0x20 arch/x86/kernel/process.c:491
 default_idle_call+0x6d/0x90 kernel/sched/idle.c:93
 cpuidle_idle_call kernel/sched/idle.c:153 [inline]
 do_idle+0x3aa/0x570 kernel/sched/idle.c:262
 cpu_startup_entry+0x10c/0x120 kernel/sched/idle.c:368
 rest_init+0xe1/0xe4 init/main.c:442
 start_kernel+0x90e/0x949 init/main.c:738
 x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:452
 x86_64_start_kernel+0x76/0x79 arch/x86/kernel/head64.c:433
 secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:242
Modules linked in:
Dumping ftrace buffer:
   (ftrace buffer empty)

======================================================
WARNING: possible circular locking dependency detected
4.18.0-rc3+ #1 Not tainted
------------------------------------------------------
swapper/0/0 is trying to acquire lock:
00000000430e5b3f ((console_sem).lock){-.-.}, at: down_trylock+0x13/0x70 kernel/locking/semaphore.c:136

but task is already holding lock:
00000000e662b200 (&rq->lock){-.-.}, at: rq_lock_irqsave kernel/sched/sched.h:1789 [inline]
00000000e662b200 (&rq->lock){-.-.}, at: load_balance+0xb58/0x3640 kernel/sched/fair.c:8877

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&rq->lock){-.-.}:
       __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
       _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:144
       rq_lock kernel/sched/sched.h:1805 [inline]
       task_fork_fair+0x93/0x680 kernel/sched/fair.c:9953
       sched_fork+0x446/0xb40 kernel/sched/core.c:2382
       copy_process.part.39+0x1c09/0x7220 kernel/fork.c:1773
       copy_process kernel/fork.c:1616 [inline]
       _do_fork+0x291/0x12a0 kernel/fork.c:2099
       kernel_thread+0x34/0x40 kernel/fork.c:2158
       rest_init+0x22/0xe4 init/main.c:408
       start_kernel+0x90e/0x949 init/main.c:738
       x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:452
       x86_64_start_kernel+0x76/0x79 arch/x86/kernel/head64.c:433
       secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:242

-> #1 (&p->pi_lock){-.-.}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x96/0xc0 kernel/locking/spinlock.c:152
       try_to_wake_up+0xd2/0x12b0 kernel/sched/core.c:1986
       wake_up_process+0x10/0x20 kernel/sched/core.c:2149
       __up.isra.1+0x1c0/0x2a0 kernel/locking/semaphore.c:262
       up+0x13c/0x1c0 kernel/locking/semaphore.c:187
       __up_console_sem+0xbe/0x1b0 kernel/printk/printk.c:242
       console_unlock+0x7a2/0x10b0 kernel/printk/printk.c:2411
       do_con_write+0x12cc/0x22a0 drivers/tty/vt/vt.c:2435
       con_write+0x25/0xc0 drivers/tty/vt/vt.c:2784
       process_output_block drivers/tty/n_tty.c:580 [inline]
       n_tty_write+0x6c1/0x11a0 drivers/tty/n_tty.c:2317
       do_tty_write drivers/tty/tty_io.c:963 [inline]
       tty_write+0x45f/0xae0 drivers/tty/tty_io.c:1051
       __vfs_write+0x117/0x9f0 fs/read_write.c:485
       vfs_write+0x1f8/0x560 fs/read_write.c:549
       ksys_write+0x101/0x260 fs/read_write.c:598
       __do_sys_write fs/read_write.c:610 [inline]
       __se_sys_write fs/read_write.c:607 [inline]
       __x64_sys_write+0x73/0xb0 fs/read_write.c:607
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 ((console_sem).lock){-.-.}:
       lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x96/0xc0 kernel/locking/spinlock.c:152
       down_trylock+0x13/0x70 kernel/locking/semaphore.c:136
       __down_trylock_console_sem+0xae/0x200 kernel/printk/printk.c:225
       console_trylock+0x15/0xa0 kernel/printk/printk.c:2230
       console_trylock_spinning kernel/printk/printk.c:1643 [inline]
       vprintk_emit+0x6ad/0xdf0 kernel/printk/printk.c:1906
       vprintk_default+0x28/0x30 kernel/printk/printk.c:1948
       vprintk_func+0x7a/0xe7 kernel/printk/printk_safe.c:382
       printk+0xa7/0xcf kernel/printk/printk.c:1981
       __list_del_entry_valid.cold.1+0x48/0x58 lib/list_debug.c:51
       __list_del_entry include/linux/list.h:117 [inline]
       list_move include/linux/list.h:170 [inline]
       detach_tasks kernel/sched/fair.c:7550 [inline]
       load_balance+0x1639/0x3640 kernel/sched/fair.c:8884
       rebalance_domains+0x82a/0xd90 kernel/sched/fair.c:9262
       run_rebalance_domains+0x365/0x4c0 kernel/sched/fair.c:9884
       __do_softirq+0x2e8/0xb17 kernel/softirq.c:288
       invoke_softirq kernel/softirq.c:368 [inline]
       irq_exit+0x1d1/0x200 kernel/softirq.c:408
       exiting_irq arch/x86/include/asm/apic.h:527 [inline]
       smp_apic_timer_interrupt+0x186/0x730 arch/x86/kernel/apic/apic.c:1052
       apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
       native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:54
       arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline]
       default_idle+0xc7/0x450 arch/x86/kernel/process.c:500
       arch_cpu_idle+0x10/0x20 arch/x86/kernel/process.c:491
       default_idle_call+0x6d/0x90 kernel/sched/idle.c:93
       cpuidle_idle_call kernel/sched/idle.c:153 [inline]
       do_idle+0x3aa/0x570 kernel/sched/idle.c:262
       cpu_startup_entry+0x10c/0x120 kernel/sched/idle.c:368
       rest_init+0xe1/0xe4 init/main.c:442
       start_kernel+0x90e/0x949 init/main.c:738
       x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:452
       x86_64_start_kernel+0x76/0x79 arch/x86/kernel/head64.c:433
       secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:242

other info that might help us debug this:

Chain exists of:
  (console_sem).lock --> &p->pi_lock --> &rq->lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&rq->lock);
                               lock(&p->pi_lock);
                               lock(&rq->lock);
  lock((console_sem).lock);

 *** DEADLOCK ***

2 locks held by swapper/0/0:
 #0: 00000000b8578250 (rcu_read_lock){....}, at: rebalance_domains+0x135/0xd90 kernel/sched/fair.c:9220
 #1: 00000000e662b200 (&rq->lock){-.-.}, at: rq_lock_irqsave kernel/sched/sched.h:1789 [inline]
 #1: 00000000e662b200 (&rq->lock){-.-.}, at: load_balance+0xb58/0x3640 kernel/sched/fair.c:8877

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.18.0-rc3+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
 print_circular_bug.isra.36.cold.57+0x1bd/0x27d kernel/locking/lockdep.c:1227
 check_prev_add kernel/locking/lockdep.c:1867 [inline]
 check_prevs_add kernel/locking/lockdep.c:1980 [inline]
 validate_chain kernel/locking/lockdep.c:2421 [inline]
 __lock_acquire+0x3449/0x5020 kernel/locking/lockdep.c:3435
 lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0x96/0xc0 kernel/locking/spinlock.c:152
 down_trylock+0x13/0x70 kernel/locking/semaphore.c:136
 __down_trylock_console_sem+0xae/0x200 kernel/printk/printk.c:225
 console_trylock+0x15/0xa0 kernel/printk/printk.c:2230
 console_trylock_spinning kernel/printk/printk.c:1643 [inline]
 vprintk_emit+0x6ad/0xdf0 kernel/printk/printk.c:1906
 vprintk_default+0x28/0x30 kernel/printk/printk.c:1948
 vprintk_func+0x7a/0xe7 kernel/printk/printk_safe.c:382
 printk+0xa7/0xcf kernel/printk/printk.c:1981
 __list_del_entry_valid.cold.1+0x48/0x58 lib/list_debug.c:51
 __list_del_entry include/linux/list.h:117 [inline]
 list_move include/linux/list.h:170 [inline]
 detach_tasks kernel/sched/fair.c:7550 [inline]
 load_balance+0x1639/0x3640 kernel/sched/fair.c:8884
 rebalance_domains+0x82a/0xd90 kernel/sched/fair.c:9262
 run_rebalance_domains+0x365/0x4c0 kernel/sched/fair.c:9884
 __do_softirq+0x2e8/0xb17 kernel/softirq.c:288
 invoke_softirq kernel/softirq.c:368 [inline]
 irq_exit+0x1d1/0x200 kernel/softirq.c:408
 exiting_irq arch/x86/include/asm/apic.h:527 [inline]
 smp_apic_timer_interrupt+0x186/0x730 arch/x86/kernel/apic/apic.c:1052
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
 </IRQ>
RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:54
Code: c7 48 89 45 d8 e8 4a 30 26 fa 48 8b 45 d8 e9 d2 fe ff ff 48 89 df e8 39 30 26 fa eb 8a 90 90 90 90 90 90 90 55 48 89 e5 fb f4 <5d> c3 0f 1f 84 00 00 00 00 00 55 48 89 e5 f4 5d c3 90 90 90 90 90 
RSP: 0018:ffffffff88e07bc0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
RAX: dffffc0000000000 RBX: 1ffffffff11c0f7b RCX: 0000000000000000
RDX: 1ffffffff11e3610 RSI: 0000000000000001 RDI: ffffffff88f1b080
RBP: ffffffff88e07bc0 R08: ffffed003b5c46d7 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffffffff88e07c78 R14: ffffffff899ec8a0 R15: 0000000000000000
 arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline]
 default_idle+0xc7/0x450 arch/x86/kernel/process.c:500
 arch_cpu_idle+0x10/0x20 arch/x86/kernel/process.c:491
 default_idle_call+0x6d/0x90 kernel/sched/idle.c:93
 cpuidle_idle_call kernel/sched/idle.c:153 [inline]
 do_idle+0x3aa/0x570 kernel/sched/idle.c:262
 cpu_startup_entry+0x10c/0x120 kernel/sched/idle.c:368
 rest_init+0xe1/0xe4 init/main.c:442
 start_kernel+0x90e/0x949 init/main.c:738
 x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:452
 x86_64_start_kernel+0x76/0x79 arch/x86/kernel/head64.c:433
 secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:242
---[ end trace 335980085409a356 ]---
RIP: 0010:__list_del_entry_valid.cold.1+0x48/0x58 lib/list_debug.c:51
Code: 5f 1a 88 e8 46 79 02 fe 0f 0b 48 89 de 48 c7 c7 20 60 1a 88 e8 35 79 02 fe 0f 0b 48 89 de 48 c7 c7 c0 5f 1a 88 e8 24 79 02 fe <0f> 0b 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 
RSP: 0018:ffff8801dae07598 EFLAGS: 00010086
RAX: 0000000000000054 RBX: ffff880193f92370 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff81631851 RDI: 0000000000000001
RBP: ffff8801dae075b0 R08: ffffffff88e75dc0 R09: ffffed003b5c4fc0
R10: ffffed003b5c4fc0 R11: ffff8801dae27e07 R12: ffff8801c1c3a1b0
R13: ffff8801daf2d490 R14: ffff880193f922c0 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff8801c26cd8d8 CR3: 000000019d7d8000 CR4: 00000000001406f0
DR0: 00000000200001c0 DR1: 0000000020000000 DR2: 0000000020000000
DR3: 0000000020000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600

Crashes (1):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2018/07/06 20:48 bpf c48424d993fa 9636bc93 .config console log report ci-upstream-bpf-kasan-gce
* Struck through repros no longer work on HEAD.