BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=-20 stuck for 58s! Showing busy workqueues and worker pools: workqueue events: flags=0x0 pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=5/256 pending: defense_work_handler, defense_work_handler, defense_work_handler, defense_work_handler, cache_reap ====================================================== WARNING: possible circular locking dependency detected 4.18.0-rc3+ #31 Not tainted ------------------------------------------------------ ksoftirqd/0/9 is trying to acquire lock: 000000005eca4b1c (console_owner){-.-.}, at: log_next kernel/printk/printk.c:496 [inline] 000000005eca4b1c (console_owner){-.-.}, at: console_unlock+0x54e/0x10b0 kernel/printk/printk.c:2376 but task is already holding lock: 00000000ad57e432 (&(&pool->lock)->rlock){-.-.}, at: show_workqueue_state.cold.48+0xb16/0x15ec kernel/workqueue.c:4542 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 (&(&pool->lock)->rlock){-.-.}: __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:144 spin_lock include/linux/spinlock.h:310 [inline] __queue_work+0x352/0x1410 kernel/workqueue.c:1417 queue_work_on+0x19a/0x1e0 kernel/workqueue.c:1486 queue_work include/linux/workqueue.h:512 [inline] schedule_work include/linux/workqueue.h:570 [inline] put_pwq+0x175/0x1c0 kernel/workqueue.c:1090 put_pwq_unlocked.part.29+0x34/0x70 kernel/workqueue.c:1107 put_pwq_unlocked kernel/workqueue.c:1101 [inline] destroy_workqueue+0x880/0x9d0 kernel/workqueue.c:4202 ucma_close+0x262/0x300 drivers/infiniband/core/ucma.c:1768 __fput+0x355/0x8b0 fs/file_table.c:209 ____fput+0x15/0x20 fs/file_table.c:243 task_work_run+0x1ec/0x2a0 kernel/task_work.c:113 exit_task_work include/linux/task_work.h:22 [inline] do_exit+0x1b08/0x2750 kernel/exit.c:865 do_group_exit+0x177/0x440 kernel/exit.c:968 get_signal+0x88e/0x1970 kernel/signal.c:2468 do_signal+0x9c/0x21c0 arch/x86/kernel/signal.c:816 exit_to_usermode_loop+0x2e0/0x370 arch/x86/entry/common.c:162 prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline] syscall_return_slowpath arch/x86/entry/common.c:268 [inline] do_syscall_32_irqs_on arch/x86/entry/common.c:341 [inline] do_fast_syscall_32+0xcd5/0xfb2 arch/x86/entry/common.c:397 entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139 -> #3 (&pool->lock/1){-.-.}: __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:144 spin_lock include/linux/spinlock.h:310 [inline] __queue_work+0x352/0x1410 kernel/workqueue.c:1417 queue_work_on+0x19a/0x1e0 kernel/workqueue.c:1486 queue_work include/linux/workqueue.h:512 [inline] tty_schedule_flip+0x14c/0x1d0 drivers/tty/tty_buffer.c:408 tty_flip_buffer_push+0x15/0x20 drivers/tty/tty_buffer.c:547 pty_write+0x19d/0x1f0 drivers/tty/pty.c:124 n_tty_write+0xc5b/0x11a0 drivers/tty/n_tty.c:2340 do_tty_write drivers/tty/tty_io.c:963 [inline] tty_write+0x45f/0xae0 drivers/tty/tty_io.c:1051 __vfs_write+0x117/0x9f0 fs/read_write.c:485 vfs_write+0x1f8/0x560 fs/read_write.c:549 ksys_write+0x101/0x260 fs/read_write.c:598 __do_sys_write fs/read_write.c:610 [inline] __se_sys_write fs/read_write.c:607 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:607 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #2 (&(&port->lock)->rlock){-.-.}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x96/0xc0 kernel/locking/spinlock.c:152 tty_port_tty_get+0x20/0x80 drivers/tty/tty_port.c:288 tty_port_default_wakeup+0x15/0x40 drivers/tty/tty_port.c:47 tty_port_tty_wakeup+0x5d/0x70 drivers/tty/tty_port.c:390 uart_write_wakeup+0x44/0x60 drivers/tty/serial/serial_core.c:103 serial8250_tx_chars+0x4be/0xb60 drivers/tty/serial/8250/8250_port.c:1808 serial8250_handle_irq.part.25+0x1ee/0x280 drivers/tty/serial/8250/8250_port.c:1881 serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1867 [inline] serial8250_default_handle_irq+0xc8/0x150 drivers/tty/serial/8250/8250_port.c:1897 serial8250_interrupt+0xfa/0x1d0 drivers/tty/serial/8250/8250_core.c:125 __handle_irq_event_percpu+0x1c8/0xaf0 kernel/irq/handle.c:149 handle_irq_event_percpu+0xa0/0x1d0 kernel/irq/handle.c:189 handle_irq_event+0xa7/0x135 kernel/irq/handle.c:206 handle_edge_irq+0x20f/0x870 kernel/irq/chip.c:791 generic_handle_irq_desc include/linux/irqdesc.h:154 [inline] handle_irq+0x18c/0x2e7 arch/x86/kernel/irq_64.c:77 do_IRQ+0x78/0x190 arch/x86/kernel/irq.c:245 ret_from_intr+0x0/0x1e native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:54 arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline] default_idle+0xc7/0x450 arch/x86/kernel/process.c:500 arch_cpu_idle+0x10/0x20 arch/x86/kernel/process.c:491 default_idle_call+0x6d/0x90 kernel/sched/idle.c:93 cpuidle_idle_call kernel/sched/idle.c:153 [inline] do_idle+0x3aa/0x570 kernel/sched/idle.c:262 cpu_startup_entry+0x10c/0x120 kernel/sched/idle.c:368 start_secondary+0x433/0x5d0 arch/x86/kernel/smpboot.c:265 secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:242 -> #1 (&port_lock_key){-.-.}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x96/0xc0 kernel/locking/spinlock.c:152 serial8250_console_write+0x8d5/0xb00 drivers/tty/serial/8250/8250_port.c:3230 univ8250_console_write+0x5f/0x70 drivers/tty/serial/8250/8250_core.c:590 call_console_drivers kernel/printk/printk.c:1718 [inline] console_unlock+0xab1/0x10b0 kernel/printk/printk.c:2389 vprintk_emit+0x6c6/0xdf0 kernel/printk/printk.c:1907 vprintk_default+0x28/0x30 kernel/printk/printk.c:1948 vprintk_func+0x7a/0xe7 kernel/printk/printk_safe.c:382 printk+0xa7/0xcf kernel/printk/printk.c:1981 register_console+0x7e7/0xc00 kernel/printk/printk.c:2704 univ8250_console_init+0x3f/0x4b drivers/tty/serial/8250/8250_core.c:685 console_init+0x6e1/0xa54 kernel/printk/printk.c:2788 start_kernel+0x610/0x949 init/main.c:661 x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:452 x86_64_start_kernel+0x76/0x79 arch/x86/kernel/head64.c:433 secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:242 -> #0 (console_owner){-.-.}: lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924 console_lock_spinning_enable kernel/printk/printk.c:1581 [inline] console_unlock+0x5bb/0x10b0 kernel/printk/printk.c:2386 vprintk_emit+0x6c6/0xdf0 kernel/printk/printk.c:1907 vprintk_default+0x28/0x30 kernel/printk/printk.c:1948 vprintk_func+0x7a/0xe7 kernel/printk/printk_safe.c:382 printk+0xa7/0xcf kernel/printk/printk.c:1981 show_pwq kernel/workqueue.c:4449 [inline] show_workqueue_state.cold.48+0xcb8/0x15ec kernel/workqueue.c:4544 wq_watchdog_timer_fn+0x709/0x830 kernel/workqueue.c:5556 call_timer_fn+0x242/0x970 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0x7a6/0xc70 kernel/time/timer.c:1666 run_timer_softirq+0x60/0x70 kernel/time/timer.c:1694 __do_softirq+0x2e8/0xb17 kernel/softirq.c:288 run_ksoftirqd+0x86/0x100 kernel/softirq.c:649 smpboot_thread_fn+0x417/0x870 kernel/smpboot.c:164 kthread+0x345/0x410 kernel/kthread.c:240 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412 other info that might help us debug this: Chain exists of: console_owner --> &pool->lock/1 --> &(&pool->lock)->rlock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&pool->lock)->rlock); lock(&pool->lock/1); lock(&(&pool->lock)->rlock); lock(console_owner); *** DEADLOCK *** 4 locks held by ksoftirqd/0/9: #0: 00000000f5d49146 ((&wq_watchdog_timer)){+.-.}, at: lockdep_copy_map include/linux/lockdep.h:178 [inline] #0: 00000000f5d49146 ((&wq_watchdog_timer)){+.-.}, at: call_timer_fn+0x1cd/0x970 kernel/time/timer.c:1316 #1: 00000000a2921f9b (rcu_read_lock_sched){....}, at: show_workqueue_state+0x0/0x1d0 kernel/workqueue.c:4408 #2: 00000000ad57e432 (&(&pool->lock)->rlock){-.-.}, at: show_workqueue_state.cold.48+0xb16/0x15ec kernel/workqueue.c:4542 #3: 00000000a2e766b3 (console_lock){+.+.}, at: console_trylock_spinning kernel/printk/printk.c:1643 [inline] #3: 00000000a2e766b3 (console_lock){+.+.}, at: vprintk_emit+0x6ad/0xdf0 kernel/printk/printk.c:1906 stack backtrace: CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 4.18.0-rc3+ #31 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 print_circular_bug.isra.36.cold.57+0x1bd/0x27d kernel/locking/lockdep.c:1227 check_prev_add kernel/locking/lockdep.c:1867 [inline] check_prevs_add kernel/locking/lockdep.c:1980 [inline] validate_chain kernel/locking/lockdep.c:2421 [inline] __lock_acquire+0x3449/0x5020 kernel/locking/lockdep.c:3435 lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924 console_lock_spinning_enable kernel/printk/printk.c:1581 [inline] console_unlock+0x5bb/0x10b0 kernel/printk/printk.c:2386 vprintk_emit+0x6c6/0xdf0 kernel/printk/printk.c:1907 vprintk_default+0x28/0x30 kernel/printk/printk.c:1948 vprintk_func+0x7a/0xe7 kernel/printk/printk_safe.c:382 printk+0xa7/0xcf kernel/printk/printk.c:1981 show_pwq kernel/workqueue.c:4449 [inline] show_workqueue_state.cold.48+0xcb8/0x15ec kernel/workqueue.c:4544 wq_watchdog_timer_fn+0x709/0x830 kernel/workqueue.c:5556 call_timer_fn+0x242/0x970 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0x7a6/0xc70 kernel/time/timer.c:1666 run_timer_softirq+0x60/0x70 kernel/time/timer.c:1694 __do_softirq+0x2e8/0xb17 kernel/softirq.c:288 run_ksoftirqd+0x86/0x100 kernel/softirq.c:649 smpboot_thread_fn+0x417/0x870 kernel/smpboot.c:164 ? trace Lost 5 message(s)! pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=8/256 pending: defense_work_handler, defense_work_handler, defense_work_handler, defense_work_handler, defense_work_handler, defense_work_handler, check_corruption, cache_reap workqueue events_power_efficient: flags=0x80 pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256 pending: gc_worker, do_cache_clean pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 pending: neigh_periodic_work workqueue netns: flags=0xe000a pwq 4: cpus=0-1 flags=0x4 nice=0 active=1/1 in-flight: 833:cleanup_net delayed: cleanup_net workqueue kblockd: flags=0x18 pwq 1: cpus=0 node=0 flags=0x0 nice=-20 active=2/256 pending: blk_mq_requeue_work, blk_mq_timeout_work workqueue dm_bufio_cache: flags=0x8 pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 pending: work_fn pool 4: cpus=0-1 flags=0x4 nice=0 hung=40s workers=6 idle: 23 46 89 7137 7