syzbot


possible deadlock in __wake_up

Status: upstream: reported C repro on 2024/04/01 17:05
Bug presence: origin:lts-only
[Documentation on labels]
Reported-by: syzbot+307a1c20dacb44bdcf84@syzkaller.appspotmail.com
First crash: 220d, last: 179d
Fix commit to backport (bisect log) :
tree: upstream
commit bca39f39058567643487cd654970717705784ba3
Author: Pavel Begunkov <asml.silence@gmail.com>
Date: Mon Jan 9 14:46:09 2023 +0000

  io_uring: add lazy poll_wq activation

  
Bug presence (2)
Date Name Commit Repro Result
2024/04/30 linux-6.1.y (ToT) dcbc050cb0d3 C [report] possible deadlock in __wake_up
2024/04/30 upstream (ToT) 50dffbf77180 C Didn't crash
Fix bisection attempts (2)
Created Duration User Patch Repo Result
2024/08/20 23:21 4h27m fix candidate upstream OK (1) job log
2024/05/30 18:42 10m fix candidate upstream error job log

Sample crash report:
============================================
WARNING: possible recursive locking detected
6.1.83-syzkaller #0 Not tainted
--------------------------------------------
syz-executor285/3540 is trying to acquire lock:
ffff888029235378 (&ctx->cq_wait){....}-{2:2}, at: __wake_up_common_lock kernel/sched/wait.c:137 [inline]
ffff888029235378 (&ctx->cq_wait){....}-{2:2}, at: __wake_up+0xfd/0x1c0 kernel/sched/wait.c:160

but task is already holding lock:
ffff888029235378 (&ctx->cq_wait){....}-{2:2}, at: __wake_up_common_lock kernel/sched/wait.c:137 [inline]
ffff888029235378 (&ctx->cq_wait){....}-{2:2}, at: __wake_up+0xfd/0x1c0 kernel/sched/wait.c:160

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&ctx->cq_wait);
  lock(&ctx->cq_wait);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

2 locks held by syz-executor285/3540:
 #0: ffff8880292350a8 (&ctx->uring_lock){+.+.}-{3:3}, at: __do_sys_io_uring_enter io_uring/io_uring.c:3280 [inline]
 #0: ffff8880292350a8 (&ctx->uring_lock){+.+.}-{3:3}, at: __se_sys_io_uring_enter+0x336/0x2750 io_uring/io_uring.c:3213
 #1: ffff888029235378 (&ctx->cq_wait){....}-{2:2}, at: __wake_up_common_lock kernel/sched/wait.c:137 [inline]
 #1: ffff888029235378 (&ctx->cq_wait){....}-{2:2}, at: __wake_up+0xfd/0x1c0 kernel/sched/wait.c:160

stack backtrace:
CPU: 1 PID: 3540 Comm: syz-executor285 Not tainted 6.1.83-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106
 print_deadlock_bug kernel/locking/lockdep.c:2983 [inline]
 check_deadlock kernel/locking/lockdep.c:3026 [inline]
 validate_chain+0x4711/0x5950 kernel/locking/lockdep.c:3812
 __lock_acquire+0x125b/0x1f80 kernel/locking/lockdep.c:5049
 lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0xd1/0x120 kernel/locking/spinlock.c:162
 __wake_up_common_lock kernel/sched/wait.c:137 [inline]
 __wake_up+0xfd/0x1c0 kernel/sched/wait.c:160
 __io_cqring_wake io_uring/io_uring.h:224 [inline]
 io_req_local_work_add io_uring/io_uring.c:1117 [inline]
 __io_req_task_work_add+0x3c7/0x5c0 io_uring/io_uring.c:1128
 io_poll_wake+0x351/0x430 io_uring/poll.c:465
 __wake_up_common+0x2a0/0x4e0 kernel/sched/wait.c:107
 __wake_up_common_lock kernel/sched/wait.c:138 [inline]
 __wake_up+0x11a/0x1c0 kernel/sched/wait.c:160
 io_queue_sqe io_uring/io_uring.c:1910 [inline]
 io_submit_sqe io_uring/io_uring.c:2162 [inline]
 io_submit_sqes+0xf29/0x1e70 io_uring/io_uring.c:2275
 __do_sys_io_uring_enter io_uring/io_uring.c:3281 [inline]
 __se_sys_io_uring_enter+0x341/0x2750 io_uring/io_uring.c:3213
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f7acc82a529
Code: 48 83 c4 28 c3 e8 37 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffc546b11c8 EFLAGS: 00000216 ORIG_RAX: 00000000000001aa
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7acc82a529
RDX: 0000000000000000 RSI: 00000000000053f8 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000216 R12: 0000000000000000
R13: 00007ffc546b1448 R14: 0000000000000001 R15: 0000000000000001
 </TASK>

Crashes (7):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2024/04/01 17:39 linux-6.1.y e5cd595e23c1 6baf5069 .config console log report syz C [disk image] [vmlinux] [kernel image] ci2-linux-6-1-kasan possible deadlock in __wake_up
2024/04/06 19:52 linux-6.1.y 347385861c50 ca620dd8 .config console log report info [disk image] [vmlinux] [kernel image] ci2-linux-6-1-kasan possible deadlock in __wake_up
2024/04/03 18:46 linux-6.1.y 347385861c50 51c4dcff .config console log report info [disk image] [vmlinux] [kernel image] ci2-linux-6-1-kasan possible deadlock in __wake_up
2024/04/01 17:08 linux-6.1.y e5cd595e23c1 6baf5069 .config console log report info [disk image] [vmlinux] [kernel image] ci2-linux-6-1-kasan possible deadlock in __wake_up
2024/04/01 17:04 linux-6.1.y e5cd595e23c1 6baf5069 .config console log report info [disk image] [vmlinux] [kernel image] ci2-linux-6-1-kasan possible deadlock in __wake_up
2024/05/12 00:23 linux-6.1.y 909ba1f1b414 9026e142 .config console log report info [disk image] [vmlinux] [kernel image] ci2-linux-6-1-kasan-arm64 possible deadlock in __wake_up
2024/04/24 15:34 linux-6.1.y 6741e066ec76 21339d7b .config console log report info [disk image] [vmlinux] [kernel image] ci2-linux-6-1-kasan-arm64 possible deadlock in __wake_up
* Struck through repros no longer work on HEAD.