syzbot


possible deadlock in rds_wake_sk_sleep

Status: upstream: reported C repro on 2021/12/15 07:20
Reported-by: syzbot+3b4069868f81d1bf6df1@syzkaller.appspotmail.com
First crash: 291d, last: 291d

Fix bisection: failed (bisect log)
similar bugs (4):
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
upstream possible deadlock in rds_wake_sk_sleep (2) 1 973d 971d 0/24 auto-closed as invalid on 2020/05/31 18:56
upstream possible deadlock in rds_wake_sk_sleep (4) C inconclusive 4 37d 136d 0/24 upstream: reported C repro on 2022/05/19 03:35
upstream possible deadlock in rds_wake_sk_sleep (3) 3 760d 831d 0/24 auto-closed as invalid on 2020/12/31 09:16
upstream possible deadlock in rds_wake_sk_sleep 8 1417d 1516d 0/24 auto-closed as invalid on 2019/05/14 04:22

Sample crash report:
======================================================
WARNING: possible circular locking dependency detected
4.19.211-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor728/12995 is trying to acquire lock:
000000001eb09c18 (&rs->rs_recv_lock){....}, at: rds_wake_sk_sleep+0x1d/0xc0 net/rds/af_rds.c:109

but task is already holding lock:
0000000063aa4bb2 (&(&rm->m_rs_lock)->rlock){....}, at: rds_send_remove_from_sock+0x278/0x8b0 net/rds/send.c:618

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&(&rm->m_rs_lock)->rlock){....}:
       rds_message_purge net/rds/message.c:138 [inline]
       rds_message_put+0x198/0xd00 net/rds/message.c:180
       rds_inc_put+0xf9/0x140 net/rds/recv.c:87
       rds_clear_recv_queue+0x147/0x350 net/rds/recv.c:762
       rds_release+0xc6/0x350 net/rds/af_rds.c:73
       __sock_release+0xcd/0x2a0 net/socket.c:599
       sock_close+0x15/0x20 net/socket.c:1214
       __fput+0x2ce/0x890 fs/file_table.c:278
       task_work_run+0x148/0x1c0 kernel/task_work.c:113
       tracehook_notify_resume include/linux/tracehook.h:193 [inline]
       exit_to_usermode_loop+0x251/0x2a0 arch/x86/entry/common.c:167
       prepare_exit_to_usermode arch/x86/entry/common.c:198 [inline]
       syscall_return_slowpath arch/x86/entry/common.c:271 [inline]
       do_syscall_64+0x538/0x620 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 (&rs->rs_recv_lock){....}:
       __raw_read_lock_irqsave include/linux/rwlock_api_smp.h:159 [inline]
       _raw_read_lock_irqsave+0x93/0xd0 kernel/locking/spinlock.c:224
       rds_wake_sk_sleep+0x1d/0xc0 net/rds/af_rds.c:109
       rds_send_remove_from_sock+0xb1/0x8b0 net/rds/send.c:624
       rds_send_path_drop_acked+0x2de/0x3c0 net/rds/send.c:700
       rds_tcp_write_space+0x199/0x650 net/rds/tcp_send.c:203
       tcp_new_space net/ipv4/tcp_input.c:5167 [inline]
       tcp_check_space+0x407/0x6f0 net/ipv4/tcp_input.c:5178
       tcp_data_snd_check net/ipv4/tcp_input.c:5188 [inline]
       tcp_rcv_established+0x916/0x1ef0 net/ipv4/tcp_input.c:5681
       tcp_v4_do_rcv+0x5d6/0x870 net/ipv4/tcp_ipv4.c:1547
       sk_backlog_rcv include/net/sock.h:952 [inline]
       __release_sock+0x134/0x3a0 net/core/sock.c:2362
       release_sock+0x54/0x1b0 net/core/sock.c:2901
       do_tcp_setsockopt.constprop.0+0x42e/0x2340 net/ipv4/tcp.c:3098
       tcp_setsockopt net/ipv4/tcp.c:3110 [inline]
       tcp_setsockopt+0xb2/0xd0 net/ipv4/tcp.c:3102
       kernel_setsockopt+0x106/0x1c0 net/socket.c:3563
       rds_tcp_cork net/rds/tcp_send.c:43 [inline]
       rds_tcp_xmit_path_complete+0xbf/0x100 net/rds/tcp_send.c:57
       rds_send_xmit+0x13b5/0x2290 net/rds/send.c:410
       rds_sendmsg+0x289d/0x2ea0 net/rds/send.c:1367
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg+0xc3/0x120 net/socket.c:661
       __sys_sendto+0x21a/0x320 net/socket.c:1899
       __do_sys_sendto net/socket.c:1911 [inline]
       __se_sys_sendto net/socket.c:1907 [inline]
       __x64_sys_sendto+0xdd/0x1b0 net/socket.c:1907
       do_syscall_64+0xf9/0x620 arch/x86/entry/common.c:293
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&(&rm->m_rs_lock)->rlock);
                               lock(&rs->rs_recv_lock);
                               lock(&(&rm->m_rs_lock)->rlock);
  lock(&rs->rs_recv_lock);

 *** DEADLOCK ***

3 locks held by syz-executor728/12995:
 #0: 00000000589d3912 (k-sk_lock-AF_INET){+.+.}, at: lock_sock include/net/sock.h:1512 [inline]
 #0: 00000000589d3912 (k-sk_lock-AF_INET){+.+.}, at: do_tcp_setsockopt.constprop.0+0x13f/0x2340 net/ipv4/tcp.c:2816
 #1: 00000000b5d5c10f (k-clock-AF_INET){++.-}, at: rds_tcp_write_space+0x25/0x650 net/rds/tcp_send.c:189
 #2: 0000000063aa4bb2 (&(&rm->m_rs_lock)->rlock){....}, at: rds_send_remove_from_sock+0x278/0x8b0 net/rds/send.c:618

stack backtrace:
CPU: 1 PID: 12995 Comm: syz-executor728 Not tainted 4.19.211-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1fc/0x2ef lib/dump_stack.c:118
 print_circular_bug.constprop.0.cold+0x2d7/0x41e kernel/locking/lockdep.c:1222
 check_prev_add kernel/locking/lockdep.c:1866 [inline]
 check_prevs_add kernel/locking/lockdep.c:1979 [inline]
 validate_chain kernel/locking/lockdep.c:2420 [inline]
 __lock_acquire+0x30c9/0x3ff0 kernel/locking/lockdep.c:3416
 lock_acquire+0x170/0x3c0 kernel/locking/lockdep.c:3908
 __raw_read_lock_irqsave include/linux/rwlock_api_smp.h:159 [inline]
 _raw_read_lock_irqsave+0x93/0xd0 kernel/locking/spinlock.c:224
 rds_wake_sk_sleep+0x1d/0xc0 net/rds/af_rds.c:109
 rds_send_remove_from_sock+0xb1/0x8b0 net/rds/send.c:624
 rds_send_path_drop_acked+0x2de/0x3c0 net/rds/send.c:700
 rds_tcp_write_space+0x199/0x650 net/rds/tcp_send.c:203
 tcp_new_space net/ipv4/tcp_input.c:5167 [inline]
 tcp_check_space+0x407/0x6f0 net/ipv4/tcp_input.c:5178
 tcp_data_snd_check net/ipv4/tcp_input.c:5188 [inline]
 tcp_rcv_established+0x916/0x1ef0 net/ipv4/tcp_input.c:5681
 tcp_v4_do_rcv+0x5d6/0x870 net/ipv4/tcp_ipv4.c:1547
 sk_backlog_rcv include/net/sock.h:952 [inline]
 __release_sock+0x134/0x3a0 net/core/sock.c:2362
 release_sock+0x54/0x1b0 net/core/sock.c:2901
 do_tcp_setsockopt.constprop.0+0x42e/0x2340 net/ipv4/tcp.c:3098
 tcp_setsockopt net/ipv4/tcp.c:3110 [inline]
 tcp_setsockopt+0xb2/0xd0 net/ipv4/tcp.c:3102
 kernel_setsockopt+0x106/0x1c0 net/socket.c:3563
 rds_tcp_cork net/rds/tcp_send.c:43 [inline]
 rds_tcp_xmit_path_complete+0xbf/0x100 net/rds/tcp_send.c:57
 rds_send_xmit+0x13b5/0x2290 net/rds/send.c:410
 rds_sendmsg+0x289d/0x2ea0 net/rds/send.c:1367
 sock_sendmsg_nosec net/socket.c:651 [inline]
 sock_sendmsg+0xc3/0x120 net/socket.c:661
 __sys_sendto+0x21a/0x320 net/socket.c:1899
 __do_sys_sendto net/socket.c:1911 [inline]
 __se_sys_sendto net/socket.c:1907 [inline]
 __x64_sys_sendto+0xdd/0x1b0 net/socket.c:1907
 do_syscall_64+0xf9/0x620 arch/x86/entry/common.c:293
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f31e6ecc079
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f31e6e71308 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007f31e6f53268 RCX: 00007f31e6ecc079
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000005
RBP: 00007f31e6f53260 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f31e6f1a74c
R13: 00007ffed006e09f R14: 00007f31e6e71400 R15: 0000000000022000

Crashes (1):
Manager Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Title
ci2-linux-4-19 2021/12/15 07:19 linux-4.19.y 3f8a27f9e27b f752fb53 .config log report syz C possible deadlock in rds_wake_sk_sleep
* Struck through repros no longer work on HEAD.