syzbot


possible deadlock in rds_wake_sk_sleep (3)

Status: auto-closed as invalid on 2020/12/31 09:16
Reported-by: syzbot+4670352c72e1f1994dc3@syzkaller.appspotmail.com
First crash: 782d, last: 707d
similar bugs (4):
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
upstream possible deadlock in rds_wake_sk_sleep (2) 1 921d 919d 0/23 auto-closed as invalid on 2020/05/31 18:56
upstream possible deadlock in rds_wake_sk_sleep (4) C inconclusive 4 14d 83d 0/23 upstream: reported C repro on 2022/05/19 03:35
linux-4.19 possible deadlock in rds_wake_sk_sleep C error 1 238d 238d 0/1 upstream: reported C repro on 2021/12/15 07:20
upstream possible deadlock in rds_wake_sk_sleep 8 1364d 1464d 0/23 auto-closed as invalid on 2019/05/14 04:22

Sample crash report:
======================================================
WARNING: possible circular locking dependency detected
5.8.0-rc2-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor.2/3146 is trying to acquire lock:
ffff88808f835d98 (&rs->rs_recv_lock){..--}-{2:2}, at: rds_wake_sk_sleep+0x1f/0xe0 net/rds/af_rds.c:109

but task is already holding lock:
ffff88809fd53100 (&rm->m_rs_lock){..-.}-{2:2}, at: rds_send_remove_from_sock+0x340/0x9e0 net/rds/send.c:628

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&rm->m_rs_lock){..-.}-{2:2}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x8c/0xc0 kernel/locking/spinlock.c:159
       rds_message_purge net/rds/message.c:138 [inline]
       rds_message_put+0x1d8/0xe30 net/rds/message.c:180
       rds_inc_put+0x13a/0x1a0 net/rds/recv.c:82
       rds_clear_recv_queue+0x147/0x350 net/rds/recv.c:770
       rds_release+0xd4/0x3b0 net/rds/af_rds.c:73
       __sock_release+0xcd/0x280 net/socket.c:605
       sock_close+0x18/0x20 net/socket.c:1278
       __fput+0x33c/0x880 fs/file_table.c:281
       task_work_run+0xdd/0x190 kernel/task_work.c:123
       tracehook_notify_resume include/linux/tracehook.h:188 [inline]
       exit_to_usermode_loop arch/x86/entry/common.c:216 [inline]
       __prepare_exit_to_usermode+0x1e9/0x1f0 arch/x86/entry/common.c:246
       do_syscall_64+0x6c/0xe0 arch/x86/entry/common.c:368
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

-> #0 (&rs->rs_recv_lock){..--}-{2:2}:
       check_prev_add kernel/locking/lockdep.c:2496 [inline]
       check_prevs_add kernel/locking/lockdep.c:2601 [inline]
       validate_chain kernel/locking/lockdep.c:3218 [inline]
       __lock_acquire+0x2acb/0x56e0 kernel/locking/lockdep.c:4380
       lock_acquire+0x1f1/0xad0 kernel/locking/lockdep.c:4959
       __raw_read_lock_irqsave include/linux/rwlock_api_smp.h:159 [inline]
       _raw_read_lock_irqsave+0x93/0xd0 kernel/locking/spinlock.c:231
       rds_wake_sk_sleep+0x1f/0xe0 net/rds/af_rds.c:109
       rds_send_remove_from_sock+0xb9/0x9e0 net/rds/send.c:634
       rds_send_path_drop_acked+0x2ef/0x3d0 net/rds/send.c:710
       rds_tcp_write_space+0x1a7/0x658 net/rds/tcp_send.c:198
       tcp_new_space net/ipv4/tcp_input.c:5244 [inline]
       tcp_check_space+0x178/0x730 net/ipv4/tcp_input.c:5255
       tcp_data_snd_check net/ipv4/tcp_input.c:5265 [inline]
       tcp_rcv_established+0x13dd/0x1e70 net/ipv4/tcp_input.c:5672
       tcp_v4_do_rcv+0x5d1/0x870 net/ipv4/tcp_ipv4.c:1641
       sk_backlog_rcv include/net/sock.h:996 [inline]
       __release_sock+0x134/0x3a0 net/core/sock.c:2550
       release_sock+0x54/0x1b0 net/core/sock.c:3066
       rds_send_xmit+0x142f/0x24c0 net/rds/send.c:422
       rds_sendmsg+0x27b2/0x2fd0 net/rds/send.c:1382
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:672
       __sys_sendto+0x21c/0x320 net/socket.c:1995
       __do_sys_sendto net/socket.c:2007 [inline]
       __se_sys_sendto net/socket.c:2003 [inline]
       __x64_sys_sendto+0xdd/0x1b0 net/socket.c:2003
       do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:359
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&rm->m_rs_lock);
                               lock(&rs->rs_recv_lock);
                               lock(&rm->m_rs_lock);
  lock(&rs->rs_recv_lock);

 *** DEADLOCK ***

3 locks held by syz-executor.2/3146:
 #0: ffff8880a68db360 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1576 [inline]
 #0: ffff8880a68db360 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_sock_set_cork+0x16/0x90 net/ipv4/tcp.c:2885
 #1: ffff8880a68db608 (clock-AF_INET6){++.-}-{2:2}, at: rds_tcp_write_space+0x25/0x658 net/rds/tcp_send.c:184
 #2: ffff88809fd53100 (&rm->m_rs_lock){..-.}-{2:2}, at: rds_send_remove_from_sock+0x340/0x9e0 net/rds/send.c:628

stack backtrace:
CPU: 0 PID: 3146 Comm: syz-executor.2 Not tainted 5.8.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x18f/0x20d lib/dump_stack.c:118
 check_noncircular+0x324/0x3e0 kernel/locking/lockdep.c:1827
 check_prev_add kernel/locking/lockdep.c:2496 [inline]
 check_prevs_add kernel/locking/lockdep.c:2601 [inline]
 validate_chain kernel/locking/lockdep.c:3218 [inline]
 __lock_acquire+0x2acb/0x56e0 kernel/locking/lockdep.c:4380
 lock_acquire+0x1f1/0xad0 kernel/locking/lockdep.c:4959
 __raw_read_lock_irqsave include/linux/rwlock_api_smp.h:159 [inline]
 _raw_read_lock_irqsave+0x93/0xd0 kernel/locking/spinlock.c:231
 rds_wake_sk_sleep+0x1f/0xe0 net/rds/af_rds.c:109
 rds_send_remove_from_sock+0xb9/0x9e0 net/rds/send.c:634
 rds_send_path_drop_acked+0x2ef/0x3d0 net/rds/send.c:710
 rds_tcp_write_space+0x1a7/0x658 net/rds/tcp_send.c:198
 tcp_new_space net/ipv4/tcp_input.c:5244 [inline]
 tcp_check_space+0x178/0x730 net/ipv4/tcp_input.c:5255
 tcp_data_snd_check net/ipv4/tcp_input.c:5265 [inline]
 tcp_rcv_established+0x13dd/0x1e70 net/ipv4/tcp_input.c:5672
 tcp_v4_do_rcv+0x5d1/0x870 net/ipv4/tcp_ipv4.c:1641
 sk_backlog_rcv include/net/sock.h:996 [inline]
 __release_sock+0x134/0x3a0 net/core/sock.c:2550
 release_sock+0x54/0x1b0 net/core/sock.c:3066
 rds_send_xmit+0x142f/0x24c0 net/rds/send.c:422
 rds_sendmsg+0x27b2/0x2fd0 net/rds/send.c:1382
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:672
 __sys_sendto+0x21c/0x320 net/socket.c:1995
 __do_sys_sendto net/socket.c:2007 [inline]
 __se_sys_sendto net/socket.c:2003 [inline]
 __x64_sys_sendto+0xdd/0x1b0 net/socket.c:2003
 do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:359
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45cb29
Code: Bad RIP value.
RSP: 002b:00007f347cf16c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000502a20 RCX: 000000000045cb29
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000007
RBP: 000000000078bf00 R08: 0000000020000080 R09: 0000000000000010
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000000a4e R14: 00000000004cd338 R15: 00007f347cf176d4

Crashes (3):
Manager Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Title
ci-upstream-net-this-kasan-gce 2020/07/05 14:12 net 1ca0fafd73c5 51095195 .config log report
ci-upstream-net-kasan-gce 2020/09/02 09:15 net-next dc1a9bf2c816 abf9ba4f .config log report
ci-upstream-net-kasan-gce 2020/06/19 06:18 net-next cb8e59cc8720 bc258b50 .config log report