syzbot


possible deadlock in rds_wake_sk_sleep (2)

Status: auto-closed as invalid on 2020/05/31 18:56
Subsystems: rds
[Documentation on labels]
Reported-by: syzbot+fafe7ab87492bb36ac5b@syzkaller.appspotmail.com
First crash: 1822d, last: 1822d
Discussions (1)
Title Replies (including bot) Last reply
possible deadlock in rds_wake_sk_sleep (2) 0 (1) 2020/02/03 17:28
Similar bugs (6)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
linux-6.1 possible deadlock in rds_wake_sk_sleep origin:upstream missing-backport C done 1 183d 595d 3/3 fixed on 2024/09/02 20:54
linux-5.15 possible deadlock in rds_wake_sk_sleep origin:upstream missing-backport C done 1 160d 602d 3/3 fixed on 2024/09/30 11:45
upstream possible deadlock in rds_wake_sk_sleep (4) rds C error 16 356d 984d 25/28 fixed on 2024/03/27 19:12
linux-4.19 possible deadlock in rds_wake_sk_sleep C error 2 742d 1139d 0/1 upstream: reported C repro on 2021/12/15 07:20
upstream possible deadlock in rds_wake_sk_sleep (3) rds 3 1608d 1679d 0/28 auto-closed as invalid on 2020/12/31 09:16
upstream possible deadlock in rds_wake_sk_sleep rds 8 2265d 2365d 0/28 auto-closed as invalid on 2019/05/14 04:22

Sample crash report:
======================================================
WARNING: possible circular locking dependency detected
5.5.0-syzkaller #0 Not tainted
------------------------------------------------------
kworker/u4:4/231 is trying to acquire lock:
ffff8880a7e4de10 (&rs->rs_recv_lock){..--}, at: rds_wake_sk_sleep+0x24/0xe0 net/rds/af_rds.c:109

but task is already holding lock:
ffff888096fed900 (&(&rm->m_rs_lock)->rlock){..-.}, at: rds_send_remove_from_sock+0x352/0x9d0 net/rds/send.c:628

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&(&rm->m_rs_lock)->rlock){..-.}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159
       rds_message_purge net/rds/message.c:138 [inline]
       rds_message_put net/rds/message.c:180 [inline]
       rds_message_put+0x1d9/0xda0 net/rds/message.c:173
       rds_loop_inc_free+0x16/0x20 net/rds/loop.c:115
       rds_inc_put+0x148/0x1b0 net/rds/recv.c:82
       rds_clear_recv_queue+0x157/0x380 net/rds/recv.c:770
       rds_release+0x117/0x430 net/rds/af_rds.c:73
       __sock_release+0xce/0x280 net/socket.c:605
       sock_close+0x1e/0x30 net/socket.c:1283
       __fput+0x2ff/0x890 fs/file_table.c:280
       ____fput+0x16/0x20 fs/file_table.c:313
       task_work_run+0x145/0x1c0 kernel/task_work.c:113
       tracehook_notify_resume include/linux/tracehook.h:188 [inline]
       exit_to_usermode_loop+0x316/0x380 arch/x86/entry/common.c:164
       prepare_exit_to_usermode arch/x86/entry/common.c:195 [inline]
       syscall_return_slowpath arch/x86/entry/common.c:278 [inline]
       do_syscall_64+0x676/0x790 arch/x86/entry/common.c:304
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 (&rs->rs_recv_lock){..--}:
       check_prev_add kernel/locking/lockdep.c:2475 [inline]
       check_prevs_add kernel/locking/lockdep.c:2580 [inline]
       validate_chain kernel/locking/lockdep.c:2970 [inline]
       __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954
       lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484
       __raw_read_lock_irqsave include/linux/rwlock_api_smp.h:159 [inline]
       _raw_read_lock_irqsave+0x98/0xd0 kernel/locking/spinlock.c:231
       rds_wake_sk_sleep+0x24/0xe0 net/rds/af_rds.c:109
       rds_send_remove_from_sock+0xc0/0x9d0 net/rds/send.c:634
       rds_send_path_drop_acked+0x330/0x430 net/rds/send.c:710
       rds_tcp_write_space+0x1bb/0x6a1 net/rds/tcp_send.c:203
       tcp_new_space net/ipv4/tcp_input.c:5217 [inline]
       tcp_check_space+0x18f/0x760 net/ipv4/tcp_input.c:5228
       tcp_data_snd_check net/ipv4/tcp_input.c:5238 [inline]
       tcp_rcv_established+0x188b/0x1e90 net/ipv4/tcp_input.c:5646
       tcp_v4_do_rcv+0x619/0x8d0 net/ipv4/tcp_ipv4.c:1619
       tcp_v4_rcv+0x307f/0x3b40 net/ipv4/tcp_ipv4.c:2001
       ip_protocol_deliver_rcu+0x5a/0x880 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x23b/0x380 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:307 [inline]
       NF_HOOK include/linux/netfilter.h:301 [inline]
       ip_local_deliver+0x1e9/0x520 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x1db/0x2f0 net/ipv4/ip_input.c:428
       NF_HOOK include/linux/netfilter.h:307 [inline]
       NF_HOOK include/linux/netfilter.h:301 [inline]
       ip_rcv+0xe8/0x3f0 net/ipv4/ip_input.c:538
       __netif_receive_skb_one_core+0x113/0x1a0 net/core/dev.c:5198
       __netif_receive_skb+0x2c/0x1d0 net/core/dev.c:5312
       process_backlog+0x206/0x750 net/core/dev.c:6144
       napi_poll net/core/dev.c:6582 [inline]
       net_rx_action+0x508/0x1120 net/core/dev.c:6650
       __do_softirq+0x262/0x98c kernel/softirq.c:292
       do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082
       do_softirq.part.0+0x11a/0x170 kernel/softirq.c:337
       do_softirq kernel/softirq.c:329 [inline]
       __local_bh_enable_ip+0x211/0x270 kernel/softirq.c:189
       local_bh_enable include/linux/bottom_half.h:32 [inline]
       rcu_read_unlock_bh include/linux/rcupdate.h:690 [inline]
       ip_finish_output2+0x957/0x2610 net/ipv4/ip_output.c:229
       __ip_finish_output net/ipv4/ip_output.c:306 [inline]
       __ip_finish_output+0x604/0xc00 net/ipv4/ip_output.c:288
       ip_finish_output+0x38/0x1f0 net/ipv4/ip_output.c:316
       NF_HOOK_COND include/linux/netfilter.h:296 [inline]
       ip_output+0x22b/0x680 net/ipv4/ip_output.c:430
       dst_output include/net/dst.h:436 [inline]
       ip_local_out+0xbb/0x1b0 net/ipv4/ip_output.c:125
       __ip_queue_xmit+0x878/0x1c20 net/ipv4/ip_output.c:530
       ip_queue_xmit+0x5a/0x70 include/net/ip.h:237
       __tcp_transmit_skb+0x1ac9/0x3900 net/ipv4/tcp_output.c:1234
       __tcp_send_ack.part.0+0x3c6/0x5b0 net/ipv4/tcp_output.c:3771
       __tcp_send_ack net/ipv4/tcp_output.c:3777 [inline]
       tcp_send_ack net/ipv4/tcp_output.c:3777 [inline]
       tcp_send_delayed_ack+0x361/0x460 net/ipv4/tcp_output.c:3725
       __tcp_ack_snd_check+0x6b0/0x980 net/ipv4/tcp_input.c:5268
       tcp_rcv_established+0x1789/0x1e90 net/ipv4/tcp_input.c:5694
       tcp_v4_do_rcv+0x619/0x8d0 net/ipv4/tcp_ipv4.c:1619
       sk_backlog_rcv include/net/sock.h:938 [inline]
       __release_sock+0x129/0x390 net/core/sock.c:2437
       release_sock+0x59/0x1c0 net/core/sock.c:2953
       tcp_sendmsg+0x3b/0x50 net/ipv4/tcp.c:1434
       inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:672
       kernel_sendmsg+0x44/0x50 net/socket.c:692
       rds_tcp_sendmsg+0xc7/0x100 net/rds/tcp_send.c:71
       rds_tcp_xmit+0x6e2/0xa40 net/rds/tcp_send.c:109
       rds_send_xmit+0x1354/0x2970 net/rds/send.c:367
       rds_send_worker+0x9c/0x2a0 net/rds/threads.c:200
       process_one_work+0xa05/0x17a0 kernel/workqueue.c:2264
       worker_thread+0x98/0xe40 kernel/workqueue.c:2410
       kthread+0x361/0x430 kernel/kthread.c:255
       ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&(&rm->m_rs_lock)->rlock);
                               lock(&rs->rs_recv_lock);
                               lock(&(&rm->m_rs_lock)->rlock);
  lock(&rs->rs_recv_lock);

 *** DEADLOCK ***

9 locks held by kworker/u4:4/231:
 #0: ffff888099596928 ((wq_completion)krdsd){+.+.}, at: __write_once_size include/linux/compiler.h:226 [inline]
 #0: ffff888099596928 ((wq_completion)krdsd){+.+.}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888099596928 ((wq_completion)krdsd){+.+.}, at: atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
 #0: ffff888099596928 ((wq_completion)krdsd){+.+.}, at: atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
 #0: ffff888099596928 ((wq_completion)krdsd){+.+.}, at: set_work_data kernel/workqueue.c:615 [inline]
 #0: ffff888099596928 ((wq_completion)krdsd){+.+.}, at: set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
 #0: ffff888099596928 ((wq_completion)krdsd){+.+.}, at: process_one_work+0x8dd/0x17a0 kernel/workqueue.c:2235
 #1: ffffc90001717dc0 ((work_completion)(&(&cp->cp_send_w)->work)){+.+.}, at: process_one_work+0x917/0x17a0 kernel/workqueue.c:2239
 #2: ffff8880873b0150 (k-sk_lock-AF_INET){+.+.}, at: lock_sock include/net/sock.h:1516 [inline]
 #2: ffff8880873b0150 (k-sk_lock-AF_INET){+.+.}, at: tcp_sendmsg+0x22/0x50 net/ipv4/tcp.c:1432
 #3: ffffffff89babf40 (rcu_read_lock){....}, at: sock_net include/net/sock.h:2459 [inline]
 #3: ffffffff89babf40 (rcu_read_lock){....}, at: __ip_queue_xmit+0x42/0x1c20 net/ipv4/ip_output.c:455
 #4: ffffffff89babf40 (rcu_read_lock){....}, at: __write_once_size include/linux/compiler.h:226 [inline]
 #4: ffffffff89babf40 (rcu_read_lock){....}, at: __skb_unlink include/linux/skbuff.h:2034 [inline]
 #4: ffffffff89babf40 (rcu_read_lock){....}, at: __skb_dequeue include/linux/skbuff.h:2049 [inline]
 #4: ffffffff89babf40 (rcu_read_lock){....}, at: process_backlog+0x195/0x750 net/core/dev.c:6142
 #5: ffffffff89babf40 (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2265 [inline]
 #5: ffffffff89babf40 (rcu_read_lock){....}, at: ip_local_deliver_finish+0x13a/0x380 net/ipv4/ip_input.c:228
 #6: ffff88805b9c4e60 (k-slock-AF_INET6/1){+.-.}, at: tcp_v4_rcv+0x2e1a/0x3b40 net/ipv4/tcp_ipv4.c:1995
 #7: ffff88805b9c5150 (clock-AF_INET6){++.-}, at: rds_tcp_write_space+0x28/0x6a1 net/rds/tcp_send.c:189
 #8: ffff888096fed900 (&(&rm->m_rs_lock)->rlock){..-.}, at: rds_send_remove_from_sock+0x352/0x9d0 net/rds/send.c:628

stack backtrace:
CPU: 1 PID: 231 Comm: kworker/u4:4 Not tainted 5.5.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: krdsd rds_send_worker
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x197/0x210 lib/dump_stack.c:118
 print_circular_bug.isra.0.cold+0x163/0x172 kernel/locking/lockdep.c:1684
 check_noncircular+0x32e/0x3e0 kernel/locking/lockdep.c:1808
 check_prev_add kernel/locking/lockdep.c:2475 [inline]
 check_prevs_add kernel/locking/lockdep.c:2580 [inline]
 validate_chain kernel/locking/lockdep.c:2970 [inline]
 __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954
 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484
 __raw_read_lock_irqsave include/linux/rwlock_api_smp.h:159 [inline]
 _raw_read_lock_irqsave+0x98/0xd0 kernel/locking/spinlock.c:231
 rds_wake_sk_sleep+0x24/0xe0 net/rds/af_rds.c:109
 rds_send_remove_from_sock+0xc0/0x9d0 net/rds/send.c:634
 rds_send_path_drop_acked+0x330/0x430 net/rds/send.c:710
 rds_tcp_write_space+0x1bb/0x6a1 net/rds/tcp_send.c:203
 tcp_new_space net/ipv4/tcp_input.c:5217 [inline]
 tcp_check_space+0x18f/0x760 net/ipv4/tcp_input.c:5228
 tcp_data_snd_check net/ipv4/tcp_input.c:5238 [inline]
 tcp_rcv_established+0x188b/0x1e90 net/ipv4/tcp_input.c:5646
 tcp_v4_do_rcv+0x619/0x8d0 net/ipv4/tcp_ipv4.c:1619
 tcp_v4_rcv+0x307f/0x3b40 net/ipv4/tcp_ipv4.c:2001
 ip_protocol_deliver_rcu+0x5a/0x880 net/ipv4/ip_input.c:204
 ip_local_deliver_finish+0x23b/0x380 net/ipv4/ip_input.c:231
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ip_local_deliver+0x1e9/0x520 net/ipv4/ip_input.c:252
 dst_input include/net/dst.h:442 [inline]
 ip_rcv_finish+0x1db/0x2f0 net/ipv4/ip_input.c:428
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ip_rcv+0xe8/0x3f0 net/ipv4/ip_input.c:538
 __netif_receive_skb_one_core+0x113/0x1a0 net/core/dev.c:5198
 __netif_receive_skb+0x2c/0x1d0 net/core/dev.c:5312
 process_backlog+0x206/0x750 net/core/dev.c:6144
 napi_poll net/core/dev.c:6582 [inline]
 net_rx_action+0x508/0x1120 net/core/dev.c:6650
 __do_softirq+0x262/0x98c kernel/softirq.c:292
 do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082
 </IRQ>
 do_softirq.part.0+0x11a/0x170 kernel/softirq.c:337
 do_softirq kernel/softirq.c:329 [inline]
 __local_bh_enable_ip+0x211/0x270 kernel/softirq.c:189
 local_bh_enable include/linux/bottom_half.h:32 [inline]
 rcu_read_unlock_bh include/linux/rcupdate.h:690 [inline]
 ip_finish_output2+0x957/0x2610 net/ipv4/ip_output.c:229
 __ip_finish_output net/ipv4/ip_output.c:306 [inline]
 __ip_finish_output+0x604/0xc00 net/ipv4/ip_output.c:288
 ip_finish_output+0x38/0x1f0 net/ipv4/ip_output.c:316
 NF_HOOK_COND include/linux/netfilter.h:296 [inline]
 ip_output+0x22b/0x680 net/ipv4/ip_output.c:430
 dst_output include/net/dst.h:436 [inline]
 ip_local_out+0xbb/0x1b0 net/ipv4/ip_output.c:125
 __ip_queue_xmit+0x878/0x1c20 net/ipv4/ip_output.c:530
 ip_queue_xmit+0x5a/0x70 include/net/ip.h:237
 __tcp_transmit_skb+0x1ac9/0x3900 net/ipv4/tcp_output.c:1234
 __tcp_send_ack.part.0+0x3c6/0x5b0 net/ipv4/tcp_output.c:3771
 __tcp_send_ack net/ipv4/tcp_output.c:3777 [inline]
 tcp_send_ack net/ipv4/tcp_output.c:3777 [inline]
 tcp_send_delayed_ack+0x361/0x460 net/ipv4/tcp_output.c:3725
 __tcp_ack_snd_check+0x6b0/0x980 net/ipv4/tcp_input.c:5268
 tcp_rcv_established+0x1789/0x1e90 net/ipv4/tcp_input.c:5694
 tcp_v4_do_rcv+0x619/0x8d0 net/ipv4/tcp_ipv4.c:1619
 sk_backlog_rcv include/net/sock.h:938 [inline]
 __release_sock+0x129/0x390 net/core/sock.c:2437
 release_sock+0x59/0x1c0 net/core/sock.c:2953
 tcp_sendmsg+0x3b/0x50 net/ipv4/tcp.c:1434
 inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0xd7/0x130 net/socket.c:672
 kernel_sendmsg+0x44/0x50 net/socket.c:692
 rds_tcp_sendmsg+0xc7/0x100 net/rds/tcp_send.c:71
 rds_tcp_xmit+0x6e2/0xa40 net/rds/tcp_send.c:109
 rds_send_xmit+0x1354/0x2970 net/rds/send.c:367
 rds_send_worker+0x9c/0x2a0 net/rds/threads.c:200
 process_one_work+0xa05/0x17a0 kernel/workqueue.c:2264
 worker_thread+0x98/0xe40 kernel/workqueue.c:2410
 kthread+0x361/0x430 kernel/kthread.c:255
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

Crashes (1):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2020/02/01 18:55 net-next-old 9f68e3655aae 0eb59c27 .config console log report ci-upstream-net-kasan-gce
* Struck through repros no longer work on HEAD.