syzbot


possible deadlock in red_adaptative_timer

Status: fixed on 2021/04/21 07:52
Reported-by: syzbot+3612cf9819657cef64d0@syzkaller.appspotmail.com
Fix commit: 43c9bffda3a2 net: sched: validate stab values
First crash: 1434d, last: 1343d
Fix bisection: fixed by (bisect log) :
commit 43c9bffda3a21f363c0beab06f24c1974e1d4b9f
Author: Eric Dumazet <edumazet@google.com>
Date: Wed Mar 10 16:26:41 2021 +0000

  net: sched: validate stab values

  
Similar bugs (2)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
linux-4.19 possible deadlock in red_adaptative_timer C error 10 1410d 1579d 0/1 upstream: reported C repro on 2020/07/28 13:48
linux-6.1 possible deadlock in red_adaptative_timer 1 466d 466d 0/3 auto-obsoleted due to no activity on 2023/11/23 10:12
Fix bisection attempts (5)
Created Duration User Patch Repo Result
2021/04/20 20:54 3h01m bisect fix linux-4.14.y OK (1) job log
2021/03/21 17:50 20m bisect fix linux-4.14.y OK (0) job log log
2021/02/19 16:55 21m bisect fix linux-4.14.y OK (0) job log log
2021/02/17 12:13 18m bisect fix linux-4.14.y error job log
2021/01/30 07:14 1m bisect fix linux-4.14.y error job log

Sample crash report:
IPVS: ftp: loaded support on port[0] = 21
======================================================
WARNING: possible circular locking dependency detected
4.14.213-syzkaller #0 Not tainted
------------------------------------------------------
kworker/u4:2/34 is trying to acquire lock:
 (&qdisc_rx_lock){+.-.}, at: [<ffffffff85f37772>] spin_lock include/linux/spinlock.h:317 [inline]
 (&qdisc_rx_lock){+.-.}, at: [<ffffffff85f37772>] red_adaptative_timer+0x72/0x570 net/sched/sch_red.c:233

but task is already holding lock:
 (((&q->adapt_timer))){+.-.}, at: [<ffffffff814a45c8>] lockdep_copy_map include/linux/lockdep.h:174 [inline]
 (((&q->adapt_timer))){+.-.}, at: [<ffffffff814a45c8>] call_timer_fn+0xb8/0x650 kernel/time/timer.c:1270

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (((&q->adapt_timer))){+.-.}:
       del_timer_sync+0xaf/0x240 kernel/time/timer.c:1236
       red_destroy+0x15/0x50 net/sched/sch_red.c:154
       qdisc_destroy+0x13c/0x310 net/sched/sch_generic.c:726
       red_change+0x4a4/0xf80 net/sched/sch_red.c:205
       qdisc_change net/sched/sch_api.c:1144 [inline]
       tc_modify_qdisc+0xd4e/0x1250 net/sched/sch_api.c:1410
       rtnetlink_rcv_msg+0x3be/0xb10 net/core/rtnetlink.c:4316
       netlink_rcv_skb+0x125/0x390 net/netlink/af_netlink.c:2433
       netlink_unicast_kernel net/netlink/af_netlink.c:1287 [inline]
       netlink_unicast+0x437/0x610 net/netlink/af_netlink.c:1313
       netlink_sendmsg+0x62e/0xb80 net/netlink/af_netlink.c:1878
       sock_sendmsg_nosec net/socket.c:646 [inline]
       sock_sendmsg+0xb5/0x100 net/socket.c:656
       ___sys_sendmsg+0x6c8/0x800 net/socket.c:2062
       __sys_sendmsg+0xa3/0x120 net/socket.c:2096
       SYSC_sendmsg net/socket.c:2107 [inline]
       SyS_sendmsg+0x27/0x40 net/socket.c:2103
       do_syscall_64+0x1d5/0x640 arch/x86/entry/common.c:292
       entry_SYSCALL_64_after_hwframe+0x46/0xbb

-> #0 (&qdisc_rx_lock){+.-.}:
       lock_acquire+0x170/0x3f0 kernel/locking/lockdep.c:3998
       __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
       _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:152
       spin_lock include/linux/spinlock.h:317 [inline]
       red_adaptative_timer+0x72/0x570 net/sched/sch_red.c:233
       call_timer_fn+0x14a/0x650 kernel/time/timer.c:1280
       expire_timers+0x232/0x4d0 kernel/time/timer.c:1319
       __run_timers kernel/time/timer.c:1637 [inline]
       run_timer_softirq+0x1d5/0x5a0 kernel/time/timer.c:1650
       __do_softirq+0x254/0xa1d kernel/softirq.c:288
       invoke_softirq kernel/softirq.c:368 [inline]
       irq_exit+0x193/0x240 kernel/softirq.c:409
       exiting_irq arch/x86/include/asm/apic.h:648 [inline]
       smp_apic_timer_interrupt+0x141/0x5e0 arch/x86/kernel/apic/apic.c:1102
       apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:793
       preempt_count_equals kernel/sched/core.c:5972 [inline]
       ___might_sleep+0x6c/0x2b0 kernel/sched/core.c:6005
       inet_twsk_purge+0x9a/0x5a0 net/ipv4/inet_timewait_sock.c:268
       ops_exit_list+0xf9/0x150 net/core/net_namespace.c:145
       cleanup_net+0x3b3/0x840 net/core/net_namespace.c:484
       process_one_work+0x793/0x14a0 kernel/workqueue.c:2116
       worker_thread+0x5cc/0xff0 kernel/workqueue.c:2250
       kthread+0x30d/0x420 kernel/kthread.c:232
       ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:404

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(((&q->adapt_timer)));
                               lock(&qdisc_rx_lock);
                               lock(((&q->adapt_timer)));
  lock(&qdisc_rx_lock);

 *** DEADLOCK ***

4 locks held by kworker/u4:2/34:
 #0:  ("%s""netns"){+.+.}, at: [<ffffffff81373630>] process_one_work+0x6b0/0x14a0 kernel/workqueue.c:2087
 #1:  (net_cleanup_work){+.+.}, at: [<ffffffff81373666>] process_one_work+0x6e6/0x14a0 kernel/workqueue.c:2091
 #2:  (net_mutex){+.+.}, at: [<ffffffff85dc8bf0>] cleanup_net+0x110/0x840 net/core/net_namespace.c:450
 #3:  (((&q->adapt_timer))){+.-.}, at: [<ffffffff814a45c8>] lockdep_copy_map include/linux/lockdep.h:174 [inline]
 #3:  (((&q->adapt_timer))){+.-.}, at: [<ffffffff814a45c8>] call_timer_fn+0xb8/0x650 kernel/time/timer.c:1270

stack backtrace:
CPU: 0 PID: 34 Comm: kworker/u4:2 Not tainted 4.14.213-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x1b2/0x283 lib/dump_stack.c:58
 print_circular_bug.constprop.0.cold+0x2d7/0x41e kernel/locking/lockdep.c:1258
 check_prev_add kernel/locking/lockdep.c:1905 [inline]
 check_prevs_add kernel/locking/lockdep.c:2022 [inline]
 validate_chain kernel/locking/lockdep.c:2464 [inline]
 __lock_acquire+0x2e0e/0x3f20 kernel/locking/lockdep.c:3491
 lock_acquire+0x170/0x3f0 kernel/locking/lockdep.c:3998
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:152
 spin_lock include/linux/spinlock.h:317 [inline]
 red_adaptative_timer+0x72/0x570 net/sched/sch_red.c:233
 call_timer_fn+0x14a/0x650 kernel/time/timer.c:1280
 expire_timers+0x232/0x4d0 kernel/time/timer.c:1319
 __run_timers kernel/time/timer.c:1637 [inline]
 run_timer_softirq+0x1d5/0x5a0 kernel/time/timer.c:1650
 __do_softirq+0x254/0xa1d kernel/softirq.c:288
 invoke_softirq kernel/softirq.c:368 [inline]
 irq_exit+0x193/0x240 kernel/softirq.c:409
 exiting_irq arch/x86/include/asm/apic.h:648 [inline]
 smp_apic_timer_interrupt+0x141/0x5e0 arch/x86/kernel/apic/apic.c:1102
 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:793
 </IRQ>
RIP: 0010:preempt_count_equals kernel/sched/core.c:5972 [inline]
RIP: 0010:___might_sleep+0x6c/0x2b0 kernel/sched/core.c:6005
RSP: 0018:ffff8880b56c7bc8 EFLAGS: 00000a02 ORIG_RAX: ffffffffffffff10
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000001
RDX: 1ffff11016ad70a8 RSI: 00000000ffffffff RDI: ffff8880b56b8540
RBP: ffffffff885c9540 R08: 0000000000000000 R09: 0000000000000003
R10: 0000000000000000 R11: ffff8880b56b81c0 R12: 000000000000010c
R13: 0000000000000000 R14: ffff8880b56b81c0 R15: dffffc0000000000
 inet_twsk_purge+0x9a/0x5a0 net/ipv4/inet_timewait_sock.c:268
 ops_exit_list+0xf9/0x150 net/core/net_namespace.c:145
 cleanup_net+0x3b3/0x840 net/core/net_namespace.c:484
 process_one_work+0x793/0x14a0 kernel/workqueue.c:2116
 worker_thread+0x5cc/0xff0 kernel/workqueue.c:2250
 kthread+0x30d/0x420 kernel/kthread.c:232
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:404
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21

Crashes (2):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2020/12/31 07:14 linux-4.14.y 1752938529c6 5cc121d6 .config console log report syz C ci2-linux-4-14
2020/12/20 16:19 linux-4.14.y 3f2ecb86cb90 04201c06 .config console log report syz ci2-linux-4-14
* Struck through repros no longer work on HEAD.