syzbot


possible deadlock in sch_direct_xmit (3)

Status: fixed on 2024/04/10 16:40
Subsystems: net
[Documentation on labels]
Fix commit: 0bef512012b1 UPSTREAM: net: add netdev_lockdep_set_classes() to virtual drivers
First crash: 75d, last: 75d
Similar bugs (11)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
linux-6.1 possible deadlock in sch_direct_xmit (2) origin:lts-only C 9 1d07h 108d 0/3 upstream: reported C repro on 2024/01/09 18:28
android-44 possible deadlock in sch_direct_xmit C 240 1606d 1843d 0/2 public: reported C repro on 2019/04/11 08:44
upstream possible deadlock in sch_direct_xmit (2) net C done unreliable 109 285d 1459d 0/26 auto-obsoleted due to no activity on 2024/01/14 06:05
linux-4.19 possible deadlock in sch_direct_xmit (2) C error 15 427d 944d 0/1 upstream: reported C repro on 2021/09/26 01:30
upstream possible deadlock in sch_direct_xmit net C done done 1548 1614d 2292d 15/26 fixed on 2020/04/17 19:57
linux-5.15 possible deadlock in sch_direct_xmit (2) 4 2d22h 64d 0/3 upstream: reported on 2024/02/22 19:25
linux-4.14 possible deadlock in sch_direct_xmit 1 1790d 1790d 0/1 auto-closed as invalid on 2019/10/25 08:40
linux-4.14 possible deadlock in sch_direct_xmit (2) 1 1623d 1623d 0/1 auto-closed as invalid on 2020/03/15 19:58
linux-4.19 possible deadlock in sch_direct_xmit 1 1791d 1791d 0/1 auto-closed as invalid on 2019/10/25 08:50
linux-5.15 possible deadlock in sch_direct_xmit 1 351d 351d 0/3 auto-obsoleted due to no activity on 2023/08/23 09:09
linux-6.1 possible deadlock in sch_direct_xmit 2 359d 398d 0/3 auto-obsoleted due to no activity on 2023/08/23 09:10

Sample crash report:
============================================
WARNING: possible recursive locking detected
6.8.0-rc4-next-20240212-syzkaller #0 Not tainted
--------------------------------------------
syz-executor.0/19016 is trying to acquire lock:
ffff8880162cb298 (_xmit_ETHER#2){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
ffff8880162cb298 (_xmit_ETHER#2){+.-.}-{2:2}, at: __netif_tx_lock include/linux/netdevice.h:4452 [inline]
ffff8880162cb298 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340

but task is already holding lock:
ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: __netif_tx_lock include/linux/netdevice.h:4452 [inline]
ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(_xmit_ETHER#2);
  lock(_xmit_ETHER#2);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

9 locks held by syz-executor.0/19016:
 #0: ffffffff8f385208 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
 #0: ffffffff8f385208 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x82c/0x1040 net/core/rtnetlink.c:6603
 #1: ffffc90000a08c00 ((&in_dev->mr_ifc_timer)){+.-.}-{0:0}, at: call_timer_fn+0xc0/0x600 kernel/time/timer.c:1697
 #2: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:298 [inline]
 #2: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:750 [inline]
 #2: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x45f/0x1360 net/ipv4/ip_output.c:228
 #3: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: local_bh_disable include/linux/bottom_half.h:20 [inline]
 #3: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: rcu_read_lock_bh include/linux/rcupdate.h:802 [inline]
 #3: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x2c4/0x3b10 net/core/dev.c:4284
 #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: spin_trylock include/linux/spinlock.h:361 [inline]
 #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: qdisc_run_begin include/net/sch_generic.h:195 [inline]
 #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_xmit_skb net/core/dev.c:3771 [inline]
 #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_queue_xmit+0x1262/0x3b10 net/core/dev.c:4325
 #5: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
 #5: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: __netif_tx_lock include/linux/netdevice.h:4452 [inline]
 #5: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340
 #6: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:298 [inline]
 #6: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:750 [inline]
 #6: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x45f/0x1360 net/ipv4/ip_output.c:228
 #7: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: local_bh_disable include/linux/bottom_half.h:20 [inline]
 #7: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: rcu_read_lock_bh include/linux/rcupdate.h:802 [inline]
 #7: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x2c4/0x3b10 net/core/dev.c:4284
 #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: spin_trylock include/linux/spinlock.h:361 [inline]
 #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: qdisc_run_begin include/net/sch_generic.h:195 [inline]
 #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_xmit_skb net/core/dev.c:3771 [inline]
 #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_queue_xmit+0x1262/0x3b10 net/core/dev.c:4325

stack backtrace:
CPU: 1 PID: 19016 Comm: syz-executor.0 Not tainted 6.8.0-rc4-next-20240212-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
 check_deadlock kernel/locking/lockdep.c:3062 [inline]
 validate_chain+0x15c1/0x58e0 kernel/locking/lockdep.c:3856
 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
 lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
 _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
 spin_lock include/linux/spinlock.h:351 [inline]
 __netif_tx_lock include/linux/netdevice.h:4452 [inline]
 sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340
 __dev_xmit_skb net/core/dev.c:3784 [inline]
 __dev_queue_xmit+0x1912/0x3b10 net/core/dev.c:4325
 neigh_output include/net/neighbour.h:542 [inline]
 ip_finish_output2+0xe66/0x1360 net/ipv4/ip_output.c:235
 iptunnel_xmit+0x540/0x9b0 net/ipv4/ip_tunnel_core.c:82
 ip_tunnel_xmit+0x20ee/0x2960 net/ipv4/ip_tunnel.c:831
 erspan_xmit+0x9de/0x1460 net/ipv4/ip_gre.c:720
 __netdev_start_xmit include/linux/netdevice.h:4989 [inline]
 netdev_start_xmit include/linux/netdevice.h:5003 [inline]
 xmit_one net/core/dev.c:3555 [inline]
 dev_hard_start_xmit+0x242/0x770 net/core/dev.c:3571
 sch_direct_xmit+0x2b6/0x5f0 net/sched/sch_generic.c:342
 __dev_xmit_skb net/core/dev.c:3784 [inline]
 __dev_queue_xmit+0x1912/0x3b10 net/core/dev.c:4325
 neigh_output include/net/neighbour.h:542 [inline]
 ip_finish_output2+0xe66/0x1360 net/ipv4/ip_output.c:235
 igmpv3_send_cr net/ipv4/igmp.c:723 [inline]
 igmp_ifc_timer_expire+0xb71/0xd90 net/ipv4/igmp.c:813
 call_timer_fn+0x17e/0x600 kernel/time/timer.c:1700
 expire_timers kernel/time/timer.c:1751 [inline]
 __run_timers+0x621/0x830 kernel/time/timer.c:2038
 run_timer_softirq+0x67/0xf0 kernel/time/timer.c:2051
 __do_softirq+0x2bc/0x943 kernel/softirq.c:554
 invoke_softirq kernel/softirq.c:428 [inline]
 __irq_exit_rcu+0xf2/0x1c0 kernel/softirq.c:633
 irq_exit_rcu+0x9/0x30 kernel/softirq.c:645
 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1076 [inline]
 sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1076
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
RIP: 0010:resched_offsets_ok kernel/sched/core.c:10127 [inline]
RIP: 0010:__might_resched+0x16f/0x780 kernel/sched/core.c:10142
Code: 00 4c 89 e8 48 c1 e8 03 48 ba 00 00 00 00 00 fc ff df 48 89 44 24 38 0f b6 04 10 84 c0 0f 85 87 04 00 00 41 8b 45 00 c1 e0 08 <01> d8 44 39 e0 0f 85 d6 00 00 00 44 89 64 24 1c 48 8d bc 24 a0 00
RSP: 0018:ffffc9000ee069e0 EFLAGS: 00000246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8880296a9e00
RDX: dffffc0000000000 RSI: ffff8880296a9e00 RDI: ffffffff8bfe8fa0
RBP: ffffc9000ee06b00 R08: ffffffff82326877 R09: 1ffff11002b5ad1b
R10: dffffc0000000000 R11: ffffed1002b5ad1c R12: 0000000000000000
R13: ffff8880296aa23c R14: 000000000000062a R15: 1ffff92001dc0d44
 down_write+0x19/0x50 kernel/locking/rwsem.c:1578
 kernfs_activate fs/kernfs/dir.c:1403 [inline]
 kernfs_add_one+0x4af/0x8b0 fs/kernfs/dir.c:819
 __kernfs_create_file+0x22e/0x2e0 fs/kernfs/file.c:1056
 sysfs_add_file_mode_ns+0x24a/0x310 fs/sysfs/file.c:307
 create_files fs/sysfs/group.c:64 [inline]
 internal_create_group+0x4f4/0xf20 fs/sysfs/group.c:152
 internal_create_groups fs/sysfs/group.c:192 [inline]
 sysfs_create_groups+0x56/0x120 fs/sysfs/group.c:218
 create_dir lib/kobject.c:78 [inline]
 kobject_add_internal+0x472/0x8d0 lib/kobject.c:240
 kobject_add_varg lib/kobject.c:374 [inline]
 kobject_init_and_add+0x124/0x190 lib/kobject.c:457
 netdev_queue_add_kobject net/core/net-sysfs.c:1706 [inline]
 netdev_queue_update_kobjects+0x1f3/0x480 net/core/net-sysfs.c:1758
 register_queue_kobjects net/core/net-sysfs.c:1819 [inline]
 netdev_register_kobject+0x265/0x310 net/core/net-sysfs.c:2059
 register_netdevice+0x1191/0x19c0 net/core/dev.c:10298
 bond_newlink+0x3b/0x90 drivers/net/bonding/bond_netlink.c:576
 rtnl_newlink_create net/core/rtnetlink.c:3506 [inline]
 __rtnl_newlink net/core/rtnetlink.c:3726 [inline]
 rtnl_newlink+0x158f/0x20a0 net/core/rtnetlink.c:3739
 rtnetlink_rcv_msg+0x885/0x1040 net/core/rtnetlink.c:6606
 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2543
 netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
 netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1367
 netlink_sendmsg+0xa3c/0xd70 net/netlink/af_netlink.c:1908
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x221/0x270 net/socket.c:745
 __sys_sendto+0x3a4/0x4f0 net/socket.c:2191
 __do_sys_sendto net/socket.c:2203 [inline]
 __se_sys_sendto net/socket.c:2199 [inline]
 __x64_sys_sendto+0xde/0x100 net/socket.c:2199
 do_syscall_64+0xfb/0x240
 entry_SYSCALL_64_after_hwframe+0x6d/0x75
RIP: 0033:0x7fc3fa87fa9c
Code: 1a 51 02 00 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 34 89 ef 48 89 44 24 08 e8 60 51 02 00 48 8b
RSP: 002b:00007ffc6382a760 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007fc3fb4d4620 RCX: 00007fc3fa87fa9c
RDX: 0000000000000038 RSI: 00007fc3fb4d4670 RDI: 0000000000000003
RBP: 0000000000000000 R08: 00007ffc6382a7b4 R09: 000000000000000c
R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000003
R13: 0000000000000000 R14: 00007fc3fb4d4670 R15: 0000000000000000
 </TASK>
vkms_vblank_simulate: vblank timer overrun
vkms_vblank_simulate: vblank timer overrun
----------------
Code disassembly (best guess):
   0:	00 4c 89 e8          	add    %cl,-0x18(%rcx,%rcx,4)
   4:	48 c1 e8 03          	shr    $0x3,%rax
   8:	48 ba 00 00 00 00 00 	movabs $0xdffffc0000000000,%rdx
   f:	fc ff df
  12:	48 89 44 24 38       	mov    %rax,0x38(%rsp)
  17:	0f b6 04 10          	movzbl (%rax,%rdx,1),%eax
  1b:	84 c0                	test   %al,%al
  1d:	0f 85 87 04 00 00    	jne    0x4aa
  23:	41 8b 45 00          	mov    0x0(%r13),%eax
  27:	c1 e0 08             	shl    $0x8,%eax
* 2a:	01 d8                	add    %ebx,%eax <-- trapping instruction
  2c:	44 39 e0             	cmp    %r12d,%eax
  2f:	0f 85 d6 00 00 00    	jne    0x10b
  35:	44 89 64 24 1c       	mov    %r12d,0x1c(%rsp)
  3a:	48                   	rex.W
  3b:	8d                   	.byte 0x8d
  3c:	bc                   	.byte 0xbc
  3d:	24 a0                	and    $0xa0,%al

Crashes (1):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2024/02/12 07:33 linux-next ae00c445390b 77b23aa1 .config console log report info [disk image] [vmlinux] [kernel image] ci-upstream-linux-next-kasan-gce-root possible deadlock in sch_direct_xmit
* Struck through repros no longer work on HEAD.