syzbot


possible deadlock in flush_workqueue

Status: fixed on 2019/12/18 17:48
Reported-by: syzbot+e3f421b94470bd51217c@syzkaller.appspotmail.com
Fix commit: 4df728651b8a nbd: verify socket is supported during setup
First crash: 1923d, last: 1830d
Fix bisection: fixed by (bisect log) :
commit 4df728651b8a99693c69962d8e5a5b9e5a3bbcc7
Author: Mike Christie <mchristi@redhat.com>
Date: Thu Oct 17 21:27:34 2019 +0000

  nbd: verify socket is supported during setup

  
Similar bugs (7)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
linux-5.15 possible deadlock in flush_workqueue 1 589d 589d 0/3 auto-obsoleted due to no activity on 2023/08/09 18:38
android-414 possible deadlock in flush_workqueue 1 1837d 1837d 0/1 auto-closed as invalid on 2020/03/10 11:29
linux-5.15 possible deadlock in flush_workqueue (2) origin:lts-only syz inconclusive 51 118d 252d 0/3 auto-obsoleted due to no activity on 2024/10/05 10:09
upstream possible deadlock in flush_workqueue (2) C done done 256 1790d 2223d 15/28 fixed on 2020/01/31 18:49
linux-4.14 possible deadlock in flush_workqueue (2) 3 1793d 1798d 0/1 auto-closed as invalid on 2020/04/22 20:54
upstream possible deadlock in flush_workqueue net C 73762 2239d 2282d 11/28 fixed on 2018/10/11 14:33
linux-4.19 possible deadlock in flush_workqueue 3 1851d 1862d 0/1 auto-closed as invalid on 2020/02/25 05:02

Sample crash report:
IPv6: ADDRCONF(NETDEV_CHANGE): vxcan1: link becomes ready
8021q: adding VLAN 0 to HW filter on device batadv0
block nbd0: Receive control failed (result -22)
block nbd0: shutting down sockets
============================================
WARNING: possible recursive locking detected
4.14.151 #0 Not tainted
--------------------------------------------
kworker/u5:1/6981 is trying to acquire lock:
 ("knbd%d-recv"nbd->index){+.+.}, at: [<ffffffff813c88ca>] flush_workqueue+0xda/0x1400 kernel/workqueue.c:2613

but task is already holding lock:
 ("knbd%d-recv"nbd->index){+.+.}, at: [<ffffffff813cf8ee>] work_static include/linux/workqueue.h:199 [inline]
 ("knbd%d-recv"nbd->index){+.+.}, at: [<ffffffff813cf8ee>] set_work_data kernel/workqueue.c:619 [inline]
 ("knbd%d-recv"nbd->index){+.+.}, at: [<ffffffff813cf8ee>] set_work_pool_and_clear_pending kernel/workqueue.c:646 [inline]
 ("knbd%d-recv"nbd->index){+.+.}, at: [<ffffffff813cf8ee>] process_one_work+0x76e/0x1600 kernel/workqueue.c:2085

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock("knbd%d-recv"nbd->index);
  lock("knbd%d-recv"nbd->index);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

3 locks held by kworker/u5:1/6981:
 #0:  ("knbd%d-recv"nbd->index){+.+.}, at: [<ffffffff813cf8ee>] work_static include/linux/workqueue.h:199 [inline]
 #0:  ("knbd%d-recv"nbd->index){+.+.}, at: [<ffffffff813cf8ee>] set_work_data kernel/workqueue.c:619 [inline]
 #0:  ("knbd%d-recv"nbd->index){+.+.}, at: [<ffffffff813cf8ee>] set_work_pool_and_clear_pending kernel/workqueue.c:646 [inline]
 #0:  ("knbd%d-recv"nbd->index){+.+.}, at: [<ffffffff813cf8ee>] process_one_work+0x76e/0x1600 kernel/workqueue.c:2085
 #1:  ((&args->work)){+.+.}, at: [<ffffffff813cf92b>] process_one_work+0x7ab/0x1600 kernel/workqueue.c:2089
 #2:  (&nbd->config_lock){+.+.}, at: [<ffffffff82d77ed9>] refcount_dec_and_mutex_lock lib/refcount.c:312 [inline]
 #2:  (&nbd->config_lock){+.+.}, at: [<ffffffff82d77ed9>] refcount_dec_and_mutex_lock+0x49/0x6c lib/refcount.c:307

stack backtrace:
CPU: 0 PID: 6981 Comm: kworker/u5:1 Not tainted 4.14.151 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: knbd0-recv recv_work
Call Trace:
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x138/0x197 lib/dump_stack.c:53
 print_deadlock_bug kernel/locking/lockdep.c:1796 [inline]
 check_deadlock kernel/locking/lockdep.c:1843 [inline]
 validate_chain kernel/locking/lockdep.c:2444 [inline]
 __lock_acquire.cold+0x2bf/0x8dc kernel/locking/lockdep.c:3487
 lock_acquire+0x16f/0x430 kernel/locking/lockdep.c:3994
 flush_workqueue+0x109/0x1400 kernel/workqueue.c:2616
 drain_workqueue+0x177/0x3e0 kernel/workqueue.c:2781
 destroy_workqueue+0x21/0x620 kernel/workqueue.c:4088
 nbd_config_put+0x43c/0x7a0 drivers/block/nbd.c:1124
 recv_work+0x18d/0x1f0 drivers/block/nbd.c:724
 process_one_work+0x863/0x1600 kernel/workqueue.c:2114
 worker_thread+0x5d9/0x1050 kernel/workqueue.c:2248
 kthread+0x319/0x430 kernel/kthread.c:232
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:404
kobject: 'batman_adv' (ffff88808df04280): kobject_uevent_env
kobject: 'batman_adv' (ffff88808df04280): kobject_uevent_env: filter function caused the event to drop!
kobject: 'batman_adv' (ffff88808df04280): kobject_cleanup, parent           (null)
kobject: 'batman_adv' (ffff88808df04280): calling ktype release
kobject: (ffff88808df04280): dynamic_kobj_release
kobject: 'batman_adv': free name
kobject: 'rx-0' (ffff88809ad0ce50): kobject_cleanup, parent ffff8880a1295048
kobject: 'rx-0' (ffff88809ad0ce50): auto cleanup 'remove' event
kobject: 'rx-0' (ffff88809ad0ce50): kobject_uevent_env
kobject: 'rx-0' (ffff88809ad0ce50): fill_kobj_path: path = '/devices/virtual/net/syz_tun/queues/rx-0'
kobject: 'rx-0' (ffff88809ad0ce50): auto cleanup kobject_del
kobject: 'rx-0' (ffff88809ad0ce50): calling ktype release
kobject: 'rx-0': free name
kobject: 'tx-0' (ffff8880917c5058): kobject_cleanup, parent ffff8880a1295048
kobject: 'tx-0' (ffff8880917c5058): auto cleanup 'remove' event
kobject: 'tx-0' (ffff8880917c5058): kobject_uevent_env
kobject: 'tx-0' (ffff8880917c5058): fill_kobj_path: path = '/devices/virtual/net/syz_tun/queues/tx-0'
kobject: 'tx-0' (ffff8880917c5058): auto cleanup kobject_del
kobject: 'tx-0' (ffff8880917c5058): calling ktype release
kobject: 'tx-0': free name
kobject: 'queues' (ffff8880a1295048): kobject_cleanup, parent           (null)
kobject: 'queues' (ffff8880a1295048): calling ktype release
kobject: 'queues' (ffff8880a1295048): kset_release
kobject: 'queues': free name
kobject: 'syz_tun' (ffff8880a9490a70): kobject_uevent_env
kobject: 'syz_tun' (ffff8880a9490a70): fill_kobj_path: path = '/devices/virtual/net/syz_tun'
IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready

Crashes (15):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2019/11/05 10:28 linux-4.14.y ddef1e8e3f6e 76630fc9 .config console log report syz C ci2-linux-4-14
2019/11/18 09:43 linux-4.14.y 775d01b65b5d d5696d51 .config console log report ci2-linux-4-14
2019/11/10 20:40 linux-4.14.y c9fda4f22428 dc438b91 .config console log report ci2-linux-4-14
2019/11/07 16:16 linux-4.14.y c9fda4f22428 f39aff9e .config console log report ci2-linux-4-14
2019/11/01 11:49 linux-4.14.y ddef1e8e3f6e a41ca8fa .config console log report ci2-linux-4-14
2019/11/01 03:11 linux-4.14.y ddef1e8e3f6e a41ca8fa .config console log report ci2-linux-4-14
2019/10/26 12:03 linux-4.14.y b98aebd29824 25bb509e .config console log report ci2-linux-4-14
2019/10/24 08:07 linux-4.14.y b98aebd29824 d01bb02a .config console log report ci2-linux-4-14
2019/10/20 16:03 linux-4.14.y b98aebd29824 8c88c9c1 .config console log report ci2-linux-4-14
2019/10/20 02:00 linux-4.14.y b98aebd29824 8c88c9c1 .config console log report ci2-linux-4-14
2019/10/20 00:35 linux-4.14.y b98aebd29824 8c88c9c1 .config console log report ci2-linux-4-14
2019/10/19 12:35 linux-4.14.y b98aebd29824 8c88c9c1 .config console log report ci2-linux-4-14
2019/10/19 01:29 linux-4.14.y b98aebd29824 8c88c9c1 .config console log report ci2-linux-4-14
2019/10/14 21:29 linux-4.14.y e132c8d7b58d a6aef847 .config console log report ci2-linux-4-14
2019/08/17 14:11 linux-4.14.y 45f092f9e9cb 55bf8926 .config console log report ci2-linux-4-14
* Struck through repros no longer work on HEAD.