syzbot


possible deadlock in refcount_dec_and_mutex_lock

Status: fixed on 2020/01/25 22:00
Reported-by: syzbot+6921901d8a8ec5b51d40@syzkaller.appspotmail.com
Fix commit: 4df728651b8a nbd: verify socket is supported during setup
First crash: 1660d, last: 1588d
Fix bisection: fixed by (bisect log) :
commit 4df728651b8a99693c69962d8e5a5b9e5a3bbcc7
Author: Mike Christie <mchristi@redhat.com>
Date: Thu Oct 17 21:27:34 2019 +0000

  nbd: verify socket is supported during setup

  
Similar bugs (2)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
upstream possible deadlock in refcount_dec_and_mutex_lock nbd 1 1684d 1680d 0/26 auto-closed as invalid on 2019/11/20 04:39
linux-4.19 possible deadlock in refcount_dec_and_mutex_lock C done done 5 1585d 1640d 1/1 fixed on 2020/01/28 16:29

Sample crash report:
block nbd0: shutting down sockets
block nbd0: Receive control failed (result -22)
block nbd0: shutting down sockets
block nbd0: Receive control failed (result -22)
======================================================
WARNING: possible circular locking dependency detected
4.14.151 #0 Not tainted
------------------------------------------------------
kworker/u5:0/1143 is trying to acquire lock:
 (&nbd->config_lock){+.+.}, at: [<ffffffff82d77ed9>] refcount_dec_and_mutex_lock lib/refcount.c:312 [inline]
 (&nbd->config_lock){+.+.}, at: [<ffffffff82d77ed9>] refcount_dec_and_mutex_lock+0x49/0x6c lib/refcount.c:307

but task is already holding lock:
 ((&args->work)){+.+.}, at: [<ffffffff813cf92b>] process_one_work+0x7ab/0x1600 kernel/workqueue.c:2089

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 ((&args->work)){+.+.}:
       lock_acquire+0x16f/0x430 kernel/locking/lockdep.c:3994
       process_one_work+0x803/0x1600 kernel/workqueue.c:2090
       worker_thread+0x5d9/0x1050 kernel/workqueue.c:2248
       kthread+0x319/0x430 kernel/kthread.c:232
       ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:404

-> #1 ("knbd%d-recv"nbd->index){+.+.}:
       lock_acquire+0x16f/0x430 kernel/locking/lockdep.c:3994
       flush_workqueue+0x109/0x1400 kernel/workqueue.c:2616
       drain_workqueue+0x177/0x3e0 kernel/workqueue.c:2781
       destroy_workqueue+0x21/0x620 kernel/workqueue.c:4088
       nbd_config_put+0x43c/0x7a0 drivers/block/nbd.c:1124
       nbd_release+0xdb/0x120 drivers/block/nbd.c:1393
       __blkdev_put+0x434/0x7f0 fs/block_dev.c:1791
       blkdev_put+0x88/0x510 fs/block_dev.c:1856
       blkdev_close+0x8b/0xb0 fs/block_dev.c:1863
       __fput+0x275/0x7a0 fs/file_table.c:210
       ____fput+0x16/0x20 fs/file_table.c:244
       task_work_run+0x114/0x190 kernel/task_work.c:113
       exit_task_work include/linux/task_work.h:22 [inline]
       do_exit+0x7df/0x2c10 kernel/exit.c:874
       do_group_exit+0x111/0x330 kernel/exit.c:977
       SYSC_exit_group kernel/exit.c:988 [inline]
       SyS_exit_group+0x1d/0x20 kernel/exit.c:986
       do_syscall_64+0x1e8/0x640 arch/x86/entry/common.c:292
       entry_SYSCALL_64_after_hwframe+0x42/0xb7

-> #0 (&nbd->config_lock){+.+.}:
       check_prev_add kernel/locking/lockdep.c:1901 [inline]
       check_prevs_add kernel/locking/lockdep.c:2018 [inline]
       validate_chain kernel/locking/lockdep.c:2460 [inline]
       __lock_acquire+0x2cb3/0x4620 kernel/locking/lockdep.c:3487
       lock_acquire+0x16f/0x430 kernel/locking/lockdep.c:3994
       __mutex_lock_common kernel/locking/mutex.c:756 [inline]
       __mutex_lock+0xe8/0x1470 kernel/locking/mutex.c:893
       mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
       refcount_dec_and_mutex_lock lib/refcount.c:312 [inline]
       refcount_dec_and_mutex_lock+0x49/0x6c lib/refcount.c:307
       nbd_config_put+0x31/0x7a0 drivers/block/nbd.c:1102
       recv_work+0x18d/0x1f0 drivers/block/nbd.c:724
       process_one_work+0x863/0x1600 kernel/workqueue.c:2114
       worker_thread+0x5d9/0x1050 kernel/workqueue.c:2248
       kthread+0x319/0x430 kernel/kthread.c:232
       ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:404

other info that might help us debug this:

Chain exists of:
  &nbd->config_lock --> "knbd%d-recv"nbd->index --> (&args->work)

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((&args->work));
                               lock("knbd%d-recv"nbd->index);
                               lock((&args->work));
  lock(&nbd->config_lock);

 *** DEADLOCK ***

2 locks held by kworker/u5:0/1143:
 #0:  ("knbd%d-recv"nbd->index){+.+.}, at: [<ffffffff813cf8ee>] work_static include/linux/workqueue.h:199 [inline]
 #0:  ("knbd%d-recv"nbd->index){+.+.}, at: [<ffffffff813cf8ee>] set_work_data kernel/workqueue.c:619 [inline]
 #0:  ("knbd%d-recv"nbd->index){+.+.}, at: [<ffffffff813cf8ee>] set_work_pool_and_clear_pending kernel/workqueue.c:646 [inline]
 #0:  ("knbd%d-recv"nbd->index){+.+.}, at: [<ffffffff813cf8ee>] process_one_work+0x76e/0x1600 kernel/workqueue.c:2085
 #1:  ((&args->work)){+.+.}, at: [<ffffffff813cf92b>] process_one_work+0x7ab/0x1600 kernel/workqueue.c:2089

stack backtrace:
CPU: 0 PID: 1143 Comm: kworker/u5:0 Not tainted 4.14.151 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: knbd0-recv recv_work
Call Trace:
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x138/0x197 lib/dump_stack.c:53
 print_circular_bug.isra.0.cold+0x1cc/0x28f kernel/locking/lockdep.c:1258
 check_prev_add kernel/locking/lockdep.c:1901 [inline]
 check_prevs_add kernel/locking/lockdep.c:2018 [inline]
 validate_chain kernel/locking/lockdep.c:2460 [inline]
 __lock_acquire+0x2cb3/0x4620 kernel/locking/lockdep.c:3487
 lock_acquire+0x16f/0x430 kernel/locking/lockdep.c:3994
 __mutex_lock_common kernel/locking/mutex.c:756 [inline]
 __mutex_lock+0xe8/0x1470 kernel/locking/mutex.c:893
 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
 refcount_dec_and_mutex_lock lib/refcount.c:312 [inline]
 refcount_dec_and_mutex_lock+0x49/0x6c lib/refcount.c:307
 nbd_config_put+0x31/0x7a0 drivers/block/nbd.c:1102
 recv_work+0x18d/0x1f0 drivers/block/nbd.c:724
 process_one_work+0x863/0x1600 kernel/workqueue.c:2114
 worker_thread+0x5d9/0x1050 kernel/workqueue.c:2248
 kthread+0x319/0x430 kernel/kthread.c:232
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:404
block nbd0: shutting down sockets

Crashes (10):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2019/11/03 06:31 linux-4.14.y ddef1e8e3f6e d603afc9 .config console log report syz C ci2-linux-4-14
2019/11/03 00:09 linux-4.14.y ddef1e8e3f6e d603afc9 .config console log report syz ci2-linux-4-14
2019/12/26 06:58 linux-4.14.y e1f7d50ae3a3 be5c2c81 .config console log report ci2-linux-4-14
2019/12/25 14:05 linux-4.14.y e1f7d50ae3a3 be5c2c81 .config console log report ci2-linux-4-14
2019/12/16 08:33 linux-4.14.y a844dc4c5442 eef6e580 .config console log report ci2-linux-4-14
2019/11/17 04:35 linux-4.14.y 775d01b65b5d cdac920b .config console log report ci2-linux-4-14
2019/11/12 21:24 linux-4.14.y 4762bcd451a9 048f2d49 .config console log report ci2-linux-4-14
2019/11/03 06:03 linux-4.14.y ddef1e8e3f6e d603afc9 .config console log report ci2-linux-4-14
2019/11/02 21:13 linux-4.14.y ddef1e8e3f6e d603afc9 .config console log report ci2-linux-4-14
2019/10/14 22:07 linux-4.14.y e132c8d7b58d 05ad7292 .config console log report ci2-linux-4-14
* Struck through repros no longer work on HEAD.