syzbot


possible deadlock in sd_remove

Status: upstream: reported on 2024/11/06 02:29
Subsystems: scsi
[Documentation on labels]
Reported-by: syzbot+566d48f3784973a22771@syzkaller.appspotmail.com
First crash: 33d, last: 15h32m
Discussions (1)
Title Replies (including bot) Last reply
[syzbot] [scsi?] possible deadlock in sd_remove 0 (1) 2024/11/06 02:29

Sample crash report:
======================================================
WARNING: possible circular locking dependency detected
6.13.0-rc1-syzkaller-00025-gfeffde684ac2 #0 Not tainted
------------------------------------------------------
kworker/1:1/46 is trying to acquire lock:
ffff888054e523e0 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: touch_work_lockdep_map kernel/workqueue.c:3909 [inline]
ffff888054e523e0 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: start_flush_work kernel/workqueue.c:4163 [inline]
ffff888054e523e0 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: __flush_work+0x46d/0xc30 kernel/workqueue.c:4195

but task is already holding lock:
ffff888028816c78 (&q->q_usage_counter(queue)#53){++++}-{0:0}
, at: sd_remove+0x8f/0x150 drivers/scsi/sd.c:4072

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 (&q->q_usage_counter(queue)#53){++++}-{0:0}:
       blk_queue_enter+0x50f/0x640 block/blk-core.c:328
       blk_mq_alloc_request+0x59b/0x950 block/blk-mq.c:651
       scsi_alloc_request drivers/scsi/scsi_lib.c:1222 [inline]
       scsi_execute_cmd+0x1eb/0xf40 drivers/scsi/scsi_lib.c:304
       read_capacity_10+0x1d4/0x6d0 drivers/scsi/sd.c:2766
       sd_read_capacity drivers/scsi/sd.c:2834 [inline]
       sd_revalidate_disk.isra.0+0x3145/0xa8d0 drivers/scsi/sd.c:3734
       sd_probe+0x904/0x1000 drivers/scsi/sd.c:4010
       call_driver_probe drivers/base/dd.c:579 [inline]
       really_probe+0x241/0xa90 drivers/base/dd.c:658
       __driver_probe_device+0x1de/0x440 drivers/base/dd.c:800
       driver_probe_device+0x4c/0x1b0 drivers/base/dd.c:830
       __device_attach_driver+0x1df/0x310 drivers/base/dd.c:958
       bus_for_each_drv+0x15a/0x1e0 drivers/base/bus.c:459
       __device_attach_async_helper+0x1d3/0x290 drivers/base/dd.c:987
       async_run_entry_fn+0x9f/0x530 kernel/async.c:129
       process_one_work+0x9c8/0x1ba0 kernel/workqueue.c:3229
       process_scheduled_works kernel/workqueue.c:3310 [inline]
       worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391
       kthread+0x2c4/0x3a0 kernel/kthread.c:389
       ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

-> #2 (&q->limits_lock){+.+.}-{4:4}:
       __mutex_lock_common kernel/locking/mutex.c:585 [inline]
       __mutex_lock+0x19b/0xa60 kernel/locking/mutex.c:735
       queue_limits_start_update include/linux/blkdev.h:949 [inline]
       loop_reconfigure_limits+0x407/0x8c0 drivers/block/loop.c:998
       loop_set_block_size drivers/block/loop.c:1473 [inline]
       lo_simple_ioctl drivers/block/loop.c:1496 [inline]
       lo_ioctl+0x901/0x18b0 drivers/block/loop.c:1559
       blkdev_ioctl+0x279/0x6d0 block/ioctl.c:693
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:906 [inline]
       __se_sys_ioctl fs/ioctl.c:892 [inline]
       __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:892
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #1 (&q->q_usage_counter(io)#23){++++}-{0:0}:
       bio_queue_enter block/blk.h:75 [inline]
       blk_mq_submit_bio+0x1fb6/0x24c0 block/blk-mq.c:3091
       __submit_bio+0x384/0x540 block/blk-core.c:629
       __submit_bio_noacct_mq block/blk-core.c:710 [inline]
       submit_bio_noacct_nocheck+0x698/0xd70 block/blk-core.c:739
       submit_bio_noacct+0x93a/0x1e20 block/blk-core.c:868
       __block_write_full_folio+0x729/0xe00 fs/buffer.c:1904
       block_write_full_folio+0x342/0x400 fs/buffer.c:2743
       write_cache_pages+0xb3/0x130 mm/page-writeback.c:2659
       blkdev_writepages+0xa6/0xf0 block/fops.c:433
       do_writepages+0x1b6/0x820 mm/page-writeback.c:2702
       __writeback_single_inode+0x166/0xfa0 fs/fs-writeback.c:1680
       writeback_sb_inodes+0x606/0xfa0 fs/fs-writeback.c:1976
       __writeback_inodes_wb+0xff/0x2e0 fs/fs-writeback.c:2047
       wb_writeback+0x803/0xb80 fs/fs-writeback.c:2158
       wb_check_background_flush fs/fs-writeback.c:2228 [inline]
       wb_do_writeback fs/fs-writeback.c:2316 [inline]
       wb_workfn+0x730/0xbc0 fs/fs-writeback.c:2343
       process_one_work+0x9c8/0x1ba0 kernel/workqueue.c:3229
       process_scheduled_works kernel/workqueue.c:3310 [inline]
       worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391
       kthread+0x2c4/0x3a0 kernel/kthread.c:389
       ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

-> #0 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}:
       check_prev_add kernel/locking/lockdep.c:3161 [inline]
       check_prevs_add kernel/locking/lockdep.c:3280 [inline]
       validate_chain kernel/locking/lockdep.c:3904 [inline]
       __lock_acquire+0x249e/0x3c40 kernel/locking/lockdep.c:5226
       lock_acquire.part.0+0x11b/0x380 kernel/locking/lockdep.c:5849
       touch_work_lockdep_map kernel/workqueue.c:3909 [inline]
       start_flush_work kernel/workqueue.c:4163 [inline]
       __flush_work+0x477/0xc30 kernel/workqueue.c:4195
       wb_shutdown+0x180/0x240 mm/backing-dev.c:575
       bdi_unregister+0x184/0x640 mm/backing-dev.c:1158
       del_gendisk+0x947/0xb20 block/genhd.c:707
       sd_remove+0x8f/0x150 drivers/scsi/sd.c:4072
       device_remove drivers/base/dd.c:569 [inline]
       device_remove+0x125/0x170 drivers/base/dd.c:561
       __device_release_driver drivers/base/dd.c:1273 [inline]
       device_release_driver_internal+0x44a/0x610 drivers/base/dd.c:1296
       bus_remove_device+0x22f/0x420 drivers/base/bus.c:576
       device_del+0x396/0x9f0 drivers/base/core.c:3854
       __scsi_remove_device+0x307/0x3d0 drivers/scsi/scsi_sysfs.c:1499
       scsi_forget_host+0x138/0x190 drivers/scsi/scsi_scan.c:2068
       scsi_remove_host+0xf8/0x320 drivers/scsi/hosts.c:181
       quiesce_and_remove_host drivers/usb/storage/usb.c:949 [inline]
       usb_stor_disconnect+0x121/0x270 drivers/usb/storage/usb.c:1178
       usb_unbind_interface+0x1e5/0x960 drivers/usb/core/driver.c:458
       device_remove drivers/base/dd.c:569 [inline]
       device_remove+0x125/0x170 drivers/base/dd.c:561
       __device_release_driver drivers/base/dd.c:1273 [inline]
       device_release_driver_internal+0x44a/0x610 drivers/base/dd.c:1296
       bus_remove_device+0x22f/0x420 drivers/base/bus.c:576
       device_del+0x396/0x9f0 drivers/base/core.c:3854
       usb_disable_device+0x36c/0x7f0 drivers/usb/core/message.c:1418
       usb_disconnect+0x2e1/0x920 drivers/usb/core/hub.c:2304
       hub_port_connect drivers/usb/core/hub.c:5361 [inline]
       hub_port_connect_change drivers/usb/core/hub.c:5661 [inline]
       port_event drivers/usb/core/hub.c:5821 [inline]
       hub_event+0x1da5/0x4e10 drivers/usb/core/hub.c:5903
       process_one_work+0x9c8/0x1ba0 kernel/workqueue.c:3229
       process_scheduled_works kernel/workqueue.c:3310 [inline]
       worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391
       kthread+0x2c4/0x3a0 kernel/kthread.c:389
       ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

other info that might help us debug this:

Chain exists of:
  (work_completion)(&(&wb->dwork)->work) --> &q->limits_lock --> &q->q_usage_counter(queue)#53

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&q->q_usage_counter(queue)#53);
                               lock(&q->limits_lock);
                               lock(&q->q_usage_counter(queue)#53);
  lock((work_completion)(&(&wb->dwork)->work));

 *** DEADLOCK ***

9 locks held by kworker/1:1/46:
 #0: ffff88802128e548 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc90000b67d80 ((work_completion)(&hub->events)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
 #2: ffff888028951190 (&dev->mutex){....}-{4:4}, at: device_lock include/linux/device.h:1014 [inline]
 #2: ffff888028951190 (&dev->mutex){....}-{4:4}, at: hub_event+0x1c1/0x4e10 drivers/usb/core/hub.c:5849
 #3: ffff88805b913190 (&dev->mutex){....}-{4:4}, at: device_lock include/linux/device.h:1014 [inline]
 #3: ffff88805b913190 (&dev->mutex){....}-{4:4}, at: usb_disconnect+0x10a/0x920 drivers/usb/core/hub.c:2295
 #4: ffff88807f287160 (&dev->mutex){....}-{4:4}, at: device_lock include/linux/device.h:1014 [inline]
 #4: ffff88807f287160 (&dev->mutex){....}-{4:4}, at: __device_driver_lock drivers/base/dd.c:1095 [inline]
 #4: ffff88807f287160 (&dev->mutex){....}-{4:4}, at: device_release_driver_internal+0xa4/0x610 drivers/base/dd.c:1293
 #5: ffff888058bb80e0 (&shost->scan_mutex){+.+.}-{4:4}, at: scsi_remove_host+0x26/0x320 drivers/scsi/hosts.c:169
 #6: ffff8880594b0378 (&dev->mutex){....}-{4:4}, at: device_lock include/linux/device.h:1014 [inline]
 #6: ffff8880594b0378 (&dev->mutex){....}-{4:4}, at: __device_driver_lock drivers/base/dd.c:1095 [inline]
 #6: ffff8880594b0378 (&dev->mutex){....}-{4:4}, at: device_release_driver_internal+0xa4/0x610 drivers/base/dd.c:1293
 #7: ffff888028816c78 (&q->q_usage_counter(queue)#53){++++}-{0:0}, at: sd_remove+0x8f/0x150 drivers/scsi/sd.c:4072
 #8: ffffffff8e1bb500 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
 #8: ffffffff8e1bb500 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
 #8: ffffffff8e1bb500 (rcu_read_lock){....}-{1:3}, at: start_flush_work kernel/workqueue.c:4137 [inline]
 #8: ffffffff8e1bb500 (rcu_read_lock){....}-{1:3}, at: __flush_work+0x103/0xc30 kernel/workqueue.c:4195

stack backtrace:
CPU: 1 UID: 0 PID: 46 Comm: kworker/1:1 Not tainted 6.13.0-rc1-syzkaller-00025-gfeffde684ac2 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Workqueue: usb_hub_wq hub_event
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
 print_circular_bug+0x419/0x5d0 kernel/locking/lockdep.c:2074
 check_noncircular+0x31a/0x400 kernel/locking/lockdep.c:2206
 check_prev_add kernel/locking/lockdep.c:3161 [inline]
 check_prevs_add kernel/locking/lockdep.c:3280 [inline]
 validate_chain kernel/locking/lockdep.c:3904 [inline]
 __lock_acquire+0x249e/0x3c40 kernel/locking/lockdep.c:5226
 lock_acquire.part.0+0x11b/0x380 kernel/locking/lockdep.c:5849
 touch_work_lockdep_map kernel/workqueue.c:3909 [inline]
 start_flush_work kernel/workqueue.c:4163 [inline]
 __flush_work+0x477/0xc30 kernel/workqueue.c:4195
 wb_shutdown+0x180/0x240 mm/backing-dev.c:575
 bdi_unregister+0x184/0x640 mm/backing-dev.c:1158
 del_gendisk+0x947/0xb20 block/genhd.c:707
 sd_remove+0x8f/0x150 drivers/scsi/sd.c:4072
 device_remove drivers/base/dd.c:569 [inline]
 device_remove+0x125/0x170 drivers/base/dd.c:561
 __device_release_driver drivers/base/dd.c:1273 [inline]
 device_release_driver_internal+0x44a/0x610 drivers/base/dd.c:1296
 bus_remove_device+0x22f/0x420 drivers/base/bus.c:576
 device_del+0x396/0x9f0 drivers/base/core.c:3854
 __scsi_remove_device+0x307/0x3d0 drivers/scsi/scsi_sysfs.c:1499
 scsi_forget_host+0x138/0x190 drivers/scsi/scsi_scan.c:2068
 scsi_remove_host+0xf8/0x320 drivers/scsi/hosts.c:181
 quiesce_and_remove_host drivers/usb/storage/usb.c:949 [inline]
 usb_stor_disconnect+0x121/0x270 drivers/usb/storage/usb.c:1178
 usb_unbind_interface+0x1e5/0x960 drivers/usb/core/driver.c:458
 device_remove drivers/base/dd.c:569 [inline]
 device_remove+0x125/0x170 drivers/base/dd.c:561
 __device_release_driver drivers/base/dd.c:1273 [inline]
 device_release_driver_internal+0x44a/0x610 drivers/base/dd.c:1296
 bus_remove_device+0x22f/0x420 drivers/base/bus.c:576
 device_del+0x396/0x9f0 drivers/base/core.c:3854
 usb_disable_device+0x36c/0x7f0 drivers/usb/core/message.c:1418
 usb_disconnect+0x2e1/0x920 drivers/usb/core/hub.c:2304
 hub_port_connect drivers/usb/core/hub.c:5361 [inline]
 hub_port_connect_change drivers/usb/core/hub.c:5661 [inline]
 port_event drivers/usb/core/hub.c:5821 [inline]
 hub_event+0x1da5/0x4e10 drivers/usb/core/hub.c:5903
 process_one_work+0x9c8/0x1ba0 kernel/workqueue.c:3229
 process_scheduled_works kernel/workqueue.c:3310 [inline]
 worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391
 kthread+0x2c4/0x3a0 kernel/kthread.c:389
 ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>

Crashes (4):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2024/12/04 16:52 upstream feffde684ac2 b50eb251 .config console log report info [disk image] [vmlinux] [kernel image] ci-upstream-kasan-badwrites-root possible deadlock in sd_remove
2024/12/03 07:10 upstream cdd30ebb1b9f 578925bc .config console log report info [disk image] [vmlinux] [kernel image] ci-upstream-kasan-badwrites-root possible deadlock in sd_remove
2024/11/03 16:29 linux-next c88416ba074a f00eed24 .config console log report info [disk image] [vmlinux] [kernel image] ci-upstream-linux-next-kasan-gce-root possible deadlock in sd_remove
2024/11/02 02:22 linux-next c88416ba074a f00eed24 .config console log report info [disk image] [vmlinux] [kernel image] ci-upstream-linux-next-kasan-gce-root possible deadlock in sd_remove
* Struck through repros no longer work on HEAD.