syzbot


possible deadlock in reada_start_machine_worker

Status: upstream: reported on 2023/02/03 05:58
Subsystems: btrfs
[Documentation on labels]
Reported-by: syzbot+edbe3db84d16338e2271@syzkaller.appspotmail.com
First crash: 505d, last: 505d

Sample crash report:
BTRFS info (device loop0): using free space tree
BTRFS info (device loop0): has skinny extents
======================================================
WARNING: possible circular locking dependency detected
4.14.304-syzkaller #0 Not tainted
------------------------------------------------------
kworker/u4:5/8933 is trying to acquire lock:
 (&fs_devs->device_list_mutex){+.+.}, at: [<ffffffff82b1f812>] __reada_start_machine fs/btrfs/reada.c:765 [inline]
 (&fs_devs->device_list_mutex){+.+.}, at: [<ffffffff82b1f812>] reada_start_machine_worker+0x1d2/0xa90 fs/btrfs/reada.c:746

but task is already holding lock:
 ((&work->normal_work)){+.+.}, at: [<ffffffff81366016>] process_one_work+0x6e6/0x14a0 kernel/workqueue.c:2092

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 ((&work->normal_work)){+.+.}:
       process_one_work+0x736/0x14a0 kernel/workqueue.c:2093
       worker_thread+0x5cc/0xff0 kernel/workqueue.c:2251
       kthread+0x30d/0x420 kernel/kthread.c:232
       ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:406

-> #2 ("%s-%s""btrfs", name){+.+.}:
       flush_workqueue+0xfa/0x1310 kernel/workqueue.c:2625
       drain_workqueue+0x177/0x3e0 kernel/workqueue.c:2790
       destroy_workqueue+0x71/0x710 kernel/workqueue.c:4116
       __btrfs_destroy_workqueue fs/btrfs/async-thread.c:436 [inline]
       btrfs_destroy_workqueue+0xf8/0x630 fs/btrfs/async-thread.c:447
       scrub_workers_put+0x90/0x1a0 fs/btrfs/scrub.c:4075
       btrfs_scrub_dev+0x536/0xcd0 fs/btrfs/scrub.c:4219
       btrfs_ioctl_scrub fs/btrfs/ioctl.c:4451 [inline]
       btrfs_ioctl+0xba8/0x5b20 fs/btrfs/ioctl.c:5681
       vfs_ioctl fs/ioctl.c:46 [inline]
       file_ioctl fs/ioctl.c:500 [inline]
       do_vfs_ioctl+0x75a/0xff0 fs/ioctl.c:684
       SYSC_ioctl fs/ioctl.c:701 [inline]
       SyS_ioctl+0x7f/0xb0 fs/ioctl.c:692
       do_syscall_64+0x1d5/0x640 arch/x86/entry/common.c:292
       entry_SYSCALL_64_after_hwframe+0x5e/0xd3

-> #1 (&fs_info->scrub_lock){+.+.}:
       __mutex_lock_common kernel/locking/mutex.c:756 [inline]
       __mutex_lock+0xc4/0x1310 kernel/locking/mutex.c:893
       btrfs_scrub_dev+0x1f3/0xcd0 fs/btrfs/scrub.c:4150
       btrfs_ioctl_scrub fs/btrfs/ioctl.c:4451 [inline]
       btrfs_ioctl+0xba8/0x5b20 fs/btrfs/ioctl.c:5681
       vfs_ioctl fs/ioctl.c:46 [inline]
       file_ioctl fs/ioctl.c:500 [inline]
       do_vfs_ioctl+0x75a/0xff0 fs/ioctl.c:684
       SYSC_ioctl fs/ioctl.c:701 [inline]
       SyS_ioctl+0x7f/0xb0 fs/ioctl.c:692
       do_syscall_64+0x1d5/0x640 arch/x86/entry/common.c:292
       entry_SYSCALL_64_after_hwframe+0x5e/0xd3

-> #0 (&fs_devs->device_list_mutex){+.+.}:
       lock_acquire+0x170/0x3f0 kernel/locking/lockdep.c:3998
       __mutex_lock_common kernel/locking/mutex.c:756 [inline]
       __mutex_lock+0xc4/0x1310 kernel/locking/mutex.c:893
       __reada_start_machine fs/btrfs/reada.c:765 [inline]
       reada_start_machine_worker+0x1d2/0xa90 fs/btrfs/reada.c:746
       normal_work_helper+0x304/0x1330 fs/btrfs/async-thread.c:376
       process_one_work+0x793/0x14a0 kernel/workqueue.c:2117
       worker_thread+0x5cc/0xff0 kernel/workqueue.c:2251
       kthread+0x30d/0x420 kernel/kthread.c:232
       ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:406

other info that might help us debug this:

Chain exists of:
  &fs_devs->device_list_mutex --> "%s-%s""btrfs", name --> (&work->normal_work)

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((&work->normal_work));
                               lock("%s-%s""btrfs", name);
                               lock((&work->normal_work));
  lock(&fs_devs->device_list_mutex);

 *** DEADLOCK ***

2 locks held by kworker/u4:5/8933:
 #0:  ("%s-%s""btrfs", name){+.+.}, at: [<ffffffff81365fe0>] process_one_work+0x6b0/0x14a0 kernel/workqueue.c:2088
 #1:  ((&work->normal_work)){+.+.}, at: [<ffffffff81366016>] process_one_work+0x6e6/0x14a0 kernel/workqueue.c:2092

stack backtrace:
CPU: 0 PID: 8933 Comm: kworker/u4:5 Not tainted 4.14.304-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/12/2023
Workqueue: btrfs-readahead btrfs_readahead_helper
Call Trace:
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x1b2/0x281 lib/dump_stack.c:58
 print_circular_bug.constprop.0.cold+0x2d7/0x41e kernel/locking/lockdep.c:1258
 check_prev_add kernel/locking/lockdep.c:1905 [inline]
 check_prevs_add kernel/locking/lockdep.c:2022 [inline]
 validate_chain kernel/locking/lockdep.c:2464 [inline]
 __lock_acquire+0x2e0e/0x3f20 kernel/locking/lockdep.c:3491
 lock_acquire+0x170/0x3f0 kernel/locking/lockdep.c:3998
 __mutex_lock_common kernel/locking/mutex.c:756 [inline]
 __mutex_lock+0xc4/0x1310 kernel/locking/mutex.c:893
 __reada_start_machine fs/btrfs/reada.c:765 [inline]
 reada_start_machine_worker+0x1d2/0xa90 fs/btrfs/reada.c:746
 normal_work_helper+0x304/0x1330 fs/btrfs/async-thread.c:376
 process_one_work+0x793/0x14a0 kernel/workqueue.c:2117
 worker_thread+0x5cc/0xff0 kernel/workqueue.c:2251
 kthread+0x30d/0x420 kernel/kthread.c:232
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:406
BTRFS info (device loop0): using free space tree
BTRFS info (device loop0): has skinny extents
BTRFS info (device loop4): using free space tree
BTRFS info (device loop4): has skinny extents
audit: type=1804 audit(1675403865.048:41): pid=11955 uid=0 auid=4294967295 ses=4294967295 op="invalid_pcr" cause="ToMToU" comm="syz-executor.0" name="/root/syzkaller-testdir1667990053/syzkaller.IWvPMh/71/file0/.log" dev="loop4" ino=263 res=1
audit: type=1804 audit(1675403865.088:42): pid=11932 uid=0 auid=4294967295 ses=4294967295 op="invalid_pcr" cause="ToMToU" comm="syz-executor.1" name="/root/syzkaller-testdir764593847/syzkaller.2goUcs/85/file0/.log" dev="loop4" ino=263 res=1
BTRFS info (device loop4): using free space tree
BTRFS info (device loop4): has skinny extents
audit: type=1804 audit(1675403865.668:43): pid=12016 uid=0 auid=4294967295 ses=4294967295 op="invalid_pcr" cause="open_writers" comm="syz-executor.0" name="/root/syzkaller-testdir1667990053/syzkaller.IWvPMh/72/file0/.log" dev="loop4" ino=263 res=1
audit: type=1804 audit(1675403865.698:44): pid=12016 uid=0 auid=4294967295 ses=4294967295 op="invalid_pcr" cause="ToMToU" comm="syz-executor.0" name="/root/syzkaller-testdir1667990053/syzkaller.IWvPMh/72/file0/.log" dev="loop4" ino=263 res=1
audit: type=1804 audit(1675403865.738:45): pid=12029 uid=0 auid=4294967295 ses=4294967295 op="invalid_pcr" cause="ToMToU" comm="syz-executor.1" name="/root/syzkaller-testdir764593847/syzkaller.2goUcs/86/file0/.log" dev="loop4" ino=263 res=1
PF_BRIDGE: br_mdb_parse() with invalid ifindex
Alternate GPT is invalid, using primary GPT.
 loop2: p1 p2 p3
PF_BRIDGE: br_mdb_parse() with invalid ifindex
BTRFS info (device loop4): using free space tree
BTRFS info (device loop4): has skinny extents
loop_reread_partitions: partition scan of loop2 () failed (rc=-16)
print_req_error: I/O error, dev loop2, sector 58
print_req_error: I/O error, dev loop2, sector 108
print_req_error: I/O error, dev loop2, sector 58
print_req_error: I/O error, dev loop2, sector 108
Buffer I/O error on dev loop2p2, logical block 8, async page read
Buffer I/O error on dev loop2p1, logical block 8, async page read
print_req_error: I/O error, dev loop2, sector 1008
print_req_error: I/O error, dev loop2, sector 59
Buffer I/O error on dev loop2p2, logical block 9, async page read
print_req_error: I/O error, dev loop2, sector 109
Buffer I/O error on dev loop2p1, logical block 9, async page read
print_req_error: I/O error, dev loop2, sector 110
Buffer I/O error on dev loop2p1, logical block 10, async page read
print_req_error: I/O error, dev loop2, sector 111
Buffer I/O error on dev loop2p1, logical block 11, async page read
print_req_error: I/O error, dev loop2, sector 112
Buffer I/O error on dev loop2p1, logical block 12, async page read
Buffer I/O error on dev loop2p1, logical block 13, async page read
Buffer I/O error on dev loop2p1, logical block 14, async page read
Buffer I/O error on dev loop2p1, logical block 15, async page read
audit: type=1804 audit(1675403866.328:46): pid=12129 uid=0 auid=4294967295 ses=4294967295 op="invalid_pcr" cause="open_writers" comm="syz-executor.1" name="/root/syzkaller-testdir764593847/syzkaller.2goUcs/87/file0/.log" dev="loop4" ino=263 res=1
audit: type=1804 audit(1675403866.328:47): pid=12129 uid=0 auid=4294967295 ses=4294967295 op="invalid_pcr" cause="ToMToU" comm="syz-executor.1" name="/root/syzkaller-testdir764593847/syzkaller.2goUcs/87/file0/.log" dev="loop4" ino=263 res=1
PF_BRIDGE: br_mdb_parse() with invalid ifindex
PF_BRIDGE: br_mdb_parse() with invalid ifindex
Alternate GPT is invalid, using primary GPT.
Alternate GPT is invalid, using primary GPT.
 loop1: p1 p2 p3
 loop2: p1 p2 p3
Alternate GPT is invalid, using primary GPT.
Alternate GPT is invalid, using primary GPT.
 loop2: p1 p2 p3
 loop1: p1 p2 p3
Alternate GPT is invalid, using primary GPT.
 loop2: p1 p2 p3
Alternate GPT is invalid, using primary GPT.
 loop2: p1 p2 p3
Alternate GPT is invalid, using primary GPT.
 loop1: p1 p2 p3
Alternate GPT is invalid, using primary GPT.
 loop2: p1 p2 p3
IPVS: ftp: loaded support on port[0] = 21
sctp: [Deprecated]: syz-executor.5 (pid 12392) Use of int in max_burst socket option.
Use struct sctp_assoc_value instead
sctp: [Deprecated]: syz-executor.5 (pid 12468) Use of int in max_burst socket option.
Use struct sctp_assoc_value instead
XFS (loop0): Mounting V4 Filesystem
XFS (loop0): Ending clean mount
XFS (loop0): Quotacheck needed: Please wait.
XFS (loop0): Quotacheck: Done.
sctp: [Deprecated]: syz-executor.4 (pid 12494) Use of int in max_burst socket option.
Use struct sctp_assoc_value instead
syz-executor.0 (12449) used greatest stack depth: 24008 bytes left
XFS (loop0): Unmounting Filesystem
sctp: [Deprecated]: syz-executor.5 (pid 12525) Use of int in max_burst socket option.
Use struct sctp_assoc_value instead
sctp: [Deprecated]: syz-executor.4 (pid 12538) Use of int in max_burst socket option.
Use struct sctp_assoc_value instead
XFS (loop1): Mounting V4 Filesystem
XFS (loop1): Ending clean mount
XFS (loop1): Quotacheck needed: Please wait.
XFS (loop1): Quotacheck: Done.
XFS (loop0): Mounting V4 Filesystem
XFS (loop0): Ending clean mount
XFS (loop0): Quotacheck needed: Please wait.
XFS (loop0): Quotacheck: Done.
XFS (loop1): Unmounting Filesystem
XFS (loop0): Unmounting Filesystem
sctp: [Deprecated]: syz-executor.5 (pid 12573) Use of int in max_burst socket option.
Use struct sctp_assoc_value instead
sctp: [Deprecated]: syz-executor.4 (pid 12589) Use of int in max_burst socket option.
Use struct sctp_assoc_value instead
sctp: [Deprecated]: syz-executor.2 (pid 12592) Use of int in max_burst socket option.
Use struct sctp_assoc_value instead
XFS (loop0): Mounting V4 Filesystem
XFS (loop0): Ending clean mount
XFS (loop0): Quotacheck needed: Please wait.
XFS (loop1): Mounting V4 Filesystem
XFS (loop0): Quotacheck: Done.
XFS (loop1): Ending clean mount
XFS (loop1): Quotacheck needed: Please wait.
XFS (loop1): Quotacheck: Done.
XFS (loop0): Unmounting Filesystem
XFS (loop1): Unmounting Filesystem
sctp: [Deprecated]: syz-executor.2 (pid 12677) Use of int in max_burst socket option.
Use struct sctp_assoc_value instead
XFS (loop0): Mounting V4 Filesystem
XFS (loop0): Ending clean mount
XFS (loop0): Quotacheck needed: Please wait.
XFS (loop0): Quotacheck: Done.
XFS (loop1): Mounting V4 Filesystem
XFS (loop1): Ending clean mount
XFS (loop1): Quotacheck needed: Please wait.
XFS (loop1): Quotacheck: Done.
XFS (loop0): Unmounting Filesystem
XFS (loop1): Unmounting Filesystem
bridge0: port 3(ipvlan2) entered blocking state
bridge0: port 3(ipvlan2) entered disabled state
sctp: [Deprecated]: syz-executor.2 (pid 12777) Use of int in max_burst socket option.
Use struct sctp_assoc_value instead
bridge0: port 3(ipvlan2) entered blocking state
bridge0: port 3(ipvlan2) entered disabled state
bridge0: port 3(ipvlan2) entered blocking state
bridge0: port 3(ipvlan2) entered disabled state

Crashes (1):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2023/02/03 05:57 linux-4.14.y 3949d1610004 33fc5c09 .config console log report info [disk image] [vmlinux] [kernel image] ci2-linux-4-14 possible deadlock in reada_start_machine_worker
* Struck through repros no longer work on HEAD.