syzbot


possible deadlock in gfs2_trans_begin

Status: upstream: reported on 2024/09/24 09:55
Subsystems: gfs2
[Documentation on labels]
Reported-by: syzbot+5baab0d4d584f7b68982@syzkaller.appspotmail.com
First crash: 14d, last: 13d
Discussions (2)
Title Replies (including bot) Last reply
[syzbot] Monthly gfs2 report (Oct 2024) 0 (1) 2024/10/03 09:03
[syzbot] [gfs2?] possible deadlock in gfs2_trans_begin 0 (1) 2024/09/24 09:55

Sample crash report:
======================================================
WARNING: possible circular locking dependency detected
6.11.0-syzkaller-07462-g1868f9d0260e #0 Not tainted
------------------------------------------------------
kswapd0/78 is trying to acquire lock:
ffff88801fe9a610 (sb_internal#2){.+.+}-{0:0}, at: gfs2_trans_begin+0x71/0xe0 fs/gfs2/trans.c:118

but task is already holding lock:
ffffffff8ea369a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat mm/vmscan.c:6821 [inline]
ffffffff8ea369a0 (fs_reclaim){+.+.}-{0:0}, at: kswapd+0xbf1/0x3720 mm/vmscan.c:7203

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 (fs_reclaim){+.+.}-{0:0}:
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5822
       __fs_reclaim_acquire mm/page_alloc.c:3825 [inline]
       fs_reclaim_acquire+0x88/0x140 mm/page_alloc.c:3839
       might_alloc include/linux/sched/mm.h:334 [inline]
       prepare_alloc_pages+0x147/0x5d0 mm/page_alloc.c:4473
       __alloc_pages_noprof+0x166/0x6c0 mm/page_alloc.c:4691
       alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2263
       alloc_pages_noprof mm/mempolicy.c:2343 [inline]
       folio_alloc_noprof+0x128/0x180 mm/mempolicy.c:2350
       filemap_alloc_folio_noprof+0xdf/0x500 mm/filemap.c:1010
       __filemap_get_folio+0x446/0xbd0 mm/filemap.c:1952
       filemap_grab_folio include/linux/pagemap.h:806 [inline]
       gfs2_unstuff_dinode+0xfb/0x15e0 fs/gfs2/bmap.c:162
       fallocate_chunk fs/gfs2/file.c:1190 [inline]
       __gfs2_fallocate+0xf4e/0x1e00 fs/gfs2/file.c:1337
       gfs2_fallocate+0x35c/0x490 fs/gfs2/file.c:1401
       vfs_fallocate+0x569/0x6e0 fs/open.c:333
       ksys_fallocate fs/open.c:356 [inline]
       __do_sys_fallocate fs/open.c:364 [inline]
       __se_sys_fallocate fs/open.c:362 [inline]
       __x64_sys_fallocate+0xbd/0x110 fs/open.c:362
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #2 (&ip->i_rw_mutex){++++}-{3:3}:
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5822
       down_write+0x99/0x220 kernel/locking/rwsem.c:1579
       gfs2_unstuff_dinode+0xa0/0x15e0 fs/gfs2/bmap.c:161
       fallocate_chunk fs/gfs2/file.c:1190 [inline]
       __gfs2_fallocate+0xf4e/0x1e00 fs/gfs2/file.c:1337
       gfs2_fallocate+0x35c/0x490 fs/gfs2/file.c:1401
       vfs_fallocate+0x569/0x6e0 fs/open.c:333
       ksys_fallocate fs/open.c:356 [inline]
       __do_sys_fallocate fs/open.c:364 [inline]
       __se_sys_fallocate fs/open.c:362 [inline]
       __x64_sys_fallocate+0xbd/0x110 fs/open.c:362
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #1 (&sdp->sd_log_flush_lock){++++}-{3:3}:
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5822
       down_read+0xb1/0xa40 kernel/locking/rwsem.c:1526
       __gfs2_trans_begin+0x55d/0x950 fs/gfs2/trans.c:87
       gfs2_trans_begin+0x71/0xe0 fs/gfs2/trans.c:118
       alloc_dinode+0x2ef/0x5e0 fs/gfs2/inode.c:418
       gfs2_create_inode+0xf39/0x1b30 fs/gfs2/inode.c:739
       gfs2_atomic_open+0xe5/0x230 fs/gfs2/inode.c:1315
       atomic_open fs/namei.c:3455 [inline]
       lookup_open fs/namei.c:3566 [inline]
       open_last_lookups fs/namei.c:3694 [inline]
       path_openat+0x101b/0x3590 fs/namei.c:3930
       do_filp_open+0x235/0x490 fs/namei.c:3960
       do_sys_openat2+0x13e/0x1d0 fs/open.c:1415
       do_sys_open fs/open.c:1430 [inline]
       __do_sys_creat fs/open.c:1506 [inline]
       __se_sys_creat fs/open.c:1500 [inline]
       __x64_sys_creat+0x123/0x170 fs/open.c:1500
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #0 (sb_internal#2){.+.+}-{0:0}:
       check_prev_add kernel/locking/lockdep.c:3158 [inline]
       check_prevs_add kernel/locking/lockdep.c:3277 [inline]
       validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3901
       __lock_acquire+0x1384/0x2050 kernel/locking/lockdep.c:5199
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5822
       percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
       __sb_start_write include/linux/fs.h:1715 [inline]
       sb_start_intwrite include/linux/fs.h:1898 [inline]
       __gfs2_trans_begin+0x471/0x950 fs/gfs2/trans.c:76
       gfs2_trans_begin+0x71/0xe0 fs/gfs2/trans.c:118
       gfs2_dirty_inode+0x3e0/0x6b0 fs/gfs2/super.c:520
       __mark_inode_dirty+0x2ee/0xe90 fs/fs-writeback.c:2493
       mark_inode_dirty_sync include/linux/fs.h:2478 [inline]
       iput+0x1f1/0xa50 fs/inode.c:1906
       __dentry_kill+0x20d/0x630 fs/dcache.c:615
       shrink_kill+0xa9/0x2c0 fs/dcache.c:1060
       shrink_dentry_list+0x2c0/0x5b0 fs/dcache.c:1087
       prune_dcache_sb+0x10f/0x180 fs/dcache.c:1168
       super_cache_scan+0x34f/0x4b0 fs/super.c:221
       do_shrink_slab+0x701/0x1160 mm/shrinker.c:435
       shrink_slab+0x1093/0x14d0 mm/shrinker.c:662
       shrink_one+0x43b/0x850 mm/vmscan.c:4795
       shrink_many mm/vmscan.c:4856 [inline]
       lru_gen_shrink_node mm/vmscan.c:4934 [inline]
       shrink_node+0x3799/0x3de0 mm/vmscan.c:5914
       kswapd_shrink_node mm/vmscan.c:6742 [inline]
       balance_pgdat mm/vmscan.c:6934 [inline]
       kswapd+0x1cbc/0x3720 mm/vmscan.c:7203
       kthread+0x2f0/0x390 kernel/kthread.c:389
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

other info that might help us debug this:

Chain exists of:
  sb_internal#2 --> &ip->i_rw_mutex --> fs_reclaim

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(fs_reclaim);
                               lock(&ip->i_rw_mutex);
                               lock(fs_reclaim);
  rlock(sb_internal#2);

 *** DEADLOCK ***

2 locks held by kswapd0/78:
 #0: ffffffff8ea369a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat mm/vmscan.c:6821 [inline]
 #0: ffffffff8ea369a0 (fs_reclaim){+.+.}-{0:0}, at: kswapd+0xbf1/0x3720 mm/vmscan.c:7203
 #1: ffff88801fe9a0e0 (&type->s_umount_key#47){.+.+}-{3:3}, at: super_trylock_shared fs/super.c:562 [inline]
 #1: ffff88801fe9a0e0 (&type->s_umount_key#47){.+.+}-{3:3}, at: super_cache_scan+0x94/0x4b0 fs/super.c:196

stack backtrace:
CPU: 0 UID: 0 PID: 78 Comm: kswapd0 Not tainted 6.11.0-syzkaller-07462-g1868f9d0260e #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:93 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
 print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2074
 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2203
 check_prev_add kernel/locking/lockdep.c:3158 [inline]
 check_prevs_add kernel/locking/lockdep.c:3277 [inline]
 validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3901
 __lock_acquire+0x1384/0x2050 kernel/locking/lockdep.c:5199
 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5822
 percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
 __sb_start_write include/linux/fs.h:1715 [inline]
 sb_start_intwrite include/linux/fs.h:1898 [inline]
 __gfs2_trans_begin+0x471/0x950 fs/gfs2/trans.c:76
 gfs2_trans_begin+0x71/0xe0 fs/gfs2/trans.c:118
 gfs2_dirty_inode+0x3e0/0x6b0 fs/gfs2/super.c:520
 __mark_inode_dirty+0x2ee/0xe90 fs/fs-writeback.c:2493
 mark_inode_dirty_sync include/linux/fs.h:2478 [inline]
 iput+0x1f1/0xa50 fs/inode.c:1906
 __dentry_kill+0x20d/0x630 fs/dcache.c:615
 shrink_kill+0xa9/0x2c0 fs/dcache.c:1060
 shrink_dentry_list+0x2c0/0x5b0 fs/dcache.c:1087
 prune_dcache_sb+0x10f/0x180 fs/dcache.c:1168
 super_cache_scan+0x34f/0x4b0 fs/super.c:221
 do_shrink_slab+0x701/0x1160 mm/shrinker.c:435
 shrink_slab+0x1093/0x14d0 mm/shrinker.c:662
 shrink_one+0x43b/0x850 mm/vmscan.c:4795
 shrink_many mm/vmscan.c:4856 [inline]
 lru_gen_shrink_node mm/vmscan.c:4934 [inline]
 shrink_node+0x3799/0x3de0 mm/vmscan.c:5914
 kswapd_shrink_node mm/vmscan.c:6742 [inline]
 balance_pgdat mm/vmscan.c:6934 [inline]
 kswapd+0x1cbc/0x3720 mm/vmscan.c:7203
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>

Crashes (4):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2024/09/21 13:43 upstream 1868f9d0260e 6f888b75 .config console log report [disk image (non-bootable)] [vmlinux] [kernel image] ci-snapshot-upstream-root possible deadlock in gfs2_trans_begin
2024/09/20 23:37 upstream baeb9a7d8b60 6f888b75 .config console log report [disk image (non-bootable)] [vmlinux] [kernel image] ci-snapshot-upstream-root possible deadlock in gfs2_trans_begin
2024/09/20 11:32 upstream 2004cef11ea0 6f888b75 .config console log report [disk image (non-bootable)] [vmlinux] [kernel image] ci-snapshot-upstream-root possible deadlock in gfs2_trans_begin
2024/09/20 09:52 upstream 2004cef11ea0 6f888b75 .config console log report [disk image (non-bootable)] [vmlinux] [kernel image] ci-snapshot-upstream-root possible deadlock in gfs2_trans_begin
* Struck through repros no longer work on HEAD.