bisecting fixing commit since 2d16cf4817bc6944a2adb5bf4db607c8258e87da building syzkaller on f3ba1b5b7b3b1e8e178f239c514ba0c2cb50f214 testing commit 2d16cf4817bc6944a2adb5bf4db607c8258e87da with gcc (GCC) 8.4.1 20210217 kernel signature: dcef52b9052584cfc85a7674aca4c4153ffaf345d29a7497928eb55931c75e2a run #0: crashed: possible deadlock in ext4_file_write_iter run #1: crashed: possible deadlock in ext4_file_write_iter run #2: crashed: possible deadlock in __generic_file_fsync run #3: OK run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK run #10: OK run #11: OK run #12: OK run #13: OK run #14: OK run #15: OK run #16: OK run #17: OK run #18: OK run #19: OK reproducer seems to be flaky testing current HEAD eb575cd5d7f60241d016fdd13a9e86d962093c9b testing commit eb575cd5d7f60241d016fdd13a9e86d962093c9b with gcc (GCC) 8.4.1 20210217 kernel signature: eb44e970c59e0d267e0d98e8b7c9d4e484e742830898061c4b1243a19ef9e6d1 run #0: crashed: possible deadlock in ext4_file_write_iter run #1: crashed: possible deadlock in __generic_file_fsync run #2: OK run #3: OK run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK run #10: OK run #11: OK run #12: OK run #13: OK run #14: OK run #15: OK run #16: OK run #17: OK run #18: OK run #19: OK Reproducer flagged being flaky revisions tested: 2, total time: 44m28.778592349s (build: 20m28.596994657s, test: 23m28.993371169s) the crash still happens on HEAD commit msg: Linux 4.19.195 crash: possible deadlock in __generic_file_fsync batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems! batman_adv: The newly added mac address (aa:aa:aa:aa:aa:3d) already exists on: batadv_slave_0 batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems! batman_adv: The newly added mac address (aa:aa:aa:aa:aa:3d) already exists on: batadv_slave_0 ====================================================== WARNING: possible circular locking dependency detected 4.19.195-syzkaller #0 Not tainted ------------------------------------------------------ kworker/0:3/7074 is trying to acquire lock: 0000000010348f6a (&sb->s_type->i_mutex_key#13){+.+.}, at: inode_lock include/linux/fs.h:748 [inline] 0000000010348f6a (&sb->s_type->i_mutex_key#13){+.+.}, at: __generic_file_fsync+0x8a/0x1a0 fs/libfs.c:989 batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems! but task is already holding lock: 00000000d50a2c4a ((work_completion)(&dio->complete_work)){+.+.}, at: process_one_work+0x71b/0x15a0 kernel/workqueue.c:2128 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 ((work_completion)(&dio->complete_work)){+.+.}: process_one_work+0x76c/0x15a0 kernel/workqueue.c:2129 worker_thread+0x85/0xb60 kernel/workqueue.c:2296 kthread+0x347/0x410 kernel/kthread.c:259 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415 -> #1 ((wq_completion)"dio/%s"sb->s_id){+.+.}: flush_workqueue+0xf2/0x1350 kernel/workqueue.c:2661 drain_workqueue+0x148/0x3a0 kernel/workqueue.c:2826 destroy_workqueue+0x68/0x5d0 kernel/workqueue.c:4177 sb_init_dio_done_wq+0x65/0x80 fs/direct-io.c:634 do_blockdev_direct_IO fs/direct-io.c:1285 [inline] __blockdev_direct_IO+0x5bc/0xc5f0 fs/direct-io.c:1419 ext4_direct_IO_write fs/ext4/inode.c:3777 [inline] ext4_direct_IO+0x87c/0x17d0 fs/ext4/inode.c:3915 generic_file_direct_write+0x1ee/0x410 mm/filemap.c:3073 __generic_file_write_iter+0x279/0x590 mm/filemap.c:3252 ext4_file_write_iter+0x281/0xe50 fs/ext4/file.c:272 call_write_iter include/linux/fs.h:1821 [inline] aio_write+0x2e4/0x560 fs/aio.c:1574 __io_submit_one fs/aio.c:1858 [inline] io_submit_one+0x791/0x1ce0 fs/aio.c:1909 __do_sys_io_submit fs/aio.c:1953 [inline] __se_sys_io_submit+0x10b/0x360 fs/aio.c:1924 __x64_sys_io_submit+0x6e/0xb0 fs/aio.c:1924 do_syscall_64+0xd0/0x4e0 arch/x86/entry/common.c:293 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (&sb->s_type->i_mutex_key#13){+.+.}: lock_acquire+0x180/0x3a0 kernel/locking/lockdep.c:3908 down_write+0x38/0x90 kernel/locking/rwsem.c:70 inode_lock include/linux/fs.h:748 [inline] __generic_file_fsync+0x8a/0x1a0 fs/libfs.c:989 ext4_sync_file+0x729/0xf40 fs/ext4/fsync.c:118 vfs_fsync_range+0xee/0x220 fs/sync.c:197 generic_write_sync include/linux/fs.h:2750 [inline] dio_complete+0x55b/0x970 fs/direct-io.c:329 dio_aio_complete_work+0x17/0x20 fs/direct-io.c:341 process_one_work+0x7b9/0x15a0 kernel/workqueue.c:2153 worker_thread+0x85/0xb60 kernel/workqueue.c:2296 kthread+0x347/0x410 kernel/kthread.c:259 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415 other info that might help us debug this: Chain exists of: &sb->s_type->i_mutex_key#13 --> (wq_completion)"dio/%s"sb->s_id --> (work_completion)(&dio->complete_work) Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock((work_completion)(&dio->complete_work)); lock((wq_completion)"dio/%s"sb->s_id); lock((work_completion)(&dio->complete_work)); lock(&sb->s_type->i_mutex_key#13); *** DEADLOCK *** 2 locks held by kworker/0:3/7074: #0: 000000001dade04f ((wq_completion)"dio/%s"sb->s_id){+.+.}, at: process_one_work+0x6e8/0x15a0 kernel/workqueue.c:2124 #1: 00000000d50a2c4a ((work_completion)(&dio->complete_work)){+.+.}, at: process_one_work+0x71b/0x15a0 kernel/workqueue.c:2128 stack backtrace: CPU: 0 PID: 7074 Comm: kworker/0:3 Not tainted 4.19.195-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: dio/sda1 dio_aio_complete_work Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x17c/0x226 lib/dump_stack.c:118 print_circular_bug.isra.17.cold.34+0x2e3/0x41e kernel/locking/lockdep.c:1222 check_prev_add kernel/locking/lockdep.c:1866 [inline] check_prevs_add kernel/locking/lockdep.c:1979 [inline] validate_chain kernel/locking/lockdep.c:2420 [inline] __lock_acquire+0x35c2/0x47c0 kernel/locking/lockdep.c:3416 lock_acquire+0x180/0x3a0 kernel/locking/lockdep.c:3908 down_write+0x38/0x90 kernel/locking/rwsem.c:70 inode_lock include/linux/fs.h:748 [inline] __generic_file_fsync+0x8a/0x1a0 fs/libfs.c:989 ext4_sync_file+0x729/0xf40 fs/ext4/fsync.c:118 vfs_fsync_range+0xee/0x220 fs/sync.c:197 generic_write_sync include/linux/fs.h:2750 [inline] dio_complete+0x55b/0x970 fs/direct-io.c:329 dio_aio_complete_work+0x17/0x20 fs/direct-io.c:341 process_one_work+0x7b9/0x15a0 kernel/workqueue.c:2153 worker_thread+0x85/0xb60 kernel/workqueue.c:2296 kthread+0x347/0x410 kernel/kthread.c:259 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415 batman_adv: The newly added mac address (aa:aa:aa:aa:aa:3d) already exists on: batadv_slave_0 batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems! batman_adv: The newly added mac address (aa:aa:aa:aa:aa:3d) already exists on: batadv_slave_0 batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems! IPv6: ADDRCONF(NETDEV_UP): batadv_slave_0: link is not ready batman_adv: batadv0: Interface activated: batadv_slave_0 batman_adv: The newly added mac address (aa:aa:aa:aa:aa:3e) already exists on: batadv_slave_1 batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems! batman_adv: The newly added mac address (aa:aa:aa:aa:aa:3e) already exists on: batadv_slave_1 batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems! batman_adv: The newly added mac address (aa:aa:aa:aa:aa:3e) already exists on: batadv_slave_1 batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems! batman_adv: The newly added mac address (aa:aa:aa:aa:aa:3e) already exists on: batadv_slave_1 batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems! batman_adv: The newly added mac address (aa:aa:aa:aa:aa:3e) already exists on: batadv_slave_1 batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems! IPv6: ADDRCONF(NETDEV_UP): batadv_slave_1: link is not ready batman_adv: batadv0: Interface activated: batadv_slave_1 IPv6: ADDRCONF(NETDEV_CHANGE): macsec0: link becomes ready IPv6: ADDRCONF(NETDEV_CHANGE): batadv_slave_0: link becomes ready IPv6: ADDRCONF(NETDEV_CHANGE): veth0_to_batadv: link becomes ready IPv6: ADDRCONF(NETDEV_CHANGE): batadv_slave_1: link becomes ready IPv6: ADDRCONF(NETDEV_CHANGE): veth1_to_batadv: link becomes ready print_req_error: I/O error, dev loop0, sector 0 print_req_error: I/O error, dev loop0, sector 0 Buffer I/O error on dev loop0, logical block 0, async page read print_req_error: I/O error, dev loop0, sector 0 Buffer I/O error on dev loop0, logical block 0, async page read print_req_error: I/O error, dev loop0, sector 0 print_req_error: I/O error, dev loop0, sector 0 Buffer I/O error on dev loop0, logical block 0, async page read