syzbot


possible deadlock in __mod_timer (2)

Status: fixed on 2023/07/01 16:05
Subsystems: kernel
[Documentation on labels]
Reported-by: syzbot+fe0c72f0ccbb93786380@syzkaller.appspotmail.com
Fix commit: eb799279fb1f debugobjects: Don't wake up kswapd from fill_pool()
First crash: 353d, last: 353d
Duplicate bugs (1)
duplicates (1):
Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
possible deadlock in hrtimer_start_range_ns kernel 1 348d 344d 0/26 closed as dup on 2023/05/20 22:31
Discussions (3)
Title Replies (including bot) Last reply
Re: [PATCH v2] mm/page_alloc: don't wake kswapd from rmqueue() unless __GFP_KSWAPD_RECLAIM is specified 3 (3) 2023/05/22 13:57
[PATCH] debugobject: don't wake up kswapd from fill_pool() 11 (11) 2023/05/22 12:56
[syzbot] [kernel?] possible deadlock in __mod_timer (2) 0 (1) 2023/05/10 22:10
Similar bugs (2)
Kernel Title Repro Cause bisect Fix bisect Count Last Reported Patched Status
upstream possible deadlock in __mod_timer kernel 35 504d 577d 0/26 auto-obsoleted due to no activity on 2023/04/09 11:04
linux-5.15 possible deadlock in __mod_timer 4 3h42m 18d 0/3 upstream: reported on 2024/04/08 15:59

Sample crash report:
======================================================
WARNING: possible circular locking dependency detected
6.4.0-rc1-syzkaller-00011-g1dc3731daf1f #0 Not tainted
------------------------------------------------------
kworker/u16:0/11 is trying to acquire lock:
ffff88807ffdaba0 (&pgdat->kswapd_wait){....}-{2:2}, at: __wake_up_common_lock+0xb8/0x140 kernel/sched/wait.c:137

but task is already holding lock:
ffff88802c7296d8 (&base->lock){-.-.}-{2:2}, at: __mod_timer+0x69c/0xe80 kernel/time/timer.c:1112

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 (&base->lock){-.-.}-{2:2}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162
       lock_timer_base+0x5a/0x1f0 kernel/time/timer.c:999
       __mod_timer+0x3f9/0xe80 kernel/time/timer.c:1080
       add_timer+0x62/0x90 kernel/time/timer.c:1244
       __queue_delayed_work+0x1a7/0x270 kernel/workqueue.c:1685
       queue_delayed_work_on+0x109/0x120 kernel/workqueue.c:1710
       psi_task_change+0x1bf/0x300 kernel/sched/psi.c:920
       psi_enqueue kernel/sched/stats.h:139 [inline]
       enqueue_task kernel/sched/core.c:2078 [inline]
       activate_task kernel/sched/core.c:2112 [inline]
       wake_up_new_task+0xc13/0x1000 kernel/sched/core.c:4833
       kernel_clone+0x219/0x890 kernel/fork.c:2949
       user_mode_thread+0xb1/0xf0 kernel/fork.c:2996
       rest_init+0x27/0x2b0 init/main.c:700
       arch_call_rest_init+0x13/0x30 init/main.c:834
       start_kernel+0x3b6/0x490 init/main.c:1088
       x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
       x86_64_start_kernel+0xb3/0xc0 arch/x86/kernel/head64.c:537
       secondary_startup_64_no_verify+0xf4/0xfb

-> #2 (&rq->__lock){-.-.}-{2:2}:
       _raw_spin_lock_nested+0x34/0x40 kernel/locking/spinlock.c:378
       raw_spin_rq_lock_nested+0x2f/0x120 kernel/sched/core.c:558
       raw_spin_rq_lock kernel/sched/sched.h:1366 [inline]
       rq_lock kernel/sched/sched.h:1653 [inline]
       task_fork_fair+0x74/0x4f0 kernel/sched/fair.c:12095
       sched_cgroup_fork+0x3d1/0x540 kernel/sched/core.c:4777
       copy_process+0x4b8a/0x7600 kernel/fork.c:2618
       kernel_clone+0xeb/0x890 kernel/fork.c:2918
       user_mode_thread+0xb1/0xf0 kernel/fork.c:2996
       rest_init+0x27/0x2b0 init/main.c:700
       arch_call_rest_init+0x13/0x30 init/main.c:834
       start_kernel+0x3b6/0x490 init/main.c:1088
       x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
       x86_64_start_kernel+0xb3/0xc0 arch/x86/kernel/head64.c:537
       secondary_startup_64_no_verify+0xf4/0xfb

-> #1 (&p->pi_lock){-.-.}-{2:2}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162
       try_to_wake_up+0xab/0x1c40 kernel/sched/core.c:4191
       autoremove_wake_function+0x16/0x150 kernel/sched/wait.c:419
       __wake_up_common+0x147/0x650 kernel/sched/wait.c:107
       __wake_up_common_lock+0xd4/0x140 kernel/sched/wait.c:138
       wakeup_kswapd+0x3fe/0x5c0 mm/vmscan.c:7797
       rmqueue mm/page_alloc.c:3057 [inline]
       get_page_from_freelist+0x6c5/0x2c00 mm/page_alloc.c:3499
       __alloc_pages+0x1cb/0x4a0 mm/page_alloc.c:4768
       __folio_alloc+0x16/0x40 mm/page_alloc.c:4800
       vma_alloc_folio+0x155/0x890 mm/mempolicy.c:2240
       wp_page_copy mm/memory.c:3070 [inline]
       do_wp_page+0x173d/0x33c0 mm/memory.c:3432
       handle_pte_fault mm/memory.c:4964 [inline]
       __handle_mm_fault+0x1635/0x41c0 mm/memory.c:5089
       handle_mm_fault+0x2af/0x9f0 mm/memory.c:5243
       do_user_addr_fault+0x2ca/0x1210 arch/x86/mm/fault.c:1349
       handle_page_fault arch/x86/mm/fault.c:1534 [inline]
       exc_page_fault+0x98/0x170 arch/x86/mm/fault.c:1590
       asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:570

-> #0 (&pgdat->kswapd_wait){....}-{2:2}:
       check_prev_add kernel/locking/lockdep.c:3108 [inline]
       check_prevs_add kernel/locking/lockdep.c:3227 [inline]
       validate_chain kernel/locking/lockdep.c:3842 [inline]
       __lock_acquire+0x2f21/0x5df0 kernel/locking/lockdep.c:5074
       lock_acquire kernel/locking/lockdep.c:5691 [inline]
       lock_acquire+0x1b1/0x520 kernel/locking/lockdep.c:5656
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162
       __wake_up_common_lock+0xb8/0x140 kernel/sched/wait.c:137
       wakeup_kswapd+0x3fe/0x5c0 mm/vmscan.c:7797
       wake_all_kswapds+0x182/0x2d0 mm/page_alloc.c:4028
       __alloc_pages_slowpath.constprop.0+0x1724/0x2170 mm/page_alloc.c:4296
       __alloc_pages+0x408/0x4a0 mm/page_alloc.c:4781
       alloc_pages+0x1aa/0x270 mm/mempolicy.c:2279
       alloc_slab_page mm/slub.c:1851 [inline]
       allocate_slab+0x25f/0x390 mm/slub.c:1998
       new_slab mm/slub.c:2051 [inline]
       ___slab_alloc+0xa91/0x1400 mm/slub.c:3192
       __slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3291
       __slab_alloc_node mm/slub.c:3344 [inline]
       slab_alloc_node mm/slub.c:3441 [inline]
       slab_alloc mm/slub.c:3459 [inline]
       __kmem_cache_alloc_lru mm/slub.c:3466 [inline]
       kmem_cache_alloc+0x38e/0x3b0 mm/slub.c:3475
       kmem_cache_zalloc include/linux/slab.h:670 [inline]
       fill_pool+0x264/0x5c0 lib/debugobjects.c:168
       debug_objects_fill_pool lib/debugobjects.c:597 [inline]
       debug_object_activate+0xfd/0x400 lib/debugobjects.c:693
       debug_timer_activate kernel/time/timer.c:782 [inline]
       __mod_timer+0x80d/0xe80 kernel/time/timer.c:1119
       add_timer+0x62/0x90 kernel/time/timer.c:1244
       __queue_delayed_work+0x1a7/0x270 kernel/workqueue.c:1685
       queue_delayed_work_on+0x109/0x120 kernel/workqueue.c:1710
       process_one_work+0x99a/0x15e0 kernel/workqueue.c:2405
       worker_thread+0x67d/0x10c0 kernel/workqueue.c:2552
       kthread+0x344/0x440 kernel/kthread.c:379
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308

other info that might help us debug this:

Chain exists of:
  &pgdat->kswapd_wait --> &rq->__lock --> &base->lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&base->lock);
                               lock(&rq->__lock);
                               lock(&base->lock);
  lock(&pgdat->kswapd_wait);

 *** DEADLOCK ***

3 locks held by kworker/u16:0/11:
 #0: ffff8880444ff138 ((wq_completion)bat_events){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff8880444ff138 ((wq_completion)bat_events){+.+.}-{0:0}, at: arch_atomic_long_set include/linux/atomic/atomic-long.h:41 [inline]
 #0: ffff8880444ff138 ((wq_completion)bat_events){+.+.}-{0:0}, at: atomic_long_set include/linux/atomic/atomic-instrumented.h:1324 [inline]
 #0: ffff8880444ff138 ((wq_completion)bat_events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:643 [inline]
 #0: ffff8880444ff138 ((wq_completion)bat_events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:670 [inline]
 #0: ffff8880444ff138 ((wq_completion)bat_events){+.+.}-{0:0}, at: process_one_work+0x883/0x15e0 kernel/workqueue.c:2376
 #1: ffffc900003d7db0 ((work_completion)(&(&bat_priv->nc.work)->work)){+.+.}-{0:0}, at: process_one_work+0x8b7/0x15e0 kernel/workqueue.c:2380
 #2: ffff88802c7296d8 (&base->lock){-.-.}-{2:2}, at: __mod_timer+0x69c/0xe80 kernel/time/timer.c:1112

stack backtrace:
CPU: 1 PID: 11 Comm: kworker/u16:0 Not tainted 6.4.0-rc1-syzkaller-00011-g1dc3731daf1f #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
Workqueue: bat_events batadv_nc_worker
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106
 check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2188
 check_prev_add kernel/locking/lockdep.c:3108 [inline]
 check_prevs_add kernel/locking/lockdep.c:3227 [inline]
 validate_chain kernel/locking/lockdep.c:3842 [inline]
 __lock_acquire+0x2f21/0x5df0 kernel/locking/lockdep.c:5074
 lock_acquire kernel/locking/lockdep.c:5691 [inline]
 lock_acquire+0x1b1/0x520 kernel/locking/lockdep.c:5656
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162
 __wake_up_common_lock+0xb8/0x140 kernel/sched/wait.c:137
 wakeup_kswapd+0x3fe/0x5c0 mm/vmscan.c:7797
 wake_all_kswapds+0x182/0x2d0 mm/page_alloc.c:4028
 __alloc_pages_slowpath.constprop.0+0x1724/0x2170 mm/page_alloc.c:4296
 __alloc_pages+0x408/0x4a0 mm/page_alloc.c:4781
 alloc_pages+0x1aa/0x270 mm/mempolicy.c:2279
 alloc_slab_page mm/slub.c:1851 [inline]
 allocate_slab+0x25f/0x390 mm/slub.c:1998
 new_slab mm/slub.c:2051 [inline]
 ___slab_alloc+0xa91/0x1400 mm/slub.c:3192
 __slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3291
 __slab_alloc_node mm/slub.c:3344 [inline]
 slab_alloc_node mm/slub.c:3441 [inline]
 slab_alloc mm/slub.c:3459 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3466 [inline]
 kmem_cache_alloc+0x38e/0x3b0 mm/slub.c:3475
 kmem_cache_zalloc include/linux/slab.h:670 [inline]
 fill_pool+0x264/0x5c0 lib/debugobjects.c:168
 debug_objects_fill_pool lib/debugobjects.c:597 [inline]
 debug_object_activate+0xfd/0x400 lib/debugobjects.c:693
 debug_timer_activate kernel/time/timer.c:782 [inline]
 __mod_timer+0x80d/0xe80 kernel/time/timer.c:1119
 add_timer+0x62/0x90 kernel/time/timer.c:1244
 __queue_delayed_work+0x1a7/0x270 kernel/workqueue.c:1685
 queue_delayed_work_on+0x109/0x120 kernel/workqueue.c:1710
 process_one_work+0x99a/0x15e0 kernel/workqueue.c:2405
 worker_thread+0x67d/0x10c0 kernel/workqueue.c:2552
 kthread+0x344/0x440 kernel/kthread.c:379
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
 </TASK>

Crashes (1):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2023/05/10 02:36 upstream 1dc3731daf1f 1964022b .config console log report info [disk image (non-bootable)] [vmlinux] [kernel image] ci-qemu-upstream-386 possible deadlock in __mod_timer
* Struck through repros no longer work on HEAD.