bisecting cause commit starting from c6dd78fcb8eefa15dd861889e0f59d301cb5230c building syzkaller on 32329ceb4bbf58a21007c90edf2fb7ed242345db testing commit c6dd78fcb8eefa15dd861889e0f59d301cb5230c with gcc (GCC) 8.1.0 run #0: crashed: INFO: task hung in perf_event_free_task run #1: OK run #2: crashed: INFO: task hung in perf_event_free_task run #3: OK run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK testing release v5.2 testing commit 0ecfebd2b52404ae0c54a878c872bb93363ada36 with gcc (GCC) 8.1.0 all runs: OK # git bisect start c6dd78fcb8eefa15dd861889e0f59d301cb5230c v5.2 Bisecting: 6613 revisions left to test after this (roughly 13 steps) [e786741ff1b52769b044b7f4407f39cd13ee5d2d] Merge tag 'staging-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging testing commit e786741ff1b52769b044b7f4407f39cd13ee5d2d with gcc (GCC) 8.1.0 all runs: OK # git bisect good e786741ff1b52769b044b7f4407f39cd13ee5d2d Bisecting: 3034 revisions left to test after this (roughly 12 steps) [be8454afc50f43016ca8b6130d9673bdd0bd56ec] Merge tag 'drm-next-2019-07-16' of git://anongit.freedesktop.org/drm/drm testing commit be8454afc50f43016ca8b6130d9673bdd0bd56ec with gcc (GCC) 8.1.0 run #0: crashed: INFO: task hung in perf_event_free_task run #1: crashed: INFO: task hung in perf_event_free_task run #2: OK run #3: OK run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK # git bisect bad be8454afc50f43016ca8b6130d9673bdd0bd56ec Bisecting: 1793 revisions left to test after this (roughly 11 steps) [44c153671296ecdee7c75aaf778f054ffaf1ee00] Merge tag 'drm-misc-next-fixes-2019-06-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next testing commit 44c153671296ecdee7c75aaf778f054ffaf1ee00 with gcc (GCC) 8.1.0 all runs: OK # git bisect good 44c153671296ecdee7c75aaf778f054ffaf1ee00 Bisecting: 900 revisions left to test after this (roughly 10 steps) [39ceda5ce1b002e30563fcb8ad1bb5ac8e4d59ee] Merge tag 'kbuild-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild testing commit 39ceda5ce1b002e30563fcb8ad1bb5ac8e4d59ee with gcc (GCC) 8.1.0 all runs: OK # git bisect good 39ceda5ce1b002e30563fcb8ad1bb5ac8e4d59ee Bisecting: 441 revisions left to test after this (roughly 9 steps) [d12109291ccbef7e879cc0d0733f31685cd80854] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net testing commit d12109291ccbef7e879cc0d0733f31685cd80854 with gcc (GCC) 8.1.0 all runs: OK # git bisect good d12109291ccbef7e879cc0d0733f31685cd80854 Bisecting: 216 revisions left to test after this (roughly 8 steps) [fde7dc63b1caa6dedf9af7cbf79895589629bc95] Merge tag 'mailbox-v5.3' of git://git.linaro.org/landing-teams/working/fujitsu/integration testing commit fde7dc63b1caa6dedf9af7cbf79895589629bc95 with gcc (GCC) 8.1.0 run #0: crashed: INFO: task hung in perf_event_free_task run #1: crashed: INFO: task hung in perf_event_free_task run #2: OK run #3: OK run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK # git bisect bad fde7dc63b1caa6dedf9af7cbf79895589629bc95 Bisecting: 112 revisions left to test after this (roughly 7 steps) [658829dfe75c49e879e0c4c9cbcd3bd1e4fbdcf5] powerpc/cell: set no_llseek in spufs_cntl_fops testing commit 658829dfe75c49e879e0c4c9cbcd3bd1e4fbdcf5 with gcc (GCC) 8.1.0 all runs: OK # git bisect good 658829dfe75c49e879e0c4c9cbcd3bd1e4fbdcf5 Bisecting: 59 revisions left to test after this (roughly 6 steps) [f5a9e488d62360c91c5770bd55a0b40e419a71ce] powerpc/powernv/idle: Fix restore of SPRN_LDBAR for POWER9 stop state. testing commit f5a9e488d62360c91c5770bd55a0b40e419a71ce with gcc (GCC) 8.1.0 all runs: OK # git bisect good f5a9e488d62360c91c5770bd55a0b40e419a71ce Bisecting: 29 revisions left to test after this (roughly 5 steps) [8a58ddae23796c733c5dfbd717538d89d036c5bd] perf/core: Fix exclusive events' grouping testing commit 8a58ddae23796c733c5dfbd717538d89d036c5bd with gcc (GCC) 8.1.0 run #0: crashed: INFO: task hung in perf_event_free_task run #1: crashed: INFO: task hung in perf_event_free_task run #2: crashed: INFO: task hung in perf_event_free_task run #3: crashed: INFO: task hung in perf_event_free_task run #4: crashed: INFO: task hung in perf_event_free_task run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK # git bisect bad 8a58ddae23796c733c5dfbd717538d89d036c5bd Bisecting: 14 revisions left to test after this (roughly 4 steps) [e56fbc9dc79ce0fdc49ffadd062214ddd02f65b6] perf tools: Use list_del_init() more thorougly testing commit e56fbc9dc79ce0fdc49ffadd062214ddd02f65b6 with gcc (GCC) 8.1.0 all runs: OK # git bisect good e56fbc9dc79ce0fdc49ffadd062214ddd02f65b6 Bisecting: 7 revisions left to test after this (roughly 3 steps) [1334bb94cd8a21217cb0c186925f9bc9d47adafc] perf scripts python: export-to-sqlite.py: Fix DROP VIEW power_events_view testing commit 1334bb94cd8a21217cb0c186925f9bc9d47adafc with gcc (GCC) 8.1.0 all runs: OK # git bisect good 1334bb94cd8a21217cb0c186925f9bc9d47adafc Bisecting: 3 revisions left to test after this (roughly 2 steps) [e5eb08ac81d237e19fc68888bbba2cf88891bbe9] Merge tag 'perf-core-for-mingo-5.3-20190709' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent testing commit e5eb08ac81d237e19fc68888bbba2cf88891bbe9 with gcc (GCC) 8.1.0 all runs: OK # git bisect good e5eb08ac81d237e19fc68888bbba2cf88891bbe9 Bisecting: 1 revision left to test after this (roughly 1 step) [16f4641166b10e199f0d7b68c2c5f004fef0bda3] perf/x86/amd/uncore: Do not set 'ThreadMask' and 'SliceMask' for non-L3 PMCs testing commit 16f4641166b10e199f0d7b68c2c5f004fef0bda3 with gcc (GCC) 8.1.0 run #0: crashed: INFO: task hung in perf_event_free_task run #1: crashed: INFO: task hung in perf_event_free_task run #2: crashed: INFO: task hung in perf_event_free_task run #3: crashed: INFO: task hung in perf_event_free_task run #4: crashed: INFO: task hung in perf_event_free_task run #5: crashed: INFO: task hung in perf_event_free_task run #6: OK run #7: OK run #8: OK run #9: OK # git bisect bad 16f4641166b10e199f0d7b68c2c5f004fef0bda3 Bisecting: 0 revisions left to test after this (roughly 0 steps) [1cf8dfe8a661f0462925df943140e9f6d1ea5233] perf/core: Fix race between close() and fork() testing commit 1cf8dfe8a661f0462925df943140e9f6d1ea5233 with gcc (GCC) 8.1.0 run #0: crashed: INFO: task hung in perf_event_free_task run #1: crashed: INFO: task hung in perf_event_free_task run #2: crashed: INFO: task hung in perf_event_free_task run #3: crashed: INFO: task hung in perf_event_free_task run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK # git bisect bad 1cf8dfe8a661f0462925df943140e9f6d1ea5233 1cf8dfe8a661f0462925df943140e9f6d1ea5233 is the first bad commit commit 1cf8dfe8a661f0462925df943140e9f6d1ea5233 Author: Peter Zijlstra Date: Sat Jul 13 11:21:25 2019 +0200 perf/core: Fix race between close() and fork() Syzcaller reported the following Use-after-Free bug: close() clone() copy_process() perf_event_init_task() perf_event_init_context() mutex_lock(parent_ctx->mutex) inherit_task_group() inherit_group() inherit_event() mutex_lock(event->child_mutex) // expose event on child list list_add_tail() mutex_unlock(event->child_mutex) mutex_unlock(parent_ctx->mutex) ... goto bad_fork_* bad_fork_cleanup_perf: perf_event_free_task() perf_release() perf_event_release_kernel() list_for_each_entry() mutex_lock(ctx->mutex) mutex_lock(event->child_mutex) // event is from the failing inherit // on the other CPU perf_remove_from_context() list_move() mutex_unlock(event->child_mutex) mutex_unlock(ctx->mutex) mutex_lock(ctx->mutex) list_for_each_entry_safe() // event already stolen mutex_unlock(ctx->mutex) delayed_free_task() free_task() list_for_each_entry_safe() list_del() free_event() _free_event() // and so event->hw.target // is the already freed failed clone() if (event->hw.target) put_task_struct(event->hw.target) // WHOOPSIE, already quite dead Which puts the lie to the the comment on perf_event_free_task(): 'unexposed, unused context' not so much. Which is a 'fun' confluence of fail; copy_process() doing an unconditional free_task() and not respecting refcounts, and perf having creative locking. In particular: 82d94856fa22 ("perf/core: Fix lock inversion between perf,trace,cpuhp") seems to have overlooked this 'fun' parade. Solve it by using the fact that detached events still have a reference count on their (previous) context. With this perf_event_free_task() can detect when events have escaped and wait for their destruction. Debugged-by: Alexander Shishkin Reported-by: syzbot+a24c397a29ad22d86c98@syzkaller.appspotmail.com Signed-off-by: Peter Zijlstra (Intel) Acked-by: Mark Rutland Cc: Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Fixes: 82d94856fa22 ("perf/core: Fix lock inversion between perf,trace,cpuhp") Signed-off-by: Ingo Molnar :040000 040000 341f5d75e0a7c3692281e35defc813d111a90e66 46f9d23f46b0375eacaee05dd48223ba229f9568 M kernel revisions tested: 16, total time: 4h25m55.692414123s (build: 1h35m0.820502911s, test: 2h45m11.062992193s) first bad commit: 1cf8dfe8a661f0462925df943140e9f6d1ea5233 perf/core: Fix race between close() and fork() cc: ["acme@redhat.com" "alexander.shishkin@linux.intel.com" "eranian@google.com" "jolsa@redhat.com" "mark.rutland@arm.com" "mingo@kernel.org" "peterz@infradead.org" "tglx@linutronix.de" "torvalds@linux-foundation.org" "vincent.weaver@maine.edu"] crash: INFO: task hung in perf_event_free_task INFO: task syz-executor.2:25341 blocked for more than 143 seconds. Not tainted 5.2.0+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor.2 D28928 25341 7605 0x00004006 Call Trace: context_switch kernel/sched/core.c:3252 [inline] __schedule+0x730/0x14d0 kernel/sched/core.c:3878 schedule+0x8c/0x250 kernel/sched/core.c:3942 perf_event_free_task+0x43f/0x630 kernel/events/core.c:11596 copy_process+0x364c/0x64e0 kernel/fork.c:2284 _do_fork+0xec/0xbc0 kernel/fork.c:2370 __do_sys_clone kernel/fork.c:2512 [inline] __se_sys_clone kernel/fork.c:2492 [inline] __x64_sys_clone+0x171/0x230 kernel/fork.c:2492 do_syscall_64+0xd0/0x540 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x459829 Code: dd fe ff ff cc cc cc cc cc cc cc cc cc cc cc cc cc 64 48 8b 0c 25 f8 ff ff ff 48 3b 61 10 76 68 48 83 ec 28 48 89 6c 24 20 48 <8d> 6c 24 20 48 8b 44 24 30 48 89 04 24 48 8b 4c 24 38 48 89 4c 24 RSP: 002b:00007fc9275a1c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000038 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 0000000000459829 RDX: 9999999999999999 RSI: 0000000000000000 RDI: 0000002102001ffe RBP: 000000000075bf20 R08: ffffffffffffffff R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc9275a26d4 R13: 00000000004bfce6 R14: 00000000004d17f8 R15: 00000000ffffffff Showing all locks held in the system: 1 lock held by khungtaskd/1052: #0: 000000004c1bc5d9 (rcu_read_lock){....}, at: debug_show_all_locks+0x5b/0x27e kernel/locking/lockdep.c:5257 1 lock held by rsyslogd/7397: #0: 0000000059ec90d9 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0xa3/0xc0 fs/file.c:801 2 locks held by getty/7487: #0: 00000000461de154 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x2d/0x40 drivers/tty/tty_ldsem.c:341 #1: 000000008749f352 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x1ee/0x1930 drivers/tty/n_tty.c:2156 2 locks held by getty/7488: #0: 000000007fd039ab (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x2d/0x40 drivers/tty/tty_ldsem.c:341 #1: 000000003306781a (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x1ee/0x1930 drivers/tty/n_tty.c:2156 2 locks held by getty/7489: #0: 000000007e16a53f (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x2d/0x40 drivers/tty/tty_ldsem.c:341 #1: 000000006e08b81e (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x1ee/0x1930 drivers/tty/n_tty.c:2156 2 locks held by getty/7490: #0: 00000000c5d1b7bf (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x2d/0x40 drivers/tty/tty_ldsem.c:341 #1: 000000007ca0f08a (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x1ee/0x1930 drivers/tty/n_tty.c:2156 2 locks held by getty/7491: #0: 00000000c12382d3 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x2d/0x40 drivers/tty/tty_ldsem.c:341 #1: 00000000c92cc89b (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x1ee/0x1930 drivers/tty/n_tty.c:2156 2 locks held by getty/7492: #0: 000000007f53ae3f (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x2d/0x40 drivers/tty/tty_ldsem.c:341 #1: 0000000043c4b2b5 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x1ee/0x1930 drivers/tty/n_tty.c:2156 2 locks held by getty/7493: #0: 00000000c5ca67f6 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x2d/0x40 drivers/tty/tty_ldsem.c:341 #1: 00000000630d4d67 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x1ee/0x1930 drivers/tty/n_tty.c:2156 ============================================= NMI backtrace for cpu 1 CPU: 1 PID: 1052 Comm: khungtaskd Not tainted 5.2.0+ #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x115/0x167 lib/dump_stack.c:113 nmi_cpu_backtrace.cold.8+0x4b/0x84 lib/nmi_backtrace.c:101 nmi_trigger_cpumask_backtrace+0x16b/0x18d lib/nmi_backtrace.c:62 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline] check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline] watchdog+0x59f/0xb60 kernel/hung_task.c:289 kthread+0x331/0x3f0 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 Sending NMI from CPU 1 to CPUs 0: NMI backtrace for cpu 0 CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.2.0+ #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events_power_efficient gc_worker RIP: 0010:mark_held_locks+0x25/0x130 kernel/locking/lockdep.c:3273 Code: 00 00 00 00 00 55 48 8d 87 48 08 00 00 41 89 f0 48 89 e5 48 89 c2 41 57 41 56 48 c1 ea 03 49 89 fe 41 55 41 54 53 48 83 ec 18 <48> 89 45 c8 48 b8 00 00 00 00 00 fc ff df 0f b6 04 02 84 c0 74 08 RSP: 0018:ffff8880a9877bc0 EFLAGS: 00000086 RAX: ffff8880a985e988 RBX: ffff8880a985e140 RCX: 0000000000000002 RDX: 1ffff1101530bd31 RSI: 0000000000000006 RDI: ffff8880a985e140 RBP: ffff8880a9877c00 R08: 0000000000000006 R09: fffffbfff1340569 R10: fffffbfff1340568 R11: ffffffff89a02b47 R12: ffffffff855f0fc2 R13: ffff8880aea34bc0 R14: ffff8880a985e140 R15: ffffffff8a898180 FS: 0000000000000000(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000244aff8 CR3: 00000000a1598000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __trace_hardirqs_on_caller kernel/locking/lockdep.c:3314 [inline] lockdep_hardirqs_on+0x424/0x5c0 kernel/locking/lockdep.c:3359 trace_hardirqs_on+0x28/0x1a0 kernel/trace/trace_preemptirq.c:31 seqcount_lockdep_reader_access include/linux/seqlock.h:83 [inline] read_seqcount_begin include/linux/seqlock.h:164 [inline] nf_conntrack_get_ht include/net/netfilter/nf_conntrack.h:305 [inline] gc_worker+0x732/0xae0 net/netfilter/nf_conntrack_core.c:1248 process_one_work+0x856/0x1610 kernel/workqueue.c:2269 worker_thread+0x85/0xb60 kernel/workqueue.c:2415 kthread+0x331/0x3f0 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352