bisecting cause commit starting from 92188b41f1394d5e4399fcb28c13a2933f255255 building syzkaller on 9c8124727c791c492f98fceaebf7b74d9ab78878 testing commit 92188b41f1394d5e4399fcb28c13a2933f255255 with gcc (GCC) 8.1.0 kernel signature: 66b2e856854e42795615c43db05ea26379c9fb0e20573a67bbf033701648948c run #0: crashed: WARNING in ptrace_stop run #1: crashed: WARNING in ptrace_stop run #2: crashed: WARNING in ptrace_stop run #3: crashed: WARNING in ptrace_stop run #4: crashed: WARNING in ptrace_stop run #5: crashed: WARNING in ptrace_stop run #6: OK run #7: crashed: WARNING in ptrace_stop run #8: crashed: WARNING in ptrace_stop run #9: OK testing release v5.7 testing commit 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162 with gcc (GCC) 8.1.0 kernel signature: 449cf79dbfef3eb2d27b792be64593fd6892d4fa98270cee9e03549478108079 all runs: OK # git bisect start 92188b41f1394d5e4399fcb28c13a2933f255255 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162 Bisecting: 8504 revisions left to test after this (roughly 13 steps) [a789d5f8a99a366778a6e42b3700a246244201a6] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid testing commit a789d5f8a99a366778a6e42b3700a246244201a6 with gcc (GCC) 8.1.0 kernel signature: 9ce8732cde389ca6a9179a40b0a009d8c60ca765eb3c6532c2573664eaa35ce7 all runs: OK # git bisect good a789d5f8a99a366778a6e42b3700a246244201a6 Bisecting: 4254 revisions left to test after this (roughly 12 steps) [bf97bac9dc6481e9f68992e52bed5cc4b210e636] net: atm: Remove the error message according to the atomic context testing commit bf97bac9dc6481e9f68992e52bed5cc4b210e636 with gcc (GCC) 8.1.0 kernel signature: 5f8269a2a2329bb9ff2b41dd7c91bb2ab39325f3ba6a46560b7eeafe1afaecd7 run #0: basic kernel testing failed: BUG: using smp_processor_id() in preemptible code in ext4_mb_new_blocks run #1: basic kernel testing failed: BUG: using smp_processor_id() in preemptible code in ext4_mb_new_blocks run #2: basic kernel testing failed: BUG: using smp_processor_id() in preemptible code in ext4_mb_new_blocks run #3: basic kernel testing failed: BUG: using smp_processor_id() in preemptible code in ext4_mb_new_blocks run #4: basic kernel testing failed: BUG: using smp_processor_id() in preemptible code in ext4_mb_new_blocks run #5: basic kernel testing failed: BUG: using smp_processor_id() in preemptible code in ext4_mb_new_blocks run #6: basic kernel testing failed: BUG: using smp_processor_id() in preemptible code in ext4_mb_new_blocks run #7: basic kernel testing failed: BUG: using smp_processor_id() in preemptible code in ext4_mb_new_blocks run #8: basic kernel testing failed: BUG: using smp_processor_id() in preemptible code in ext4_mb_new_blocks run #9: boot failed: BUG: using smp_processor_id() in preemptible code in ext4_mb_new_blocks # git bisect skip bf97bac9dc6481e9f68992e52bed5cc4b210e636 Bisecting: 4254 revisions left to test after this (roughly 12 steps) [571cfadcc628dd5591444f7289e27445ea732f4c] clk: mediatek: assign the initial value to clk_init_data of mtk_mux testing commit 571cfadcc628dd5591444f7289e27445ea732f4c with gcc (GCC) 8.1.0 kernel signature: 34cdbee45ba8de25914b92658da00d0786799c376863a6dd42f229e52375e2e1 all runs: OK # git bisect good 571cfadcc628dd5591444f7289e27445ea732f4c Bisecting: 4249 revisions left to test after this (roughly 12 steps) [6954a9e4192b86d778fb52b525fd7b62d51b1147] ibmvnic: Flush existing work items before device removal testing commit 6954a9e4192b86d778fb52b525fd7b62d51b1147 with gcc (GCC) 8.1.0 kernel signature: ab19d0912ae9f81928811545c92f7d2d35783b39a156f9b2fdb4630bed615a64 all runs: basic kernel testing failed: BUG: using smp_processor_id() in preemptible code in ext4_mb_new_blocks # git bisect skip 6954a9e4192b86d778fb52b525fd7b62d51b1147 Bisecting: 4249 revisions left to test after this (roughly 12 steps) [2e919bc446faee429ac862a6cdb5e40017051f6b] net: phylink: ensure manual pause mode configuration takes effect testing commit 2e919bc446faee429ac862a6cdb5e40017051f6b with gcc (GCC) 8.1.0 kernel signature: 94d4ce26c7f2325c231c049890ce5955d59cdc80679ad358d240bcb39d39ef23 all runs: OK # git bisect good 2e919bc446faee429ac862a6cdb5e40017051f6b Bisecting: 844 revisions left to test after this (roughly 10 steps) [4c95ad261cfac120dd66238fcae222766754c219] perf intel-pt: Fix PEBS sample for XMM registers testing commit 4c95ad261cfac120dd66238fcae222766754c219 with gcc (GCC) 8.1.0 kernel signature: 28cecdf809f43d120ccd42395f742a4f693f850bf3204fdb55e6e935f16740c8 run #0: boot failed: can't ssh into the instance run #1: OK run #2: OK run #3: OK run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK # git bisect good 4c95ad261cfac120dd66238fcae222766754c219 Bisecting: 344 revisions left to test after this (roughly 9 steps) [5a764898afec0bc097003e8c3e727792289f76d6] Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net testing commit 5a764898afec0bc097003e8c3e727792289f76d6 with gcc (GCC) 8.1.0 kernel signature: eb048048714eb805d6c0a7ab5ed92790cdca5d484fed1dd0cf8f9d5c3d5d9c33 all runs: OK # git bisect good 5a764898afec0bc097003e8c3e727792289f76d6 Bisecting: 171 revisions left to test after this (roughly 8 steps) [94fddb7ad019ad9f14d33cd0a6cd159a52a082b8] perf tools: Sync hashmap.h with libbpf's testing commit 94fddb7ad019ad9f14d33cd0a6cd159a52a082b8 with gcc (GCC) 8.1.0 kernel signature: 7a1e7a6e0ca19a8ee020a6714533071179f61bfc8994f1bd427617b5af75c2c0 all runs: OK # git bisect good 94fddb7ad019ad9f14d33cd0a6cd159a52a082b8 Bisecting: 77 revisions left to test after this (roughly 7 steps) [630c183b2de1a94f442564664362fc0171428640] Merge tag 'arm-fixes-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc into master testing commit 630c183b2de1a94f442564664362fc0171428640 with gcc (GCC) 8.1.0 kernel signature: 3de4cbb96a60a51e3a86cb62324061c6741ff7b753a531d4f4f42ed8e7e29160 run #0: OK run #1: OK run #2: OK run #3: OK run #4: boot failed: can't ssh into the instance run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK # git bisect good 630c183b2de1a94f442564664362fc0171428640 Bisecting: 45 revisions left to test after this (roughly 5 steps) [8c18fc6344568bdc131436be0345d82da512bfef] Merge tag 'dma-mapping-5.8-6' of git://git.infradead.org/users/hch/dma-mapping into master testing commit 8c18fc6344568bdc131436be0345d82da512bfef with gcc (GCC) 8.1.0 kernel signature: 849fb076f138b9cdf6343050acf9d60d86fff0512ba5ccc3631f8ebdf8119a6e all runs: OK # git bisect good 8c18fc6344568bdc131436be0345d82da512bfef Bisecting: 22 revisions left to test after this (roughly 5 steps) [ce20d7bf6e00997496d8d5322b1253584d2a0908] Merge tag 'usb-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb into master testing commit ce20d7bf6e00997496d8d5322b1253584d2a0908 with gcc (GCC) 8.1.0 kernel signature: 35e08acc333563fd6e7c946e9e8c46778d55240400f61dce29ad742f796c69a5 all runs: OK # git bisect good ce20d7bf6e00997496d8d5322b1253584d2a0908 Bisecting: 12 revisions left to test after this (roughly 4 steps) [66e4b63624fcfa47f4d4e0d451f22a8f67902426] Merge tag 'timers-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master testing commit 66e4b63624fcfa47f4d4e0d451f22a8f67902426 with gcc (GCC) 8.1.0 kernel signature: 8d9b9ce1f87c1543031e43d3445bce2d1b2b20d9d0ea1e5b1f6a4c5e9952321c run #0: crashed: WARNING in ptrace_stop run #1: crashed: WARNING in ptrace_stop run #2: crashed: WARNING in ptrace_stop run #3: crashed: WARNING in ptrace_stop run #4: OK run #5: OK run #6: OK run #7: OK run #8: crashed: WARNING in ptrace_stop run #9: crashed: WARNING in ptrace_stop # git bisect bad 66e4b63624fcfa47f4d4e0d451f22a8f67902426 Bisecting: 6 revisions left to test after this (roughly 2 steps) [01cfcde9c26d8555f0e6e9aea9d6049f87683998] sched/fair: handle case of task_h_load() returning 0 testing commit 01cfcde9c26d8555f0e6e9aea9d6049f87683998 with gcc (GCC) 8.1.0 kernel signature: 89d9f91c13966ab69cdfd2cb619dc8f0040f69582b7371ce47af706d171b2e06 run #0: crashed: WARNING in ptrace_stop run #1: crashed: WARNING in ptrace_stop run #2: crashed: WARNING in ptrace_stop run #3: crashed: WARNING in ptrace_stop run #4: OK run #5: OK run #6: crashed: WARNING in ptrace_stop run #7: OK run #8: OK run #9: crashed: WARNING in ptrace_stop # git bisect bad 01cfcde9c26d8555f0e6e9aea9d6049f87683998 Bisecting: 0 revisions left to test after this (roughly 1 step) [ce3614daabea8a2d01c1dd17ae41d1ec5e5ae7db] sched: Fix unreliable rseq cpu_id for new tasks testing commit ce3614daabea8a2d01c1dd17ae41d1ec5e5ae7db with gcc (GCC) 8.1.0 kernel signature: e4b905805f54ecbabbc4f6133338e5bac307d0ae4a7466e0d847288fa8ae689a run #0: crashed: WARNING in ptrace_stop run #1: crashed: WARNING in ptrace_stop run #2: OK run #3: crashed: WARNING in ptrace_stop run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK # git bisect bad ce3614daabea8a2d01c1dd17ae41d1ec5e5ae7db Bisecting: 0 revisions left to test after this (roughly 0 steps) [dbfb089d360b1cc623c51a2c7cf9b99eff78e0e7] sched: Fix loadavg accounting race testing commit dbfb089d360b1cc623c51a2c7cf9b99eff78e0e7 with gcc (GCC) 8.1.0 kernel signature: 9659bb9d7e4a16a6472c93fbf5396baf8fe25e708a99ea02818d8c0f326ef7a5 run #0: OK run #1: OK run #2: OK run #3: OK run #4: OK run #5: OK run #6: OK run #7: crashed: WARNING in ptrace_stop run #8: OK run #9: OK # git bisect bad dbfb089d360b1cc623c51a2c7cf9b99eff78e0e7 dbfb089d360b1cc623c51a2c7cf9b99eff78e0e7 is the first bad commit commit dbfb089d360b1cc623c51a2c7cf9b99eff78e0e7 Author: Peter Zijlstra Date: Fri Jul 3 12:40:33 2020 +0200 sched: Fix loadavg accounting race The recent commit: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu") moved these lines in ttwu(): p->sched_contributes_to_load = !!task_contributes_to_load(p); p->state = TASK_WAKING; up before: smp_cond_load_acquire(&p->on_cpu, !VAL); into the 'p->on_rq == 0' block, with the thinking that once we hit schedule() the current task cannot change it's ->state anymore. And while this is true, it is both incorrect and flawed. It is incorrect in that we need at least an ACQUIRE on 'p->on_rq == 0' to avoid weak hardware from re-ordering things for us. This can fairly easily be achieved by relying on the control-dependency already in place. The second problem, which makes the flaw in the original argument, is that while schedule() will not change prev->state, it will read it a number of times (arguably too many times since it's marked volatile). The previous condition 'p->on_cpu == 0' was sufficient because that indicates schedule() has completed, and will no longer read prev->state. So now the trick is to make this same true for the (much) earlier 'prev->on_rq == 0' case. Furthermore, in order to make the ordering stick, the 'prev->on_rq = 0' assignment needs to he a RELEASE, but adding additional ordering to schedule() is an unwelcome proposition at the best of times, doubly so for mere accounting. Luckily we can push the prev->state load up before rq->lock, with the only caveat that we then have to re-read the state after. However, we know that if it changed, we no longer have to worry about the blocking path. This gives us the required ordering, if we block, we did the prev->state load before an (effective) smp_mb() and the p->on_rq store needs not change. With this we end up with the effective ordering: LOAD p->state LOAD-ACQUIRE p->on_rq == 0 MB STORE p->on_rq, 0 STORE p->state, TASK_WAKING which ensures the TASK_WAKING store happens after the prev->state load, and all is well again. Fixes: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu") Reported-by: Dave Jones Reported-by: Paul Gortmaker Signed-off-by: Peter Zijlstra (Intel) Tested-by: Dave Jones Tested-by: Paul Gortmaker Link: https://lkml.kernel.org/r/20200707102957.GN117543@hirez.programming.kicks-ass.net include/linux/sched.h | 4 --- kernel/sched/core.c | 67 +++++++++++++++++++++++++++++++++++++++------------ 2 files changed, 51 insertions(+), 20 deletions(-) parent commit dcb7fd82c75ee2d6e6f9d8cc71c52519ed52e258 wasn't tested testing commit dcb7fd82c75ee2d6e6f9d8cc71c52519ed52e258 with gcc (GCC) 8.1.0 kernel signature: 7527e2d2c129e7eec62b877d5f5cb7f12e1325ffbfcd28707d8a482d1be181dc culprit signature: 9659bb9d7e4a16a6472c93fbf5396baf8fe25e708a99ea02818d8c0f326ef7a5 parent signature: 7527e2d2c129e7eec62b877d5f5cb7f12e1325ffbfcd28707d8a482d1be181dc revisions tested: 17, total time: 4h36m6.773693069s (build: 1h37m24.574858162s, test: 2h56m12.166100698s) first bad commit: dbfb089d360b1cc623c51a2c7cf9b99eff78e0e7 sched: Fix loadavg accounting race cc: ["davej@codemonkey.org.uk" "paul.gortmaker@windriver.com" "peterz@infradead.org"] crash: WARNING in ptrace_stop ------------[ cut here ]------------ do not call blocking ops when !TASK_RUNNING; state=8 set at [<000000008c9858e8>] ptrace_stop+0x0/0x2b0 kernel/signal.c:2073 WARNING: CPU: 1 PID: 18055 at kernel/sched/core.c:6886 __might_sleep+0x63/0x70 kernel/sched/core.c:6881 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 18055 Comm: syz-executor359 Not tainted 5.8.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xb3/0xec lib/dump_stack.c:118 panic+0x115/0x2fa kernel/panic.c:231 __warn.cold.13+0x20/0x25 kernel/panic.c:600 report_bug+0xc0/0xf0 lib/bug.c:198 handle_bug+0x35/0x90 arch/x86/kernel/traps.c:235 exc_invalid_op+0x13/0x60 arch/x86/kernel/traps.c:255 asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:542 RIP: 0010:__might_sleep+0x63/0x70 kernel/sched/core.c:6881 Code: 5b 5d 41 5c e9 7e fe ff ff 48 8b 90 10 14 00 00 48 c7 c7 88 9d e4 83 c6 05 49 6b 3c 03 01 48 8b 70 10 48 89 d1 e8 a8 6f fc ff <0f> 0b eb ca 66 0f 1f 84 00 00 00 00 00 89 f6 e9 19 6b 0e 00 66 0f RSP: 0018:ffffc9000b6afdc0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffffff83e4634e RCX: 0000000000000001 RDX: 0000000080000001 RSI: ffffffff8400a781 RDI: 00000000ffffffff RBP: 0000000000000039 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000002 try_to_freeze_unsafe include/linux/freezer.h:57 [inline] try_to_freeze include/linux/freezer.h:67 [inline] freezer_count include/linux/freezer.h:128 [inline] freezable_schedule include/linux/freezer.h:173 [inline] ptrace_stop+0x160/0x2b0 kernel/signal.c:2215 ptrace_signal kernel/signal.c:2490 [inline] get_signal+0x719/0xcb0 kernel/signal.c:2639 do_signal+0x2b/0x920 arch/x86/kernel/signal.c:810 exit_to_usermode_loop arch/x86/entry/common.c:235 [inline] __prepare_exit_to_usermode+0x1a7/0x210 arch/x86/entry/common.c:269 do_syscall_64+0x6c/0xe0 arch/x86/entry/common.c:393 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x441719 Code: Bad RIP value. RSP: 002b:00007ffd3b8d9008 EFLAGS: 00000246 ORIG_RAX: 000000000000003d RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 0000000000441719 RDX: 0000000080000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 000000000007b179 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402410 R13: 00000000004024a0 R14: 0000000000000000 R15: 0000000000000000 Kernel Offset: disabled Rebooting in 86400 seconds..