bisecting cause commit starting from 4b972a01a7da614b4796475f933094751a295a2f building syzkaller on 82c13b6b49369ae3f3846e867fe1b8e0c21eefc0 testing commit 4b972a01a7da614b4796475f933094751a295a2f with gcc (GCC) 8.1.0 run #0: crashed: KASAN: use-after-free Read in class_equal run #1: crashed: kernel panic: corrupted stack end in corrupted run #2: crashed: KASAN: use-after-free Read in class_equal run #3: crashed: KASAN: use-after-free Read in class_equal run #4: crashed: kernel panic: corrupted stack end in corrupted run #5: crashed: KASAN: use-after-free Read in class_equal run #6: crashed: KASAN: slab-out-of-bounds Read in class_equal run #7: crashed: KASAN: slab-out-of-bounds Read in class_equal run #8: crashed: KASAN: use-after-free Read in class_equal run #9: crashed: KASAN: use-after-free Read in class_equal testing release v5.1 testing commit e93c9c99a629c61837d5a7fc2120cd2b6c70dbdd with gcc (GCC) 8.1.0 all runs: crashed: WARNING: ODEBUG bug in del_timer testing release v5.0 testing commit 1c163f4c7b3f621efff9b28a47abb36f7378d783 with gcc (GCC) 8.1.0 all runs: crashed: WARNING in strp_done testing release v4.20 testing commit 8fe28cb58bcb235034b64cbbb7550a8a43fd88be with gcc (GCC) 8.1.0 run #0: crashed: KASAN: use-after-free Read in class_equal run #1: crashed: KASAN: use-after-free Read in class_equal run #2: crashed: KASAN: slab-out-of-bounds Read in class_equal run #3: crashed: kernel panic: corrupted stack end in corrupted run #4: crashed: kernel panic: corrupted stack end in corrupted run #5: crashed: KASAN: slab-out-of-bounds Read in class_equal run #6: crashed: KASAN: use-after-free Read in class_equal run #7: crashed: KASAN: use-after-free Read in class_equal run #8: crashed: KASAN: slab-out-of-bounds Read in class_equal run #9: crashed: kernel panic: corrupted stack end in corrupted testing release v4.19 testing commit 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d with gcc (GCC) 8.1.0 run #0: crashed: KASAN: use-after-free Read in psock_map_pop run #1: crashed: KASAN: use-after-free Read in bpf_tcp_remove run #2: crashed: KASAN: use-after-free Read in psock_map_pop run #3: crashed: KASAN: use-after-free Read in psock_map_pop run #4: crashed: KASAN: use-after-free Read in psock_map_pop run #5: crashed: KASAN: use-after-free Read in bpf_tcp_remove run #6: crashed: KASAN: use-after-free Read in psock_map_pop run #7: OK run #8: OK run #9: OK testing release v4.18 testing commit 94710cac0ef4ee177a63b5227664b38c95bbf703 with gcc (GCC) 8.1.0 run #0: crashed: KASAN: use-after-free Write in bpf_tcp_close run #1: crashed: KASAN: use-after-free Read in psock_map_pop run #2: crashed: KASAN: use-after-free Read in psock_map_pop run #3: crashed: KASAN: use-after-free Write in bpf_tcp_close run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK testing release v4.17 testing commit 29dcea88779c856c7dc92040a0c01233263101d4 with gcc (GCC) 8.1.0 all runs: OK # git bisect start v4.18 v4.17 Bisecting: 7032 revisions left to test after this (roughly 13 steps) [3036bc45364f98515a2c446d7fac2c34dcfbeff4] Merge tag 'media/v4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media testing commit 3036bc45364f98515a2c446d7fac2c34dcfbeff4 with gcc (GCC) 8.1.0 run #0: OK run #1: OK run #2: OK run #3: OK run #4: OK run #5: OK run #6: OK run #7: OK run #8: boot failed: KASAN: use-after-free Write in call_usermodehelper_exec_work run #9: boot failed: KASAN: use-after-free Write in call_usermodehelper_exec_work # git bisect good 3036bc45364f98515a2c446d7fac2c34dcfbeff4 Bisecting: 3348 revisions left to test after this (roughly 12 steps) [721afaa2aeb860067decdddadc84ed16f42f2048] Merge tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc testing commit 721afaa2aeb860067decdddadc84ed16f42f2048 with gcc (GCC) 8.1.0 all runs: OK # git bisect good 721afaa2aeb860067decdddadc84ed16f42f2048 Bisecting: 1674 revisions left to test after this (roughly 11 steps) [7b72717a20bba8bdd01b14c0460be7d15061cd6b] iw_cxgb4: correctly enforce the max reg_mr depth testing commit 7b72717a20bba8bdd01b14c0460be7d15061cd6b with gcc (GCC) 8.1.0 all runs: OK # git bisect good 7b72717a20bba8bdd01b14c0460be7d15061cd6b Bisecting: 837 revisions left to test after this (roughly 10 steps) [47f7dc4b845a9fe60c53b84b8c88cf14efd0de7f] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm testing commit 47f7dc4b845a9fe60c53b84b8c88cf14efd0de7f with gcc (GCC) 8.1.0 run #0: crashed: KASAN: use-after-free Write in bpf_tcp_close run #1: crashed: KASAN: use-after-free Write in bpf_tcp_close run #2: crashed: KASAN: use-after-free Write in bpf_tcp_close run #3: crashed: KASAN: use-after-free Write in bpf_tcp_close run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK # git bisect bad 47f7dc4b845a9fe60c53b84b8c88cf14efd0de7f Bisecting: 414 revisions left to test after this (roughly 9 steps) [4e33d7d47943aaa84a5904472cf2f9c6d6b0a6ca] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net testing commit 4e33d7d47943aaa84a5904472cf2f9c6d6b0a6ca with gcc (GCC) 8.1.0 run #0: crashed: KASAN: use-after-free Read in psock_map_pop run #1: crashed: KASAN: use-after-free Read in psock_map_pop run #2: crashed: KASAN: use-after-free Write in bpf_tcp_close run #3: OK run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK # git bisect bad 4e33d7d47943aaa84a5904472cf2f9c6d6b0a6ca Bisecting: 202 revisions left to test after this (roughly 8 steps) [d7d5388679312b7a7b6377e38e2b8fb06a82d84e] Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip testing commit d7d5388679312b7a7b6377e38e2b8fb06a82d84e with gcc (GCC) 8.1.0 all runs: OK # git bisect good d7d5388679312b7a7b6377e38e2b8fb06a82d84e Bisecting: 101 revisions left to test after this (roughly 7 steps) [d3bc0e67f8525760479e88a51e87bb0c026e40f3] Merge tag 'for-4.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux testing commit d3bc0e67f8525760479e88a51e87bb0c026e40f3 with gcc (GCC) 8.1.0 all runs: OK # git bisect good d3bc0e67f8525760479e88a51e87bb0c026e40f3 Bisecting: 50 revisions left to test after this (roughly 6 steps) [484c016d9392786ce5c74017c206c706f29f823d] bnx2x: Fix receiving tx-timeout in error or recovery state. testing commit 484c016d9392786ce5c74017c206c706f29f823d with gcc (GCC) 8.1.0 all runs: OK # git bisect good 484c016d9392786ce5c74017c206c706f29f823d Bisecting: 29 revisions left to test after this (roughly 5 steps) [bf2b866a2fe2d74558fe4b7bdf63a4bc0afbdf70] Merge branch 'bpf-sockmap-fixes' testing commit bf2b866a2fe2d74558fe4b7bdf63a4bc0afbdf70 with gcc (GCC) 8.1.0 run #0: crashed: KASAN: use-after-free Read in psock_map_pop run #1: crashed: KASAN: use-after-free Write in bpf_tcp_close run #2: crashed: KASAN: use-after-free Write in bpf_tcp_close run #3: crashed: KASAN: use-after-free Write in bpf_tcp_close run #4: crashed: KASAN: use-after-free Write in bpf_tcp_close run #5: crashed: KASAN: use-after-free Write in bpf_tcp_close run #6: OK run #7: OK run #8: OK run #9: OK # git bisect bad bf2b866a2fe2d74558fe4b7bdf63a4bc0afbdf70 Bisecting: 10 revisions left to test after this (roughly 3 steps) [dd349c3ffd93d15d01f4f5ace73870ca45ea249d] selftests: bpf: notification about privilege required to run test_lwt_seg6local.sh testing script testing commit dd349c3ffd93d15d01f4f5ace73870ca45ea249d with gcc (GCC) 8.1.0 all runs: OK # git bisect good dd349c3ffd93d15d01f4f5ace73870ca45ea249d Bisecting: 4 revisions left to test after this (roughly 3 steps) [ca09cb04af900768d32c8ba5f807dfc83e9ca4d3] Merge branch 'bpf-fixes' testing commit ca09cb04af900768d32c8ba5f807dfc83e9ca4d3 with gcc (GCC) 8.1.0 all runs: OK # git bisect good ca09cb04af900768d32c8ba5f807dfc83e9ca4d3 Bisecting: 2 revisions left to test after this (roughly 1 step) [54fedb42c6537dcb0102e4a58a88456a6286999d] bpf: sockmap, fix smap_list_map_remove when psock is in many maps testing commit 54fedb42c6537dcb0102e4a58a88456a6286999d with gcc (GCC) 8.1.0 all runs: OK # git bisect good 54fedb42c6537dcb0102e4a58a88456a6286999d Bisecting: 0 revisions left to test after this (roughly 1 step) [caac76a5170eb508529bbff9d9300e9c57126444] bpf: sockhash, add release routine testing commit caac76a5170eb508529bbff9d9300e9c57126444 with gcc (GCC) 8.1.0 run #0: crashed: KASAN: use-after-free Write in bpf_tcp_close run #1: crashed: KASAN: use-after-free Read in psock_map_pop run #2: crashed: KASAN: use-after-free Write in bpf_tcp_close run #3: OK run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK # git bisect bad caac76a5170eb508529bbff9d9300e9c57126444 Bisecting: 0 revisions left to test after this (roughly 0 steps) [e9db4ef6bf4ca9894bb324c76e01b8f1a16b2650] bpf: sockhash fix omitted bucket lock in sock_close testing commit e9db4ef6bf4ca9894bb324c76e01b8f1a16b2650 with gcc (GCC) 8.1.0 run #0: crashed: KASAN: use-after-free Write in bpf_tcp_close run #1: crashed: KASAN: use-after-free Read in psock_map_pop run #2: crashed: KASAN: use-after-free Write in bpf_tcp_close run #3: OK run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK # git bisect bad e9db4ef6bf4ca9894bb324c76e01b8f1a16b2650 e9db4ef6bf4ca9894bb324c76e01b8f1a16b2650 is the first bad commit commit e9db4ef6bf4ca9894bb324c76e01b8f1a16b2650 Author: John Fastabend Date: Sat Jun 30 06:17:47 2018 -0700 bpf: sockhash fix omitted bucket lock in sock_close First the sk_callback_lock() was being used to protect both the sock callback hooks and the psock->maps list. This got overly convoluted after the addition of sockhash (in sockmap it made some sense because masp and callbacks were tightly coupled) so lets split out a specific lock for maps and only use the callback lock for its intended purpose. This fixes a couple cases where we missed using maps lock when it was in fact needed. Also this makes it easier to follow the code because now we can put the locking closer to the actual code its serializing. Next, in sock_hash_delete_elem() the pattern was as follows, sock_hash_delete_elem() [...] spin_lock(bucket_lock) l = lookup_elem_raw() if (l) hlist_del_rcu() write_lock(sk_callback_lock) .... destroy psock ... write_unlock(sk_callback_lock) spin_unlock(bucket_lock) The ordering is necessary because we only know the {p}sock after dereferencing the hash table which we can't do unless we have the bucket lock held. Once we have the bucket lock and the psock element it is deleted from the hashmap to ensure any other path doing a lookup will fail. Finally, the refcnt is decremented and if zero the psock is destroyed. In parallel with the above (or free'ing the map) a tcp close event may trigger tcp_close(). Which at the moment omits the bucket lock altogether (oops!) where the flow looks like this, bpf_tcp_close() [...] write_lock(sk_callback_lock) for each psock->maps // list of maps this sock is part of hlist_del_rcu(ref_hash_node); .... destroy psock ... write_unlock(sk_callback_lock) Obviously, and demonstrated by syzbot, this is broken because we can have multiple threads deleting entries via hlist_del_rcu(). To fix this we might be tempted to wrap the hlist operation in a bucket lock but that would create a lock inversion problem. In summary to follow locking rules the psocks maps list needs the sk_callback_lock (after this patch maps_lock) but we need the bucket lock to do the hlist_del_rcu. To resolve the lock inversion problem pop the head of the maps list repeatedly and remove the reference until no more are left. If a delete happens in parallel from the BPF API that is OK as well because it will do a similar action, lookup the lock in the map/hash, delete it from the map/hash, and dec the refcnt. We check for this case before doing a destroy on the psock to ensure we don't have two threads tearing down a psock. The new logic is as follows, bpf_tcp_close() e = psock_map_pop(psock->maps) // done with map lock bucket_lock() // lock hash list bucket l = lookup_elem_raw(head, hash, key, key_size); if (l) { //only get here if elmnt was not already removed hlist_del_rcu() ... destroy psock... } bucket_unlock() And finally for all the above to work add missing locking around map operations per above. Then add RCU annotations and use rcu_dereference/rcu_assign_pointer to manage values relying on RCU so that the object is not free'd from sock_hash_free() while it is being referenced in bpf_tcp_close(). Reported-by: syzbot+0ce137753c78f7b6acc1@syzkaller.appspotmail.com Fixes: 81110384441a ("bpf: sockmap, add hash map support") Signed-off-by: John Fastabend Signed-off-by: Daniel Borkmann :040000 040000 f7e8e67f6b97b13e99ace32595bb22001fe04f0e 90569ed4ad1acaf1b63ff5f8062953cdfde54c2a M kernel revisions tested: 21, total time: 5h41m9.995469589s (build: 1h53m57.595803392s, test: 3h41m0.255326821s) first bad commit: e9db4ef6bf4ca9894bb324c76e01b8f1a16b2650 bpf: sockhash fix omitted bucket lock in sock_close cc: ["ast@kernel.org" "daniel@iogearbox.net" "john.fastabend@gmail.com" "linux-kernel@vger.kernel.org" "netdev@vger.kernel.org"] crash: KASAN: use-after-free Write in bpf_tcp_close ================================================================== BUG: KASAN: use-after-free in cmpxchg_size include/asm-generic/atomic-instrumented.h:355 [inline] BUG: KASAN: use-after-free in bpf_tcp_close+0x469/0xb90 kernel/bpf/sockmap.c:344 Write of size 8 at addr ffff88009f0f4880 by task syz-executor.0/17837 CPU: 0 PID: 17837 Comm: syz-executor.0 Not tainted 4.18.0-rc1+ #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x109/0x15a lib/dump_stack.c:113 print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.9+0x242/0x2fe mm/kasan/report.c:412 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267 kasan_check_write+0x14/0x20 mm/kasan/kasan.c:278 cmpxchg_size include/asm-generic/atomic-instrumented.h:355 [inline] bpf_tcp_close+0x469/0xb90 kernel/bpf/sockmap.c:344 inet_release+0xd9/0x1c0 net/ipv4/af_inet.c:427 inet6_release+0x46/0x60 net/ipv6/af_inet6.c:459 __sock_release+0xc2/0x230 net/socket.c:603 sock_close+0x10/0x20 net/socket.c:1186 __fput+0x238/0x780 fs/file_table.c:209 ____fput+0x9/0x10 fs/file_table.c:243 task_work_run+0x111/0x180 kernel/task_work.c:113 tracehook_notify_resume include/linux/tracehook.h:192 [inline] exit_to_usermode_loop+0x1a2/0x200 arch/x86/entry/common.c:166 prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline] syscall_return_slowpath arch/x86/entry/common.c:268 [inline] do_syscall_64+0x407/0x4d0 arch/x86/entry/common.c:293 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x413201 Code: 75 14 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 04 1b 00 00 c3 48 83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01 RSP: 002b:00007fffaac20c40 EFLAGS: 00000293 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000413201 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000005 RBP: 0000000000000001 R08: ffffffffffffffff R09: ffffffffffffffff R10: 00007fffaac20d20 R11: 0000000000000293 R12: 0000000000761178 R13: 000000000004a1f8 R14: 000000000004a225 R15: ffffffffffffffff Allocated by task 17843: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 __do_kmalloc_node mm/slab.c:3682 [inline] __kmalloc_node+0x47/0x70 mm/slab.c:3689 kmalloc_node include/linux/slab.h:555 [inline] bpf_map_area_alloc+0x22/0x50 kernel/bpf/syscall.c:147 sock_map_alloc+0x2b4/0x350 kernel/bpf/sockmap.c:1653 find_and_alloc_map kernel/bpf/syscall.c:129 [inline] map_create+0x230/0xc40 kernel/bpf/syscall.c:453 __do_sys_bpf kernel/bpf/syscall.c:2290 [inline] __se_sys_bpf kernel/bpf/syscall.c:2267 [inline] __x64_sys_bpf+0x1d1/0x380 kernel/bpf/syscall.c:2267 do_syscall_64+0xd0/0x4d0 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 16131: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kfree+0xcf/0x270 mm/slab.c:3813 kvfree+0x2c/0x30 mm/util.c:442 bpf_map_area_free+0x9/0x10 kernel/bpf/syscall.c:158 sock_map_remove_complete kernel/bpf/sockmap.c:1540 [inline] sock_map_free+0x22e/0x300 kernel/bpf/sockmap.c:1729 bpf_map_free_deferred+0xaf/0xe0 kernel/bpf/syscall.c:262 process_one_work+0x830/0x1650 kernel/workqueue.c:2153 worker_thread+0x85/0xb60 kernel/workqueue.c:2296 kobject: 'loop3' ((____ptrval____)): fill_kobj_path: path = '/devices/virtual/block/loop3' kthread+0x316/0x3d0 kernel/kthread.c:240 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:412 The buggy address belongs to the object at ffff88009f0f4880 which belongs to the cache kmalloc-8192 of size 8192 The buggy address is located 0 bytes inside of 8192-byte region [ffff88009f0f4880, ffff88009f0f6880) The buggy address belongs to the page: page:ffffea00027c3d00 count:1 mapcount:0 mapping:ffff8800aa802080 index:0x0 compound_mapcount: 0 flags: 0x1fffc0000008100(slab|head) raw: 01fffc0000008100 ffffea00021f3b08 ffffea000242d808 ffff8800aa802080 raw: 0000000000000000 ffff88009f0f4880 0000000100000001 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88009f0f4780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88009f0f4800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88009f0f4880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88009f0f4900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88009f0f4980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ==================================================================