bisecting cause commit starting from 244dc2689085d7ff478f7b61841e62e59bea4557 building syzkaller on bc8bc756c272115ed92fad4f716b77f6fb995203 testing commit 244dc2689085d7ff478f7b61841e62e59bea4557 with gcc (GCC) 8.1.0 kernel signature: 4088843844a89dbd8c1052097ec46790c13d53bf all runs: crashed: BUG: unable to handle kernel paging request in do_xdp_generic testing release v5.4 testing commit 219d54332a09e8d8741c1e1982f5eae56099de85 with gcc (GCC) 8.1.0 kernel signature: 49128b75bab75b271ba4ca82680a982f08ce1f74 all runs: crashed: BUG: unable to handle kernel paging request in do_xdp_generic testing release v5.3 testing commit 4d856f72c10ecb060868ed10ff1b1453943fc6c8 with gcc (GCC) 8.1.0 kernel signature: 8d965fe28a84216b7e6ebb591778e15bd51d9cf4 all runs: crashed: BUG: unable to handle kernel paging request in do_xdp_generic testing release v5.2 testing commit 0ecfebd2b52404ae0c54a878c872bb93363ada36 with gcc (GCC) 8.1.0 kernel signature: e6d321cb387f751ff2ceb246bd9af99d1fc2ac19 all runs: crashed: BUG: unable to handle kernel paging request in do_xdp_generic testing release v5.1 testing commit e93c9c99a629c61837d5a7fc2120cd2b6c70dbdd with gcc (GCC) 8.1.0 kernel signature: 4dbc1554a9c0854108c43e8a1c9d4fc91544d12e all runs: crashed: BUG: unable to handle kernel paging request in do_xdp_generic testing release v5.0 testing commit 1c163f4c7b3f621efff9b28a47abb36f7378d783 with gcc (GCC) 8.1.0 kernel signature: 25a05073efed17cd0c724deb71572e3b9d9b8969 all runs: crashed: BUG: unable to handle kernel paging request in do_xdp_generic testing release v4.20 testing commit 8fe28cb58bcb235034b64cbbb7550a8a43fd88be with gcc (GCC) 8.1.0 kernel signature: 2dbc2aa70549190422a4d8085dd9f59219e83bf9 all runs: OK # git bisect start 1c163f4c7b3f621efff9b28a47abb36f7378d783 8fe28cb58bcb235034b64cbbb7550a8a43fd88be Bisecting: 7011 revisions left to test after this (roughly 13 steps) [af7ddd8a627c62a835524b3f5b471edbbbcce025] Merge tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping testing commit af7ddd8a627c62a835524b3f5b471edbbbcce025 with gcc (GCC) 8.1.0 kernel signature: 691c82e788632ee506edaca0cf83e8db5f56ffca all runs: crashed: BUG: unable to handle kernel paging request in do_xdp_generic # git bisect bad af7ddd8a627c62a835524b3f5b471edbbbcce025 Bisecting: 3448 revisions left to test after this (roughly 12 steps) [792bf4d871dea8b69be2aaabdd320d7c6ed15985] Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip testing commit 792bf4d871dea8b69be2aaabdd320d7c6ed15985 with gcc (GCC) 8.1.0 kernel signature: 4f6e6be78b01fb7206cc32eacac202526eee1d99 all runs: OK # git bisect good 792bf4d871dea8b69be2aaabdd320d7c6ed15985 Bisecting: 1729 revisions left to test after this (roughly 11 steps) [aa9d6e0f33aea8a1879e7e53fe0e436943f9ce0c] linux/netlink.h: drop unnecessary extern prefix testing commit aa9d6e0f33aea8a1879e7e53fe0e436943f9ce0c with gcc (GCC) 8.1.0 kernel signature: a7d1f734a4eb22cb9e6d9135a4cb8f9bdaac8037 all runs: crashed: BUG: unable to handle kernel paging request in do_xdp_generic # git bisect bad aa9d6e0f33aea8a1879e7e53fe0e436943f9ce0c Bisecting: 858 revisions left to test after this (roughly 10 steps) [2a95471c3397734ba6869ca3fa084490fb35b40b] Merge branch 'prog_test_run-improvement' testing commit 2a95471c3397734ba6869ca3fa084490fb35b40b with gcc (GCC) 8.1.0 kernel signature: 43073399ed707094af0afc011237995b91ba95d2 all runs: OK # git bisect good 2a95471c3397734ba6869ca3fa084490fb35b40b Bisecting: 412 revisions left to test after this (roughly 9 steps) [addb0679839a1f74da6ec742137558be244dd0e9] Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next testing commit addb0679839a1f74da6ec742137558be244dd0e9 with gcc (GCC) 8.1.0 kernel signature: a59dff2c60db5ca8e7a15dbbd3fd6cf504d8c3a5 all runs: crashed: BUG: unable to handle kernel paging request in do_xdp_generic # git bisect bad addb0679839a1f74da6ec742137558be244dd0e9 Bisecting: 222 revisions left to test after this (roughly 8 steps) [b6f153d3e5a5bfbda17b997bd9e258143aa11809] selftests: mlxsw: Add one-armed router test testing commit b6f153d3e5a5bfbda17b997bd9e258143aa11809 with gcc (GCC) 8.1.0 kernel signature: e952005fb960009669799acda02baf870ddea078 all runs: OK # git bisect good b6f153d3e5a5bfbda17b997bd9e258143aa11809 Bisecting: 119 revisions left to test after this (roughly 7 steps) [d8ed257f313f64e9835e61d1365dea95a0a1c9c6] tcp: handle EOR and FIN conditions the same in tcp_tso_should_defer() testing commit d8ed257f313f64e9835e61d1365dea95a0a1c9c6 with gcc (GCC) 8.1.0 kernel signature: bec7a9362427a2cd3089bde205225de4176799b0 run #0: crashed: BUG: corrupted list in neigh_mark_dead run #1: crashed: BUG: corrupted list in neigh_mark_dead run #2: crashed: KASAN: use-after-free Read in neigh_mark_dead run #3: crashed: BUG: corrupted list in neigh_mark_dead run #4: crashed: KASAN: use-after-free Read in neigh_mark_dead run #5: crashed: BUG: corrupted list in neigh_mark_dead run #6: OK run #7: OK run #8: OK run #9: OK # git bisect bad d8ed257f313f64e9835e61d1365dea95a0a1c9c6 Bisecting: 51 revisions left to test after this (roughly 6 steps) [6b241e411607a8f78bee74b96655e9d7835ea8ba] Merge branch 'net-aquantia-add-RSS-configuration' testing commit 6b241e411607a8f78bee74b96655e9d7835ea8ba with gcc (GCC) 8.1.0 kernel signature: c677c5afdf81e2d665bcff6ed79d32a952a7de49 all runs: OK # git bisect good 6b241e411607a8f78bee74b96655e9d7835ea8ba Bisecting: 25 revisions left to test after this (roughly 5 steps) [c3529177db471b964fbe327ffb801266f0482d64] net: hns3: handle hw errors of SSU testing commit c3529177db471b964fbe327ffb801266f0482d64 with gcc (GCC) 8.1.0 kernel signature: 4337ad10e68fc84ddfec8c6b0cad388d61153a8f all runs: OK # git bisect good c3529177db471b964fbe327ffb801266f0482d64 Bisecting: 12 revisions left to test after this (roughly 4 steps) [120d633f199b264e0ad4c98eedc0564a89171c1d] Merge branch 'platform-data-controls-for-mdio-gpio' testing commit 120d633f199b264e0ad4c98eedc0564a89171c1d with gcc (GCC) 8.1.0 kernel signature: d301965e32cab3c1de29096cff599b0033f3ad82 run #0: crashed: BUG: corrupted list in neigh_mark_dead run #1: crashed: KASAN: use-after-free Read in neigh_mark_dead run #2: crashed: BUG: corrupted list in neigh_mark_dead run #3: crashed: BUG: corrupted list in ___neigh_create run #4: crashed: BUG: corrupted list in neigh_mark_dead run #5: crashed: KASAN: use-after-free Read in neigh_mark_dead run #6: crashed: BUG: corrupted list in ___neigh_create run #7: crashed: BUG: corrupted list in neigh_mark_dead run #8: crashed: BUG: corrupted list in neigh_mark_dead run #9: OK # git bisect bad 120d633f199b264e0ad4c98eedc0564a89171c1d Bisecting: 6 revisions left to test after this (roughly 3 steps) [dfe465d33e7fce145746d836ddb31f0e27f28e28] tc-testing: Add new TdcResults module testing commit dfe465d33e7fce145746d836ddb31f0e27f28e28 with gcc (GCC) 8.1.0 kernel signature: a8be95ce0314dbfa4afa6c50c12f7f115f2ee02b run #0: crashed: KASAN: slab-out-of-bounds Read in neigh_mark_dead run #1: crashed: KASAN: use-after-free Read in neigh_mark_dead run #2: crashed: BUG: corrupted list in corrupted run #3: crashed: KASAN: use-after-free Read in neigh_mark_dead run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK # git bisect bad dfe465d33e7fce145746d836ddb31f0e27f28e28 Bisecting: 2 revisions left to test after this (roughly 2 steps) [58956317c8de52009d1a38a721474c24aef74fe7] neighbor: Improve garbage collection testing commit 58956317c8de52009d1a38a721474c24aef74fe7 with gcc (GCC) 8.1.0 kernel signature: e831ac09f89acfcf70b81a9bb05748406460358d run #0: crashed: BUG: corrupted list in ___neigh_create run #1: crashed: BUG: corrupted list in neigh_mark_dead run #2: crashed: KASAN: use-after-free Read in neigh_mark_dead run #3: crashed: BUG: corrupted list in ___neigh_create run #4: crashed: KASAN: use-after-free Read in neigh_mark_dead run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK # git bisect bad 58956317c8de52009d1a38a721474c24aef74fe7 Bisecting: 0 revisions left to test after this (roughly 1 step) [12edfdfc79860f42fa493f81518e040376b6a5bc] Merge branch 'hns3-error-handling' testing commit 12edfdfc79860f42fa493f81518e040376b6a5bc with gcc (GCC) 8.1.0 kernel signature: 6af9026e35f43f2a55dfdf997a07bff103387963 all runs: OK # git bisect good 12edfdfc79860f42fa493f81518e040376b6a5bc 58956317c8de52009d1a38a721474c24aef74fe7 is the first bad commit commit 58956317c8de52009d1a38a721474c24aef74fe7 Author: David Ahern Date: Fri Dec 7 12:24:57 2018 -0800 neighbor: Improve garbage collection The existing garbage collection algorithm has a number of problems: 1. The gc algorithm will not evict PERMANENT entries as those entries are managed by userspace, yet the existing algorithm walks the entire hash table which means it always considers PERMANENT entries when looking for entries to evict. In some use cases (e.g., EVPN) there can be tens of thousands of PERMANENT entries leading to wasted CPU cycles when gc kicks in. As an example, with 32k permanent entries, neigh_alloc has been observed taking more than 4 msec per invocation. 2. Currently, when the number of neighbor entries hits gc_thresh2 and the last flush for the table was more than 5 seconds ago gc kicks in walks the entire hash table evicting *all* entries not in PERMANENT or REACHABLE state and not marked as externally learned. There is no discriminator on when the neigh entry was created or if it just moved from REACHABLE to another NUD_VALID state (e.g., NUD_STALE). It is possible for entries to be created or for established neighbor entries to be moved to STALE (e.g., an external node sends an ARP request) right before the 5 second window lapses: -----|---------x|----------|----- t-5 t t+5 If that happens those entries are evicted during gc causing unnecessary thrashing on neighbor entries and userspace caches trying to track them. Further, this contradicts the description of gc_thresh2 which says "Entries older than 5 seconds will be cleared". One workaround is to make gc_thresh2 == gc_thresh3 but that negates the whole point of having separate thresholds. 3. Clearing *all* neigh non-PERMANENT/REACHABLE/externally learned entries when gc_thresh2 is exceeded is over kill and contributes to trashing especially during startup. This patch addresses these problems as follows: 1. Use of a separate list_head to track entries that can be garbage collected along with a separate counter. PERMANENT entries are not added to this list. The gc_thresh parameters are only compared to the new counter, not the total entries in the table. The forced_gc function is updated to only walk this new gc_list looking for entries to evict. 2. Entries are added to the list head at the tail and removed from the front. 3. Entries are only evicted if they were last updated more than 5 seconds ago, adhering to the original intent of gc_thresh2. 4. Forced gc is stopped once the number of gc_entries drops below gc_thresh2. 5. Since gc checks do not apply to PERMANENT entries, gc levels are skipped when allocating a new neighbor for a PERMANENT entry. By extension this means there are no explicit limits on the number of PERMANENT entries that can be created, but this is no different than FIB entries or FDB entries. Signed-off-by: David Ahern Signed-off-by: David S. Miller Documentation/networking/ip-sysctl.txt | 4 +- include/net/neighbour.h | 3 + net/core/neighbour.c | 119 +++++++++++++++++++++++---------- 3 files changed, 90 insertions(+), 36 deletions(-) culprit signature: e831ac09f89acfcf70b81a9bb05748406460358d parent signature: 6af9026e35f43f2a55dfdf997a07bff103387963 revisions tested: 20, total time: 4h39m39.363676946s (build: 2h4m6.848188067s, test: 2h33m3.075716434s) first bad commit: 58956317c8de52009d1a38a721474c24aef74fe7 neighbor: Improve garbage collection cc: ["corbet@lwn.net" "davem@davemloft.net" "dsahern@gmail.com" "linux-doc@vger.kernel.org" "linux-kernel@vger.kernel.org" "netdev@vger.kernel.org"] crash: KASAN: use-after-free Read in neigh_mark_dead A link change request failed with some changes committed already. Interface lo may have been left with an inconsistent configuration, please check. A link change request failed with some changes committed already. Interface lo may have been left with an inconsistent configuration, please check. ================================================================== BUG: KASAN: use-after-free in __list_del_entry_valid+0xf1/0x100 lib/list_debug.c:51 Read of size 8 at addr ffff8880898f9950 by task kworker/0:10/7703 CPU: 0 PID: 7703 Comm: kworker/0:10 Not tainted 4.20.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events_power_efficient neigh_periodic_work Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x16b/0x224 lib/dump_stack.c:113 print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 NOHZ: local_softirq_pending 08 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 __list_del_entry_valid+0xf1/0x100 lib/list_debug.c:51 __list_del_entry include/linux/list.h:117 [inline] list_del_init include/linux/list.h:159 [inline] neigh_mark_dead+0x86/0x210 net/core/neighbour.c:125 neigh_periodic_work+0x56f/0x870 net/core/neighbour.c:905 process_one_work+0x830/0x1670 kernel/workqueue.c:2153 worker_thread+0x85/0xb60 kernel/workqueue.c:2296 kthread+0x324/0x3e0 kernel/kthread.c:246 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 Allocated by task 3925: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 __do_kmalloc_node mm/slab.c:3684 [inline] __kmalloc_node_track_caller+0x50/0x70 mm/slab.c:3698 __kmalloc_reserve.isra.43+0x2c/0xc0 net/core/skbuff.c:137 __alloc_skb+0xd7/0x580 net/core/skbuff.c:205 alloc_skb include/linux/skbuff.h:1008 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline] netlink_sendmsg+0x810/0xc40 net/netlink/af_netlink.c:1892 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg+0xb5/0xf0 net/socket.c:631 ___sys_sendmsg+0x647/0x950 net/socket.c:2116 __sys_sendmsg+0xd9/0x180 net/socket.c:2154 __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg net/socket.c:2161 [inline] __x64_sys_sendmsg+0x73/0xb0 net/socket.c:2161 do_syscall_64+0xd0/0x4d0 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 7370: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kfree+0xcf/0x220 mm/slab.c:3817 skb_free_head+0x74/0x90 net/core/skbuff.c:550 skb_release_data+0x510/0x780 net/core/skbuff.c:570 skb_release_all+0x3d/0x50 net/core/skbuff.c:627 __kfree_skb net/core/skbuff.c:641 [inline] consume_skb+0x91/0x270 net/core/skbuff.c:701 skb_free_datagram+0x12/0xc0 net/core/datagram.c:329 netlink_recvmsg+0x5e3/0xe60 net/netlink/af_netlink.c:1996 sock_recvmsg_nosec net/socket.c:794 [inline] sock_recvmsg+0xb7/0xf0 net/socket.c:801 ___sys_recvmsg+0x219/0x530 net/socket.c:2278 __sys_recvmsg+0xd6/0x180 net/socket.c:2327 __do_sys_recvmsg net/socket.c:2337 [inline] __se_sys_recvmsg net/socket.c:2334 [inline] __x64_sys_recvmsg+0x73/0xb0 net/socket.c:2334 do_syscall_64+0xd0/0x4d0 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff8880898f96c0 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 656 bytes inside of 1024-byte region [ffff8880898f96c0, ffff8880898f9ac0) The buggy address belongs to the page: page:ffffea0002263e00 count:1 mapcount:0 mapping:ffff88812c3f6ac0 index:0x0 compound_mapcount: 0 flags: 0xfffe0000010200(slab|head) raw: 00fffe0000010200 ffffea0002a53f88 ffffea00025d5208 ffff88812c3f6ac0 raw: 0000000000000000 ffff8880898f8040 0000000100000007 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8880898f9800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880898f9880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff8880898f9900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8880898f9980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880898f9a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ==================================================================