bisecting cause commit starting from b4bb878f3eb3e604ebfe83bbc17eb7af8d99cbf4 building syzkaller on 63631df1539816bc62c7be40779c5f3e23b23b2f testing commit b4bb878f3eb3e604ebfe83bbc17eb7af8d99cbf4 with gcc (GCC) 8.1.0 kernel signature: 300d2f6536745dd94a36ceff655a94e9043e97b5d4773e9e577837d1f033ba7e all runs: crashed: BUG: sleeping function called from invalid context in rxe_alloc_nl testing release v5.10 testing commit 2c85ebc57b3e1817b6ce1a6b703928e113a90442 with gcc (GCC) 8.1.0 kernel signature: 395a614079c9030036488f72c030c4708fae86b9aba956ca15d15003efd3a593 all runs: OK # git bisect start b4bb878f3eb3e604ebfe83bbc17eb7af8d99cbf4 2c85ebc57b3e1817b6ce1a6b703928e113a90442 Bisecting: 9314 revisions left to test after this (roughly 13 steps) [009bd55dfcc857d8b00a5bbb17a8db060317af6f] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma testing commit 009bd55dfcc857d8b00a5bbb17a8db060317af6f with gcc (GCC) 8.1.0 kernel signature: 0854d3cbe7ad28d8fd77693bd87ed4bf26d9efdd89516d88a75a859b18b2c091 all runs: OK # git bisect good 009bd55dfcc857d8b00a5bbb17a8db060317af6f Bisecting: 4657 revisions left to test after this (roughly 12 steps) [8a2ba7312a0fa908c5c3603d3438c0fce7611305] mm/huge_memory.c: update tlb entry if pmd is changed testing commit 8a2ba7312a0fa908c5c3603d3438c0fce7611305 with gcc (GCC) 8.1.0 kernel signature: 264fc8b861b851c324bba19597be80c8252bd72a5dd4bd1b544503b2046f837d all runs: OK # git bisect good 8a2ba7312a0fa908c5c3603d3438c0fce7611305 Bisecting: 2339 revisions left to test after this (roughly 11 steps) [e7939c7f0e5789fdf43e2dcf8bccbe76afd626af] Merge remote-tracking branch 'wireless-drivers-next/master' testing commit e7939c7f0e5789fdf43e2dcf8bccbe76afd626af with gcc (GCC) 8.1.0 kernel signature: 30f9ad7407153e187402e6b64840236000bf944722036dba49ac3cc7162afdfb all runs: crashed: BUG: sleeping function called from invalid context in rxe_alloc_nl # git bisect bad e7939c7f0e5789fdf43e2dcf8bccbe76afd626af Bisecting: 1156 revisions left to test after this (roughly 10 steps) [6ca1a413a83005d607b7e30aa302c995654e5f55] Merge remote-tracking branch 'mips/mips-next' testing commit 6ca1a413a83005d607b7e30aa302c995654e5f55 with gcc (GCC) 8.1.0 kernel signature: a7ecfa21716693cf39a78090c571520259a63c6565e786f14cf93d3027e5e171 all runs: OK # git bisect good 6ca1a413a83005d607b7e30aa302c995654e5f55 Bisecting: 520 revisions left to test after this (roughly 9 steps) [433babd5837484734b9e1c347c2410a04bd6b776] Merge remote-tracking branch 'v4l-dvb/master' testing commit 433babd5837484734b9e1c347c2410a04bd6b776 with gcc (GCC) 8.1.0 kernel signature: 5114d21991bf70c4308c5c3945859d25617a029359ff3727ba7ed2b30d9c3505 all runs: OK # git bisect good 433babd5837484734b9e1c347c2410a04bd6b776 Bisecting: 252 revisions left to test after this (roughly 8 steps) [747fdd47ed4fc00a6c08bc59710c469f30c1594c] Merge tag 'linux-can-next-for-5.12-20210114' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next testing commit 747fdd47ed4fc00a6c08bc59710c469f30c1594c with gcc (GCC) 8.1.0 kernel signature: d8b1622773c38fc4649230bcf6de44f98118f4920b997274277c5141aed52520 all runs: OK # git bisect good 747fdd47ed4fc00a6c08bc59710c469f30c1594c Bisecting: 142 revisions left to test after this (roughly 7 steps) [a98c0c47420412ef94d6f45f9ae607258929aa10] net: bridge: check vlan with eth_type_vlan() method testing commit a98c0c47420412ef94d6f45f9ae607258929aa10 with gcc (GCC) 8.1.0 kernel signature: 69a1711f9043e9c345ef398971c108e73f1c4f4459083b8eb6f6d8f0e1136b74 all runs: OK # git bisect good a98c0c47420412ef94d6f45f9ae607258929aa10 Bisecting: 70 revisions left to test after this (roughly 6 steps) [87d10125df779685e7280607335bf7fcfca09aa9] Merge remote-tracking branch 'thermal/thermal/linux-next' testing commit 87d10125df779685e7280607335bf7fcfca09aa9 with gcc (GCC) 8.1.0 kernel signature: 6e3185a62bc27074a3413006eb99953817538bef69da4ae7f085d6cf1c6a7e6f all runs: OK # git bisect good 87d10125df779685e7280607335bf7fcfca09aa9 Bisecting: 35 revisions left to test after this (roughly 5 steps) [266c2f5476a8a306f9800b521f944df07fbc5604] Merge remote-tracking branch 'rdma/for-next' testing commit 266c2f5476a8a306f9800b521f944df07fbc5604 with gcc (GCC) 8.1.0 kernel signature: 2fa6959390bbfccde12f30b57934803d3f7915ad934c5ef187e5a7177077fe66 all runs: crashed: BUG: sleeping function called from invalid context in rxe_alloc_nl # git bisect bad 266c2f5476a8a306f9800b521f944df07fbc5604 Bisecting: 17 revisions left to test after this (roughly 4 steps) [f991fdac813f1598a9bb17b602ce04812ba9148c] RDMA/rtrs-srv: Use sysfs_remove_file_self for disconnect testing commit f991fdac813f1598a9bb17b602ce04812ba9148c with gcc (GCC) 8.1.0 kernel signature: e87f9f926f3c08c5cf395dcfa840c8a61cdf04491b93c2b5a1b3f47f65ef609b all runs: crashed: BUG: sleeping function called from invalid context in rxe_alloc_nl # git bisect bad f991fdac813f1598a9bb17b602ce04812ba9148c Bisecting: 8 revisions left to test after this (roughly 3 steps) [1d11c1b7f9ff28196d66d995f11fcf3101301fe9] RDMA/rxe: Remove unneeded RXE_POOL_ATOMIC flag testing commit 1d11c1b7f9ff28196d66d995f11fcf3101301fe9 with gcc (GCC) 8.1.0 kernel signature: 43a0cef05f0bd37855ea648137fba534ca21405080b3e0c8afe30ec66b677db9 all runs: OK # git bisect good 1d11c1b7f9ff28196d66d995f11fcf3101301fe9 Bisecting: 4 revisions left to test after this (roughly 2 steps) [91a42c5becb685e1ae73554726906dfe74a8afdd] RDMA/rxe: Make add/drop key/index APIs type safe testing commit 91a42c5becb685e1ae73554726906dfe74a8afdd with gcc (GCC) 8.1.0 kernel signature: 9c6a4b30359f905ed44620a0ad5fa2bf6c53da05db939c67c7330fc564edbc88 all runs: OK # git bisect good 91a42c5becb685e1ae73554726906dfe74a8afdd Bisecting: 2 revisions left to test after this (roughly 1 step) [8a48ac7f6c24973863b42d03156d5e34c46c2971] RDMA/rxe: Fix race in rxe_mcast.c testing commit 8a48ac7f6c24973863b42d03156d5e34c46c2971 with gcc (GCC) 8.1.0 kernel signature: b7f1521e62ab63d3403f61dd2cb83ae1d42ada687618bcc8cd6c11d9d1f72178 all runs: crashed: BUG: sleeping function called from invalid context in rxe_alloc_nl # git bisect bad 8a48ac7f6c24973863b42d03156d5e34c46c2971 Bisecting: 0 revisions left to test after this (roughly 0 steps) [3853c35e243d56238159e8365b6aca410bdd4576] RDMA/rxe: Add unlocked versions of pool APIs testing commit 3853c35e243d56238159e8365b6aca410bdd4576 with gcc (GCC) 8.1.0 kernel signature: d721d05cd80453fdc67894ed40de2f5122ff29eb6e07f35161e0dc1c482d0a09 all runs: crashed: BUG: sleeping function called from invalid context in rxe_alloc_nl # git bisect bad 3853c35e243d56238159e8365b6aca410bdd4576 3853c35e243d56238159e8365b6aca410bdd4576 is the first bad commit commit 3853c35e243d56238159e8365b6aca410bdd4576 Author: Bob Pearson Date: Wed Dec 16 17:15:49 2020 -0600 RDMA/rxe: Add unlocked versions of pool APIs The existing pool APIs use the rw_lock pool_lock to protect critical sections that change the pool state. This does not correctly implement a typical sequence like the following elem = if found use elem else elem = Which is racy if multiple threads are attempting to perform this at the same time. We want the second thread to use the elem created by the first thread not create two equivalent elems. This patch adds new APIs that are the same as existing APIs but do not take the pool_lock. A caller can then take the lock and perform a sequence of pool operations and then release the lock. Link: https://lore.kernel.org/r/20201216231550.27224-7-rpearson@hpe.com Signed-off-by: Bob Pearson Signed-off-by: Jason Gunthorpe drivers/infiniband/sw/rxe/rxe_pool.c | 82 +++++++++++++++++++++++++++--------- drivers/infiniband/sw/rxe/rxe_pool.h | 41 +++++++++++++++--- 2 files changed, 97 insertions(+), 26 deletions(-) culprit signature: d721d05cd80453fdc67894ed40de2f5122ff29eb6e07f35161e0dc1c482d0a09 parent signature: 9c6a4b30359f905ed44620a0ad5fa2bf6c53da05db939c67c7330fc564edbc88 revisions tested: 16, total time: 2h58m28.751594872s (build: 1h13m23.20482725s, test: 1h43m26.778865007s) first bad commit: 3853c35e243d56238159e8365b6aca410bdd4576 RDMA/rxe: Add unlocked versions of pool APIs recipients (to): ["jgg@nvidia.com" "rpearson@hpe.com" "rpearsonhpe@gmail.com"] recipients (cc): [] crash: BUG: sleeping function called from invalid context in rxe_alloc_nl infiniband syz2: set active infiniband syz2: added bond_slave_0 BUG: sleeping function called from invalid context at drivers/infiniband/sw/rxe/rxe_pool.c:346 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 10213, name: syz-executor.3 6 locks held by syz-executor.3/10213: #0: ffffffff878bbef8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:164 [inline] #0: ffffffff878bbef8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] #0: ffffffff878bbef8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv+0xa9/0x450 drivers/infiniband/core/netlink.c:259 #1: ffffffff85211b10 (link_ops_rwsem){++++}-{3:3}, at: nldev_newlink+0x11f/0x1f0 drivers/infiniband/core/nldev.c:1545 #2: ffffffff8520e330 (devices_rwsem){++++}-{3:3} , at: enable_device_and_get+0x52/0x1f0 drivers/infiniband/core/device.c:1307 #3: ffffffff8520e230 (clients_rwsem){++++}-{3:3}, at: enable_device_and_get+0x7e/0x1f0 drivers/infiniband/core/device.c:1315 #4: ffff88811f444598 (&device->client_data_rwsem){++++}-{3:3}, at: add_client_context+0x122/0x1d0 drivers/infiniband/core/device.c:715 #5: ffff88811f445640 (&pool->pool_lock){....}-{2:2}, at: rxe_alloc+0x13/0x40 drivers/infiniband/sw/rxe/rxe_pool.c:384 irq event stamp: 2548 hardirqs last enabled at (2547): [] __raw_read_unlock_irqrestore include/linux/rwlock_api_smp.h:235 [inline] hardirqs last enabled at (2547): [] _raw_read_unlock_irqrestore+0x42/0x50 kernel/locking/spinlock.c:263 hardirqs last disabled at (2548): [] __raw_read_lock_irqsave include/linux/rwlock_api_smp.h:157 [inline] hardirqs last disabled at (2548): [] _raw_read_lock_irqsave+0x7d/0x80 kernel/locking/spinlock.c:231 softirqs last enabled at (2538): [] __do_softirq+0x3b9/0x566 kernel/softirq.c:370 softirqs last disabled at (2445): [] asm_call_irq_on_stack+0xf/0x20 Preemption disabled at: [<0000000000000000>] 0x0 CPU: 0 PID: 10213 Comm: syz-executor.3 Not tainted 5.11.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x77/0x97 lib/dump_stack.c:120 ___might_sleep.cold.120+0xf2/0x106 kernel/sched/core.c:7901 rxe_alloc_nl+0x145/0x1b0 drivers/infiniband/sw/rxe/rxe_pool.c:346 rxe_alloc+0x1e/0x40 drivers/infiniband/sw/rxe/rxe_pool.c:385 rxe_get_dma_mr+0x1a/0x80 drivers/infiniband/sw/rxe/rxe_verbs.c:870 __ib_alloc_pd+0xb9/0x180 drivers/infiniband/core/verbs.c:299 ib_mad_port_open drivers/infiniband/core/mad.c:2981 [inline] ib_mad_init_device+0x141/0x540 drivers/infiniband/core/mad.c:3092 add_client_context+0x130/0x1d0 drivers/infiniband/core/device.c:717 enable_device_and_get+0xd4/0x1f0 drivers/infiniband/core/device.c:1317 ib_register_device+0x3f9/0x4b0 drivers/infiniband/core/device.c:1399 rxe_register_device+0x10f/0x120 drivers/infiniband/sw/rxe/rxe_verbs.c:1147 rxe_net_add+0x38/0x70 drivers/infiniband/sw/rxe/rxe_net.c:489 rxe_newlink+0x53/0x60 drivers/infiniband/sw/rxe/rxe.c:269 nldev_newlink+0x13f/0x1f0 drivers/infiniband/core/nldev.c:1555 rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:195 [inline] rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] rdma_nl_rcv+0x13d/0x450 drivers/infiniband/core/netlink.c:259 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x19a/0x270 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x248/0x480 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0x2b/0x40 net/socket.c:672 ____sys_sendmsg+0x1ed/0x230 net/socket.c:2345 ___sys_sendmsg+0x77/0xb0 net/socket.c:2399 __sys_sendmsg+0x52/0xa0 net/socket.c:2432 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45e219 Code: 0d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f6f50cc9c68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 000000000045e219 RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000003 RBP: 000000000119bfc0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000119bf8c R13: 00007ffd2d12fb4f R14: 00007f6f50cca9c0 R15: 000000000119bf8c RDS/IB: syz2: added smc: adding ib device syz2 with port count 1 smc: ib device syz2 port 1 has pnetid