syzbot


ID Workflow Result Correct Bug Created Started Finished Revision Error
10ce9772-673a-4cf2-916e-4c32693bd07c assessment-kcsan Benign: ✅  Confident: ✅  KCSAN: data-race in __filemap_remove_folio / folio_mapping (2) 2026/01/14 20:44 2026/01/14 20:44 2026/01/14 20:47 a9d6a79219801d2130df3b1a792c57f0e5428e9f
Benign: ✅    Confident: ✅   

Explanation:
The data race reported by KCSAN occurs between `__filemap_remove_folio` (writing to `folio->mapping`) and `folio_mapping` (reading from `folio->mapping`).

### Analysis

**1. The Write Side:**
In `__filemap_remove_folio`, which is called during folio truncation or inode eviction (as seen in the stack trace from `shmem_evict_inode`), the folio is being removed from the page cache. The code sets `folio->mapping = NULL` at `mm/filemap.c:144` (within the inlined `page_cache_delete`). This operation is performed while holding the folio lock and the `i_pages` lock of the address space.

**2. The Read Side:**
In `folio_mapping`, the kernel reads `folio->mapping` to determine which address space the folio belongs to. This is called from `folio_evictable` in `mm/internal.h`, which is used by the multi-generational LRU (MGLRU) reclaimer (`scan_folios` -> `sort_folio`).

**3. Why the race is benign:**
*   **Atomic Access:** On the reported architecture (x86_64), aligned 8-byte pointer writes and reads are atomic. The reclaimer will see either the original valid `address_space` pointer or `NULL`.
*   **Object Lifetime (RCU):** `folio_evictable` wraps the call to `folio_mapping` and the subsequent use of the mapping pointer within an `rcu_read_lock()` / `rcu_read_unlock()` block. In the Linux kernel, inodes (and their embedded `address_space` structures, especially for shmem) are freed using RCU (`call_rcu` in `destroy_inode`). Thus, even if the reclaimer reads a "stale" mapping pointer while the inode is being evicted, the memory pointed to by that mapping remains valid for the duration of the RCU critical section.
*   **Safety of `NULL`:** If the reclaimer reads `NULL`, the `mapping_unevictable` function (which receives the result of `folio_mapping`) explicitly handles `NULL` by returning `false`.
*   **Logic Impact:** The result of `folio_evictable` is used as a heuristic to decide whether to move a folio to the unevictable LRU list. Since the folio is in the process of being truncated and removed from the page cache anyway, its temporary placement on one LRU list versus another before its final `put` is inconsequential and does not lead to any corruption or inconsistent state.

In summary, this is a classic example of a "best-effort" read of a pointer that is being cleared during object teardown. The access is protected by RCU to ensure memory safety, and the logic handles both the old and new values correctly.

Final result: **Benign**

Crash report:
==================================================================
BUG: KCSAN: data-race in __filemap_remove_folio / folio_mapping

write to 0xffffea0005fc3398 of 8 bytes by task 14341 on cpu 0:
 page_cache_delete mm/filemap.c:145 [inline]
 __filemap_remove_folio+0x1a5/0x2a0 mm/filemap.c:228
 filemap_remove_folio+0x6d/0x1d0 mm/filemap.c:257
 truncate_inode_folio+0x42/0x50 mm/truncate.c:176
 shmem_undo_range+0x244/0xa80 mm/shmem.c:1133
 shmem_truncate_range mm/shmem.c:1249 [inline]
 shmem_evict_inode+0x12e/0x510 mm/shmem.c:1379
 evict+0x2af/0x510 fs/inode.c:837
 iput_final fs/inode.c:1954 [inline]
 iput+0x4bd/0x650 fs/inode.c:2006
 dentry_unlink_inode+0x24f/0x260 fs/dcache.c:467
 __dentry_kill+0x18d/0x4b0 fs/dcache.c:670
 finish_dput+0x2b/0x200 fs/dcache.c:879
 dput+0x52/0x60 fs/dcache.c:928
 __fput+0x444/0x650 fs/file_table.c:476
 ____fput+0x1c/0x30 fs/file_table.c:496
 task_work_run+0x131/0x1a0 kernel/task_work.c:233
 exit_task_work include/linux/task_work.h:40 [inline]
 do_exit+0x467/0x1590 kernel/exit.c:971
 do_group_exit+0xff/0x140 kernel/exit.c:1112
 get_signal+0xe58/0xf70 kernel/signal.c:3034
 arch_do_signal_or_restart+0x96/0x450 arch/x86/kernel/signal.c:337
 __exit_to_user_mode_loop kernel/entry/common.c:41 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:75 [inline]
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
 irqentry_exit_to_user_mode_prepare include/linux/irq-entry-common.h:270 [inline]
 irqentry_exit_to_user_mode include/linux/irq-entry-common.h:339 [inline]
 irqentry_exit+0xf9/0x560 kernel/entry/common.c:196
 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:618

read to 0xffffea0005fc3398 of 8 bytes by task 14423 on cpu 1:
 folio_mapping+0xa1/0xe0 mm/util.c:703
 folio_evictable mm/internal.h:499 [inline]
 sort_folio mm/vmscan.c:4445 [inline]
 scan_folios mm/vmscan.c:4573 [inline]
 isolate_folios mm/vmscan.c:4669 [inline]
 evict_folios+0xe0b/0x35a0 mm/vmscan.c:4699
 try_to_shrink_lruvec+0x5b5/0x950 mm/vmscan.c:4874
 lru_gen_shrink_lruvec mm/vmscan.c:5023 [inline]
 shrink_lruvec+0x22e/0x1b50 mm/vmscan.c:5784
 shrink_node_memcgs mm/vmscan.c:6020 [inline]
 shrink_node+0x66c/0x2010 mm/vmscan.c:6061
 shrink_zones mm/vmscan.c:6300 [inline]
 do_try_to_free_pages+0x3f6/0xcd0 mm/vmscan.c:6362
 try_to_free_mem_cgroup_pages+0x1ab/0x410 mm/vmscan.c:6690
 try_charge_memcg+0x383/0xa10 mm/memcontrol.c:2388
 obj_cgroup_charge_pages+0xa6/0x150 mm/memcontrol.c:2823
 __memcg_kmem_charge_page+0x9f/0x170 mm/memcontrol.c:2867
 __alloc_frozen_pages_noprof+0x18f/0x360 mm/page_alloc.c:5227
 alloc_pages_mpol+0xb3/0x260 mm/mempolicy.c:2486
 alloc_frozen_pages_noprof mm/mempolicy.c:2557 [inline]
 alloc_pages_noprof+0x90/0x130 mm/mempolicy.c:2577
 vm_area_alloc_pages mm/vmalloc.c:3718 [inline]
 __vmalloc_area_node mm/vmalloc.c:3863 [inline]
 __vmalloc_node_range_noprof+0xa7b/0x1310 mm/vmalloc.c:4051
 __kvmalloc_node_noprof+0x492/0x6b0 mm/slub.c:7164
 ip_set_alloc+0x24/0x30 net/netfilter/ipset/ip_set_core.c:261
 hash_netiface_create+0x282/0x740 net/netfilter/ipset/ip_set_hash_gen.h:1568
 ip_set_create+0x3cc/0x970 net/netfilter/ipset/ip_set_core.c:1109
 nfnetlink_rcv_msg+0x4c6/0x590 net/netfilter/nfnetlink.c:302
 netlink_rcv_skb+0x123/0x220 net/netlink/af_netlink.c:2550
 nfnetlink_rcv+0x167/0x16c0 net/netfilter/nfnetlink.c:669
 netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
 netlink_unicast+0x5c0/0x690 net/netlink/af_netlink.c:1344
 netlink_sendmsg+0x58b/0x6b0 net/netlink/af_netlink.c:1894
 sock_sendmsg_nosec net/socket.c:727 [inline]
 __sock_sendmsg+0x145/0x180 net/socket.c:742
 ____sys_sendmsg+0x31e/0x4a0 net/socket.c:2592
 ___sys_sendmsg+0x17b/0x1d0 net/socket.c:2646
 __sys_sendmsg net/socket.c:2678 [inline]
 __do_sys_sendmsg net/socket.c:2683 [inline]
 __se_sys_sendmsg net/socket.c:2681 [inline]
 __x64_sys_sendmsg+0xd4/0x160 net/socket.c:2681
 x64_sys_call+0x17ba/0x3000 arch/x86/include/generated/asm/syscalls_64.h:47
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xca/0x2b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

value changed: 0xffff88811b46eaa0 -> 0x0000000000000000

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 14423 Comm: syz.7.15806 Tainted: G        W           syzkaller #0 PREEMPT(voluntary) 
Tainted: [W]=WARN
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
==================================================================

Trajectory:
Seq Timestamp Type Name Duration
0/0 2026/01/14 20:44 flow assessment-kcsan
2m Model:
unknown

Results:
map[Benign:true Confident:true Explanation:The data race reported by KCSAN occurs between `__filemap_remove_folio` (writing to `folio->mapping`) and `folio_mapping` (reading from `folio->mapping`).

### Analysis

**1. The Write Side:**
In `__filemap_remove_folio`, which is called during folio truncation or inode eviction (as seen in the stack trace from `shmem_evict_inode`), the folio is being removed from the page cache. The code sets `folio->mapping = NULL` at `mm/filemap.c:144` (within the inlined `page_cache_delete`). This operation is performed while holding the folio lock and the `i_pages` lock of the address space.

**2. The Read Side:**
In `folio_mapping`, the kernel reads `folio->mapping` to determine which address space the folio belongs to. This is called from `folio_evictable` in `mm/internal.h`, which is used by the multi-generational LRU (MGLRU) reclaimer (`scan_folios` -> `sort_folio`).

**3. Why the race is benign:**
*   **Atomic Access:** On the reported architecture (x86_64), aligned 8-byte pointer writes and reads are atomic. The reclaimer will see either the original valid `address_space` pointer or `NULL`.
*   **Object Lifetime (RCU):** `folio_evictable` wraps the call to `folio_mapping` and the subsequent use of the mapping pointer within an `rcu_read_lock()` / `rcu_read_unlock()` block. In the Linux kernel, inodes (and their embedded `address_space` structures, especially for shmem) are freed using RCU (`call_rcu` in `destroy_inode`). Thus, even if the reclaimer reads a "stale" mapping pointer while the inode is being evicted, the memory pointed to by that mapping remains valid for the duration of the RCU critical section.
*   **Safety of `NULL`:** If the reclaimer reads `NULL`, the `mapping_unevictable` function (which receives the result of `folio_mapping`) explicitly handles `NULL` by returning `false`.
*   **Logic Impact:** The result of `folio_evictable` is used as a heuristic to decide whether to move a folio to the unevictable LRU list. Since the folio is in the process of being truncated and removed from the page cache anyway, its temporary placement on one LRU list versus another before its final `put` is inconsequential and does not lead to any corruption or inconsistent state.

In summary, this is a classic example of a "best-effort" read of a pointer that is being cleared during object teardown. The access is protected by RCU to ensure memory safety, and the logic handles both the old and new values correctly.

Final result: **Benign**]

1/1 2026/01/14 20:44 action kernel-checkouter
0m Model:
unknown

Results:
map[KernelSrc:/usr/local/google/home/dvyukov/syzkaller/agent/workdir/cache/src/45e88afe9f77e1065fe073226d73bb1a7b8fcd49]

2/1 2026/01/14 20:44 action kernel-builder
0m Model:
unknown

Results:
map[KernelObj:/usr/local/google/home/dvyukov/syzkaller/agent/workdir/cache/build/cd032fae0a4a355202951678faca55c2ed45746d]

3/1 2026/01/14 20:44 action codesearch-prepare
0m Model:
unknown

Results:
map[Index:codesearch-index]

4/1 2026/01/14 20:44 agent expert
2m Model:
unknown

Results:
map[Benign:true Confident:true]

Instruction:
You are an experienced Linux kernel developer tasked with determining if the given kernel
data race is benign or not. The data race report is from KCSAN tool.
It contains 2 stack traces of the memory accesses that constitute a data race.

A "benign" data races are on a simple int/bool variable or similar field,
and the accesses are not supposed to be protected by any mutual exclusion primitives.
Common examples of such "benign" data races are accesses to various flags fields,
statistics counters, and similar. A "benign" data race does not lead to memory corruption/crash
with a conservative compiler that compiles memory accesses to primitive types
effectively as atomic.

A non-benign (or "harmful" data race) can lead to corruption/crash even with
a conservative compiler that compiles memory accesses to primitive types
effectively as atomic. A common example of a "harmful" data races is race on
a complex container (list/hashmap/etc), where accesses are supposed to be protected
by a mutual exclusion primitive.

In the final reply explain why you think the given data race is benign or is harmful.

Use the provided tools to confirm any assumptions, variables/fields being accessed, etc.
In particular, don't make assumptions about the kernel source code,
use codesearch tools to read the actual source code.


Use set-results tool to provide results of the analysis.
It must be called exactly once before the final reply.
Ignore results of this tool.

Prompt:
The data race report is:

==================================================================
BUG: KCSAN: data-race in __filemap_remove_folio / folio_mapping

write to 0xffffea0005fc3398 of 8 bytes by task 14341 on cpu 0:
 page_cache_delete mm/filemap.c:145 [inline]
 __filemap_remove_folio+0x1a5/0x2a0 mm/filemap.c:228
 filemap_remove_folio+0x6d/0x1d0 mm/filemap.c:257
 truncate_inode_folio+0x42/0x50 mm/truncate.c:176
 shmem_undo_range+0x244/0xa80 mm/shmem.c:1133
 shmem_truncate_range mm/shmem.c:1249 [inline]
 shmem_evict_inode+0x12e/0x510 mm/shmem.c:1379
 evict+0x2af/0x510 fs/inode.c:837
 iput_final fs/inode.c:1954 [inline]
 iput+0x4bd/0x650 fs/inode.c:2006
 dentry_unlink_inode+0x24f/0x260 fs/dcache.c:467
 __dentry_kill+0x18d/0x4b0 fs/dcache.c:670
 finish_dput+0x2b/0x200 fs/dcache.c:879
 dput+0x52/0x60 fs/dcache.c:928
 __fput+0x444/0x650 fs/file_table.c:476
 ____fput+0x1c/0x30 fs/file_table.c:496
 task_work_run+0x131/0x1a0 kernel/task_work.c:233
 exit_task_work include/linux/task_work.h:40 [inline]
 do_exit+0x467/0x1590 kernel/exit.c:971
 do_group_exit+0xff/0x140 kernel/exit.c:1112
 get_signal+0xe58/0xf70 kernel/signal.c:3034
 arch_do_signal_or_restart+0x96/0x450 arch/x86/kernel/signal.c:337
 __exit_to_user_mode_loop kernel/entry/common.c:41 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:75 [inline]
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
 irqentry_exit_to_user_mode_prepare include/linux/irq-entry-common.h:270 [inline]
 irqentry_exit_to_user_mode include/linux/irq-entry-common.h:339 [inline]
 irqentry_exit+0xf9/0x560 kernel/entry/common.c:196
 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:618

read to 0xffffea0005fc3398 of 8 bytes by task 14423 on cpu 1:
 folio_mapping+0xa1/0xe0 mm/util.c:703
 folio_evictable mm/internal.h:499 [inline]
 sort_folio mm/vmscan.c:4445 [inline]
 scan_folios mm/vmscan.c:4573 [inline]
 isolate_folios mm/vmscan.c:4669 [inline]
 evict_folios+0xe0b/0x35a0 mm/vmscan.c:4699
 try_to_shrink_lruvec+0x5b5/0x950 mm/vmscan.c:4874
 lru_gen_shrink_lruvec mm/vmscan.c:5023 [inline]
 shrink_lruvec+0x22e/0x1b50 mm/vmscan.c:5784
 shrink_node_memcgs mm/vmscan.c:6020 [inline]
 shrink_node+0x66c/0x2010 mm/vmscan.c:6061
 shrink_zones mm/vmscan.c:6300 [inline]
 do_try_to_free_pages+0x3f6/0xcd0 mm/vmscan.c:6362
 try_to_free_mem_cgroup_pages+0x1ab/0x410 mm/vmscan.c:6690
 try_charge_memcg+0x383/0xa10 mm/memcontrol.c:2388
 obj_cgroup_charge_pages+0xa6/0x150 mm/memcontrol.c:2823
 __memcg_kmem_charge_page+0x9f/0x170 mm/memcontrol.c:2867
 __alloc_frozen_pages_noprof+0x18f/0x360 mm/page_alloc.c:5227
 alloc_pages_mpol+0xb3/0x260 mm/mempolicy.c:2486
 alloc_frozen_pages_noprof mm/mempolicy.c:2557 [inline]
 alloc_pages_noprof+0x90/0x130 mm/mempolicy.c:2577
 vm_area_alloc_pages mm/vmalloc.c:3718 [inline]
 __vmalloc_area_node mm/vmalloc.c:3863 [inline]
 __vmalloc_node_range_noprof+0xa7b/0x1310 mm/vmalloc.c:4051
 __kvmalloc_node_noprof+0x492/0x6b0 mm/slub.c:7164
 ip_set_alloc+0x24/0x30 net/netfilter/ipset/ip_set_core.c:261
 hash_netiface_create+0x282/0x740 net/netfilter/ipset/ip_set_hash_gen.h:1568
 ip_set_create+0x3cc/0x970 net/netfilter/ipset/ip_set_core.c:1109
 nfnetlink_rcv_msg+0x4c6/0x590 net/netfilter/nfnetlink.c:302
 netlink_rcv_skb+0x123/0x220 net/netlink/af_netlink.c:2550
 nfnetlink_rcv+0x167/0x16c0 net/netfilter/nfnetlink.c:669
 netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
 netlink_unicast+0x5c0/0x690 net/netlink/af_netlink.c:1344
 netlink_sendmsg+0x58b/0x6b0 net/netlink/af_netlink.c:1894
 sock_sendmsg_nosec net/socket.c:727 [inline]
 __sock_sendmsg+0x145/0x180 net/socket.c:742
 ____sys_sendmsg+0x31e/0x4a0 net/socket.c:2592
 ___sys_sendmsg+0x17b/0x1d0 net/socket.c:2646
 __sys_sendmsg net/socket.c:2678 [inline]
 __do_sys_sendmsg net/socket.c:2683 [inline]
 __se_sys_sendmsg net/socket.c:2681 [inline]
 __x64_sys_sendmsg+0xd4/0x160 net/socket.c:2681
 x64_sys_call+0x17ba/0x3000 arch/x86/include/generated/asm/syscalls_64.h:47
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xca/0x2b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

value changed: 0xffff88811b46eaa0 -> 0x0000000000000000

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 14423 Comm: syz.7.15806 Tainted: G        W           syzkaller #0 PREEMPT(voluntary) 
Tainted: [W]=WARN
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
==================================================================


Reply:
The data race reported by KCSAN occurs between `__filemap_remove_folio` (writing to `folio->mapping`) and `folio_mapping` (reading from `folio->mapping`).

### Analysis

**1. The Write Side:**
In `__filemap_remove_folio`, which is called during folio truncation or inode eviction (as seen in the stack trace from `shmem_evict_inode`), the folio is being removed from the page cache. The code sets `folio->mapping = NULL` at `mm/filemap.c:144` (within the inlined `page_cache_delete`). This operation is performed while holding the folio lock and the `i_pages` lock of the address space.

**2. The Read Side:**
In `folio_mapping`, the kernel reads `folio->mapping` to determine which address space the folio belongs to. This is called from `folio_evictable` in `mm/internal.h`, which is used by the multi-generational LRU (MGLRU) reclaimer (`scan_folios` -> `sort_folio`).

**3. Why the race is benign:**
*   **Atomic Access:** On the reported architecture (x86_64), aligned 8-byte pointer writes and reads are atomic. The reclaimer will see either the original valid `address_space` pointer or `NULL`.
*   **Object Lifetime (RCU):** `folio_evictable` wraps the call to `folio_mapping` and the subsequent use of the mapping pointer within an `rcu_read_lock()` / `rcu_read_unlock()` block. In the Linux kernel, inodes (and their embedded `address_space` structures, especially for shmem) are freed using RCU (`call_rcu` in `destroy_inode`). Thus, even if the reclaimer reads a "stale" mapping pointer while the inode is being evicted, the memory pointed to by that mapping remains valid for the duration of the RCU critical section.
*   **Safety of `NULL`:** If the reclaimer reads `NULL`, the `mapping_unevictable` function (which receives the result of `folio_mapping`) explicitly handles `NULL` by returning `false`.
*   **Logic Impact:** The result of `folio_evictable` is used as a heuristic to decide whether to move a folio to the unevictable LRU list. Since the folio is in the process of being truncated and removed from the page cache anyway, its temporary placement on one LRU list versus another before its final `put` is inconsequential and does not lead to any corruption or inconsistent state.

In summary, this is a classic example of a "best-effort" read of a pointer that is being cleared during object teardown. The access is protected by RCU to ensure memory safety, and the logic handles both the old and new values correctly.

Final result: **Benign**

5/2 2026/01/14 20:44 llm expert
0m Model:
unknown

Thoughts:
**Analyzing a Data Race in the Filemap Subsystem**

Okay, so the report indicates a nasty data race, specifically between `__filemap_remove_folio` and `folio_mapping`. This is concerning. The race is around `0xffffea0005fc3398`, which looks like it's pointing to `folio->mapping` – that's a critical pointer. The value changes from `0xffff88811b46eaa0` to `0`, which is a classic symptom of a corrupted pointer, likely due to a double free, or a use-after-free scenario.

Let's dive into the traces. The write side – where the modification is happening – shows that `__filemap_remove_folio` at `mm/filemap.c:228` calls `page_cache_delete`, which is inlined. That's our starting point for the write path. We will start here, and examine how it interacts with the folio.

On the read side, `folio_mapping` at `mm/util.c:703` is where the read is happening. That's the function that is getting the potentially bad value. So I'll need to examine what is calling `folio_mapping`, and how it's using the result.

My immediate next step is to pull up those two sections of code, `__filemap_remove_folio` and `folio_mapping`, and start tracing the logic, paying close attention to the synchronization mechanisms (or lack thereof) around access to `folio->mapping`. We need to understand the conditions under which the write is happening, and where the read is occurring, and how they could possibly overlap. I suspect we'll find a missing lock or improper synchronization around the `folio->mapping` pointer. It will be a matter of careful examination.



6/2 2026/01/14 20:44 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:__filemap_remove_folio SourceFile:mm/filemap.c]

Results:
map[Missing:false SourceCode: 216:	/*
 217:	 * Delete a page from the page cache and free it. Caller has to make
 218:	 * sure the page is locked and that nobody else uses it - or that usage
 219:	 * is safe.  The caller must hold the i_pages lock.
 220:	 */
 221:	void __filemap_remove_folio(struct folio *folio, void *shadow)
 222:	{
 223:		struct address_space *mapping = folio->mapping;
 224:	
 225:		trace_mm_filemap_delete_from_page_cache(folio);
 226:		filemap_unaccount_folio(mapping, folio);
 227:		page_cache_delete(mapping, folio, shadow);
 228:	}
 SourceFile:mm/filemap.c]

7/2 2026/01/14 20:44 llm expert
0m Model:
unknown

Thoughts:
Okay, here's my interpretation of that thought process, summarized as a first-person expert analysis:

**Diving into the `page_cache_delete` Function**

Alright, so line 227 is calling `page_cache_delete`. My immediate reaction is, "Right, let's trace that." I need to understand exactly what's happening within this function to grasp the overall flow and how it's impacting the page cache. I'm assuming, given the name, that we're dealing with cache invalidation or eviction, but the specifics are crucial. I need to know the arguments, any locking involved, and the implications for the data structures managing the cached pages. This is a key operation, so let's dig in and see the code for `page_cache_delete`.



8/2 2026/01/14 20:44 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:page_cache_delete SourceFile:mm/filemap.c]

Results:
map[Missing:false SourceCode: 128:	static void page_cache_delete(struct address_space *mapping,
 129:					   struct folio *folio, void *shadow)
 130:	{
 131:		XA_STATE(xas, &mapping->i_pages, folio->index);
 132:		long nr = 1;
 133:	
 134:		mapping_set_update(&xas, mapping);
 135:	
 136:		xas_set_order(&xas, folio->index, folio_order(folio));
 137:		nr = folio_nr_pages(folio);
 138:	
 139:		VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
 140:	
 141:		xas_store(&xas, shadow);
 142:		xas_init_marks(&xas);
 143:	
 144:		folio->mapping = NULL;
 145:		/* Leave folio->index set: truncation lookup relies upon it */
 146:		mapping->nrpages -= nr;
 147:	}
 SourceFile:mm/filemap.c]

9/2 2026/01/14 20:44 llm expert
0m Model:
unknown

Thoughts:
Okay, let's break this down.

**Analyzing the Folio Mapping State**

Alright, so the report flagged a write operation where `folio->mapping` is being set to `NULL`. I remember seeing that, it's right there at line 144. Now, the next logical step is to understand the corresponding read operations. Therefore, I need to focus on how `folio_mapping` is used from this point forward. The implication is, with the `mapping` now being `NULL`, any attempts to access whatever `folio_mapping` might be are going to be critical, and likely the source of our problem. I need to keep this in mind.



10/2 2026/01/14 20:45 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:folio_mapping SourceFile:mm/util.c]

Results:
map[Missing:false SourceCode: 679:	/**
 680:	 * folio_mapping - Find the mapping where this folio is stored.
 681:	 * @folio: The folio.
 682:	 *
 683:	 * For folios which are in the page cache, return the mapping that this
 684:	 * page belongs to.  Folios in the swap cache return the swap mapping
 685:	 * this page is stored in (which is different from the mapping for the
 686:	 * swap file or swap device where the data is stored).
 687:	 *
 688:	 * You can call this for folios which aren't in the swap cache or page
 689:	 * cache and it will return NULL.
 690:	 */
 691:	struct address_space *folio_mapping(const struct folio *folio)
 692:	{
 693:		struct address_space *mapping;
 694:	
 695:		/* This happens if someone calls flush_dcache_page on slab page */
 696:		if (unlikely(folio_test_slab(folio)))
 697:			return NULL;
 698:	
 699:		if (unlikely(folio_test_swapcache(folio)))
 700:			return swap_address_space(folio->swap);
 701:	
 702:		mapping = folio->mapping;
 703:		if ((unsigned long)mapping & FOLIO_MAPPING_FLAGS)
 704:			return NULL;
 705:	
 706:		return mapping;
 707:	}
 SourceFile:mm/util.c]

11/2 2026/01/14 20:45 llm expert
0m Model:
unknown

Thoughts:
Here's the summary of my thought process:

**Diving into the Folio Mapping**

Okay, so line 702, specifically `mapping = folio->mapping;`... that rings a bell. It directly corresponds to the read I was looking at in the report. Good. But hold on, the trail gets warmer. Now, I see `folio_mapping` is called from `folio_evictable`, and that's defined in `mm/internal.h`. Right, I need to dig into `folio_evictable` to understand the flow. Let's see what's going on there.



12/2 2026/01/14 20:45 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:folio_evictable SourceFile:mm/internal.h]

Results:
map[Missing:false SourceCode: 481:	/**
 482:	 * folio_evictable - Test whether a folio is evictable.
 483:	 * @folio: The folio to test.
 484:	 *
 485:	 * Test whether @folio is evictable -- i.e., should be placed on
 486:	 * active/inactive lists vs unevictable list.
 487:	 *
 488:	 * Reasons folio might not be evictable:
 489:	 * 1. folio's mapping marked unevictable
 490:	 * 2. One of the pages in the folio is part of an mlocked VMA
 491:	 */
 492:	static inline bool folio_evictable(struct folio *folio)
 493:	{
 494:		bool ret;
 495:	
 496:		/* Prevent address_space of inode and swap cache from being freed */
 497:		rcu_read_lock();
 498:		ret = !mapping_unevictable(folio_mapping(folio)) &&
 499:				!folio_test_mlocked(folio);
 500:		rcu_read_unlock();
 501:		return ret;
 502:	}
 SourceFile:mm/internal.h]

13/2 2026/01/14 20:45 llm expert
0m Model:
unknown

Thoughts:
Okay, here's my interpretation of this scenario, formatted as requested:

**Deconstructing `folio_evictable` and Its Implications**

Alright, so I'm looking at `folio_evictable`, and the first thing it does is call `folio_mapping(folio)`. Makes sense; we need to get the associated mapping. Now, here's where it gets interesting: if that `folio_mapping` call returns NULL, meaning no mapping is found, then `mapping_unevictable(NULL)` might be triggered. My immediate thought is, let's drill down into that `mapping_unevictable` function. What's the logic there? Why would we declare a NULL mapping as unevictable? Is it handling a corner case where a folio somehow isn't mapped but also can't be evicted, or is there a more general problem the NULL input would indicate? Time to examine that function's implementation and understand the implications of this possible NULL return from `folio_mapping` within the broader eviction strategy. I need to understand the error-handling and any specific cases this might relate to.



14/2 2026/01/14 20:45 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:mapping_unevictable SourceFile:include/linux/pagemap.h]

Results:
map[Missing:false SourceCode: 266:	static inline bool mapping_unevictable(const struct address_space *mapping)
 267:	{
 268:		return mapping && test_bit(AS_UNEVICTABLE, &mapping->flags);
 269:	}
 SourceFile:include/linux/pagemap.h]

15/2 2026/01/14 20:45 llm expert
0m Model:
unknown

Thoughts:
**Thinking Through a Potential Race Condition**

Okay, so `mapping_unevictable` handles `NULL` mappings safely, returning `false`. That's good. But I'm starting to see a race. Here's how I'm breaking it down: We have two tasks, Task 1 and Task 2. Task 1, specifically `__filemap_remove_folio`, is setting `folio->mapping` to `NULL`. Task 2, which I think is `folio_mapping`, is reading `folio->mapping`.

The core issue is timing. If Task 2 reads the *old*, non-`NULL` value of `folio->mapping`, it then proceeds to check `mapping_unevictable(mapping)`. If Task 2 reads `NULL`, then `mapping_unevictable(NULL)` correctly returns `false`. The problem is what happens if Task 2 gets that old, potentially soon-to-be-invalid pointer.

Ah, I see the stack trace now: Task 1 is calling `shmem_evict_inode` which leads to `evict` then `iput_final`, which is probably freeing the inode. That's the eviction process. The crucial part is that the old mapping could be actively being freed.

But wait, `folio_evictable` uses `rcu_read_lock()`. Does a non-`NULL` `folio->mapping` *guarantee* that the `address_space` is still valid? That's what I'm really concerned about. Normally, I believe `folio->mapping` is cleared while holding the `i_pages` lock and while the folio is locked. Task 2 (`scan_folios`) must be holding *some* other lock, or relying on something else, maybe the fact that the folio is on the LRU list, or some other synchronization.

Let me take a look at `scan_folios` in `mm/vmscan.c` to see if that gives me any more clues. I need to be sure that this isn't a lurking concurrency bug.



16/2 2026/01/14 20:45 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:scan_folios SourceFile:mm/vmscan.c]

Results:
map[Missing:false SourceCode:4532:	static int scan_folios(unsigned long nr_to_scan, struct lruvec *lruvec,
4533:			       struct scan_control *sc, int type, int tier,
4534:			       struct list_head *list)
4535:	{
4536:		int i;
4537:		int gen;
4538:		enum vm_event_item item;
4539:		int sorted = 0;
4540:		int scanned = 0;
4541:		int isolated = 0;
4542:		int skipped = 0;
4543:		int scan_batch = min(nr_to_scan, MAX_LRU_BATCH);
4544:		int remaining = scan_batch;
4545:		struct lru_gen_folio *lrugen = &lruvec->lrugen;
4546:		struct mem_cgroup *memcg = lruvec_memcg(lruvec);
4547:	
4548:		VM_WARN_ON_ONCE(!list_empty(list));
4549:	
4550:		if (get_nr_gens(lruvec, type) == MIN_NR_GENS)
4551:			return 0;
4552:	
4553:		gen = lru_gen_from_seq(lrugen->min_seq[type]);
4554:	
4555:		for (i = MAX_NR_ZONES; i > 0; i--) {
4556:			LIST_HEAD(moved);
4557:			int skipped_zone = 0;
4558:			int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES;
4559:			struct list_head *head = &lrugen->folios[gen][type][zone];
4560:	
4561:			while (!list_empty(head)) {
4562:				struct folio *folio = lru_to_folio(head);
4563:				int delta = folio_nr_pages(folio);
4564:	
4565:				VM_WARN_ON_ONCE_FOLIO(folio_test_unevictable(folio), folio);
4566:				VM_WARN_ON_ONCE_FOLIO(folio_test_active(folio), folio);
4567:				VM_WARN_ON_ONCE_FOLIO(folio_is_file_lru(folio) != type, folio);
4568:				VM_WARN_ON_ONCE_FOLIO(folio_zonenum(folio) != zone, folio);
4569:	
4570:				scanned += delta;
4571:	
4572:				if (sort_folio(lruvec, folio, sc, tier))
4573:					sorted += delta;
4574:				else if (isolate_folio(lruvec, folio, sc)) {
4575:					list_add(&folio->lru, list);
4576:					isolated += delta;
4577:				} else {
4578:					list_move(&folio->lru, &moved);
4579:					skipped_zone += delta;
4580:				}
4581:	
4582:				if (!--remaining || max(isolated, skipped_zone) >= MIN_LRU_BATCH)
4583:					break;
4584:			}
4585:	
4586:			if (skipped_zone) {
4587:				list_splice(&moved, head);
4588:				__count_zid_vm_events(PGSCAN_SKIP, zone, skipped_zone);
4589:				skipped += skipped_zone;
4590:			}
4591:	
4592:			if (!remaining || isolated >= MIN_LRU_BATCH)
4593:				break;
4594:		}
4595:	
4596:		item = PGSCAN_KSWAPD + reclaimer_offset(sc);
4597:		if (!cgroup_reclaim(sc)) {
4598:			__count_vm_events(item, isolated);
4599:			__count_vm_events(PGREFILL, sorted);
4600:		}
4601:		count_memcg_events(memcg, item, isolated);
4602:		count_memcg_events(memcg, PGREFILL, sorted);
4603:		__count_vm_events(PGSCAN_ANON + type, isolated);
4604:		trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, scan_batch,
4605:					scanned, skipped, isolated,
4606:					type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON);
4607:		if (type == LRU_GEN_FILE)
4608:			sc->nr.file_taken += isolated;
4609:		/*
4610:		 * There might not be eligible folios due to reclaim_idx. Check the
4611:		 * remaining to prevent livelock if it's not making progress.
4612:		 */
4613:		return isolated || !remaining ? scanned : 0;
4614:	}
 SourceFile:mm/vmscan.c]

17/2 2026/01/14 20:45 llm expert
0m Model:
unknown

Thoughts:
Okay, here's my take on those thoughts, as if I were the one thinking them:

**Diving into the `sort_folio` Function**

Alright, so we've hit line 4572. The core of this operation seems to hinge on `sort_folio`. It's that `if` statement: `if (sort_folio(lruvec, folio, sc, tier))`. So, let's zoom in on that function call. What does `sort_folio` *do*? I need to understand its role in this process – especially given the parameters: `lruvec`, `folio`, `sc`, and `tier`. The names themselves give me some clues, but I need to see the implementation to truly understand the logic. It's obviously critical for how we're handling this particular folio and its associated data structure within this specific tier. Let's see how the sorting happens and what conditions would trigger a success, so the if statement branches accordingly. My immediate next step is to examine the code of the `sort_folio` function.



18/2 2026/01/14 20:45 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:sort_folio SourceFile:mm/vmscan.c]

Results:
map[Missing:false SourceCode:4427:	static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_control *sc,
4428:			       int tier_idx)
4429:	{
4430:		bool success;
4431:		bool dirty, writeback;
4432:		int gen = folio_lru_gen(folio);
4433:		int type = folio_is_file_lru(folio);
4434:		int zone = folio_zonenum(folio);
4435:		int delta = folio_nr_pages(folio);
4436:		int refs = folio_lru_refs(folio);
4437:		bool workingset = folio_test_workingset(folio);
4438:		int tier = lru_tier_from_refs(refs, workingset);
4439:		struct lru_gen_folio *lrugen = &lruvec->lrugen;
4440:	
4441:		VM_WARN_ON_ONCE_FOLIO(gen >= MAX_NR_GENS, folio);
4442:	
4443:		/* unevictable */
4444:		if (!folio_evictable(folio)) {
4445:			success = lru_gen_del_folio(lruvec, folio, true);
4446:			VM_WARN_ON_ONCE_FOLIO(!success, folio);
4447:			folio_set_unevictable(folio);
4448:			lruvec_add_folio(lruvec, folio);
4449:			__count_vm_events(UNEVICTABLE_PGCULLED, delta);
4450:			return true;
4451:		}
4452:	
4453:		/* promoted */
4454:		if (gen != lru_gen_from_seq(lrugen->min_seq[type])) {
4455:			list_move(&folio->lru, &lrugen->folios[gen][type][zone]);
4456:			return true;
4457:		}
4458:	
4459:		/* protected */
4460:		if (tier > tier_idx || refs + workingset == BIT(LRU_REFS_WIDTH) + 1) {
4461:			gen = folio_inc_gen(lruvec, folio, false);
4462:			list_move(&folio->lru, &lrugen->folios[gen][type][zone]);
4463:	
4464:			/* don't count the workingset being lazily promoted */
4465:			if (refs + workingset != BIT(LRU_REFS_WIDTH) + 1) {
4466:				int hist = lru_hist_from_seq(lrugen->min_seq[type]);
4467:	
4468:				WRITE_ONCE(lrugen->protected[hist][type][tier],
4469:					   lrugen->protected[hist][type][tier] + delta);
4470:			}
4471:			return true;
4472:		}
4473:	
4474:		/* ineligible */
4475:		if (zone > sc->reclaim_idx) {
4476:			gen = folio_inc_gen(lruvec, folio, false);
4477:			list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]);
4478:			return true;
4479:		}
4480:	
4481:		dirty = folio_test_dirty(folio);
4482:		writeback = folio_test_writeback(folio);
4483:		if (type == LRU_GEN_FILE && dirty) {
4484:			sc->nr.file_taken += delta;
4485:			if (!writeback)
4486:				sc->nr.unqueued_dirty += delta;
4487:		}
4488:	
4489:		/* waiting for writeback */
4490:		if (writeback || (type == LRU_GEN_FILE && dirty)) {
4491:			gen = folio_inc_gen(lruvec, folio, true);
4492:			list_move(&folio->lru, &lrugen->folios[gen][type][zone]);
4493:			return true;
4494:		}
4495:	
4496:		return false;
4497:	}
 SourceFile:mm/vmscan.c]

19/2 2026/01/14 20:45 llm expert
0m Model:
unknown

Thoughts:
**Debugging a Potential Race Condition in Page Eviction**

Okay, so I'm looking at this code and trying to figure out why a `folio_evictable` check might be behaving unexpectedly. The core of the problem seems to be around how we're determining if a page (folio) can be evicted from the page cache. Specifically, the reported value of the `address_space` associated with the folio is changing, and I need to understand if this is a problem.

Here's the situation: `folio_evictable` is called, and inside it, we're calling `folio_mapping` to get the `address_space` associated with the folio. The report says this `address_space` changed from a valid pointer (`0xffff88811b46eaa0`) to `NULL` (`0x0000000000000000`). This is concerning because if `folio_evictable` reads `NULL`, it will return true unless the folio is mlocked, which I'm not seeing here, and the folio would then be incorrectly considered evictable.

My initial thought was that a stale `mapping` pointer could be the culprit. `folio_evictable` uses `rcu_read_lock()` to protect the `address_space`, but the folio is being removed from the page cache in `__filemap_remove_folio`. The `address_space` is potentially about to be freed, and if the check reads it after that, we could have a use-after-free situation.

The call stack shows `shmem_evict_inode` is in the write stack trace, indicating we're dealing with an inode eviction in shared memory. `evict` is called, which eventually calls `destroy_inode`. However, I see a bit of a wrinkle in the execution order. I expected `shmem_undo_range` to occur *after* `shmem_evict_inode`'s cleanup, but it seems to happen *before*, or rather, during a deeper call chain within `shmem_evict_inode`. But the stack trace is: `evict` -> `shmem_evict_inode` -> `shmem_truncate_range` -> `shmem_undo_range` -> `truncate_inode_folio` -> `filemap_remove_folio` -> `__filemap_remove_folio` -> `page_cache_delete`. This means the inode *is* still alive when `folio_evictable` is called. The `address_space` is part of that inode, so it *should* still be valid.

Let's re-examine `folio_evictable`:

```c
	rcu_read_lock();
	ret = !mapping_unevictable(folio_mapping(folio)) &&
			!folio_test_mlocked(folio);
	rcu_read_unlock();
```

If `folio_mapping` returns a valid `mapping` pointer, `mapping_unevictable` accesses `mapping->flags`, which could be a problem if the underlying memory of `mapping` is being freed. However, since `shmem_evict_inode` is running and the inode is still allocated, the `i_data` (the `address_space`) should be valid. The inode destruction happens later in `evict()`.

So, the next step is to examine `evict()` in `fs/inode.c` to understand the exact timing of the inode's deallocation and confirm this assumption. I'm reasonably sure the problem isn't a simple use-after-free, given the call stack, but it warrants a closer look. I need to make sure the assumptions are all correct!



20/2 2026/01/14 20:45 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:evict SourceFile:fs/inode.c]

Results:
map[Missing:false SourceCode: 800:	/*
 801:	 * Free the inode passed in, removing it from the lists it is still connected
 802:	 * to. We remove any pages still attached to the inode and wait for any IO that
 803:	 * is still in progress before finally destroying the inode.
 804:	 *
 805:	 * An inode must already be marked I_FREEING so that we avoid the inode being
 806:	 * moved back onto lists if we race with other code that manipulates the lists
 807:	 * (e.g. writeback_single_inode). The caller is responsible for setting this.
 808:	 *
 809:	 * An inode must already be removed from the LRU list before being evicted from
 810:	 * the cache. This should occur atomically with setting the I_FREEING state
 811:	 * flag, so no inodes here should ever be on the LRU when being evicted.
 812:	 */
 813:	static void evict(struct inode *inode)
 814:	{
 815:		const struct super_operations *op = inode->i_sb->s_op;
 816:	
 817:		BUG_ON(!(inode_state_read_once(inode) & I_FREEING));
 818:		BUG_ON(!list_empty(&inode->i_lru));
 819:	
 820:		inode_io_list_del(inode);
 821:		inode_sb_list_del(inode);
 822:	
 823:		spin_lock(&inode->i_lock);
 824:		inode_wait_for_lru_isolating(inode);
 825:	
 826:		/*
 827:		 * Wait for flusher thread to be done with the inode so that filesystem
 828:		 * does not start destroying it while writeback is still running. Since
 829:		 * the inode has I_FREEING set, flusher thread won't start new work on
 830:		 * the inode.  We just have to wait for running writeback to finish.
 831:		 */
 832:		inode_wait_for_writeback(inode);
 833:		spin_unlock(&inode->i_lock);
 834:	
 835:		if (op->evict_inode) {
 836:			op->evict_inode(inode);
 837:		} else {
 838:			truncate_inode_pages_final(&inode->i_data);
 839:			clear_inode(inode);
 840:		}
 841:		if (S_ISCHR(inode->i_mode) && inode->i_cdev)
 842:			cd_forget(inode);
 843:	
 844:		remove_inode_hash(inode);
 845:	
 846:		/*
 847:		 * Wake up waiters in __wait_on_freeing_inode().
 848:		 *
 849:		 * It is an invariant that any thread we need to wake up is already
 850:		 * accounted for before remove_inode_hash() acquires ->i_lock -- both
 851:		 * sides take the lock and sleep is aborted if the inode is found
 852:		 * unhashed. Thus either the sleeper wins and goes off CPU, or removal
 853:		 * wins and the sleeper aborts after testing with the lock.
 854:		 *
 855:		 * This also means we don't need any fences for the call below.
 856:		 */
 857:		inode_wake_up_bit(inode, __I_NEW);
 858:		BUG_ON(inode_state_read_once(inode) != (I_FREEING | I_CLEAR));
 859:	
 860:		destroy_inode(inode);
 861:	}
 SourceFile:fs/inode.c]

21/2 2026/01/14 20:45 llm expert
0m Model:
unknown

Thoughts:
**Analyzing a Potential Race Condition in Shmem Inode Destruction**

Okay, so I'm looking at this code and trying to pinpoint a possible race. Line 860, `destroy_inode`, that's where the inode itself gets wiped out. But the interesting bit happens on line 836, with the `op->evict_inode(inode)` call. For shmem, that function resolves to `shmem_evict_inode`, right? Right. And `shmem_evict_inode`, as I recall, is where the mapping of folios gets cleared.

Now, if `folio_evictable` is reading the mapping pointer, it's going to access `mapping->flags`. The crucial point here is that `destroy_inode` *hasn't* been called yet at that moment, so the `address_space`, which is part of the inode, is still valid, at least initially. But... wait a minute, aren't shmem inodes destroyed via RCU? I need to double-check `shmem_evict_inode` to be absolutely sure about the timing here. That's where the heart of the problem will reside, that's where any race condition could be hiding. Let's see how they're handling things there.



22/2 2026/01/14 20:45 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:shmem_evict_inode SourceFile:mm/shmem.c]

Results:
map[Missing:false SourceCode:1368:	static void shmem_evict_inode(struct inode *inode)
1369:	{
1370:		struct shmem_inode_info *info = SHMEM_I(inode);
1371:		struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
1372:		size_t freed = 0;
1373:	
1374:		if (shmem_mapping(inode->i_mapping)) {
1375:			shmem_unacct_size(info->flags, inode->i_size);
1376:			inode->i_size = 0;
1377:			mapping_set_exiting(inode->i_mapping);
1378:			shmem_truncate_range(inode, 0, (loff_t)-1);
1379:			if (!list_empty(&info->shrinklist)) {
1380:				spin_lock(&sbinfo->shrinklist_lock);
1381:				if (!list_empty(&info->shrinklist)) {
1382:					list_del_init(&info->shrinklist);
1383:					sbinfo->shrinklist_len--;
1384:				}
1385:				spin_unlock(&sbinfo->shrinklist_lock);
1386:			}
1387:			while (!list_empty(&info->swaplist)) {
1388:				/* Wait while shmem_unuse() is scanning this inode... */
1389:				wait_var_event(&info->stop_eviction,
1390:					       !atomic_read(&info->stop_eviction));
1391:				spin_lock(&shmem_swaplist_lock);
1392:				/* ...but beware of the race if we peeked too early */
1393:				if (!atomic_read(&info->stop_eviction))
1394:					list_del_init(&info->swaplist);
1395:				spin_unlock(&shmem_swaplist_lock);
1396:			}
1397:		}
1398:	
1399:		simple_xattrs_free(&info->xattrs, sbinfo->max_inodes ? &freed : NULL);
1400:		shmem_free_inode(inode->i_sb, freed);
1401:		WARN_ON(inode->i_blocks);
1402:		clear_inode(inode);
1403:	#ifdef CONFIG_TMPFS_QUOTA
1404:		dquot_free_inode(inode);
1405:		dquot_drop(inode);
1406:	#endif
1407:	}
 SourceFile:mm/shmem.c]

23/2 2026/01/14 20:45 llm expert
0m Model:
unknown

Thoughts:
Here's my analysis of the potential data race, as I see it:

**Data Race Analysis in `shmem` Inode Eviction**

Okay, so I'm looking at this `shmem_evict_inode` function and the sequence of calls that it triggers, specifically around the `shmem_truncate_range` and `shmem_undo_range` functions, leading down to `page_cache_delete`. My concern is focused on the `folio->mapping` pointer and a potential race condition.

Essentially, `page_cache_delete` sets `folio->mapping` to `NULL`. The concern, as I understand it, is if another task attempts to access the `folio`'s mapping concurrently. The report focuses on a potential read of a garbage value, but it's quickly dismissed because the architecture (x86_64, which is typical) *should* guarantee atomic writes for 8-byte aligned pointers, and `folio->mapping` is an 8-byte pointer. So, a garbage value from an incomplete write is unlikely. I've confirmed that the report mentions an 8 byte write and read.

However, even though we are assured there is an atomic write of NULL, the data structure being accessed may be unsafe. If Task 2 reads the original mapping, it goes on to call `mapping_unevictable` which accesses `mapping->flags`. I know that if the inode is being destroyed, then the `mapping` may be invalid, but the sequence of function calls suggests that `destroy_inode` hasn't happened yet.

Then I paused to think about what is happening on the reclaimer side of things. If the reclaimer has the folio on its LRU list, and this whole thing is supposed to be evicting the inode and freeing pages, then is the folio properly removed from the LRU list?

I double checked my understanding of `__filemap_remove_folio` and noticed it calls `page_cache_delete`. I need to ensure that the folio isn't left on the LRU list after this call. So, I will need to double-check that `shmem_undo_range`, and ultimately `truncate_inode_folio` properly handles LRU list removal.



24/2 2026/01/14 20:45 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:truncate_inode_folio SourceFile:mm/truncate.c]

Results:
map[Missing:false SourceCode: 169:	int truncate_inode_folio(struct address_space *mapping, struct folio *folio)
 170:	{
 171:		if (folio->mapping != mapping)
 172:			return -EIO;
 173:	
 174:		truncate_cleanup_folio(folio);
 175:		filemap_remove_folio(folio);
 176:		return 0;
 177:	}
 SourceFile:mm/truncate.c]

25/2 2026/01/14 20:45 llm expert
0m Model:
unknown

Thoughts:
Okay, here's the summary, assuming I'm an expert in this field, and speaking as myself:

**Cleaning Up a Folio**

Alright, let's break this down. When I see `truncate_cleanup_folio(folio)`, my brain immediately goes to data integrity. I'm looking at a function designed to sanitize and prepare a "folio" object for something – probably storage, processing, or a fresh start. "Truncate" suggests we're dealing with potentially messy or oversized data, and "cleanup" confirms that we’re dealing with getting rid of all extraneous items. It means this function is meant to streamline that folio, reducing its size, removing irrelevant information, or conforming it to some standard. So, I need to understand what constitutes "cleanup" in the context of this folio. Is it stripping metadata? Deleting redundant entries? Removing inconsistencies? I need to look at the internals of this function to see what specific transformations are being done. There are two potential aspects here: what we remove, and how we remove it. Let's see what we're going to lose, and how we go about losing it.



26/2 2026/01/14 20:45 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:truncate_cleanup_folio SourceFile:mm/truncate.c]

Results:
map[Missing:false SourceCode: 143:	/*
 144:	 * If truncate cannot remove the fs-private metadata from the page, the page
 145:	 * becomes orphaned.  It will be left on the LRU and may even be mapped into
 146:	 * user pagetables if we're racing with filemap_fault().
 147:	 *
 148:	 * We need to bail out if page->mapping is no longer equal to the original
 149:	 * mapping.  This happens a) when the VM reclaimed the page while we waited on
 150:	 * its lock, b) when a concurrent invalidate_mapping_pages got there first and
 151:	 * c) when tmpfs swizzles a page between a tmpfs inode and swapper_space.
 152:	 */
 153:	static void truncate_cleanup_folio(struct folio *folio)
 154:	{
 155:		if (folio_mapped(folio))
 156:			unmap_mapping_folio(folio);
 157:	
 158:		if (folio_needs_release(folio))
 159:			folio_invalidate(folio, 0, folio_size(folio));
 160:	
 161:		/*
 162:		 * Some filesystems seem to re-dirty the page even after
 163:		 * the VM has canceled the dirty bit (eg ext3 journaling).
 164:		 * Hence dirty accounting check is placed after invalidation.
 165:		 */
 166:		folio_cancel_dirty(folio);
 167:	}
 SourceFile:mm/truncate.c]

27/2 2026/01/14 20:45 llm expert
0m Model:
unknown

Thoughts:
**The LRU Conundrum**

Okay, so I'm looking at this code, and something isn't quite clicking with the LRU behavior here. It doesn't seem like the folio is *actually* being removed from the LRU in the way I'd expect. Normally, a folio gets tossed from the LRU when its reference count hits zero, or maybe explicitly in certain scenarios.

Hold on, I see `scan_folios` is iterating over the LRU list. If a folio is on the LRU, it *should* have a refcount of at least one, right? Now, `__filemap_remove_folio` calls `page_cache_delete`, which sets `folio->mapping = NULL`. But, the folio might still be lingering on the LRU. That's a problem.

So, the reclaimer comes along and finds this folio on the LRU and tries to figure out if it's evictable. The real crux of the issue seems to be that race on `folio->mapping`. If the reclaimer sees a non-NULL `mapping`, it plows ahead and tries to use it. But is that safe if the folio is in the process of being detached from that mapping?

The `folio_evictable` function uses `rcu_read_lock()`. That makes me think. Does the `address_space`, which is the `mapping` here, stay valid under RCU? For shmem, that `address_space` is tied to the `inode`. The `inode` is freed via RCU, right? I need to check `destroy_inode` and the `i_callback` functions to confirm that behavior. That should give me the answer.



28/2 2026/01/14 20:45 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:destroy_inode SourceFile:fs/inode.c]

Results:
map[Missing:false SourceCode: 388:	static void destroy_inode(struct inode *inode)
 389:	{
 390:		const struct super_operations *ops = inode->i_sb->s_op;
 391:	
 392:		BUG_ON(!list_empty(&inode->i_lru));
 393:		__destroy_inode(inode);
 394:		if (ops->destroy_inode) {
 395:			ops->destroy_inode(inode);
 396:			if (!ops->free_inode)
 397:				return;
 398:		}
 399:		inode->free_inode = ops->free_inode;
 400:		call_rcu(&inode->i_rcu, i_callback);
 401:	}
 SourceFile:fs/inode.c]

29/2 2026/01/14 20:45 llm expert
0m Model:
unknown

Thoughts:
**Data Race Analysis: `folio->mapping` in Page Cache Eviction**

Okay, so I'm digging into this KCSAN report about a potential data race involving `folio->mapping` during page cache eviction. The core of the issue seems to be that the reclaimer is reading this field without any explicit synchronization, while another task might be setting it to NULL (typically during truncation or inode eviction). My initial thought is that since we're dealing with a pointer, the write to `folio->mapping` is atomic on x86_64, which means the reader will either see the old valid pointer or NULL. However, KCSAN is flagging this as a data race, so I need to carefully consider the potential consequences.

First, I need to understand the role of RCU here. The inode itself is freed using `call_rcu()`, suggesting that the associated `address_space` should be protected during the grace period. This implies that if the reclaimer reads a non-NULL `mapping`, it's guaranteed to be valid until the next RCU grace period. Therefore, direct accesses to `mapping->flags` within the grace period are safe.

The reclaimer uses `folio_evictable`, which eventually calls `mapping_unevictable` to check if a folio can be reclaimed, and it relies on `folio_mapping` to get the mapping. The key question is, what happens if Task 1 has set `folio->mapping = NULL` just before Task 2 calls `folio_mapping`? In this case, `mapping_unevictable` would be called on a stale `mapping` value. But it seems this stale `mapping` is still valid, due to RCU.

The reclaimer might mistakenly put a folio on the wrong LRU list, based on a stale unevictable flag, but since the folio is in the process of being removed from the page cache, it's just a temporary misplacement. The folio's refcount will soon drop and it will be freed, so the fact that it is on a specific LRU doesn't seem critical, as the refcount will ensure its removal.

I'm also considering the code itself, let's look at `folio_mapping` again. I see that it does *not* use `READ_ONCE`. Is this intentional, or a missed optimization? It's a standard pattern to use `READ_ONCE` for `folio->mapping` if races are expected. If this were a regular code review, I'd probably add it. However, given that it's just a pointer on a modern architecture, an atomic read is likely performed anyway.

If the reclaimer reads an old mapping, this means it's likely part of an inode that is being evicted and whose memory is still valid. The `address_space` is usually protected by RCU, or the `i_pages` lock, or simply hasn't yet been freed, so accessing `mapping->flags` should be safe, and `clear_inode` and `destroy_inode` are called on this valid `inode`.

My next step is to analyze `folio_evictable` more deeply. It appears to be the decision point for whether to move a folio to the unevictable list. If `folio_evictable` is called on a folio that is being removed from the page cache, and if it sees an old `mapping`, the folio might incorrectly end up on the unevictable list, but this temporary misplacement is not a fatal error since it is still being truncated.

I'm thinking through all the different scenarios. If the reclaimer sees a valid mapping, it uses that, and that's fine. If it sees NULL, it's being freed. The most interesting case is when the mapping is in-between.

I'm verifying if there's any chance that `folio->mapping` might point to something that is *not* an `address_space`, but the implementation of `folio_mapping` seems to handle those cases (slab, swapcache). If the lower bits are zero, it's almost certainly an `address_space`.

So, the race is:
*   Writer: `folio->mapping = NULL`
*   Reader: `mapping = folio->mapping`

Both `folio->mapping` and `mapping` are 8-byte aligned pointers, so the reader *should* either see the old valid pointer or NULL.

In the case where the reclaimer mistakenly adds a folio to an unevictable list, the reclaimer is holding a reference (or at least it's scanning it), so there is no memory safety issue.

The more likely problem is a potential logic error: if `folio_evictable` returns false (it *is* evictable), but it's *actually


30/2 2026/01/14 20:46 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:folio_mapping SourceFile:mm/util.c]

Results:
map[Missing:false SourceCode: 679:	/**
 680:	 * folio_mapping - Find the mapping where this folio is stored.
 681:	 * @folio: The folio.
 682:	 *
 683:	 * For folios which are in the page cache, return the mapping that this
 684:	 * page belongs to.  Folios in the swap cache return the swap mapping
 685:	 * this page is stored in (which is different from the mapping for the
 686:	 * swap file or swap device where the data is stored).
 687:	 *
 688:	 * You can call this for folios which aren't in the swap cache or page
 689:	 * cache and it will return NULL.
 690:	 */
 691:	struct address_space *folio_mapping(const struct folio *folio)
 692:	{
 693:		struct address_space *mapping;
 694:	
 695:		/* This happens if someone calls flush_dcache_page on slab page */
 696:		if (unlikely(folio_test_slab(folio)))
 697:			return NULL;
 698:	
 699:		if (unlikely(folio_test_swapcache(folio)))
 700:			return swap_address_space(folio->swap);
 701:	
 702:		mapping = folio->mapping;
 703:		if ((unsigned long)mapping & FOLIO_MAPPING_FLAGS)
 704:			return NULL;
 705:	
 706:		return mapping;
 707:	}
 SourceFile:mm/util.c]

31/2 2026/01/14 20:46 llm expert
0m Model:
unknown

Thoughts:
**Analyzing a Potential Race Condition**

Okay, hold on a second. I'm looking at this code and I've got a potential issue flagged in my mind, specifically around `folio_test_swapcache` at line 699. It seems like this function might be racy with changes to `folio->mapping`. The crux of my concern is that `folio_test_swapcache` could be checking bits within `folio->mapping`, and if that's happening concurrently with modifications, we're likely to have a bug.

Let me just double-check what `folio_test_swapcache` actually *does*. Ah, it's defined right here...and, oh dear, this isn't correct, it's defined to be:
```c
static inline bool folio_test_swapcache(const struct folio *folio)
{
	return folio_test_reclaim(folio); // No, this is wrong.
}
```
Okay, this means I was wrong to consider the initial problem, but I better revisit the code to ensure this is correct.



32/2 2026/01/14 20:46 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:folio_test_swapcache SourceFile:include/linux/page-flags.h]

Results:
map[Missing:false SourceCode: 630:	static __always_inline bool folio_test_swapcache(const struct folio *folio)
 631:	{
 632:		return folio_test_swapbacked(folio) &&
 633:				test_bit(PG_swapcache, const_folio_flags(folio, 0));
 634:	}
 SourceFile:include/linux/page-flags.h]

33/2 2026/01/14 20:46 llm expert
0m Model:
unknown

Thoughts:
**My Assessment: Benign Data Race in Folio Mapping**

Okay, so KCSAN flagged a potential data race, specifically related to how `folio->mapping` is accessed during folio eviction and LRU scanning. My initial thought is that this is likely a classic "benign" race condition in the VM, but let's break this down systematically.

The core issue seems to be that a task (Task 1) is clearing the `folio->mapping` field while another task (Task 2, the reclaimer) might be reading it concurrently. The race is between `__filemap_remove_folio` setting it to NULL, and `folio_mapping` reading the value.

The KCSAN report correctly points out that if Task 2 reads `folio->mapping` while it's being cleared, it could get a "stale" pointer. The crucial question is, does this stale pointer cause a real problem?

First, I need to understand the lifecycle of the data structures involved. The code shows the mapping changing from a valid `address_space` pointer to NULL.  The `address_space` is part of the `inode`, which is being evicted (via `shmem_evict_inode`). However, the `inode` *isn't* destroyed immediately.  The memory associated with the `inode` and its `address_space` is *still valid* when the reclaimer is potentially reading the stale pointer, which is the key.

Now, let's consider the synchronization.  The reclaimer doesn't hold the folio lock when calling `folio_evictable` – that's where the race arises. But, it *is* using RCU (via `rcu_read_lock()`) during this process. The comment from `mm/internal.h` confirms that developers *expect* the `folio_mapping` to be potentially undergoing changes while the reclaimer is working.

The crucial code path is through `folio_evictable` into `folio_mapping`. The purpose of `folio_evictable` is to determine if a folio is a candidate for eviction; which LRU list the folio should be on.  It essentially acts as a heuristic to optimize the reclaimer.  The report mentions that the stale value is handled (RCU protection and the object being still alive).

The `folio_mapping` function itself checks flags for `PAGE_MAPPING_ANON` and `FOLIO_MAPPING_FLAGS`, which suggests that it's designed to handle different scenarios, including anonymous memory where the mapping is an `anon_vma`. It correctly returns NULL for anon folios.

Therefore, the race seems to be within the expected design of the VM.  The write is atomic (8-byte aligned on 64-bit architectures, which is the assumed case). Either the old value or NULL is safe for `mapping_unevictable()`. Also, `folio_evictable` is called only if the folio is on the LRU list, and even if it gets a stale value, the only consequence is that the folio might be incorrectly placed on the unevictable list when it could be evictable (which is not a disaster because it's being truncated). The worst-case scenario: the folio just takes a slightly different path during eviction, eventually leading to its release.

In conclusion, I'm confident that this is a benign data race. The RCU protection and lifecycle of the inode ensure memory safety. The use of the read value is a heuristic.



34/2 2026/01/14 20:46 tool set-results
0m Model:
unknown

Args:
map[Benign:true Confident:true]

Results:
map[Benign:true Confident:true]

35/2 2026/01/14 20:46 llm expert
0m Model:
unknown

Thoughts:
**Understanding a Benign Data Race**

Okay, so I've already established the outcome – now it's time to articulate the explanation clearly. I've pinpointed a data race, specifically the interplay between Task 1's clearing of `folio->mapping` in `__filemap_remove_folio` and Task 2's read of it within `folio_mapping`, invoked by `folio_evictable`.

I've determined this is a benign race, and here's why:

First, `folio_mapping` is architected to handle scenarios where the `folio->mapping` pointer might be NULL or have specific flags set. This built-in robustness is crucial.

Second, the `folio_evictable` read is protected by `rcu_read_lock()`. This is fundamental for memory safety in this context.

Third, the `address_space` that `folio->mapping` points to belongs to an inode, which is in the process of eviction but hasn't been completely destroyed. The race happens before the inode is destroyed.

Fourth, the inode destruction itself employs `call_rcu`. This guarantees that the memory remains valid for the duration of the `rcu_read_lock()`'s hold.

Fifth, on modern 64-bit systems, aligned 8-byte pointer loads and stores are atomic operations. Therefore, the reader will always see either the valid older mapping pointer or NULL, not a corrupted intermediate state.

Sixth, both scenarios—either the old mapping pointer or NULL—are handled gracefully by `folio_evictable` and its callers. A potentially stale value simply influences a heuristic within the reclaimer, and the folio is likely getting freed anyway. This helps explain why this is a race that doesn't cause a problem.

I'm now considering if there are any other aspects to include.

A valid concern might be to address KCSAN and prevent aggressive compiler optimizations with `READ_ONCE` and `WRITE_ONCE`. While these optimizations are unlikely to be problematic in this specific instance, the addition of the READ/WRITE ONCE macros would give an added benefit in this case. Importantly, the core point remains: this race, as it stands, doesn't translate into kernel crashes or memory corruption.



LLM Calls Summary:
Total Calls Total Tokens Avg Tokens Total Duration (Seconds) Avg Duration (Seconds)
Tool Calls Summary:
Total Calls Total Duration (Seconds) Avg Duration (Seconds)