Skip to content

Commit b0e4917

Browse files
Baolin Wanggregkh
authored andcommitted
mm: shmem: fix potential data corruption during shmem swapin
commit 0583135 upstream. Alex and Kairui reported some issues (system hang or data corruption) when swapping out or swapping in large shmem folios. This is especially easy to reproduce when the tmpfs is mount with the 'huge=within_size' parameter. Thanks to Kairui's reproducer, the issue can be easily replicated. The root cause of the problem is that swap readahead may asynchronously swap in order 0 folios into the swap cache, while the shmem mapping can still store large swap entries. Then an order 0 folio is inserted into the shmem mapping without splitting the large swap entry, which overwrites the original large swap entry, leading to data corruption. When getting a folio from the swap cache, we should split the large swap entry stored in the shmem mapping if the orders do not match, to fix this issue. Link: https://lkml.kernel.org/r/2fe47c557e74e9df5fe2437ccdc6c9115fa1bf70.1740476943.git.baolin.wang@linux.alibaba.com Fixes: 809bc86 ("mm: shmem: support large folio swap out") Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reported-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca> Reported-by: Kairui Song <ryncsn@gmail.com> Closes: https://lore.kernel.org/all/1738717785.im3r5g2vxc.none@localhost/ Tested-by: Kairui Song <kasong@tencent.com> Cc: David Hildenbrand <david@redhat.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: Matthew Wilcow <willy@infradead.org> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ hughd: removed skip_swapcache dependencies ] Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1 parent 246bbe3 commit b0e4917

1 file changed

Lines changed: 27 additions & 3 deletions

File tree

mm/shmem.c

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2132,7 +2132,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
21322132
struct swap_info_struct *si;
21332133
struct folio *folio = NULL;
21342134
swp_entry_t swap;
2135-
int error, nr_pages;
2135+
int error, nr_pages, order, split_order;
21362136

21372137
VM_BUG_ON(!*foliop || !xa_is_value(*foliop));
21382138
swap = radix_to_swp_entry(*foliop);
@@ -2151,8 +2151,8 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
21512151

21522152
/* Look it up and read it in.. */
21532153
folio = swap_cache_get_folio(swap, NULL, 0);
2154+
order = xa_get_order(&mapping->i_pages, index);
21542155
if (!folio) {
2155-
int split_order;
21562156

21572157
/* Or update major stats only when swapin succeeds?? */
21582158
if (fault_type) {
@@ -2189,13 +2189,37 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
21892189
error = -ENOMEM;
21902190
goto failed;
21912191
}
2192+
} else if (order != folio_order(folio)) {
2193+
/*
2194+
* Swap readahead may swap in order 0 folios into swapcache
2195+
* asynchronously, while the shmem mapping can still stores
2196+
* large swap entries. In such cases, we should split the
2197+
* large swap entry to prevent possible data corruption.
2198+
*/
2199+
split_order = shmem_split_large_entry(inode, index, swap, gfp);
2200+
if (split_order < 0) {
2201+
error = split_order;
2202+
goto failed;
2203+
}
2204+
2205+
/*
2206+
* If the large swap entry has already been split, it is
2207+
* necessary to recalculate the new swap entry based on
2208+
* the old order alignment.
2209+
*/
2210+
if (split_order > 0) {
2211+
pgoff_t offset = index - round_down(index, 1 << split_order);
2212+
2213+
swap = swp_entry(swp_type(swap), swp_offset(swap) + offset);
2214+
}
21922215
}
21932216

21942217
/* We have to do this with folio locked to prevent races */
21952218
folio_lock(folio);
21962219
if (!folio_test_swapcache(folio) ||
21972220
folio->swap.val != swap.val ||
2198-
!shmem_confirm_swap(mapping, index, swap)) {
2221+
!shmem_confirm_swap(mapping, index, swap) ||
2222+
xa_get_order(&mapping->i_pages, index) != folio_order(folio)) {
21992223
error = -EEXIST;
22002224
goto unlock;
22012225
}

0 commit comments

Comments
 (0)