Drivers: hv: mshv_vtl: fix GUP into VTL0 device mappings#141
Open
namancse wants to merge 2 commits into
Open
Conversation
Since v6.15 (aed877c, d3f7922), GUP no longer takes a pgmap reference for ZONE_DEVICE pages and walks huge entries through the unified folio path. With vmf_insert_pfn_{pmd,pud}() the mapping holds no folio reference, so a zap racing with pin_user_pages_fast() can briefly drop the folio refcount to 0 and trigger a WARN in try_grab_folio() with the I/O failing as -ENOMEM. Switch the PMD/PUD fault paths to vmf_insert_folio_{pmd,pud}(), mirroring drivers/dax/device.c. Each map takes folio_get(); the matching folio_put() in zap keeps the refcount above 0. Gate the huge inserters on pfn_valid() + ZONE_DEVICE + MEMORY_DEVICE_GENERIC via mshv_vtl_low_resolve_page(); fall back to VM_FAULT_FALLBACK when the folio order does not match PMD_ORDER/PUD_ORDER or the PFN is not yet pgmap-backed, so the core can retry at smaller order. Add VM_DONTEXPAND to the VMA to block mremap() growth past the pgmap. Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
There was a problem hiding this comment.
Pull request overview
This PR updates the mshv_vtl_low mmap fault paths so VTL0 ZONE_DEVICE mappings become GUP-pinable again after the removal of the pte_devmap fast-path in 6.15. It does so by switching huge faults to folio-aware inserters and by making the 4K fault path insert a refcounted page once the pgmap exists, while providing an early pre-pgmap pte_special fallback and later zapping those stale PTEs.
Changes:
- Add a pgmap-backed PFN→
struct pageresolver and usevmf_insert_page_mkwrite()(4K) /vmf_insert_folio_pmd()/vmf_insert_folio_pud()(huge) so faults install folio-backed entries suitable for GUP. - Capture the
/dev/mshv_vtl_lowaddress_spaceon first open and invalidate early-faultpte_specialmappings after pgmap registration. - Tighten VMA flags for the mapping (
VM_MIXEDMAP+VM_DONTEXPAND) to support the mixed fallback and keep the mapping size pinned.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Extend the folio-aware fault path to the 4K case so GUP into /dev/mshv_vtl_low works after MSHV_ADD_VTL0_MEMORY has registered the range. With the previous vmf_insert_mixed() path the PTE was always pte_special, vm_normal_page() returned NULL during pin_user_pages*(), follow_pfn_pte() returned -EEXIST, and io_uring O_DIRECT surfaced it as "disk io error: io error: File exists (os error 17)" on the first DMA into a freshly-registered VTL0 chunk. The 4K path now resolves the PFN via mshv_vtl_low_resolve_page(): when backed by an mshv_vtl pgmap the PTE is installed with vmf_insert_page_mkwrite(), giving GUP a normal pinnable page; otherwise it falls back to vmf_insert_mixed() so early CPU accesses (e.g. the VTL2 guest-memory self test reading GPA 0 before any add_vtl0_mem ioctl) still succeed instead of SIGBUSing. Such fallback PTEs would persist across registration and break later GUP. Capture the cdev's address_space on first open and, on successful MSHV_ADD_VTL0_MEMORY, invalidate the file-offset range via unmap_mapping_range() for both the encrypted (pfn) and decrypted (pfn | DECRYPTED_MASK) aliases that mshv_vtl_low_mmap() exposes. The next access re-faults into the folio path and GUP works. Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
09c2a53 to
dce4fdc
Compare
Contributor
Author
|
Adding the bug link for future reference. |
hargar19
reviewed
Jun 5, 2026
hargar19
reviewed
Jun 5, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Restores GUP (get_user_pages) into VTL0 memory mappings, broken by the 6.15 ZONE_DEVICE / pte_devmap removal (aed877c, d3f7922). After that refactor, GUP only walks PTEs/PMDs/PUDs that point to a real folio with a held reference; the legacy pte_devmap fast-path is gone. mshv_vtl_low was still installing devmap PTEs via vmf_insert_pfn_*, so userspace pins on /dev/mshv_vtl_low mappings silently failed.
Two commits:
use folio-aware inserters for huge VTL0 mappings — switches the PMD/PUD fault paths to vmf_insert_folio_pmd / vmf_insert_folio_pud, resolving the pfn to its struct page / pgmap folio and verifying the folio order matches the fault order.
fix GUP into VTL0 mappings on the 4K fault path — adds a folio-aware 4K path using vmf_insert_page_mkwrite once the pgmap is live, with a pte_special fallback (via vmf_insert_mixed) for early faults before devm_memremap_pages has run. Captures the chardev address_space on first open (cmpxchg) and calls unmap_mapping_range for both the encrypted and DECRYPTED_MASK-aliased pfns after pgmap registration so any stale special PTEs are dropped and refaulted as folio-backed. VM_MIXEDMAP | VM_DONTEXPAND are set on the VMA.