mm/shmem: persist uffd-wp bit across zapping for file-backed

File-backed memory is prone to being unmapped at any time.  It means all
information in the pte will be dropped, including the uffd-wp flag.

To persist the uffd-wp flag, we'll use the pte markers.  This patch
teaches the zap code to understand uffd-wp and know when to keep or drop
the uffd-wp bit.

Add a new flag ZAP_FLAG_DROP_MARKER and set it in zap_details when we
don't want to persist such an information, for example, when destroying
the whole vma, or punching a hole in a shmem file.  For the rest cases we
should never drop the uffd-wp bit, or the wr-protect information will get
lost.

The new ZAP_FLAG_DROP_MARKER needs to be put into mm.h rather than
memory.c because it'll be further referenced in hugetlb files later.

Link: https://lkml.kernel.org/r/20220405014847.14295-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This commit is contained in:
Peter Xu
2022-05-12 20:22:53 -07:00
committed by Andrew Morton
parent 9c28a205c0
commit 999dad824c
4 changed files with 107 additions and 3 deletions

View File

@@ -6,6 +6,8 @@
#include <linux/huge_mm.h>
#include <linux/swap.h>
#include <linux/string.h>
#include <linux/userfaultfd_k.h>
#include <linux/swapops.h>
/**
* folio_is_file_lru - Should the folio be on a file LRU or anon LRU?
@@ -316,5 +318,46 @@ static inline bool mm_tlb_flush_nested(struct mm_struct *mm)
return atomic_read(&mm->tlb_flush_pending) > 1;
}
/*
* If this pte is wr-protected by uffd-wp in any form, arm the special pte to
* replace a none pte. NOTE! This should only be called when *pte is already
* cleared so we will never accidentally replace something valuable. Meanwhile
* none pte also means we are not demoting the pte so tlb flushed is not needed.
* E.g., when pte cleared the caller should have taken care of the tlb flush.
*
* Must be called with pgtable lock held so that no thread will see the none
* pte, and if they see it, they'll fault and serialize at the pgtable lock.
*
* This function is a no-op if PTE_MARKER_UFFD_WP is not enabled.
*/
static inline void
pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr,
pte_t *pte, pte_t pteval)
{
#ifdef CONFIG_PTE_MARKER_UFFD_WP
bool arm_uffd_pte = false;
/* The current status of the pte should be "cleared" before calling */
WARN_ON_ONCE(!pte_none(*pte));
if (vma_is_anonymous(vma) || !userfaultfd_wp(vma))
return;
/* A uffd-wp wr-protected normal pte */
if (unlikely(pte_present(pteval) && pte_uffd_wp(pteval)))
arm_uffd_pte = true;
/*
* A uffd-wp wr-protected swap pte. Note: this should even cover an
* existing pte marker with uffd-wp bit set.
*/
if (unlikely(pte_swp_uffd_wp_any(pteval)))
arm_uffd_pte = true;
if (unlikely(arm_uffd_pte))
set_pte_at(vma->vm_mm, addr, pte,
make_pte_marker(PTE_MARKER_UFFD_WP));
#endif
}
#endif