arm64: Enable LPA2 at boot if supported by the system

Update the early kernel mapping code to take 52-bit virtual addressing
into account based on the LPA2 feature. This is a bit more involved than
LVA (which is supported with 64k pages only), given that some page table
descriptor bits change meaning in this case.

To keep the handling in asm to a minimum, the initial ID map is still
created with 48-bit virtual addressing, which implies that the kernel
image must be loaded into 48-bit addressable physical memory. This is
currently required by the boot protocol, even though we happen to
support placement outside of that for LVA/64k based configurations.

Enabling LPA2 involves more than setting TCR.T1SZ to a lower value,
there is also a DS bit in TCR that needs to be set, and which changes
the meaning of bits [9:8] in all page table descriptors. Since we cannot
enable DS and every live page table descriptor at the same time, let's
pivot through another temporary mapping. This avoids the need to
reintroduce manipulations of the page tables with the MMU and caches
disabled.

To permit the LPA2 feature to be overridden on the kernel command line,
which may be necessary to work around silicon errata, or to deal with
mismatched features on heterogeneous SoC designs, test for CPU feature
overrides first, and only then enable LPA2.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-78-ardb+git@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This commit is contained in:
Ard Biesheuvel
2024-02-14 13:29:19 +01:00
committed by Catalin Marinas
parent 2b6c8f96cc
commit 9684ec186f
11 changed files with 124 additions and 11 deletions

View File

@@ -581,11 +581,17 @@ alternative_endif
* but we have to add an offset so that the TTBR1 address corresponds with the
* pgdir entry that covers the lowest 48-bit addressable VA.
*
* Note that this trick is only used for LVA/64k pages - LPA2/4k pages uses an
* additional paging level, and on LPA2/16k pages, we would end up with a root
* level table with only 2 entries, which is suboptimal in terms of TLB
* utilization, so there we fall back to 47 bits of translation if LPA2 is not
* supported.
*
* orr is used as it can cover the immediate value (and is idempotent).
* ttbr: Value of ttbr to set, modified.
*/
.macro offset_ttbr1, ttbr, tmp
#ifdef CONFIG_ARM64_VA_BITS_52
#if defined(CONFIG_ARM64_VA_BITS_52) && !defined(CONFIG_ARM64_LPA2)
mrs \tmp, tcr_el1
and \tmp, \tmp, #TCR_T1SZ_MASK
cmp \tmp, #TCR_T1SZ(VA_BITS_MIN)