Commit Graph

304 Commits

Author SHA1 Message Date
Linus Torvalds
f1c243fc78 Merge tag 'iommu-updates-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu updates from Joerg Roedel:
 "Core changes:
   - PASID support for the blocked_domain

  ARM-SMMU Updates:
   - SMMUv2:
      - Implement per-client prefetcher configuration on Qualcomm SoCs
      - Support for the Adreno SMMU on Qualcomm's SDM670 SOC
   - SMMUv3:
      - Pretty-printing of event records
      - Drop the ->domain_alloc_paging implementation in favour of
        domain_alloc_paging_flags(flags==0)
   - IO-PGTable:
      - Generalisation of the page-table walker to enable external
        walkers (e.g. for debugging unexpected page-faults from the GPU)
      - Minor fix for handling concatenated PGDs at stage-2 with 16KiB
        pages
   - Misc:
      - Clean-up device probing and replace the crufty probe-deferral
        hack with a more robust implementation of
        arm_smmu_get_by_fwnode()
      - Device-tree binding updates for a bunch of Qualcomm platforms

  Intel VT-d Updates:
   - Remove domain_alloc_paging()
   - Remove capability audit code
   - Draining PRQ in sva unbind path when FPD bit set
   - Link cache tags of same iommu unit together

  AMD-Vi Updates:
   - Use CMPXCHG128 to update DTE
   - Cleanups of the domain_alloc_paging() path

  RiscV IOMMU:
   - Platform MSI support
   - Shutdown support

  Rockchip IOMMU:
   - Add DT bindings for Rockchip RK3576

  More smaller fixes and cleanups"

* tag 'iommu-updates-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (66 commits)
  iommu: Use str_enable_disable-like helpers
  iommu/amd: Fully decode all combinations of alloc_paging_flags
  iommu/amd: Move the nid to pdom_setup_pgtable()
  iommu/amd: Change amd_iommu_pgtable to use enum protection_domain_mode
  iommu/amd: Remove type argument from do_iommu_domain_alloc() and related
  iommu/amd: Remove dev == NULL checks
  iommu/amd: Remove domain_alloc()
  iommu/amd: Remove unused amd_iommu_domain_update()
  iommu/riscv: Fixup compile warning
  iommu/arm-smmu-v3: Add missing #include of linux/string_choices.h
  iommu/arm-smmu-v3: Use str_read_write helper w/ logs
  iommu/io-pgtable-arm: Add way to debug pgtable walk
  iommu/io-pgtable-arm: Re-use the pgtable walk for iova_to_phys
  iommu/io-pgtable-arm: Make pgtable walker more generic
  iommu/arm-smmu: Add ACTLR data and support for qcom_smmu_500
  iommu/arm-smmu: Introduce ACTLR custom prefetcher settings
  iommu/arm-smmu: Add support for PRR bit setup
  iommu/arm-smmu: Refactor qcom_smmu structure to include single pointer
  iommu/arm-smmu: Re-enable context caching in smmu reset operation
  iommu/vt-d: Link cache tags of same iommu unit together
  ...
2025-01-24 07:33:46 -08:00
Linus Torvalds
4c551165e7 Merge tag 'irq-core-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull interrupt subsystem updates from Thomas Gleixner:

 - Consolidate the machine_kexec_mask_interrupts() by providing a
   generic implementation and replacing the copy & pasta orgy in the
   relevant architectures.

 - Prevent unconditional operations on interrupt chips during kexec
   shutdown, which can trigger warnings in certain cases when the
   underlying interrupt has been shut down before.

 - Make the enforcement of interrupt handling in interrupt context
   unconditionally available, so that it actually works for non x86
   related interrupt chips. The earlier enablement for ARM GIC chips set
   the required chip flag, but did not notice that the check was hidden
   behind a config switch which is not selected by ARM[64].

 - Decrapify the handling of deferred interrupt affinity setting.

   Some interrupt chips require that affinity changes are made from the
   context of handling an interrupt to avoid certain race conditions.
   For x86 this was the default, but with interrupt remapping this
   requirement was lifted and a flag was introduced which tells the core
   code that affinity changes can be done in any context. Unrestricted
   affinity changes are the default for the majority of interrupt chips.

   RISCV has the requirement to add the deferred mode to one of it's
   interrupt controllers, but with the original implementation this
   would require to add the any context flag to all other RISC-V
   interrupt chips. That's backwards, so reverse the logic and require
   that chips, which need the deferred mode have to be marked
   accordingly. That avoids chasing the 'sane' chips and marking them.

 - Add multi-node support to the Loongarch AVEC interrupt controller
   driver.

 - The usual tiny cleanups, fixes and improvements all over the place.

* tag 'irq-core-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/generic_chip: Export irq_gc_mask_disable_and_ack_set()
  genirq/timings: Add kernel-doc for a function parameter
  genirq: Remove IRQ_MOVE_PCNTXT and related code
  x86/apic: Convert to IRQCHIP_MOVE_DEFERRED
  genirq: Provide IRQCHIP_MOVE_DEFERRED
  hexagon: Remove GENERIC_PENDING_IRQ leftover
  ARC: Remove GENERIC_PENDING_IRQ
  genirq: Remove handle_enforce_irqctx() wrapper
  genirq: Make handle_enforce_irqctx() unconditionally available
  irqchip/loongarch-avec: Add multi-nodes topology support
  irqchip/ts4800: Replace seq_printf() by seq_puts()
  irqchip/ti-sci-inta : Add module build support
  irqchip/ti-sci-intr: Add module build support
  irqchip/irq-brcmstb-l2: Replace brcmstb_l2_mask_and_ack() by generic function
  irqchip: keystone: Use syscon_regmap_lookup_by_phandle_args
  genirq/kexec: Prevent redundant IRQ masking by checking state before shutdown
  kexec: Consolidate machine_kexec_mask_interrupts() implementation
  genirq: Reuse irq_thread_fn() for forced thread case
  genirq: Move irq_thread_fn() further up in the code
2025-01-21 13:51:07 -08:00
Joerg Roedel
125f34e4c1 Merge branches 'arm/smmu/updates', 'arm/smmu/bindings', 'qualcomm/msm', 'rockchip', 'riscv', 'core', 'intel/vt-d' and 'amd/amd-vi' into next 2025-01-17 09:02:35 +01:00
Jason Gunthorpe
082f1bcae8 iommu/amd: Fully decode all combinations of alloc_paging_flags
Currently AMD does not support
 IOMMU_HWPT_ALLOC_PASID | IOMMU_HWPT_ALLOC_DIRTY_TRACKING

It should be rejected. Instead it creates a V1 domain without dirty
tracking support.

Use a switch to fully decode the flags.

Fixes: ce2cd17546 ("iommu/amd: Enhance amd_iommu_domain_alloc_user()")
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/7-v2-9776c53c2966+1c7-amd_paging_flags_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-01-17 08:59:32 +01:00
Jason Gunthorpe
5a081f7f42 iommu/amd: Move the nid to pdom_setup_pgtable()
The only thing that uses the nid is the io_pgtable code, and it should be
set before calling alloc_io_pgtable_ops() to ensure that the top levels
are allocated on the correct nid.

Since dev is never NULL now we can just do this trivially and remove the
other uses of nid. SVA and identity code paths never use it since they
don't use io_pgtable.

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/6-v2-9776c53c2966+1c7-amd_paging_flags_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-01-17 08:59:31 +01:00
Jason Gunthorpe
13b4ec7491 iommu/amd: Change amd_iommu_pgtable to use enum protection_domain_mode
Currently it uses enum io_pgtable_fmt which is from the io pagetable code
and most of the enum values are invalid. protection_domain_mode is
internal the driver and has the only two valid values.

Fix some signatures and variables to use the right type as well.

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/5-v2-9776c53c2966+1c7-amd_paging_flags_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-01-17 08:59:30 +01:00
Jason Gunthorpe
55b237dd7f iommu/amd: Remove type argument from do_iommu_domain_alloc() and related
do_iommu_domain_alloc() is only called from
amd_iommu_domain_alloc_paging_flags() so type is always
IOMMU_DOMAIN_UNMANAGED. Remove type and all the dead conditionals checking
it.

IOMMU_DOMAIN_IDENTITY checks are similarly obsolete as the conversion to
the global static identity domain removed those call paths.

The caller of protection_domain_alloc() should set the type, fix the miss
in the SVA code.

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/4-v2-9776c53c2966+1c7-amd_paging_flags_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-01-17 08:59:29 +01:00
Jason Gunthorpe
02bcd1a8b9 iommu/amd: Remove dev == NULL checks
This is no longer possible, amd_iommu_domain_alloc_paging_flags() is never
called with dev = NULL from the core code. Similarly
get_amd_iommu_from_dev() can never be NULL either.

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v2-9776c53c2966+1c7-amd_paging_flags_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-01-17 08:59:29 +01:00
Jason Gunthorpe
f9b80f941e iommu/amd: Remove domain_alloc()
IOMMU drivers should not be sensitive to the domain type, a paging domain
should be created based only on the flags passed in, the same for all
callers.

AMD was using the domain_alloc() path to force VFIO into a v1 domain type,
because v1 gives higher performance. However now that
IOMMU_HWPT_ALLOC_PASID is present, and a NULL device is not possible,
domain_alloc_paging_flags() will do the right thing for VFIO.

When invoked from VFIO flags will be 0 and the amd_iommu_pgtable type of
domain will be selected. This is v1 by default unless the kernel command
line has overridden it to v2.

If the admin is forcing v2 assume they know what they are doing so force
it everywhere, including for VFIO.

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/2-v2-9776c53c2966+1c7-amd_paging_flags_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-01-17 08:59:28 +01:00
Alejandro Jimenez
1a684b099f iommu/amd: Remove unused amd_iommu_domain_update()
All the callers have been removed by the below commit, remove the
implementation and prototypes.

Fixes: 322d889ae7 ("iommu/amd: Remove amd_iommu_domain_update() from page table freeing")
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v2-9776c53c2966+1c7-amd_paging_flags_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-01-17 08:59:28 +01:00
Thomas Gleixner
7d04319a05 x86/apic: Convert to IRQCHIP_MOVE_DEFERRED
Instead of marking individual interrupts as safe to be migrated in
arbitrary contexts, mark the interrupt chips, which require the interrupt
to be moved in actual interrupt context, with the new IRQCHIP_MOVE_DEFERRED
flag. This makes more sense because this is a per interrupt chip property
and not restricted to individual interrupts.

That flips the logic from the historical opt-out to a opt-in model. This is
simpler to handle for other architectures, which default to unrestricted
affinity setting. It also allows to cleanup the redundant core logic
significantly.

All interrupt chips, which belong to a top-level domain sitting directly on
top of the x86 vector domain are marked accordingly, unless the related
setup code marks the interrupts with IRQ_MOVE_PCNTXT, i.e. XEN.

No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/all/20241210103335.563277044@linutronix.de
2025-01-15 21:38:53 +01:00
Yi Liu
5f53638882 iommu/amd: Make the blocked domain support PASID
The blocked domain can be extended to park PASID of a device to be the
DMA blocking state. By this the remove_dev_pasid() op is dropped.

Remove PASID from old domain and device GCR3 table. No need to attach
PASID to the blocked domain as clearing PASID from GCR3 table will make
sure all DMAs for that PASID are blocked.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Link: https://lore.kernel.org/r/20241204122928.11987-7-yi.l.liu@intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18 09:39:37 +01:00
Suravee Suthikulpanit
457da57646 iommu/amd: Lock DTE before updating the entry with WRITE_ONCE()
When updating only within a 64-bit tuple of a DTE, just lock the DTE and
use WRITE_ONCE() because it is writing to memory read back by HW.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20241118054937.5203-9-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18 09:37:42 +01:00
Suravee Suthikulpanit
66ea3f96ae iommu/amd: Modify clear_dte_entry() to avoid in-place update
By reusing the make_clear_dte() and update_dte256().

Also, there is no need to set TV bit for non-SNP system when clearing DTE
for blocked domain, and no longer need to apply erratum 63 in clear_dte()
since it is already stored in struct ivhd_dte_flags and apply in
set_dte_entry().

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20241118054937.5203-8-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18 09:37:42 +01:00
Suravee Suthikulpanit
a2ce608a1e iommu/amd: Introduce helper function get_dte256()
And use it in clone_alias() along with update_dte256().
Also use get_dte256() in dump_dte_entry().

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20241118054937.5203-7-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18 09:37:41 +01:00
Suravee Suthikulpanit
fd5dff9de4 iommu/amd: Modify set_dte_entry() to use 256-bit DTE helpers
Also, the set_dte_entry() is used to program several DTE fields (e.g.
stage1 table, stage2 table, domain id, and etc.), which is difficult
to keep track with current implementation.

Therefore, separate logic for clearing DTE (i.e. make_clear_dte) and
another function for setting up the GCR3 Table Root Pointer, GIOV, GV,
GLX, and GuestPagingMode into another function set_dte_gcr3_table().

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20241118054937.5203-6-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18 09:37:40 +01:00
Suravee Suthikulpanit
8b3f787338 iommu/amd: Introduce helper function to update 256-bit DTE
The current implementation does not follow 128-bit write requirement
to update DTE as specified in the AMD I/O Virtualization Techonology
(IOMMU) Specification.

Therefore, modify the struct dev_table_entry to contain union of u128 data
array, and introduce a helper functions update_dte256() to update DTE using
two 128-bit cmpxchg operations to update 256-bit DTE with the modified
structure, and take into account the DTE[V, GV] bits when programming
the DTE to ensure proper order of DTE programming and flushing.

In addition, introduce a per-DTE spin_lock struct dev_data.dte_lock to
provide synchronization when updating the DTE to prevent cmpxchg128
failure.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Uros Bizjak <ubizjak@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20241118054937.5203-5-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18 09:37:39 +01:00
Jason Gunthorpe
91da87d38c iommu/amd: Add lockdep asserts for domain->dev_list
Add an assertion to all the iteration points that don't obviously
have the lock held already. These all take the locker higher in their
call chains.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/2-v1-3b9edcf8067d+3975-amd_dev_list_locking_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-10 10:12:06 +01:00
Jason Gunthorpe
4a552f7890 iommu/amd: Put list_add/del(dev_data) back under the domain->lock
The list domain->dev_list is protected by the domain->lock spinlock.
Any iteration, addition or removal must be under the lock.

Move the list_del() up into the critical section. pdom_is_sva_capable(),
and destroy_gcr3_table() do not interact with the list element.

Wrap the list_add() in a lock, it would make more sense if this was under
the same critical section as adjusting the refcounts earlier, but that
requires more complications.

Fixes: d6b47dec36 ("iommu/amd: Reduce domain lock scope in attach device path")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/1-v1-3b9edcf8067d+3975-amd_dev_list_locking_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-10 10:12:05 +01:00
Jason Gunthorpe
d53764723e iommu: Rename ops->domain_alloc_user() to domain_alloc_paging_flags()
Now that the main domain allocating path is calling this function it
doesn't make sense to leave it named _user. Change the name to
alloc_paging_flags() to mirror the new iommu_paging_domain_alloc_flags()
function.

A driver should implement only one of ops->domain_alloc_paging() or
ops->domain_alloc_paging_flags(). The former is a simpler interface with
less boiler plate that the majority of drivers use. The latter is for
drivers with a greater feature set (PASID, multiple page table support,
advanced iommufd support, nesting, etc). Additional patches will be needed
to achieve this.

Link: https://patch.msgid.link/r/2-v1-c252ebdeb57b+329-iommu_paging_flags_jgg@nvidia.com
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-22 14:43:45 -04:00
Vasant Hegde
18f5a6b34b iommu/amd: Improve amd_iommu_release_device()
Previous patch added ops->release_domain support. Core will attach
devices to release_domain->attach_dev() before calling this function.
Devices are already detached their current domain and attached to
blocked domain.

This is mostly dummy function now. Just throw warning if device is still
attached to domain.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-13-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:48 +01:00
Vasant Hegde
a0e086b16e iommu/amd: Add ops->release_domain
In release path, remove device from existing domain and attach it to
blocked domain. So that all DMAs from device is blocked.

Note that soon blocked_domain will support other ops like
set_dev_pasid() but release_domain supports only attach_dev ops.
Hence added separate 'release_domain' variable.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-12-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:47 +01:00
Vasant Hegde
0b136493d3 iommu/amd: Reorder attach device code
Ideally in attach device path, it should take dev_data lock before
making changes to device data including IOPF enablement. So far dev_data
was using spinlock and it was hitting lock order issue when it tries to
enable IOPF. Hence Commit 526606b0a1 ("iommu/amd: Fix Invalid wait
context issue") moved IOPF enablement outside dev_data->lock.

Previous patch converted dev_data lock to mutex. Now its safe to call
amd_iommu_iopf_add_device() with dev_data->mutex. Hence move back PCI
device capability enablement (ATS, PRI, PASID) and IOPF enablement code
inside the lock. Also in attach_device(), update 'dev_data->domain' at
the end so that error handling becomes simple.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-11-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:46 +01:00
Vasant Hegde
e843aedbeb iommu/amd: Convert dev_data lock from spinlock to mutex
Currently in attach device path it takes dev_data->spinlock. But as per
design attach device path can sleep. Also if device is PRI capable then
it adds device to IOMMU fault handler queue which takes mutex. Hence
currently PRI enablement is done outside dev_data lock.

Covert dev_data lock from spinlock to mutex so that it follows the
design and also PRI enablement can be done properly.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-10-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:45 +01:00
Vasant Hegde
4b18ef8491 iommu/amd: Rearrange attach device code
attach_device() is just holding lock and calling do_attach(). There is
not need to have another function. Just move do_attach() code to
attach_device(). Similarly move do_detach() code to detach_device().

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-9-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:45 +01:00
Vasant Hegde
d6b47dec36 iommu/amd: Reduce domain lock scope in attach device path
Currently attach device path takes protection domain lock followed by
dev_data lock. Most of the operations in this function is specific to
device data except pdom_attach_iommu() where it updates protection
domain structure. Hence reduce the scope of protection domain lock.

Note that this changes the locking order. Now it takes device lock
before taking doamin lock (group->mutex -> dev_data->lock ->
pdom->lock). dev_data->lock is used only in device attachment path.
So changing order is fine. It will not create any issue.

Finally move numa node assignment to pdom_attach_iommu().

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-8-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:44 +01:00
Vasant Hegde
07bbd660db iommu/amd: Do not detach devices in domain free path
All devices attached to a protection domain must be freed before
calling domain free. Hence do not try to free devices in domain
free path. Continue to throw warning if pdom->dev_list is not empty
so that any potential issues can be fixed.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-7-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:43 +01:00
Vasant Hegde
d16041124d iommu/amd: xarray to track protection_domain->iommu list
Use xarray to track IOMMU attached to protection domain instead of
static array of MAX_IOMMUS. Also add lockdep assertion.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-5-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:42 +01:00
Vasant Hegde
743a4bae9f iommu/amd: Remove protection_domain.dev_cnt variable
protection_domain->dev_list tracks list of attached devices to
domain. We can use list_* functions on dev_list to get device count.
Hence remove 'dev_cnt' variable.

No functional change intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-4-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:41 +01:00
Vasant Hegde
2fcab2deeb iommu/amd: Use ida interface to manage protection domain ID
Replace custom domain ID allocator with IDA interface.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-3-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:40 +01:00
Joerg Roedel
556af583d2 Merge branch 'core' into amd/amd-vi 2024-10-30 11:02:48 +01:00
Vasant Hegde
4402f2627d iommu/amd: Implement global identity domain
Implement global identity domain. All device groups in identity domain
will share this domain.

In attach device path, based on device capability it will allocate per
device domain ID and GCR3 table. So that it can support SVA.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20241028093810.5901-11-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29 10:08:22 +01:00
Vasant Hegde
ce2cd17546 iommu/amd: Enhance amd_iommu_domain_alloc_user()
Previous patch enhanced core layer to check device PASID capability and
pass right flags to ops->domain_alloc_user().

Enhance amd_iommu_domain_alloc_user() to allocate domain with
appropriate page table based on flags parameter.
  - If flags is empty then allocate domain with default page table type.
    This will eventually replace ops->domain_alloc().
    For UNMANAGED domain, core will call this interface with flags=0. So
    AMD driver will continue to allocate V1 page table.

  - If IOMMU_HWPT_ALLOC_PASID flags is passed then allocate domain with v2
    page table.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241028093810.5901-10-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29 10:08:22 +01:00
Vasant Hegde
a005ef62f9 iommu/amd: Pass page table type as param to pdom_setup_pgtable()
Current code forces v1 page table for UNMANAGED domain and global page
table type (amd_iommu_pgtable) for rest of paging domain.

Following patch series adds support for domain_alloc_paging() ops. Also
enhances domain_alloc_user() to allocate page table based on 'flags.

Hence pass page table type as parameter to pdomain_setup_pgtable(). So
that caller can decide right page table type. Also update
dma_max_address() to take pgtable as parameter.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jacob Pan <jacob.pan@linux.microsoft.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241028093810.5901-9-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29 10:08:21 +01:00
Vasant Hegde
b3c989083d iommu/amd: Separate page table setup from domain allocation
Currently protection_domain_alloc() allocates domain and also sets up
page table. Page table setup is required for PAGING domain only. Domain
type like SVA doesn't need page table. Hence move page table setup code
to separate function.

Also SVA domain allocation path does not call pdom_setup_pgtable().
Hence remove IOMMU_DOMAIN_SVA type check.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jacob Pan <jacob.pan@linux.microsoft.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241028093810.5901-8-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29 10:08:21 +01:00
Uros Bizjak
5ce73c524f iommu/amd: Use atomic64_inc_return() in iommu.c
Use atomic64_inc_return(&ref) instead of atomic64_add_return(1, &ref)
to use optimized implementation and ease register pressure around
the primitive for targets that implement optimized variant.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241007084356.47799-1-ubizjak@gmail.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-15 10:22:37 +02:00
Joerg Roedel
97162f6093 Merge branches 'fixes', 'arm/smmu', 'intel/vt-d', 'amd/amd-vi' and 'core' into next 2024-09-13 12:53:05 +02:00
Jason Gunthorpe
3ab9d8d1b5 iommu/amd: Test for PAGING domains before freeing a domain
This domain free function can be called for IDENTITY and SVA domains too,
and they don't have page tables. For now protect against this by checking
the type. Eventually the different types should have their own free
functions.

Fixes: 485534bfcc ("iommu/amd: Remove conditions from domain free paths")
Reported-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/0-v1-ad9884ee5f5b+da-amd_iopgtbl_fix_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-12 09:21:40 +02:00
Eliav Bar-ilan
8386207f37 iommu/amd: Fix argument order in amd_iommu_dev_flush_pasid_all()
An incorrect argument order calling amd_iommu_dev_flush_pasid_pages()
causes improper flushing of the IOMMU, leaving the old value of GCR3 from
a previous process attached to the same PASID.

The function has the signature:

void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
				     ioasid_t pasid, u64 address, size_t size)

Correct the argument order.

Cc: stable@vger.kernel.org
Fixes: 474bf01ed9 ("iommu/amd: Add support for device based TLB invalidation")
Signed-off-by: Eliav Bar-ilan <eliavb@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/0-v1-fc6bc37d8208+250b-amd_pasid_flush_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-12 09:20:18 +02:00
Jason Gunthorpe
485534bfcc iommu/amd: Remove conditions from domain free paths
Don't use tlb as some flag to indicate if protection_domain_alloc()
completed. Have protection_domain_alloc() unwind itself in the normal
kernel style and require protection_domain_free() only be called on
successful results of protection_domain_alloc().

Also, the amd_iommu_domain_free() op is never called by the core code with
a NULL argument, so remove all the NULL tests as well.

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/10-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:39:01 +02:00
Jason Gunthorpe
47f218d108 iommu/amd: Store the nid in io_pgtable_cfg instead of the domain
We already have memory in the union here that is being wasted in AMD's
case, use it to store the nid.

Putting the nid here further isolates the io_pgtable code from the struct
protection_domain.

Fixup protection_domain_alloc so that the NID from the device is provided,
at this point dev is never NULL for AMD so this will now allocate the
first table pointer on the correct NUMA node.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/8-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:38:34 +02:00
Jason Gunthorpe
977fc27ca7 iommu/amd: Remove amd_io_pgtable::pgtbl_cfg
This struct is already in iop.cfg, we don't need two.

AMD is using this API sort of wrong, the cfg is supposed to be passed in
and then the allocation function will allocate ops memory and copy the
passed config into the new memory. Keep it kind of wrong and pass in the
cfg memory that is already part of the pagetable struct.

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/7-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:38:33 +02:00
Jason Gunthorpe
670b57796c iommu/amd: Rename struct amd_io_pgtable iopt to pgtbl
There is struct protection_domain iopt and struct amd_io_pgtable iopt.
Next patches are going to want to write domain.iopt.iopt.xx which is quite
unnatural to read.

Give one of them a different name, amd_io_pgtable has fewer references so
call it pgtbl, to match pgtbl_cfg, instead.

Suggested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/6-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:38:32 +02:00
Jason Gunthorpe
322d889ae7 iommu/amd: Remove amd_iommu_domain_update() from page table freeing
It is a serious bug if the domain is still mapped to any DTEs when it is
freed as we immediately start freeing page table memory, so any remaining
HW touch will UAF.

If it is not mapped then dev_list is empty and amd_iommu_domain_update()
does nothing.

Remove it and add a WARN_ON() to catch this class of bug.

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/4-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:37:43 +02:00
Jason Gunthorpe
7a41dcb52f iommu/amd: Set the pgsize_bitmap correctly
When using io_pgtable the correct pgsize_bitmap is stored in the cfg, both
v1_alloc_pgtable() and v2_alloc_pgtable() set it correctly.

This fixes a bug where the v2 pgtable had the wrong pgsize as
protection_domain_init_v2() would set it and then do_iommu_domain_alloc()
immediately resets it.

Remove the confusing ops.pgsize_bitmap since that is not used if the
driver sets domain.pgsize_bitmap.

Fixes: 134288158a ("iommu/amd: Add domain_alloc_user based domain allocation")
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:37:42 +02:00
Jason Gunthorpe
8d00b77a52 iommu/amd: Move allocation of the top table into v1_alloc_pgtable
All the page table memory should be allocated/free within the io_pgtable
struct. The v2 path is already doing this, make it consistent.

It is hard to see but the free of the root in protection_domain_free() is
a NOP on the success path because v1_free_pgtable() does
amd_iommu_domain_clr_pt_root().

The root memory is already freed because free_sub_pt() put it on the
freelist. The free path in protection_domain_free() is only used during
error unwind of protection_domain_alloc().

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:37:41 +02:00
Vasant Hegde
89ffb2c3c2 iommu/amd: Make amd_iommu_dev_update_dte() static
As its used inside iommu.c only. Also rename function to dev_update_dte()
as its static function.

No functional changes intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20240828111029.5429-9-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:35:58 +02:00
Vasant Hegde
a3303762eb iommu/amd: Rework amd_iommu_update_and_flush_device_table()
Remove separate function to update and flush the device table as only
amd_iommu_update_and_flush_device_table() calls these functions.

No functional changes intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20240828111029.5429-8-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:35:57 +02:00
Vasant Hegde
964877dc26 iommu/amd: Make amd_iommu_domain_flush_complete() static
AMD driver uses amd_iommu_domain_flush_complete() function to make sure
IOMMU processed invalidation commands before proceeding. Ideally this
should be called from functions which updates DTE/invalidates caches.
There is no need to call this function explicitly. This patches makes
below changes :

- Rename amd_iommu_domain_flush_complete() -> domain_flush_complete()
  and make it as static function.

- Rearrage domain_flush_complete() to avoid forward declaration.

- Update amd_iommu_update_and_flush_device_table() to call
  domain_flush_complete().

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20240828111029.5429-7-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:35:56 +02:00
Vasant Hegde
845bd6ac43 iommu/amd: Make amd_iommu_dev_flush_pasid_all() static
As its not used outside iommu.c. Also rename it as dev_flush_pasid_all().

No functional change intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20240828111029.5429-6-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:35:55 +02:00