linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-18 06:44:00 -04:00

Author	SHA1	Message	Date
Linus Torvalds	12bffaef28	Merge tag 'cxl-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull CXL (Compute Express Link) updates from Dave Jiang: "The significant change of interest is the handling of soft reserved memory conflict between CXL and HMEM. In essence CXL will be the first to claim the soft reserved memory ranges that belongs to CXL and attempt to enumerate them with best effort. If CXL is not able to enumerate the ranges it will punt them to HMEM. There are also MAINTAINERS email changes from Dan Williams and Jonathan Cameron" * tag 'cxl-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (37 commits) MAINTAINERS: Update Jonathan Cameron's email address cxl/hdm: Add support for 32 switch decoders MAINTAINERS: Update address for Dan Williams tools/testing/cxl: Enable replay of user regions as auto regions cxl/region: Add a region sysfs interface for region lock status tools/testing/cxl: Test dax_hmem takeover of CXL regions tools/testing/cxl: Simulate auto-assembly failure dax/hmem: Parent dax_hmem devices dax/hmem: Fix singleton confusion between dax_hmem_work and hmem devices dax/hmem: Reduce visibility of dax_cxl coordination symbols cxl/region: Constify cxl_region_resource_contains() cxl/region: Limit visibility of cxl_region_contains_resource() dax/cxl: Fix HMEM dependencies cxl/region: Fix use-after-free from auto assembly failure cxl/core: Check existence of cxl_memdev_state in poison test cxl/core: use cleanup.h for devm_cxl_add_dax_region cxl/core/region: move dax region device logic into region_dax.c cxl/core/region: move pmem region driver logic into region_pmem.c dax/hmem, cxl: Defer and resolve Soft Reserved ownership cxl/region: Add helper to check Soft Reserved containment by CXL regions ...	2026-04-17 15:52:58 -07:00
Dave Jiang	3939dba00f	Merge branch 'for-7.1/cxl-misc' into cxl-for-next cxl/hdm: Add support for 32 switch decoders	2026-04-10 11:11:53 -07:00
Li Ming	3624a22783	cxl/hdm: Add support for 32 switch decoders Per CXL r4.0 section 8.2.4.20.1. CXL host bridge and switch ports can support 32 HDM decoders. Current implementation misses some decoders on CXL host bridge and switch in the case that the value of Decoder Count field in CXL HDM decoder Capability Register is greater than or equal to 9. Update calculation implementation to ensure the decoder count calculation is correct for CXL host bridge/switch ports. Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20260321061459.1910205-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-04-10 08:32:14 -07:00
Dave Jiang	202432ae8a	Merge branch 'for-7.1/cxl-region-refactor' into cxl-for-next Refactor CXL core/region code to make region code more manageable by splitting out DAX and PMEM code from RAM handling code. cxl/core: use cleanup.h for devm_cxl_add_dax_region cxl/core/region: move dax region device logic into region_dax.c cxl/core/region: move pmem region driver logic into region_pmem.c	2026-04-03 12:30:57 -07:00
Dave Jiang	303d32843b	Merge branch 'for-7.1/dax-hmem' into cxl-for-next The series addresses conflicts between HMEM and CXL when handling Soft Reserved memory ranges. CXL will try best effort in claiming the Soft Reserved memory region that are CXL regions. If fails, it will punt back to HMEM. tools/testing/cxl: Test dax_hmem takeover of CXL regions tools/testing/cxl: Simulate auto-assembly failure dax/hmem: Parent dax_hmem devices dax/hmem: Fix singleton confusion between dax_hmem_work and hmem devices dax/hmem: Reduce visibility of dax_cxl coordination symbols cxl/region: Constify cxl_region_resource_contains() cxl/region: Limit visibility of cxl_region_contains_resource() dax/cxl: Fix HMEM dependencies cxl/region: Fix use-after-free from auto assembly failure dax/hmem, cxl: Defer and resolve Soft Reserved ownership cxl/region: Add helper to check Soft Reserved containment by CXL regions dax: Track all dax_region allocations under a global resource tree dax/cxl, hmem: Initialize hmem early and defer dax_cxl binding dax/hmem: Gate Soft Reserved deferral on DEV_DAX_CXL dax/hmem: Request cxl_acpi and cxl_pci before walking Soft Reserved ranges dax/hmem: Factor HMEM registration into __hmem_register_device() dax/bus: Use dax_region_put() in alloc_dax_region() error path	2026-04-03 12:24:18 -07:00
Dave Jiang	7aacc62557	Merge branch 'for-7.1/cxl-type2-support' into cxl-for-next Prep patches for CXL type2 accelerator basic support cxl/region: Factor out interleave granularity setup cxl/region: Factor out interleave ways setup cxl: Make region type based on endpoint type cxl/pci: Remove redundant cxl_pci_find_port() call cxl: Move pci generic code from cxl_pci to core/cxl_pci cxl: export internal structs for external Type2 drivers cxl: support Type2 when initializing cxl_dev_state	2026-04-03 12:18:23 -07:00
Dave Jiang	2fb3bdeb00	Merge branch 'for-7.1/cxl-consolidate-endpoint' into cxl-for-next Add code to ensure the endpoint has completed initialization before usage. cxl/pci: Check memdev driver binding status in cxl_reset_done() cxl/pci: Hold memdev lock in cxl_event_trace_record()	2026-04-03 12:15:11 -07:00
Li Ming	d585bc86fb	cxl/region: Add a region sysfs interface for region lock status There are 3 scenarios that leads to a locked region: 1. A region is created on a root decoder with Fixed Device Confiuration attribute. 2. CXL_HDM_DECODER0_CTRL_LOCK. Both 1 & 1 are well described in: commit `2230c4bdc4` ("cxl: Add handling of locked CXL decoder") 3) Platform that has region creation with PRMT address translation always locks the region, regardless of the FIXED attribute or decoder ctrl bit. Region locked means region destroy operations are not permitted. CXL region driver returns -EPERM for region destroy operations. Although the locked status of the corresponding root decoder implies the region is also locked, exposing the region lock status directly to userspace improves usability for users who may not be aware of this relationship. [ dj: Amended commit log with additional locking scenarios. ] Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20260401124951.1290041-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-04-01 09:59:58 -07:00
Dan Williams	471d88441e	cxl/region: Constify cxl_region_resource_contains() The call to cxl_region_resource_contains() in hmem_register_cxl_device() need not cast away 'const'. The problem is the usage of the bus_for_each_dev() API which does not mark its @data parameter as 'const'. Switch to bus_find_device() which does take 'const' @data, fixup cxl_region_resource_contains() and its caller. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20260327052821.440749-5-dan.j.williams@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-04-01 08:12:17 -07:00
Dan Williams	b6a61d5baf	cxl/region: Limit visibility of cxl_region_contains_resource() The dax_hmem dependency on cxl_region_contains_resource() is a one-off special case. It is not suitable for other use cases. Move the definition to the other CONFIG_CXL_REGION guarded definitions in drivers/cxl/cxl.h and include that by a relative path include. This matches what drivers/dax/cxl.c does for its limited private usage of CXL core symbols. Reduce the symbol export visibility from global to just dax_hmem, to further clarify its applicability. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20260327052821.440749-4-dan.j.williams@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-04-01 08:12:17 -07:00
Dan Williams	87805c32e6	cxl/region: Fix use-after-free from auto assembly failure The following crash signature results from region destruction while an endpoint decoder is staged, but not fully attached. [ dj: Moved bus_find_device( to next line. ] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20260327052821.440749-2-dan.j.williams@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-04-01 08:12:12 -07:00
Alison Schofield	261a02b93d	cxl/core: Check existence of cxl_memdev_state in poison test Before now, all CXL memdevs were assumed to have a mailbox-backed cxl_memdev_state, so poison command checks could safely dereference the @mds. With the introduction of Type 2 devices, a memdev may not implement a mailbox interface, and so there is no associated cxl_memdev_state. Guard against this case by returning false when @mds is absent. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/20260331005047.2813980-1-alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-31 15:18:06 -07:00
Gregory Price	29990ab5cb	cxl/core: use cleanup.h for devm_cxl_add_dax_region Cleanup the gotos in the function. No functional change. Signed-off-by: Gregory Price <gourry@gourry.net> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20260327020203.876122-4-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-27 11:45:39 -07:00
Gregory Price	d747cf98f0	cxl/core/region: move dax region device logic into region_dax.c core/region.c is overloaded with per-region control logic (pmem, dax, sysram, etc). Move the CXL DAX region device infrastructure from region.c into a new region_dax.c file. This will also allow us to add additional dax-driver integration paths that don't further dirty the core region.c logic. No functional changes. Signed-off-by: Gregory Price <gourry@gourry.net> Co-developed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20260327020203.876122-3-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-27 11:45:31 -07:00
Gregory Price	8a1ec5fb23	cxl/core/region: move pmem region driver logic into region_pmem.c core/region.c is overloaded with per-region control logic (pmem, dax, sysram, etc). Move the pmem region driver logic from region.c into region_pmem.c make it clear that this code only applies to pmem regions. No functional changes. [ dj: Fixed up some tabbing issues, may be from original code. ] Signed-off-by: Gregory Price <gourry@gourry.net> Co-developed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20260327020203.876122-2-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-27 11:45:21 -07:00
Smita Koralahalli	8e65f99b52	cxl/region: Add helper to check Soft Reserved containment by CXL regions Add a helper to determine whether a given Soft Reserved memory range is fully contained within the committed CXL region. This helper provides a primitive for policy decisions in subsequent patches such as co-ordination with dax_hmem to determine whether CXL has fully claimed ownership of Soft Reserved memory ranges. Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20260322195343.206900-8-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-27 10:22:19 -07:00
Dave Jiang	7974835aa9	cxl: Add endpoint decoder flags clear when PCI reset happens When a PCI reset happens, the lock and enable flags of the CXL device should be cleared to avoid stale state flags after reset. Add flag clearing during cxl_reset_done() to clear the relevant endpoint decoder flags for all decoders of the endpoint device. Reported-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260319152541.2739343-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-20 08:38:45 -07:00
Cui Chao	be5c5280cf	cxl: Adjust the startup priority of cxl_pmem to be higher than that of cxl_acpi During the cxl_acpi probe process, it checks whether the cxl_nvb device and driver have been attached. Currently, the startup priority of the cxl_pmem driver is lower than that of the cxl_acpi driver. At this point, the cxl_nvb driver has not yet been registered on the cxl_bus, causing the attachment check to fail. This results in a failure to add the root nvdimm bridge, leading to a cxl_acpi probe failure and ultimately affecting the subsequent loading of cxl drivers. As a consequence, only one mem device object exists on the cxl_bus, while the cxl_port device objects and decoder device objects are missing. The solution is to raise the startup priority of cxl_pmem to be higher than that of cxl_acpi, ensuring that the cxl_pmem driver is registered before the aforementioned attachment check occurs. Co-developed-by: Wang Yinfeng <wangyinfeng@phytium.com.cn> Signed-off-by: Wang Yinfeng <wangyinfeng@phytium.com.cn> Signed-off-by: Cui Chao <cuichao1753@phytium.com.cn> Fixes: `e7e222ad73` ("cxl: Move devm_cxl_add_nvdimm_bridge() to cxl_pmem.ko") Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20260319074535.1709250-1-cuichao1753@phytium.com.cn Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-19 15:12:40 -07:00
Davidlohr Bueso	9a6a209132	cxl/mbox: Use proper endpoint validity check upon sanitize Fuzzying CXL triggered: BUG: KASAN: null-ptr-deref in cxl_num_decoders_committed+0x3e/0x80 drivers/cxl/core/port.c:49 Read of size 4 at addr 0000000000000642 by task syz.0.97/2282 CPU: 2 UID: 0 PID: 2282 Comm: syz.0.97 Not tainted 7.0.0-rc1-gebd11be59f74-dirty #494 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 kasan_report+0xe0/0x110 mm/kasan/report.c:595 cxl_num_decoders_committed+0x3e/0x80 drivers/cxl/core/port.c:49 cxl_mem_sanitize+0x141/0x170 drivers/cxl/core/mbox.c:1304 security_sanitize_store+0xb0/0x120 drivers/cxl/core/memdev.c:173 dev_attr_store+0x46/0x70 drivers/base/core.c:2437 sysfs_kf_write+0x95/0xb0 fs/sysfs/file.c:142 kernfs_fop_write_iter+0x276/0x330 fs/kernfs/file.c:352 new_sync_write fs/read_write.c:595 [inline] vfs_write+0x5df/0xaa0 fs/read_write.c:688 ksys_write+0x103/0x1f0 fs/read_write.c:740 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x111/0x680 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f60a584ba79 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f60a42a7038 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007f60a5ab5fa0 RCX: 00007f60a584ba79 RDX: 0000000000000002 RSI: 00002000000001c0 RDI: 0000000000000003 RBP: 00007f60a58a49df R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f60a5ab6038 R14: 00007f60a5ab5fa0 R15: 00007ffe58fad8b8 </TASK> This goes away using the correct check instead of abusing cxlmd->endpoint, which is unusable (ENXIO) until the driver has probed. During that window the memdev sysfs attributes are already visible, as soon as device_add() completes. Fixes: `29317f8dc6` ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation") Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Gregory Price <gourry@gourry.net> Link: https://patch.msgid.link/20260301221739.1726722-1-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-18 08:49:29 -07:00
Li Ming	e8069c66d0	cxl/pci: Check memdev driver binding status in cxl_reset_done() cxl_reset_done() accesses the endpoint of the corresponding CXL memdev without endpoint validity checking. By default, cxlmd->endpoint is initialized to -ENXIO, if cxl_reset_done() is triggered after the corresponding CXL memdev probing failed, this results in access to an invalid endpoint. CXL subsystem can always check CXL memdev driver binding status to confirm its endpoint validity. So adding the CXL memdev driver checking inside cxl_reset_done() to avoid accessing an invalid endpoint. Fixes: `934edcd436` ("cxl: Add post-reset warning if reset results in loss of previously committed HDM decoders") Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20260314-fix_access_endpoint_without_drv_check-v2-4-4c09edf2e1db@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-17 07:54:15 -07:00
Li Ming	dc372e5f42	cxl/pci: Hold memdev lock in cxl_event_trace_record() cxl_event_config() invokes cxl_mem_get_event_record() to get remain event logs from CXL device during cxl_pci_probe(). If CXL memdev probing failed before that, it is possible to access an invalid endpoint. So adding a cxlmd->driver binding status checking inside cxl_dpa_to_region() to ensure the corresponding endpoint is valid. Besides, cxl_event_trace_record() needs to hold memdev lock to invoke cxl_dpa_to_region() to ensure the memdev probing completed. It is possible that cxl_event_trace_record() is invoked during the CXL memdev probing, especially user or cxl_acpi triggers CXL memdev re-probing. Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20260314-fix_access_endpoint_without_drv_check-v2-3-4c09edf2e1db@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-17 07:54:15 -07:00
Smita Koralahalli	75cea0776d	cxl/hdm: Avoid incorrect DVSEC fallback when HDM decoders are enabled Check the global CXL_HDM_DECODER_ENABLE bit instead of looping over per-decoder COMMITTED bits to determine whether to fall back to DVSEC range emulation. When the HDM decoder capability is globally enabled, ignore DVSEC range registers regardless of individual decoder commit state. should_emulate_decoders() currently loops over per-decoder COMMITTED bits, which leads to an incorrect DVSEC fallback when those bits are zero. One way to trigger this is to destroy a region and bounce the memdev: cxl disable-region region0 cxl destroy-region region0 cxl disable-memdev mem0 cxl enable-memdev mem0 Region teardown zeroes the HDM decoder registers including the committed bits. The subsequent memdev re-probe finds uncommitted decoders and falls back to DVSEC emulation, even though HDM remains globally enabled. Observed failures: should_emulate_decoders: cxl_port endpoint6: decoder6.0: committed: 0 base: 0x0_00000000 size: 0x0_00000000 devm_cxl_setup_hdm: cxl_port endpoint6: Fallback map 1 range register .. devm_cxl_add_region: cxl_acpi ACPI0017:00: decoder0.0: created region0 __construct_region: cxl_pci 0000:e1:00.0: mem1:decoder6.0: __construct_region region0 res: [mem 0x850000000-0x284fffffff flags 0x200] iw: 1 ig: 4096 cxl region0: pci0000:e0:port1 cxl_port_setup_targets expected iw: 1 ig: 4096 .. cxl region0: pci0000:e0:port1 cxl_port_setup_targets got iw: 1 ig: 256 state: disabled .. cxl_port endpoint6: failed to attach decoder6.0 to region0: -6 .. devm_cxl_add_region: cxl_acpi ACPI0017:00: decoder0.0: created region4 alloc_hpa: cxl region4: HPA allocation error (-34) .. Fixes: `52cc48ad2a` ("cxl/hdm: Limit emulation to the number of range registers") Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20260316201950.224567-1-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-16 16:58:32 -07:00
Alejandro Lucero	64584273df	cxl/region: Factor out interleave granularity setup Region creation based on Type3 devices can be triggered from user space allowing memory combination through interleaving. In preparation for kernel driven region creation, that is Type2 drivers triggering region creation backed with its advertised CXL memory, factor out a common helper from the user-sysfs region setup for interleave granularity. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260228173603.1125109-4-alejandro.lucero-palau@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-16 16:32:21 -07:00
Alejandro Lucero	29f0724c45	cxl/region: Factor out interleave ways setup Region creation based on Type3 devices can be triggered from user space allowing memory combination through interleaving. In preparation for kernel driven region creation, that is Type2 drivers triggering region creation backed with its advertised CXL memory, factor out a common helper from the user-sysfs region setup for interleave ways. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Gregory Price <gourry@gourry.net> Tested-by: Gregory Price <gourry@gourry.net> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260228173603.1125109-3-alejandro.lucero-palau@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-16 16:32:05 -07:00
Alejandro Lucero	09d065d256	cxl: Make region type based on endpoint type Current code is expecting Type3 or CXL_DECODER_HOSTONLYMEM devices only. Support for Type2 implies region type needs to be based on the endpoint type HDM-D[B] instead. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Zhi Wang <zhiw@nvidia.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Davidlohr Bueso <daves@stgolabs.net> Reviewed-by: Gregory Price <gourry@gourry.net> Tested-by: Gregory Price <gourry@gourry.net> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://patch.msgid.link/20260228173603.1125109-2-alejandro.lucero-palau@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-16 16:31:56 -07:00
Gregory Price	d537d953c4	cxl/pci: Remove redundant cxl_pci_find_port() call Remove the redundant port lookup from cxl_rcrb_get_comp_regs() and use the dport parameter directly. The caller has already validated the port is non-NULL before invoking this function, and dport is given as a param. This is simpler than getting dport in the callee and return the pointer to the caller what would require more changes. Signed-off-by: Gregory Price <gourry@gourry.net> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://patch.msgid.link/20260306164741.3796372-5-alejandro.lucero-palau@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-16 16:31:39 -07:00
Alejandro Lucero	58f28930c7	cxl: Move pci generic code from cxl_pci to core/cxl_pci Inside cxl/core/pci.c there are helpers for CXL PCIe initialization meanwhile cxl/pci_drv.c implements the functionality for a Type3 device initialization. In preparation for type2 support, move helper functions from cxl/pci.c to cxl/core/pci.c in order to be exported and used by type2 drivers. [ dj: Clarified subject. ] Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Gregory Price <gourry@gourry.net> Link: https://patch.msgid.link/20260306164741.3796372-4-alejandro.lucero-palau@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-16 16:30:22 -07:00
Alejandro Lucero	005869886d	cxl: export internal structs for external Type2 drivers In preparation for type2 support, move structs and functions a type2 driver will need to access to into a new shared header file. Differentiate between public and private data to be preserved by type2 drivers. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260306164741.3796372-3-alejandro.lucero-palau@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-16 16:24:29 -07:00
Alejandro Lucero	9a775c07bb	cxl: support Type2 when initializing cxl_dev_state In preparation for type2 drivers add function and macro for differentiating CXL memory expanders (type 3) from CXL device accelerators (type 2) helping drivers built from public headers to embed struct cxl_dev_state inside a private struct. Update type3 driver for using this same initialization. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260306164741.3796372-2-alejandro.lucero-palau@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-16 16:24:29 -07:00
Keith Busch	93d0fcdddc	cxl/acpi: Fix CXL_ACPI and CXL_PMEM Kconfig tristate mismatch Commit `e7e222ad73` ("cxl: Move devm_cxl_add_nvdimm_bridge() to cxl_pmem.ko") moves devm_cxl_add_nvdimm_bridge() into the cxl_pmem file, which has independent config compile options for built-in or module. The call from cxl_acpi_probe() is guarded by IS_ENABLED(CONFIG_CXL_PMEM), which evaluates to true for both =y and =m. When CONFIG_CXL_PMEM=m, a built-in cxl_acpi attempts to reference a symbol exported by a module, which fails to link. CXL_PMEM cannot simply be promoted to =y in this configuration because it depends on LIBNVDIMM, which may itself be =m. Add a Kconfig dependency to prevent CXL_ACPI from being built-in when CXL_PMEM is a module. This contrains CXL_ACPI to =m when CXL_PMEM=m, while still allowing CXL_ACPI to be freely configured when CXL_PMEM is either built-in or disabled. [ dj: Fix up commit reference formatting. ] Fixes: `e7e222ad73` ("cxl: Move devm_cxl_add_nvdimm_bridge() to cxl_pmem.ko") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20260305204057.1516948-1-kbusch@meta.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-06 08:08:53 -07:00
Davidlohr Bueso	77b310bb7b	cxl/region: Fix leakage in __construct_region() Failing the first sysfs_update_group() needs to explicitly kfree the resource as it is too early for cxl_region_iomem_release() to do so. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Gregory Price <gourry@gourry.net> Fixes: `d6602e2581` (cxl/region: Add support to indicate region has extended linear cache) Link: https://patch.msgid.link/20260202191330.245608-1-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-04 10:26:39 -07:00
Alison Schofield	19d2f0b97a	cxl/port: Fix use after free of parent_port in cxl_detach_ep() cxl_detach_ep() is called during bottom-up removal when all CXL memory devices beneath a switch port have been removed. For each port in the hierarchy it locks both the port and its parent, removes the endpoint, and if the port is now empty, marks it dead and unregisters the port by calling delete_switch_port(). There are two places during this work where the parent_port may be used after freeing: First, a concurrent detach may have already processed a port by the time a second worker finds it via bus_find_device(). Without pinning parent_port, it may already be freed when we discover port->dead and attempt to unlock the parent_port. In a production kernel that's a silent memory corruption, with lock debug, it looks like this: []DEBUG_LOCKS_WARN_ON(__owner_task(owner) != get_current()) []WARNING: kernel/locking/mutex.c:949 at __mutex_unlock_slowpath+0x1ee/0x310 []Call Trace: []mutex_unlock+0xd/0x20 []cxl_detach_ep+0x180/0x400 [cxl_core] []devm_action_release+0x10/0x20 []devres_release_all+0xa8/0xe0 []device_unbind_cleanup+0xd/0xa0 []really_probe+0x1a6/0x3e0 Second, delete_switch_port() releases three devm actions registered against parent_port. The last of those is unregister_port() and it calls device_unregister() on the child port, which can cascade. If parent_port is now also empty the device core may unregister and free it too. So by the time delete_switch_port() returns, parent_port may be free, and the subsequent device_unlock(&parent_port->dev) operates on freed memory. The kernel log looks same as above, with a different offset in cxl_detach_ep(). Both of these issues stem from the absence of a lifetime guarantee between a child port and its parent port. Establish a lifetime rule for ports: child ports hold a reference to their parent device until release. Take the reference when the port is allocated and drop it when released. This ensures the parent is valid for the full lifetime of the child and eliminates the use after free window in cxl_detach_ep(). This is easily reproduced with a reload of cxl_acpi in QEMU with CXL devices present. Fixes: `2345df5424` ("cxl/memdev: Fix endpoint port removal") Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260226184439.1732841-1-alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-03 10:20:19 -07:00
Alison Schofield	e46f25f5a8	cxl/region: Test CXL_DECODER_F_NORMALIZED_ADDRESSING as a bitmask The CXL decoder flags are defined as bitmasks, not bit indices. Using test_bit() to check them interprets the mask value as a bit index, which is the wrong test. For CXL_DECODER_F_NORMALIZED_ADDRESSING the test reads beyond the flags word, making the flag sometimes appear set and blocking creation of CXL region debugfs attributes that support poison operations. Replace test_bit() with a bitmask check. Found with cxl-test. Fixes: `208f432406` ("cxl: Disable HPA/SPA translation handlers for Normalized Addressing") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Gregory Price <gourry@gourry.net> Tested-by: Gregory Price <gourry@gourry.net> Link: https://patch.msgid.link/63fe4a6203e40e404347f1cdc7a1c55cb4959b86.1771873256.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-02-24 08:33:30 -07:00
Alison Schofield	0a70b7cd39	cxl: Test CXL_DECODER_F_LOCK as a bitmask The CXL decoder flags are defined as bitmasks, not bit indices. Using test_bit() to check them interprets the mask value as a bit index, which is the wrong test. For CXL_DECODER_F_LOCK the test reads beyond the defined bits, causing the test to always return false and allowing resets that should have been blocked. Replace test_bit() with a bitmask check. Fixes: `2230c4bdc4` ("cxl: Add handling of locked CXL decoder") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Gregory Price <gourry@gourry.net> Tested-by: Gregory Price <gourry@gourry.net> Link: https://patch.msgid.link/98851c4770e4631753cf9f75b58a3a6daeca2ea2.1771873256.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-02-24 08:33:30 -07:00
Davidlohr Bueso	60b5d1f683	cxl/mbox: validate payload size before accessing contents in cxl_payload_from_user_allowed() cxl_payload_from_user_allowed() casts and dereferences the input payload without first verifying its size. When a raw mailbox command is sent with an undersized payload (ie: 1 byte for CXL_MBOX_OP_CLEAR_LOG, which expects a 16-byte UUID), uuid_equal() reads past the allocated buffer, triggering a KASAN splat: BUG: KASAN: slab-out-of-bounds in memcmp+0x176/0x1d0 lib/string.c:683 Read of size 8 at addr ffff88810130f5c0 by task syz.1.62/2258 CPU: 2 UID: 0 PID: 2258 Comm: syz.1.62 Not tainted 6.19.0-dirty #3 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0xab/0xe0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xce/0x650 mm/kasan/report.c:482 kasan_report+0xce/0x100 mm/kasan/report.c:595 memcmp+0x176/0x1d0 lib/string.c:683 uuid_equal include/linux/uuid.h:73 [inline] cxl_payload_from_user_allowed drivers/cxl/core/mbox.c:345 [inline] cxl_mbox_cmd_ctor drivers/cxl/core/mbox.c:368 [inline] cxl_validate_cmd_from_user drivers/cxl/core/mbox.c:522 [inline] cxl_send_cmd+0x9c0/0xb50 drivers/cxl/core/mbox.c:643 __cxl_memdev_ioctl drivers/cxl/core/memdev.c:698 [inline] cxl_memdev_ioctl+0x14f/0x190 drivers/cxl/core/memdev.c:713 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:597 [inline] __se_sys_ioctl fs/ioctl.c:583 [inline] __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xa8/0x330 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fdaf331ba79 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fdaf1d77038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fdaf3585fa0 RCX: 00007fdaf331ba79 RDX: 00002000000001c0 RSI: 00000000c030ce02 RDI: 0000000000000003 RBP: 00007fdaf33749df R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fdaf3586038 R14: 00007fdaf3585fa0 R15: 00007ffced2af768 </TASK> Add 'in_size' parameter to cxl_payload_from_user_allowed() and validate the payload is large enough. Fixes: `6179045ccc` ("cxl/mbox: Block immediate mode in SET_PARTITION_INFO command") Fixes: `206f9fa9d5` ("cxl/mbox: Add Clear Log mailbox command") Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20260220001618.963490-2-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-02-24 08:33:29 -07:00
Dave Jiang	96a1fd0d84	cxl: Fix race of nvdimm_bus object when creating nvdimm objects Found issue during running of cxl-translate.sh unit test. Adding a 3s sleep right before the test seems to make the issue reproduce fairly consistently. The cxl_translate module has dependency on cxl_acpi and causes orphaned nvdimm objects to reprobe after cxl_acpi is removed. The nvdimm_bus object is registered by the cxl_nvb object when cxl_acpi_probe() is called. With the nvdimm_bus object missing, __nd_device_register() will trigger NULL pointer dereference when accessing the dev->parent that points to &nvdimm_bus->dev. [ 192.884510] BUG: kernel NULL pointer dereference, address: 000000000000006c [ 192.895383] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20250812-19.fc42 08/12/2025 [ 192.897721] Workqueue: cxl_port cxl_bus_rescan_queue [cxl_core] [ 192.899459] RIP: 0010:kobject_get+0xc/0x90 [ 192.924871] Call Trace: [ 192.925959] <TASK> [ 192.926976] ? pm_runtime_init+0xb9/0xe0 [ 192.929712] __nd_device_register.part.0+0x4d/0xc0 [libnvdimm] [ 192.933314] __nvdimm_create+0x206/0x290 [libnvdimm] [ 192.936662] cxl_nvdimm_probe+0x119/0x1d0 [cxl_pmem] [ 192.940245] cxl_bus_probe+0x1a/0x60 [cxl_core] [ 192.943349] really_probe+0xde/0x380 This patch also relies on the previous change where devm_cxl_add_nvdimm_bridge() is called from drivers/cxl/pmem.c instead of drivers/cxl/core.c to ensure the dependency of cxl_acpi on cxl_pmem. 1. Set probe_type of cxl_nvb to PROBE_FORCE_SYNCHRONOUS to ensure the driver is probed synchronously when add_device() is called. 2. Add a check in __devm_cxl_add_nvdimm_bridge() to ensure that the cxl_nvb driver is attached during cxl_acpi_probe(). 3. Take the cxl_root uport_dev lock and the cxl_nvb->dev lock in devm_cxl_add_nvdimm() before checking nvdimm_bus is valid. 4. Set cxl_nvdimm flag to CXL_NVD_F_INVALIDATED so cxl_nvdimm_probe() will exit with -EBUSY. The removal of cxl_nvdimm devices should prevent any orphaned devices from probing once the nvdimm_bus is gone. [ dj: Fixed 0-day reported kdoc issue. ] [ dj: Fix cxl_nvb reference leak on error. Gregory (kreview-0811365) ] Suggested-by: Dan Williams <dan.j.williams@intel.com> Fixes: `8fdcb1704f` ("cxl/pmem: Add initial infrastructure for pmem support") Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com?> Link: https://patch.msgid.link/20260205001633.1813643-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-02-24 08:33:21 -07:00
Dave Jiang	e7e222ad73	cxl: Move devm_cxl_add_nvdimm_bridge() to cxl_pmem.ko Moving the symbol devm_cxl_add_nvdimm_bridge() to drivers/cxl/cxl_pmem.c, so that cxl_pmem can export a symbol that gives cxl_acpi a depedency on cxl_pmem kernel module. This is a prepatory patch to resolve the issue of a race for nvdimm_bus object that is created during cxl_acpi_probe(). No functional changes besides moving code. Suggested-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com?> Link: https://patch.msgid.link/20260205001633.1813643-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-02-23 11:29:05 -07:00
Li Ming	0066688dbc	cxl/port: Hold port host lock during dport adding. CXL testing environment can trigger following trace Oops: general protection fault, probably for non-canonical address 0xdffffc0000000092: 0000 [#1] SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000490-0x0000000000000497] RIP: 0010:cxl_dpa_to_region+0x105/0x1f0 [cxl_core] Call Trace: <TASK> cxl_event_trace_record+0xd1/0xa70 [cxl_core] __cxl_event_trace_record+0x12f/0x1e0 [cxl_core] cxl_mem_get_records_log+0x261/0x500 [cxl_core] cxl_mem_get_event_records+0x7c/0xc0 [cxl_core] cxl_mock_mem_probe+0xd38/0x1c60 [cxl_mock_mem] platform_probe+0x9d/0x130 really_probe+0x1c8/0x960 __driver_probe_device+0x187/0x3e0 driver_probe_device+0x45/0x120 __device_attach_driver+0x15d/0x280 When CXL subsystem adds a cxl port to a hierarchy, there is a small window where the new port becomes visible before it is bound to a driver. This happens because device_add() adds a device to bus device list before bus_probe_device() binds it to a driver. So if two cxl memdevs are trying to add a dport to a same port via devm_cxl_enumerate_ports(), the second cxl memdev may observe the port and attempt to add a dport, but fails because the port has not yet been attached to cxl port driver. That causes the memdev->endpoint can not be updated. The sequence is like: CPU 0 CPU 1 devm_cxl_enumerate_ports() # port not found, add it add_port_attach_ep() # hold the parent port lock # to add the new port devm_cxl_create_port() device_add() # Add dev to bus devs list bus_add_device() devm_cxl_enumerate_ports() # found the port find_cxl_port_by_uport() # hold port lock to add a dport device_lock(the port) find_or_add_dport() cxl_port_add_dport() return -ENXIO because port->dev.driver is NULL device_unlock(the port) bus_probe_device() # hold the port lock # for attaching device_lock(the port) attaching the new port device_unlock(the port) To fix this race, require that dport addition holds the host lock of the target port(the host of CXL root and all cxl host bridge ports is the platform firmware device, the host of all other ports is their parent port). The CXL subsystem already requires holding the host lock while attaching a new port. Therefore, successfully acquiring the host lock guarantees that port attaching has completed. Fixes: `4f06d81e7c` ("cxl: Defer dport allocation for switch ports") Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20260210-fix-port-enumeration-failure-v3-2-06acce0b9ead@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-02-23 09:31:22 -07:00
Li Ming	822655e675	cxl/port: Introduce port_to_host() helper In CXL subsystem, a port has its own host device for the port creation and removal. The host of CXL root and all the first level ports is the platform firmware device, the host of other ports is their parent port's device. Create this new helper to much easier to get the host of a cxl port. Introduce port_to_host() and use it to replace all places where using open coded to get the host of a port. Remove endpoint_host() as its functionality can be replaced by port_to_host(). [dj: Squashed commit 1 and 3 in the series to commit 1. ] Signed-off-by: Li Ming <ming.li@zohomail.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20260210-fix-port-enumeration-failure-v3-1-06acce0b9ead@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-02-23 09:31:07 -07:00
Gregory Price	318c58852e	cxl/memdev: fix deadlock in cxl_memdev_autoremove() on attach failure cxl_memdev_autoremove() takes device_lock(&cxlmd->dev) via guard(device) and then calls cxl_memdev_unregister() when the attach callback was provided but cxl_mem_probe() failed to bind. cxl_memdev_unregister() calls cdev_device_del() device_del() bus_remove_device() device_release_driver() This path is reached when a driver uses the @attach parameter to devm_cxl_add_memdev() and the CXL topology fails to enumerate (e.g. DVSEC range registers decode outside platform-defined CXL ranges, causing the endpoint port probe to fail). Add cxl_memdev_attach_failed() to set the scope of the check correctly. Reported-by: kreview-c94b85d6d2 Fixes: `29317f8dc6` ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation") Signed-off-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://patch.msgid.link/20260211192228.2148713-1-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-02-23 09:03:44 -07:00
Linus Torvalds	323bbfcf1e	Convert 'alloc_flex' family to use the new default GFP_KERNEL argument This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Linus Torvalds	bf4afc53b7	Convert 'alloc_obj' family to use the new default GFP_KERNEL argument This was done entirely with mindless brute force, using git grep -l '\<k[vmz]alloc_objs(., GFP_KERNEL)' \| xargs sed -i 's/\(alloc_objs(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Kees Cook	69050f8d6d	treewide: Replace kmalloc with kmalloc_obj for non-scalar types This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>	2026-02-21 01:02:28 -08:00
Linus Torvalds	e812928be2	Merge tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull CXL updates from Dave Jiang: - Introduce cxl_memdev_attach and pave way for soft reserved handling, type2 accelerator enabling, and LSA 2.0 enabling. All these series require the endpoint driver to settle before continuing the memdev driver probe. - Address CXL port error protocol handling and reporting. The large patch series was split into three parts. The first two parts are included here with the final part coming later. The first part consists of a series of code refactoring to PCI AER sub-system that addresses CXL and also CXL RAS code to prepare for port error handling. The second part refactors the CXL code to move management of component registers to cxl_port objects to allow all CXL AER errors to be handled through the cxl_port hierarchy. - Provide AMD Zen5 platform address translation for CXL using ACPI PRMT. This includes a conventions document to explain why this is needed and how it's implemented. - Misc CXL patches of fixes, cleanups, and updates. Including CXL address translation for unaligned MOD3 regions. [ TLA service: CXL is "Compute Express Link" ] * tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (59 commits) cxl: Disable HPA/SPA translation handlers for Normalized Addressing cxl/region: Factor out code into cxl_region_setup_poison() cxl/atl: Lock decoders that need address translation cxl: Enable AMD Zen5 address translation using ACPI PRMT cxl/acpi: Prepare use of EFI runtime services cxl: Introduce callback for HPA address ranges translation cxl/region: Use region data to get the root decoder cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos() cxl/region: Separate region parameter setup and region construction cxl: Simplify cxl_root_ops allocation and handling cxl/region: Store HPA range in struct cxl_region cxl/region: Store root decoder in struct cxl_region cxl/region: Rename misleading variable name @hpa to @hpa_range Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement cxl, doc: Moving conventions in separate files cxl, doc: Remove isonum.txt inclusion cxl/port: Unify endpoint and switch port lookup cxl/port: Move endpoint component register management to cxl_port cxl/port: Map Port RAS registers cxl/port: Move dport RAS setup to dport add time ...	2026-02-12 16:33:05 -08:00
Rafael J. Wysocki	dfa5dc3ad3	Merge branch 'acpi-apei' Merge ACPI APEI support updates for 6.20-rc1/7.0-rc1: - Make read-only array non_mmio_desc[] static const (Colin Ian King) - Prevent the APEI GHES support code on ARM from accessing memory out of bounds or going past the ARM processor CPER record buffer (Mauro Carvalho Chehab) - Prevent cper_print_fw_err() from dumping the entire memory on systems with defective firmware (Mauro Carvalho Chehab) - Improve ghes_notify_nmi() status check to avoid unnecessary overhead in the NMI handler by carrying out all of the requisite preparations and the NMI registration time (Tony Luck) - Refactor the GHES driver by extracting common functionality into reusable helper functions to reduce code duplication and improve the ghes_notify_sea() status check in analogy with the previous ghes_notify_nmi() status check improvement (Shuai Xue) - Make ELOG and GHES log and trace consistently and support the CPER CXL protocol analogously (Fabio De Francesco) - Disable KASAN instrumentation in the APEI GHES driver when compile testing with clang < 18 (Nathan Chancellor) - Let ghes_edac be the preferred driver to load on __ZX__ and _BYO_ systems by extending the platform detection list in the APEI GHES driver (Tony W Wang-oc) * acpi-apei: ACPI: APEI: GHES: Add ghes_edac support for __ZX__ and _BYO_ systems ACPI: APEI: GHES: Disable KASAN instrumentation when compile testing with clang < 18 ACPI: extlog: Trace CPER CXL Protocol Error Section ACPI: APEI: GHES: Add helper to copy CPER CXL protocol error info to work struct ACPI: APEI: GHES: Add helper for CPER CXL protocol errors checks ACPI: extlog: Trace CPER PCI Express Error Section ACPI: extlog: Trace CPER Non-standard Section Body ACPI: APEI: GHES: Improve ghes_notify_sea() status check ACPI: APEI: GHES: Extract helper functions for error status handling ACPI: APEI: GHES: Improve ghes_notify_nmi() status check EFI/CPER: don't dump the entire memory region APEI/GHES: ensure that won't go past CPER allocated record EFI/CPER: don't go past the ARM processor CPER record buffer APEI/GHES: ARM processor Error: don't go past allocated memory ACPI: APEI: EINJ: make read-only array non_mmio_desc static const	2026-02-05 15:17:54 +01:00
Dave Jiang	63fbf275fa	Merge branch 'for-7.0/cxl-prm-translation' into cxl-for-next Add support for normalized CXL address translation through ACPI PRM method to support AMD Zen5 platforms. Including a conventions doc that explains how the translation is implemented and for future implementations that need such setup to comply with the current implementation method. cxl: Disable HPA/SPA translation handlers for Normalized Addressing cxl/region: Factor out code into cxl_region_setup_poison() cxl/atl: Lock decoders that need address translation cxl: Enable AMD Zen5 address translation using ACPI PRMT cxl/acpi: Prepare use of EFI runtime services cxl: Introduce callback for HPA address ranges translation cxl/region: Use region data to get the root decoder cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos() cxl/region: Separate region parameter setup and region construction cxl: Simplify cxl_root_ops allocation and handling cxl/region: Store HPA range in struct cxl_region cxl/region: Store root decoder in struct cxl_region cxl/region: Rename misleading variable name @hpa to @hpa_range Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement cxl, doc: Moving conventions in separate files cxl, doc: Remove isonum.txt inclusion	2026-02-04 10:53:33 -07:00
Robert Richter	208f432406	cxl: Disable HPA/SPA translation handlers for Normalized Addressing The root decoder provides the callbacks hpa_to_spa and spa_to_hpa to perform Host Physical Address (HPA) and System Physical Address translations, respectively. The callbacks are required to convert addresses when HPA != SPA. XOR interleaving depends on this mechanism, and the necessary handlers are implemented. The translation handlers are used for poison injection (trace_cxl_poison, cxl_poison_inject_fops) and error handling (cxl_event_trace_record). In AMD Zen5 systems with Normalized Addressing, endpoint addresses are not SPAs, and translation handlers are required for these features to function correctly. Now, as ACPI PRM translation could be expensive in tracing or error handling code paths, do not yet enable translations to avoid its intensive use. Instead, disable those features which are used only for debugging and enhanced logging. Introduce the flag CXL_REGION_F_NORMALIZED_ADDRESSING that indicates Normalized Addressing for a region and use it to disable poison injection and DPA to HPA conversion. Note: Dropped unused CXL_DECODER_F_MASK macro. [dj: Fix commit log CXL_REGION_F_NORM_ADDR to CXL_REGION_F_NORMALIZED_ADDRESSING ] Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20260114164837.1076338-14-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-02-04 09:17:31 -07:00
Robert Richter	d1c9ba46d6	cxl/region: Factor out code into cxl_region_setup_poison() Poison injection setup code is embedded in cxl_region_probe(). For improved encapsulation, readability, and maintainability, factor out code into function cxl_region_setup_poison(). This patch is a prerequisite to disable poison by region offset for Normalized Addressing. No functional changes. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260114164837.1076338-13-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-02-04 09:17:31 -07:00
Robert Richter	a2e7948950	cxl/atl: Lock decoders that need address translation The current kernel implementation does not support endpoint setup with Normalized Addressing. It only translates an endpoint's DPA to the SPA range of the host bridge. Therefore, the endpoint address range cannot be determined, making a non-auto setup impossible. If a decoder requires address translation, reprogramming should be disabled and the decoder locked. The BIOS, however, provides all the necessary address translation data, which the kernel can use to reconfigure endpoint decoders with normalized addresses. Locking the decoders in the BIOS would prevent a capable kernel (or other operating systems) from shutting down auto-generated regions and managing resources dynamically. Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com>> --- Link: https://patch.msgid.link/20260114164837.1076338-12-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-02-04 09:17:31 -07:00
Robert Richter	af74daf916	cxl: Enable AMD Zen5 address translation using ACPI PRMT Add AMD Zen5 support for address translation. Zen5 systems may be configured to use 'Normalized addresses'. Then, host physical addresses (HPA) are different from their system physical addresses (SPA). The endpoint has its own physical address space and an incoming HPA is already converted to the device's physical address (DPA). Thus it has interleaving disabled and CXL endpoints are programmed passthrough (DPA == HPA). Host Physical Addresses (HPAs) need to be translated from the endpoint to its CXL host bridge, esp. to identify the endpoint's root decoder and region's address range. ACPI Platform Runtime Mechanism (PRM) provides a handler to translate the DPA to its SPA. This is documented in: AMD Family 1Ah Models 00h–0Fh and Models 10h–1Fh ACPI v6.5 Porting Guide, Publication # 58088 https://www.amd.com/en/search/documentation/hub.html With Normalized Addressing this PRM handler must be used to translate an HPA of an endpoint to its SPA. Do the following to implement AMD Zen5 address translation: Introduce a new file core/atl.c to handle ACPI PRM specific address translation code. Naming is loosely related to the kernel's AMD Address Translation Library (CONFIG_AMD_ATL) but implementation does not depend on it, nor it is vendor specific. Use Kbuild and Kconfig options respectively to enable the code depending on architecture and platform options. AMD Zen5 systems support the ACPI PRM CXL Address Translation firmware call (see ACPI v6.5 Porting Guide, Address Translation - CXL DPA to System Physical Address). Firmware enables the PRM handler if the platform has address translation implemented. Check firmware and kernel support of ACPI PRM using the specific GUID. On success enable address translation by setting up the earlier introduced root port callback, see function cxl_prm_setup_translation(). Setup is done in cxl_setup_prm_address_translation(), it is the only function that needs to be exported. For low level PRM firmware calls, use the ACPI framework. Identify the region's interleaving ways by inspecting the address ranges. Also determine the interleaving granularity using the address translation callback. Note that the position of the chunk from one interleaving block to the next may vary and thus cannot be considered constant. Address offsets larger than the interleaving block size cannot be used to calculate the granularity. Thus, probe the granularity using address translation for various HPAs in the same interleaving block. [ dj: Add atl.o build to cxl_test ] Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Gregory Price <gourry@gourry.net> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260114164837.1076338-11-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-02-04 09:17:00 -07:00

1 2 3 4 5 ...

999 Commits