linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-18 06:44:00 -04:00

Author	SHA1	Message	Date
Dave Jiang	3939dba00f	Merge branch 'for-7.1/cxl-misc' into cxl-for-next cxl/hdm: Add support for 32 switch decoders	2026-04-10 11:11:53 -07:00
Li Ming	3624a22783	cxl/hdm: Add support for 32 switch decoders Per CXL r4.0 section 8.2.4.20.1. CXL host bridge and switch ports can support 32 HDM decoders. Current implementation misses some decoders on CXL host bridge and switch in the case that the value of Decoder Count field in CXL HDM decoder Capability Register is greater than or equal to 9. Update calculation implementation to ensure the decoder count calculation is correct for CXL host bridge/switch ports. Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20260321061459.1910205-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-04-10 08:32:14 -07:00
Dave Jiang	7aacc62557	Merge branch 'for-7.1/cxl-type2-support' into cxl-for-next Prep patches for CXL type2 accelerator basic support cxl/region: Factor out interleave granularity setup cxl/region: Factor out interleave ways setup cxl: Make region type based on endpoint type cxl/pci: Remove redundant cxl_pci_find_port() call cxl: Move pci generic code from cxl_pci to core/cxl_pci cxl: export internal structs for external Type2 drivers cxl: support Type2 when initializing cxl_dev_state	2026-04-03 12:18:23 -07:00
Li Ming	dc372e5f42	cxl/pci: Hold memdev lock in cxl_event_trace_record() cxl_event_config() invokes cxl_mem_get_event_record() to get remain event logs from CXL device during cxl_pci_probe(). If CXL memdev probing failed before that, it is possible to access an invalid endpoint. So adding a cxlmd->driver binding status checking inside cxl_dpa_to_region() to ensure the corresponding endpoint is valid. Besides, cxl_event_trace_record() needs to hold memdev lock to invoke cxl_dpa_to_region() to ensure the memdev probing completed. It is possible that cxl_event_trace_record() is invoked during the CXL memdev probing, especially user or cxl_acpi triggers CXL memdev re-probing. Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20260314-fix_access_endpoint_without_drv_check-v2-3-4c09edf2e1db@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-17 07:54:15 -07:00
Alejandro Lucero	005869886d	cxl: export internal structs for external Type2 drivers In preparation for type2 support, move structs and functions a type2 driver will need to access to into a new shared header file. Differentiate between public and private data to be preserved by type2 drivers. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260306164741.3796372-3-alejandro.lucero-palau@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-16 16:24:29 -07:00
Alejandro Lucero	9a775c07bb	cxl: support Type2 when initializing cxl_dev_state In preparation for type2 drivers add function and macro for differentiating CXL memory expanders (type 3) from CXL device accelerators (type 2) helping drivers built from public headers to embed struct cxl_dev_state inside a private struct. Update type3 driver for using this same initialization. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260306164741.3796372-2-alejandro.lucero-palau@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-03-16 16:24:29 -07:00
Dave Jiang	0da3050bdd	Merge branch 'for-7.0/cxl-aer-prep' into cxl-for-next Fixup and refactor downstream port enumeration to prepare for CXL port protocol error handling. Main motivation is to move endpoint component register mapping to a port object. cxl/port: Unify endpoint and switch port lookup cxl/port: Move endpoint component register management to cxl_port cxl/port: Map Port RAS registers cxl/port: Move dport RAS setup to dport add time cxl/port: Move dport probe operations to a driver event cxl/port: Move decoder setup before dport creation cxl/port: Cleanup dport removal with a devres group cxl/port: Reduce number of @dport variables in cxl_port_add_dport() cxl/port: Cleanup handling of the nr_dports 0 -> 1 transition	2026-02-02 09:39:41 -07:00
Dan Williams	dab7162d0a	cxl/port: Move endpoint component register management to cxl_port In preparation for generic protocol error handling across CXL endpoints, whether they be memory expander class devices or accelerators, drop the endpoint component management from cxl_dev_state. Organize all CXL port component management through the common cxl_port driver. Note that the end game is that drivers/cxl/core/ras.c loses all dependencies on a 'struct cxl_dev_state' parameter and operates only on port resources. The removal of component register mapping from cxl_pci is an incremental step towards that. Reviewed-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20260131000403.2135324-9-dan.j.williams@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-02-02 08:46:17 -07:00
Dan Williams	29317f8dc6	cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation Unlike the cxl_pci class driver that opportunistically enables memory expansion with no other dependent functionality, CXL accelerator drivers have distinct PCIe-only and CXL-enhanced operation states. If CXL is available some additional coherent memory/cache operations can be enabled, otherwise traditional DMA+MMIO over PCIe/CXL.io is a fallback. This constitutes a new mode of operation where the caller of devm_cxl_add_memdev() wants to make a "go/no-go" decision about running in CXL accelerated mode or falling back to PCIe-only operation. Part of that decision making process likely also includes additional CXL-acceleration-specific resource setup. Encapsulate both of those requirements into 'struct cxl_memdev_attach' that provides a ->probe() callback. The probe callback runs in cxl_mem_probe() context, after the port topology is successfully attached for the given memdev. It supports a contract where, upon successful return from devm_cxl_add_memdev(), everything needed for CXL accelerated operation has been enabled. Additionally the presence of @cxlmd->attach indicates that the accelerator driver be detached when CXL operation ends. This conceptually makes a CXL link loss event mirror a PCIe link loss event which results in triggering the ->remove() callback of affected devices+drivers. A driver can re-attach to recover back to PCIe-only operation. Live recovery, i.e. without a ->remove()/->probe() cycle, is left as a future consideration. [ dj: Repalce with updated commit log from Dan ] Cc: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20251216005616.3090129-7-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-01-05 10:58:04 -07:00
Dan Williams	f2546eba53	cxl/mem: Drop @host argument to devm_cxl_add_memdev() In all cases the device that created the 'struct cxl_dev_state' instance is also the device to host the devm cleanup of devm_cxl_add_memdev(). This simplifies the function prototype, and limits a degree of freedom of the API. Cc: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/20251216005616.3090129-6-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-01-05 10:14:53 -07:00
Dan Williams	1f1cb7f0c2	cxl/mem: Arrange for always-synchronous memdev attach In preparation for CXL accelerator drivers that have a hard dependency on CXL capability initialization, arrange for cxl_mem_probe() to always run synchronous with the device_add() of cxl_memdev instances. I.e. cxl_mem_driver registration is always complete before the first memdev creation event. At present, cxl_pci does not care about the attach state of the cxl_memdev because all generic memory expansion functionality can be handled by the cxl_core. For accelerators, however, that driver needs to perform driver specific initialization if CXL is available, or execute a fallback to PCIe only operation. This synchronous attach guarantee is also needed for Soft Reserve Recovery, which is an effort that needs to assert that devices have had a chance to attach before making a go / no-go decision on proceeding with CXL subsystem initialization. By moving devm_cxl_add_memdev() to cxl_mem.ko it removes async module loading as one reason that a memdev may not be attached upon return from devm_cxl_add_memdev(). Cc: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Cc: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/20251216005616.3090129-3-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-01-05 10:13:53 -07:00
Dan Williams	10016118b6	cxl/mem: Fix devm_cxl_memdev_edac_release() confusion A device release method is only for undoing allocations on the path to preparing the device for device_add(). In contrast, devm allocations are post device_add(), are acquired during / after ->probe() and are released synchronous with ->remove(). So, a "devm" helper in a "release" method is a clear anti-pattern. Move this devm release action where it belongs, an action created at edac object creation time. Otherwise, this leaks resources until cxl_memdev_release() time which may be long after these xarray and error record caches have gone idle. Note, this also fixes up the type of @cxlmd->err_rec_array which needlessly dropped type-safety. Fixes: `0b5ccb0de1` ("cxl/edac: Support for finding memory operation attributes from the current boot") Cc: Dave Jiang <dave.jiang@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Shiju Jose <shiju.jose@huawei.com> Cc: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Shiju Jose <shiju.jose@huawei.com> Reviewed-by: Shiju Jose <shiju.jose@huawei.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/20251216005616.3090129-2-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-01-05 10:13:33 -07:00
Alison Schofield	25a0207828	cxl/core: Add locked variants of the poison inject and clear funcs The core functions that validate and send inject and clear commands to the memdev devices require holding both the dpa_rwsem and the region_rwsem. In preparation for another caller of these functions that must hold the locks upon entry, split the work into a locked and unlocked pair. Consideration was given to moving the locking to both callers, however, the existing caller is not in the core (mem.c) and cannot access the locks. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/1d601f586975195733984ca63d1b5789bbe8690f.1754290144.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-08-12 16:02:00 -07:00
Dave Jiang	3a32c5b3bb	Merge branch 'for-6.17/cxl-events-updates' into cxl-for-next Update Common Event Record to CXL r3.2 definition. Add additional validity check for event records. Add memory sparing event record tracing.	2025-07-18 16:15:44 -07:00
Shiju Jose	f10f46a0ee	cxl/events: Trace Memory Sparing Event Record CXL rev 3.2 section 8.2.10.2.1.4 Table 8-60 defines the Memory Sparing Event Record. Determine if the event read is memory sparing record and if so trace the record. Memory device shall produce a memory sparing event record 1. After completion of a PPR maintenance operation if the memory sparing event record enable bit is set (Field: sPPR/hPPR Operation Mode in Table 8-128/Table 8-131). 2. In response to a query request by the host (see section 8.2.10.7.1.4) to determine the availability of sparing resources. The device shall report the resource availability by producing the Memory Sparing Event Record (see Table 8-60) in which the channel, rank, nibble mask, bank group, bank, row, column, sub-channel fields are a copy of the values specified in the request. If the controller does not support reporting whether a resource is available, and a perform maintenance operation for memory sparing is issued with query resources set to 1, the controller shall return invalid input. Example trace log for produce memory sparing event record on completion of a soft PPR operation, cxl_memory_sparing: memdev=mem1 host=0000:0f:00.0 serial=3 log=Informational : time=55045163029 uuid=e71f3a40-2d29-4092-8a39-4d1c966c7c65 len=128 flags='0x1' handle=1 related_handle=0 maint_op_class=2 maint_op_sub_class=1 ld_id=0 head_id=0 : flags='' result=0 validity_flags='CHANNEL\|RANK\|NIBBLE\|BANK GROUP\|BANK\|ROW\|COLUMN' spare resource avail=1 channel=2 rank=5 nibble_mask=a59c bank_group=2 bank=4 row=13 column=23 sub_channel=0 comp_id=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 comp_id_pldm_valid_flags='' pldm_entity_id=0x00 pldm_resource_id=0x00 Note: For memory sparing event record, fields 'maintenance operation class' and 'maintenance operation subclass' are defined twice, first in the common event record (Table 8-55) and second in the memory sparing event record (Table 8-60). Thus those in the sparing event record coded as reserved, to be removed when the spec is updated. Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250717101817.2104-5-shiju.jose@huawei.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-07-18 08:19:56 -07:00
Dan Williams	683513084a	cxl/mbox: Convert poison list mutex to ACQUIRE() Towards removing all explicit unlock calls in the CXL subsystem, convert the conditional poison list mutex to use a conditional lock guard. Rename the lock to have the compiler validate that all existing call sites are converted. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250711234932.671292-3-dan.j.williams@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-07-16 11:34:36 -07:00
Shiju Jose	0b5ccb0de1	cxl/edac: Support for finding memory operation attributes from the current boot Certain operations on memory, such as memory repair, are permitted only when the address and other attributes for the operation are from the current boot. This is determined by checking whether the memory attributes for the operation match those in the CXL gen_media or CXL DRAM memory event records reported during the current boot. The CXL event records must be backed up because they are cleared in the hardware after being processed by the kernel. Support is added for storing CXL gen_media or CXL DRAM memory event records in xarrays. Old records are deleted when they expire or when there is an overflow and which depends on platform correctly report Event Record Timestamp field of CXL spec Table 8-55 Common Event Record Format. Additionally, helper functions are implemented to find a matching record in the xarray storage based on the memory attributes and repair type. Add validity check, when matching attributes for sparing, using the validity flag in the DRAM event record, to ensure that all required attributes for a requested repair operation are valid and set. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250521124749.817-7-shiju.jose@huawei.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-05-23 13:24:38 -07:00
Shiju Jose	077ee5f7dd	cxl/edac: Add support for PERFORM_MAINTENANCE command Add support for PERFORM_MAINTENANCE command. CXL spec 3.2 section 8.2.10.7.1 describes the Perform Maintenance command. This command requests the device to execute the maintenance operation specified by the maintenance operation class and the maintenance operation subclass. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250521124749.817-6-shiju.jose@huawei.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-05-23 13:24:28 -07:00
Shiju Jose	0c6e6f1357	cxl/edac: Add CXL memory device patrol scrub control feature CXL spec 3.2 section 8.2.10.9.11.1 describes the device patrol scrub control feature. The device patrol scrub proactively locates and makes corrections to errors in regular cycle. Allow specifying the number of hours within which the patrol scrub must be completed, subject to minimum and maximum limits reported by the device. Also allow disabling scrub allowing trade-off error rates against performance. Add support for patrol scrub control on CXL memory devices. Register with the EDAC device driver, which retrieves the scrub attribute descriptors from EDAC scrub and exposes the sysfs scrub control attributes to userspace. For example, scrub control for the CXL memory device "cxl_mem0" is exposed in /sys/bus/edac/devices/cxl_mem0/scrubX/. Additionally, add support for region-based CXL memory patrol scrub control. CXL memory regions may be interleaved across one or more CXL memory devices. For example, region-based scrub control for "cxl_region1" is exposed in /sys/bus/edac/devices/cxl_region1/scrubX/. [dj: A few formatting fixes from Jonathan] Reviewed-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250521124749.817-4-shiju.jose@huawei.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-05-23 13:24:09 -07:00
Dave Jiang	3b5d43245f	Merge branch 'for-6.15/features' into cxl-for-next Add CXL Features support. Setup code for enabling in kernel usage of CXL Features. Expecting EDAC/RAS to utilize CXL Features in kernel for things such as memory sparing. Also prepartion for enabling of CXL FWCTL support to issue allowed Features from user space.	2025-03-17 09:22:59 -07:00
Dave Jiang	763e15d047	Merge branch 'for-6.15/extended-linear-cache' into cxl-for-next2 Add support for Extended Linear Cache for CXL. Add enumeration support of the cache. Add MCE notification of the aliased memory address.	2025-03-14 16:22:34 -07:00
Dave Jiang	d781a45270	Merge branch 'for-6.15/dirty-shutdown' into cxl-for-next2 Add support for Global Persistent Flush (GPF) and dirty shutdown accounting.	2025-03-14 16:11:42 -07:00
Davidlohr Bueso	7d0ecc0bd8	cxl/pmem: Export dirty shutdown count via sysfs Similar to how the acpi_nfit driver exports Optane dirty shutdown count, introduce: /sys/bus/cxl/devices/nvdimm-bridge0/ndbusX/nmemY/cxl/dirty_shutdown Under the conditions that 1) dirty shutdown can be set, 2) Device GPF DVSEC exists, and 3) the count itself can be retrieved. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20250220220235.276831-4-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 15:55:26 -07:00
Davidlohr Bueso	86349aaaea	cxl/pmem: Rename cxl_dirty_shutdown_state() ... to a better suited 'cxl_arm_dirty_shutdown()'. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20250220220235.276831-3-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 15:55:25 -07:00
Davidlohr Bueso	a52b6a2c1c	cxl/pci: Support Global Persistent Flush (GPF) Add support for GPF flows. It is found that the CXL specification around this to be a bit too involved from the driver side. And while this should really all handled by the hardware, this patch takes things with a grain of salt. Upon respective port enumeration, both phase timeouts are set to a max of 20 seconds, which is the NMI watchdog default for lockup detection. The premise is that the kernel does not have enough information to set anything better than a max across the board and hope devices finish their GPF flows within the platform energy budget. Timeout detection is based on dirty Shutdown semantics. The driver will mark it as dirty, expecting that the device clear it upon a successful GPF event. The admin may consult the device Health and check the dirty shutdown counter to see if there was a problem with data integrity. [ davej: Explicitly set return to 0 in update_gpf_port_dvsec() ] [ davej: Add spec reference for 'struct cxl_mbox_set_shutdown_state_in ] [ davej: Fix 0-day reported issue ] Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250124233533.910535-1-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 15:50:22 -07:00
Ira Weiny	16ca2f5431	cxl/memdev: Remove unused partition values The next volatile and next persistent values are unused and are cluttering the cxl_memdev_state. Remove these values. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20250206-cxl-cleanup-v1-1-9ddf26dd8433@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 15:00:45 -07:00
Dave Jiang	516e5bd0b6	cxl: Add mce notifier to emit aliased address for extended linear cache Below is a setup with extended linear cache configuration with an example layout of memory region shown below presented as a single memory region consists of 256G memory where there's 128G of DRAM and 128G of CXL memory. The kernel sees a region of total 256G of system memory. 128G DRAM 128G CXL memory \|-----------------------------------\|-------------------------------------\| Data resides in either DRAM or far memory (FM) with no replication. Hot data is swapped into DRAM by the hardware behind the scenes. When error is detected in one location, it is possible that error also resides in the aliased location. Therefore when a memory location that is flagged by MCE is part of the special region, the aliased memory location needs to be offlined as well. Add an mce notify callback to identify if the MCE address location is part of an extended linear cache region and handle accordingly. Added symbol export to set_mce_nospec() in x86 code in order to call set_mce_nospec() from the CXL MCE notify callback. Link: https://lore.kernel.org/linux-cxl/668333b17e4b2_5639294fd@dwillia2-xfh.jf.intel.com.notmuch/ Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250226162224.3633792-5-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 14:13:49 -07:00
Dave Jiang	f0e6a2329b	cxl: Add Get Supported Features command for kernel usage CXL spec r3.2 8.2.9.6.1 Get Supported Features (Opcode 0500h) The command retrieve the list of supported device-specific features (identified by UUID) and general information about each Feature. The driver will retrieve the Feature entries in order to make checks and provide information for the Get Feature and Set Feature command. One of the main piece of information retrieved are the effects a Set Feature command would have for a particular feature. The retrieved Feature entries are stored in the cxl_mailbox context. The setup of Features is initiated via devm_cxl_setup_features() during the pci probe function before the cxl_memdev is enumerated. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Tested-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250220194438.2281088-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 08:51:27 -07:00
Dave Jiang	cbbca60a1e	cxl: Enumerate feature commands Add feature commands enumeration code in order to detect and enumerate the 3 feature related commands "get supported features", "get feature", and "set feature". The enumeration will help determine whether the driver can issue any of the 3 commands to the device. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Tested-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250220194438.2281088-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 08:49:41 -07:00
Dave Jiang	5666a7e7da	cxl: Refactor user ioctl command path from mds to mailbox With 'struct cxl_mailbox' context introduced, the helper functions cxl_query_cmd() and cxl_send_cmd() can take a cxl_mailbox directly rather than a cxl_memdev parameter. Refactor to use cxl_mailbox directly. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250204220430.4146187-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-24 16:42:36 -07:00
Dan Williams	58d60bbe0a	cxl: Cleanup partition size and perf helpers Now that the 'struct cxl_dpa_partition' array contains both size and performance information, all paths that iterate over that information can use a loop rather than hard-code 'ram' and 'pmem' lookups. Remove, or reduce the scope of the temporary helpers that bridged the pre-'struct cxl_dpa_partition' state of the code to the post-'struct cxl_dpa_partition' state. - to_{ram,pmem}_perf(): scope reduced to just sysfs_emit + is_visible() helpers - to_{ram,pmem}_res(): fold into their only users cxl_{ram,pmem}_size() - cxl_ram_size(): scope reduced to ram_size_show() (Note, cxl_pmem_size() also used to gate nvdimm registration) In short, memdev sysfs ABI already made the promise that 0-sized partitions will show for memdevs, but that can be avoided for future partitions by using dynamic sysfs group visibility (new relative to when the partition ABI first shipped upstream). Cc: Dave Jiang <dave.jiang@intel.com> Cc: Alejandro Lucero <alucerop@amd.com> Cc: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/173864307519.668823.10800104022426067621.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-04 13:48:19 -07:00
Dan Williams	be5cbd0840	cxl: Kill enum cxl_decoder_mode Now that the operational mode of DPA capacity (ram vs pmem... etc) is tracked in the partition, and no code paths have dependencies on the mode implying the partition index, the ambiguous 'enum cxl_decoder_mode' can be cleaned up, specifically this ambiguity on whether the operation mode implied anything about the partition order. Endpoint decoders simply reference their assigned partition where the operational mode can be retrieved as partition mode. With this in place PMEM can now be partition0 which happens today when the RAM capacity size is zero. Dynamic RAM can appear above PMEM when DCD arrives, etc. Code sequences that hard coded the "PMEM after RAM" assumption can now just iterate partitions and consult the partition mode after the fact. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/173864306972.668823.3327008645125276726.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-04 13:48:19 -07:00
Dan Williams	991d98f17d	cxl: Make cxl_dpa_alloc() DPA partition number agnostic cxl_dpa_alloc() is a hard coded nest of assumptions around PMEM allocations being distinct from RAM allocations in specific ways when in practice the allocation rules are only relative to DPA partition index. The rules for cxl_dpa_alloc() are: - allocations can only come from 1 partition - if allocating at partition-index-N, all free space in partitions less than partition-index-N must be skipped over Use the new 'struct cxl_dpa_partition' array to support allocation with an arbitrary number of DPA partitions on the device. A follow-on patch can go further to cleanup 'enum cxl_decoder_mode' concept and supersede it with looking up the memory properties from partition metadata. Until then cxl_part_mode() temporarily bridges code that looks up partitions by @cxled->mode. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/173864306400.668823.12143134425285426523.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-04 13:48:19 -07:00
Dan Williams	8e4c411c53	cxl: Introduce 'struct cxl_dpa_partition' and 'struct cxl_range_info' The pending efforts to add CXL Accelerator (type-2) device [1], and Dynamic Capacity (DCD) support [2], tripped on the no-longer-fit-for-purpose design in the CXL subsystem for tracking device-physical-address (DPA) metadata. Trip hazards include: - CXL Memory Devices need to consider a PMEM partition, but Accelerator devices with CXL.mem likely do not in the common case. - CXL Memory Devices enumerate DPA through Memory Device mailbox commands like Partition Info, Accelerators devices do not. - CXL Memory Devices that support DCD support more than 2 partitions. Some of the driver algorithms are awkward to expand to > 2 partition cases. - DPA performance data is a general capability that can be shared with accelerators, so tracking it in 'struct cxl_memdev_state' is no longer suitable. - Hardcoded assumptions around the PMEM partition always being index-1 if RAM is zero-sized or PMEM is zero sized. - 'enum cxl_decoder_mode' is sometimes a partition id and sometimes a memory property, it should be phased in favor of a partition id and the memory property comes from the partition info. Towards cleaning up those issues and allowing a smoother landing for the aforementioned pending efforts, introduce a 'struct cxl_dpa_partition' array to 'struct cxl_dev_state', and 'struct cxl_range_info' as a shared way for Memory Devices and Accelerators to initialize the DPA information in 'struct cxl_dev_state'. For now, split a new cxl_dpa_setup() from cxl_mem_create_range_info() to get the new data structure initialized, and cleanup some qos_class init. Follow on patches will go further to use the new data structure to cleanup algorithms that are better suited to loop over all possible partitions. cxl_dpa_setup() follows the locking expectations of mutating the device DPA map, and is suitable for Accelerator drivers to use. Accelerators likely only have one hardcoded 'ram' partition to convey to the cxl_core. Link: http://lore.kernel.org/20241230214445.27602-1-alejandro.lucero-palau@amd.com [1] Link: http://lore.kernel.org/20241210-dcd-type2-upstream-v8-0-812852504400@intel.com [2] Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/173864305827.668823.13978794102080021276.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-04 13:48:19 -07:00
Dan Williams	d77ca6c2b5	cxl: Introduce to_{ram,pmem}_{res,perf}() helpers In preparation for consolidating all DPA partition information into an array of DPA metadata, introduce helpers that hide the layout of the current data. I.e. make the eventual replacement of ->ram_res, ->pmem_res, ->ram_perf, and ->pmem_perf with a new DPA metadata array a no-op for code paths that consume that information, and reduce the noise of follow-on patches. The end goal is to consolidate all DPA information in 'struct cxl_dev_state', but for now the helpers just make it appear that all DPA metadata is relative to @cxlds. As the conversion to generic partition metadata walking is completed, these helpers will naturally be eliminated, or reduced in scope. Cc: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/173864305238.668823.16553986866633608541.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-04 13:48:18 -07:00
Dave Jiang	e91be3ed30	cxl: Preserve the CDAT access_coordinate for an endpoint Keep the access_coordinate from the CDAT tables for region perf calculations. The region perf calculation requires all participating endpoints to have arrived in order to determine if there are limitations of bandwidth data due to shared uplink. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20240904001316.1688225-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-09-22 21:02:53 -07:00
Dave Jiang	b5209da36b	cxl: Convert cxl_internal_send_cmd() to use 'struct cxl_mailbox' as input With the CXL mailbox context split out, cxl_internal_send_cmd() can take 'struct cxl_mailbox' as an input parameter rather than 'struct memdev_dev_state'. Change input parameter for cxl_internal_send_cmd() and fixup all impacted call sites. Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20240905223711.1990186-4-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-09-12 08:39:10 -07:00
Dave Jiang	8d8081cecf	cxl: Move mailbox related bits to the same context Create a new 'struct cxl_mailbox' and move all mailbox related bits to it. This allows isolation of all CXL mailbox data in order to export some of the calls to external kernel callers and avoid exporting of CXL driver specific bits such has device states. The allocation of 'struct cxl_mailbox' is also split out with cxl_mailbox_init() so the mailbox can be created independently. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20240905223711.1990186-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-09-12 08:38:01 -07:00
Dave Jiang	40a895fd9a	cxl: move cxl headers to new include/cxl/ directory Group all cxl related kernel headers into include/cxl/ directory. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20240905223711.1990186-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-09-09 15:47:59 -07:00
Linus Torvalds	e62f81bbd2	Merge tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull CXL updates from Dave Jiang: "Core: - A CXL maturity map has been added to the documentation to detail the current state of CXL enabling. It provides the status of the current state of various CXL features to inform current and future contributors of where things are and which areas need contribution. - A notifier handler has been added in order for a newly created CXL memory region to trigger the abstract distance metrics calculation. This should bring parity for CXL memory to the same level vs hotplugged DRAM for NUMA abstract distance calculation. The abstract distance reflects relative performance used for memory tiering handling. - An addition for XOR math has been added to address the CXL DPA to SPA translation. CXL address translation did not support address interleave math with XOR prior to this change. Fixes: - Fix to address race condition in the CXL memory hotplug notifier - Add missing MODULE_DESCRIPTION() for CXL modules - Fix incorrect vendor debug UUID define Misc: - A warning has been added to inform users of an unsupported configuration when mixing CXL VH and RCH/RCD hierarchies - The ENXIO error code has been replaced with EBUSY for inject poison limit reached via debugfs and cxl-test support - Moving the PCI config read in cxl_dvsec_rr_decode() to avoid unnecessary PCI config reads - A refactor to a common struct for DRAM and general media CXL events" * tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/core/pci: Move reading of control register to immediately before usage cxl: Remove defunct code calculating host bridge target positions cxl/region: Verify target positions using the ordered target list cxl: Restore XOR'd position bits during address translation cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa() cxl/test: Replace ENXIO with EBUSY for inject poison limit reached cxl/memdev: Replace ENXIO with EBUSY for inject poison limit reached cxl/acpi: Warn on mixed CXL VH and RCH/RCD Hierarchy cxl/core: Fix incorrect vendor debug UUID define Documentation: CXL Maturity Map cxl/region: Simplify cxl_region_nid() cxl/region: Support to calculate memory tier abstract distance cxl/region: Fix a race condition in memory hotplug notifier cxl: add missing MODULE_DESCRIPTION() macros cxl/events: Use a common struct for DRAM and General Media events	2024-07-28 09:33:28 -07:00
Alison Schofield	591209c798	cxl/memdev: Replace ENXIO with EBUSY for inject poison limit reached The CXL driver provides a debugfs interface offering users the ability to inject and clear poison to a memdev. Once a user has injected up to the devices limit further injection requests fail with ENXIO until a clear poison is issued. Users may not have device specs in hand or may want to intentionally hit the limit and then clear. Replace the usual ENXIO return status with EBUSY so users can recognize this failure. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Xingtao Yao <yaoxt.fnst@fujitsu.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://patch.msgid.link/825bd4c67fb55a4373c4182d999ad49d4e6b4fe7.1720316188.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-07-10 17:12:42 -07:00
peng guo	8ecef8e01a	cxl/core: Fix incorrect vendor debug UUID define When user send a mbox command whose opcode is CXL_MBOX_OP_CLEAR_LOG and the in_payload is normal vendor debug log UUID according to the CXL specification cxl_payload_from_user_allowed() will return false unexpectedly, Sending mbox cmd operation fails and the kernel log will print: Clear Log: input payload not allowed. All CXL devices that support a debug log shall support the Vendor Debug Log to allow the log to be accessed through a common host driver, for any device, all versions of the CXL specification define the same value with Log Identifier of: 5e1819d9-11a9-400c-811f-d60719403d86 Refer to CXL spec r3.1 Table 8-71 Fix the definition value of DEFINE_CXL_VENDOR_DEBUG_UUID to match the CXL specification. Fixes: `472b1ce6e9` ("cxl/mem: Enable commands via CEL") Signed-off-by: peng guo <engguopeng@buaa.edu.cn> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20240710023112.8063-1-engguopeng@buaa.edu.cn Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-07-10 16:10:36 -07:00
Yao Xingtao	a0f39d51db	cxl: documentation: add missing files to cxl driver-api Add the missing files into cxl driver api and fix the compile warning. Suggested-by: Dan Williams <dan.j.williams@intel.com> Suggested-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20240614084755.59503-3-yaoxt.fnst@fujitsu.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-06-25 14:45:27 -07:00
Yao Xingtao	84328c5ace	cxl/region: check interleave capability Since interleave capability is not verified, if the interleave capability of a target does not match the region need, committing decoder should have failed at the device end. In order to checkout this error as quickly as possible, driver needs to check the interleave capability of target during attaching it to region. Per CXL specification r3.1(8.2.4.20.1 CXL HDM Decoder Capability Register), bits 11 and 12 indicate the capability to establish interleaving in 3, 6, 12 and 16 ways. If these bits are not set, the target cannot be attached to a region utilizing such interleave ways. Additionally, bits 8 and 9 represent the capability of the bits used for interleaving in the address, Linux tracks this in the cxl_port interleave_mask. Per CXL specification r3.1(8.2.4.20.13 Decoder Protection): eIW means encoded Interleave Ways. eIG means encoded Interleave Granularity. in HPA: if eIW is 0 or 8 (interleave ways: 1, 3), all the bits of HPA are used, the interleave bits are none, the following check is ignored. if eIW is less than 8 (interleave ways: 2, 4, 8, 16), the interleave bits start at bit position eIG + 8 and end at eIG + eIW + 8 - 1. if eIW is greater than 8 (interleave ways: 6, 12), the interleave bits start at bit position eIG + 8 and end at eIG + eIW - 1. if the interleave mask is insufficient to cover the required interleave bits, the target cannot be attached to the region. Fixes: `384e624bb2` ("cxl/region: Attach endpoint decoders") Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20240614084755.59503-2-yaoxt.fnst@fujitsu.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-06-25 14:45:27 -07:00
Srinivasulu Thanneeru	206f9fa9d5	cxl/mbox: Add Clear Log mailbox command Adding UAPI support for CXL r3.1 8.2.9.5.4 Clear Log command. This proposed patch will be useful for clearing and populating the Vendor debug log in certain scenarios, allowing for the aggregation of results over time. Signed-off-by: Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240313071218.729-3-sthanneeru.opensrc@micron.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-04-30 08:48:10 -07:00
Srinivasulu Thanneeru	940325add1	cxl/mbox: Add Get Log Capabilities and Get Supported Logs Sub-List commands Adding UAPI support for 1. CXL r3.1 8.2.9.5.3 Get Log Capabilities. 2. CXL r3.1 8.2.9.5.6 Get Supported Logs Sub-List. Signed-off-by: Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240313071218.729-2-sthanneeru.opensrc@micron.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-04-30 08:48:10 -07:00
Dave Jiang	001c5d1934	cxl: Consolidate dport access_coordinate ->hb_coord and ->sw_coord into ->coord The driver stores access_coordinate for host bridge in ->hb_coord and switch CDAT access_coordinate in ->sw_coord. Since neither of these access_coordinate clobber each other, the variable name can be consolidated into ->coord to simplify the code. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240403154844.3403859-5-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-04-08 08:25:21 -07:00
Dave Jiang	00413c1506	cxl: Change 'struct cxl_memdev_state' *_perf_list to single 'struct cxl_dpa_perf' In order to address the issue with being able to expose qos_class sysfs attributes under 'ram' and 'pmem' sub-directories, the attributes must be defined as static attributes rather than under driver->dev_groups. To avoid implementing locking for accessing the 'struct cxl_dpa_perf` lists, convert the list to a single 'struct cxl_dpa_perf' entry in preparation to move the attributes to statically defined. While theoretically a partition may have multiple qos_class via CDAT, this has not been encountered with testing on available hardware. The code is simplified for now to not support the complex case until a use case is needed to support that. Link: https://lore.kernel.org/linux-cxl/65b200ba228f_2d43c29468@dwillia2-mobl3.amr.corp.intel.com.notmuch/ Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240206190431.1810289-2-dave.jiang@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-02-16 23:20:34 -08:00
Dan Williams	3601311593	Merge branch 'for-6.8/cxl-cper' into for-6.8/cxl Pick up the CPER to CXL driver integration work for v6.8. Some additional cleanup of cper_estatus_print() messages is needed, but that is to be handled incrementally.	2024-01-09 19:21:44 -08:00
Ira Weiny	dc97f6344f	cxl/pci: Register for and process CPER events If the firmware has configured CXL event support to be firmware first the OS can process those events through CPER records. The CXL layer has unique DPA to HPA knowledge and standard event trace parsing in place. CPER records contain Bus, Device, Function information which can be used to identify the PCI device which is sending the event. Change the PCI driver registration to include registration of a CXL CPER callback to process events through the trace subsystem. Use new scoped based management to simplify the handling of the PCI device object. Tested-by: Smita-Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Smita-Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-9-1bb8a4ca2c7a@intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com> [djbw: use new pci_dev guard, flip init order] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-09 15:41:23 -08:00

1 2 3 4

153 Commits