linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-30 20:42:33 -04:00

Author	SHA1	Message	Date
Matthew Auld	1e32ffbc9d	drm/xe/sriov: support non-contig VRAM provisioning Currently we can run into issues with provisioning VRAM region, due to requiring contig VRAM BO underneath. We sometimes see that allocation (multiple GB) can fail even when there is enough free space. We don't need CPU access to the buffer in the first place, so can forgo pin_map and therefore also the contig requirement. Keep the same behavior with save and restore during suspend/resume (which can now be done with blitter). We also need the VRAM to occupy the same pages so we don't need to re-program the LMTT, so should still remain pinned (also we don't want something to try evict it). With that covert over to plain pinned kernel object. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250403102440.266113-16-matthew.auld@intel.com	2025-04-04 11:41:11 +01:00
Michal Wajdeczko	611160b02a	drm/xe/pf: Release all VFs configs on device removal If we try to manually provision VFs using debugfs and then we try to unload the driver, we will see complains like: [ ] Memory manager not clean during takedown. [ ] RIP: 0010:drm_mm_takedown+0x3f/0x100 [ ] [drm:drm_mm_takedown] ERROR node [fedff000 + 00001000]: inserted at drm_mm_insert_node_in_range+0x2bd/0x520 xe_ggtt_node_insert+0x52/0x90 [xe] pf_provision_vf_ggtt+0x1fa/0xac0 [xe] xe_gt_sriov_pf_config_set_ggtt+0x79/0x7a0 [xe] ggtt_set+0x53/0x80 [xe] simple_attr_write_xsigned.isra.0+0xd2/0x150 simple_attr_write+0x14/0x30 debugfs_attr_write+0x4e/0x80 [ ] xe 0000:00:02.0: [drm] ERROR GT0: GUC ID manager unclean (1/65535) [ ] xe 0000:00:02.0: [drm] GT0: total 65535 [ ] xe 0000:00:02.0: [drm] GT0: used 1 [ ] xe 0000:00:02.0: [drm] GT0: range 65534..65534 (1) [ ] xe 0000:00:02.0: [drm] ERROR GT0: GuC doorbells manager unclean (1/256) [ ] xe 0000:00:02.0: [drm] GT0: count: 256 [ ] xe 0000:00:02.0: [drm] GT0: available range: 1..255 (255) [ ] xe 0000:00:02.0: [drm] GT0: available total: 255 [ ] xe 0000:00:02.0: [drm] GT0: reserved range: 0..0 (1) [ ] xe 0000:00:02.0: [drm] GT0: reserved total: 1 This could be easily fixed by adding config release action. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250211155034.1028-1-michal.wajdeczko@intel.com	2025-02-16 21:36:38 +01:00
Piotr Piórkowski	71163271dc	drm/xe: Move VRAM manager to struct xe_vram_region VRAM manager is related directly to struct xe_vram_region so it should be inside this structure. Let's move the VRAM to struct xe_vram_region. v2: - remove xe_vram_region pointer from xe_ttm_vram_mgr - stop use dynamic alloaction for xe_ttm_vram_mgr in xe_vram_region - rename struct xe_ttm_vram_mgr vram_mgr to ttm v3: - fix "'ttm' not described in 'xe_vram_region'" Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250210081511.906452-3-piotr.piorkowski@intel.com	2025-02-10 13:08:59 +01:00
Piotr Piórkowski	cbc0a0ee34	drm/xe/pf: Use an explicit check to see if the device has LMTT So far, the main condition for using LMTT has been to check that the device is a discrete gfx. Let's add a dedicated function to check if the device supports LMTT as not all future discrete GPU platforms will require LMTT. v2: - use xe_has_device_lmtt only when necessary - leave IS_DGFX for other things related to LMEM provisioning v3: - remove IS_SRIOV_PF condition from xe_device_has_lmtt (Michal Wajdeczko) - keep IS_SRIOV_PF asserts in LMTT-related code (Michal Wajdeczko) v4: - update commit description Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250207113111.853821-2-piotr.piorkowski@intel.com	2025-02-10 11:10:14 +01:00
Michal Wajdeczko	33f17e2cbd	drm/xe/pf: Reset GuC VF config when unprovisioning critical resource GuC firmware counts received VF configuration KLVs and may start validation of the complete VF config even if some resources where unprovisioned in the meantime, leading to unexpected errors like: $ echo 1 \| sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/contexts_quota $ echo 0 \| sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/contexts_quota $ echo 1 \| sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/doorbells_quota $ echo 0 \| sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/doorbells_quota $ echo 1 \| sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/ggtt_quota tee: '/sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/ggtt_quota': Input/output error To mitigate this problem trigger explicit VF config reset after unprovisioning any of the critical resources (GGTT, context or doorbell IDs) that GuC is monitoring. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250129195947.764-3-michal.wajdeczko@intel.com	2025-01-30 17:10:41 +01:00
Michal Wajdeczko	21ccac0e22	drm/xe/pf: Don't send BEGIN_ID if VF has no context/doorbells It turned out that GuC validates VF configuration immediately after receiving "some" set of configuration KLVs and complains if one of the critical, from GuC understanding, resource is left unprovisioned, even if PF should be still allowed to make late VF config adjustments, since VF was not yet started. This issue was discovered after we decided to asynchronously re-send configuration KLVs after GT reset/resume, as then fair VF auto-provisioning could already allocate some of the resources, which was a prerequiste for sending those config KLVs: # fair GGTT provisioning [] xe 0000:00:02.0: [drm] GT0: PF: pushed VF1 config with 2 KLVs: [] xe 0000:00:02.0: [drm] GT0: { key 0x0001 : 64b value 0x176a000 } # ggtt_start [] xe 0000:00:02.0: [drm] GT0: { key 0x0002 : 64b value 0xfd696000 } # ggtt_size [] xe 0000:00:02.0: [drm] GT0: PF: VF1 provisioned with 4251541504 (3.96 GiB) GGTT # re-provisioning worker [] xe 0000:00:02.0: [drm] ERROR GT0: H2G request 0x5503 failed: error 0x60 hint 0x0 [] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 14 config KLVs (-EIO) [] xe 0000:00:02.0: [drm] GT0: { key 0x0001 : 64b value 0x176a000 } # ggtt_start [] xe 0000:00:02.0: [drm] GT0: { key 0x0002 : 64b value 0xfd696000 } # ggtt_size [] xe 0000:00:02.0: [drm] GT0: { key 0x8a0b : 32b value 0 } # begin_ctx_id [] xe 0000:00:02.0: [drm] GT0: { key 0x0004 : 32b value 0 } # num_contexts [] xe 0000:00:02.0: [drm] GT0: { key 0x8a0a : 32b value 0 } # begin_db_id [] xe 0000:00:02.0: [drm] GT0: { key 0x0006 : 32b value 0 } # num_doorbells [] xe 0000:00:02.0: [drm] GT0: { key 0x8a01 : 32b value 0 } # exec_quantum [] xe 0000:00:02.0: [drm] GT0: { key 0x8a02 : 32b value 0 } # preempt_timeout [] xe 0000:00:02.0: [drm] GT0: { key 0x8a03 : 32b value 0 } # cat_error_count [] xe 0000:00:02.0: [drm] GT0: { key 0x8a04 : 32b value 0 } # engine_reset_count [] xe 0000:00:02.0: [drm] GT0: { key 0x8a05 : 32b value 0 } # page_fault_count [] xe 0000:00:02.0: [drm] GT0: { key 0x8a06 : 32b value 0 } # guc_time_us [] xe 0000:00:02.0: [drm] GT0: { key 0x8a07 : 32b value 0 } # irq_time_us [] xe 0000:00:02.0: [drm] GT0: { key 0x8a08 : 32b value 0 } # doorbell_time_us [] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 configuration (-EIO) To avoid such errors stop sending BEGIN_CONTEXT/DOORBELL_ID KLVs if no GuC context/doorbell IDs were provisioned to VF. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4176 Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250129195947.764-2-michal.wajdeczko@intel.com	2025-01-30 17:10:39 +01:00
Michal Wajdeczko	d8b2149ba8	drm/xe/pf: Use GuC Buffer Cache during VFs provisioning Start using GuC buffer cache for the VF's configuration actions. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241220194205.995-10-michal.wajdeczko@intel.com	2025-01-19 00:12:05 +01:00
Nitin Gote	75fd04f276	drm/xe: Fix all typos in xe Fix all typos in files of xe, reported by codespell tool. Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250106102646.1400146-2-nitin.r.gote@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>	2025-01-09 17:58:09 +01:00
Michal Wajdeczko	a8d0aa0e7f	drm/xe/pf: Use correct function to check LMEM provisioning There is a typo in function call and instead of VF LMEM we were looking at VF GGTT provisioning. Fix that. Fixes: `234670cea9` ("drm/xe/pf: Skip fair VFs provisioning if already provisioned") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241216223253.819-1-michal.wajdeczko@intel.com	2024-12-20 00:03:28 +01:00
Michal Wajdeczko	465d9057e5	drm/xe/pf: Drop 2GiB limit of fair LMEM allocation Since commit `678ccbf987` ("drm/xe/vram: drop 2G block restriction") we are able to provision VFs with more than 2GiB. Drop our temporary limit of maximum fair LMEM size that was added just to avoid hitting -EINVAL from auto-provisioning. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241121175754.302-1-michal.wajdeczko@intel.com	2024-11-22 17:30:09 +01:00
Michal Wajdeczko	7dbed0fdb1	drm/xe/pf: Add functions to configure VF scheduling priority Add functions to configure PF or VF scheduling priority using the VF_CFG_SCHED_PRIORITY KLV. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lukasz Laguna <lukasz.laguna@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241106151301.2079-4-michal.wajdeczko@intel.com	2024-11-08 13:31:15 +01:00
Michal Wajdeczko	43b1dd2b55	drm/xe/pf: Fix potential GGTT allocation leak In unlikely event that we fail during sending the new VF GGTT configuration to the GuC, we will free only the GGTT node data struct but will miss to release the actual GGTT allocation. This will later lead to list corruption, GGTT space leak and finally risking crash when unloading the driver: [ ] ... [drm] GT0: PF: Failed to provision VF1 with 1073741824 (1.00 GiB) GGTT (-EIO) [ ] ... [drm] GT0: PF: VF1 provisioning remains at 0 (0 B) GGTT [ ] list_add corruption. next->prev should be prev (ffff88813cfcd628), but was 0000000000000000. (next=ffff88813cfe2028). [ ] RIP: 0010:__list_add_valid_or_report+0x6b/0xb0 [ ] Call Trace: [ ] drm_mm_insert_node_in_range+0x2c0/0x4e0 [ ] xe_ggtt_node_insert+0x46/0x70 [xe] [ ] pf_provision_vf_ggtt+0x7f5/0xa70 [xe] [ ] xe_gt_sriov_pf_config_set_ggtt+0x5e/0x770 [xe] [ ] ggtt_set+0x4b/0x70 [xe] [ ] simple_attr_write_xsigned.constprop.0.isra.0+0xb0/0x110 [ ] ... [drm] GT0: PF: Failed to provision VF1 with 1073741824 (1.00 GiB) GGTT (-ENOSPC) [ ] ... [drm] GT0: PF: VF1 provisioning remains at 0 (0 B) GGTT [ ] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b7b: 0000 [#1] PREEMPT SMP NOPTI [ ] RIP: 0010:drm_mm_remove_node+0x1b7/0x390 [ ] Call Trace: [ ] <TASK> [ ] ? die_addr+0x2e/0x80 [ ] ? exc_general_protection+0x1a1/0x3e0 [ ] ? asm_exc_general_protection+0x22/0x30 [ ] ? drm_mm_remove_node+0x1b7/0x390 [ ] ggtt_node_remove+0xa5/0xf0 [xe] [ ] xe_ggtt_node_remove+0x35/0x70 [xe] [ ] xe_ttm_bo_destroy+0x123/0x220 [xe] [ ] intel_user_framebuffer_destroy+0x44/0x70 [xe] [ ] intel_plane_destroy_state+0x3b/0xc0 [xe] [ ] drm_atomic_state_default_clear+0x1cd/0x2f0 [ ] intel_atomic_state_clear+0x9/0x20 [xe] [ ] __drm_atomic_state_free+0x1d/0xb0 Fix that by using pf_release_ggtt() on the error path, which now works regardless if the node has GGTT allocation or not. Fixes: `34e804220f` ("drm/xe: Make xe_ggtt_node struct independent") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241104144901.1903-1-michal.wajdeczko@intel.com	2024-11-05 20:17:48 +01:00
Michal Wajdeczko	b982cba5ce	drm/xe/pf: Show VFs LMEM provisioning summary over debugfs While we can already view individual VF LMEM provisioning using lmem_quota debugfs attribute, we want to have unified way to show summary across all VFs, like we do for GGTT or GuC doorbells/IDs. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241021201506.1771-1-michal.wajdeczko@intel.com	2024-10-22 20:33:24 +02:00
Michal Wajdeczko	e9a14537fe	drm/xe/pf: Add functions to save and restore VF configuration blob We already have support to save and restore GuC VF state, but that will only work when the target VF configuration (provisioning) will be exactly the same as the source VF configuration. To help with assuring that both configurations match, allow to encode whole VF configuration that can be saved as blob and restored later. In the future we may want to use such captured configuration blobs as templates to make sure we provision VFs with exactly the same configuration that was previously tested or recommended, or when debugfs knobs are not be available. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240919171528.1451-5-michal.wajdeczko@intel.com	2024-10-07 13:00:09 +02:00
Michal Wajdeczko	bdc2c4d575	drm/xe/pf: Allow to encode subset of VF configuration KLVs We want to reuse format of the GuC KLVs while saving and restoring VF configuration by the PF driver, but some of those KLVs (like doorbell begin index or GGTT starting offset) are not strictly needed to correctly restore VF configuration. Modify functions to omit encoding of some of the KLVs with GuC only details. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240919171528.1451-4-michal.wajdeczko@intel.com	2024-10-07 13:00:00 +02:00
Michal Wajdeczko	99ce45cc25	drm/xe/pf: Update success code of pf_validate_vf_config() This function may return negative error codes on invalid or incomplete VF configuration, but unlike other int functions, it was returning 1 instead of 0 on success, which might be little inconvinient if we would like to use it directly in other functions. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240919171528.1451-3-michal.wajdeczko@intel.com	2024-10-07 12:59:46 +02:00
Matthew Auld	89076b5a8b	drm/xe: prevent potential UAF in pf_provision_vf_ggtt() The node ptr can point to an already freed ptr, if we hit the path with an already allocated node. We later dereference that pointer with: xe_gt_assert(gt, !xe_ggtt_node_allocated(node)); which is a potential UAF. Fix this by not stashing the ptr for node. Also since it is likely a bad idea to leave config->ggtt_region pointing to a stale ptr, also set that to NULL by calling pf_release_vf_config_ggtt() instead of pf_release_ggtt(). Fixes: `34e804220f` ("drm/xe: Make xe_ggtt_node struct independent") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240828104341.180111-2-matthew.auld@intel.com	2024-09-06 09:35:05 +01:00
Michal Wajdeczko	da6ec74339	drm/xe/pf: Reset thresholds when releasing a VF config As part of the VF config release, we should reset all parameters, including thresholds, to always start with the clean VF config. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240830132100.1704-3-michal.wajdeczko@intel.com	2024-09-02 20:51:03 +02:00
Michal Wajdeczko	a1498ab229	drm/xe/pf: Add thresholds to the VF KLV config We are pushing threshold KLV to the GuC immediately during the threshold provisioning, but those configs will be lost during a GT reset. Include threshold KLVs while encoding full VF config buffer to make sure the GuC receives all of the config KLVs. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240830132100.1704-2-michal.wajdeczko@intel.com	2024-09-02 20:51:02 +02:00
Michal Wajdeczko	16ba2b28df	drm/xe/pf: Add function to sanitize VF resources On current platforms it is a PF driver responsibility to clear some of the VF's resources during a VF FLR. Add simple function that will clear configured VF resources (GGTT, LMEM). We will start using this function soon. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240828210809.1528-2-michal.wajdeczko@intel.com	2024-08-30 10:51:06 +02:00
Nathan Chancellor	ff9c674d11	drm/xe: Fix total initialization in xe_ggtt_print_holes() Clang warns (or errors with CONFIG_DRM_WERROR or CONFIG_WERROR): drivers/gpu/drm/xe/xe_ggtt.c:810:3: error: variable 'total' is uninitialized when used here [-Werror,-Wuninitialized] 810 \| total += hole_size; \| ^~~~~ drivers/gpu/drm/xe/xe_ggtt.c:798:11: note: initialize the variable 'total' to silence this warning 798 \| u64 total; \| ^ \| = 0 1 error generated. Move the zero initialization of total from xe_gt_sriov_pf_config_print_available_ggtt() to xe_ggtt_print_holes() to resolve the warning. Fixes: `136367290e` ("drm/xe: Introduce xe_ggtt_print_holes") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240823-drm-xe-fix-total-in-xe_ggtt_print_holes-v1-1-12b02d079327@kernel.org Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-08-24 06:11:26 -07:00
Rodrigo Vivi	34e804220f	drm/xe: Make xe_ggtt_node struct independent In some rare cases, the drm_mm node cannot be removed synchronously due to runtime PM conditions. In this situation, the node removal will be delegated to a workqueue that will be able to wake up the device before removing the node. However, in this situation, the lifetime of the xe_ggtt_node cannot be restricted to the lifetime of the parent object. So, this patch introduces the infrastructure so the xe_ggtt_node struct can be allocated in advance and freed when needed. By having the ggtt backpointer, it also ensure that the init function is always called before any attempt to insert or reserve the node in the GGTT. v2: s/xe_ggtt_node_force_fini/xe_ggtt_node_fini and use it internaly (Brost) v3: - Use GF_NOFS for node allocation (CI) - Avoid ggtt argument, now that we have it inside the node (Lucas) - Fix some missed fini cases (CI) v4: - Fix SRIOV critical case where config->ggtt_region was lost (Michal) - Avoid ggtt argument also on removal (missed case on v3) (Michal) - Remove useless checks (Michal) - Return 0 instead of negative errno on a u32 addr. (Michal) - s/xe_ggtt_assign/xe_ggtt_node_assign for coherence, while we are touching it (Michal) v5: - Fix VFs' ggtt_balloon Cc: Matthew Auld <matthew.auld@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-11-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:45 -04:00
Rodrigo Vivi	136367290e	drm/xe: Introduce xe_ggtt_print_holes Introduce a new xe_ggtt_print_holes helper that attends the SRIOV demand and finishes the goal of limiting drm_mm access to xe_ggtt. Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-9-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:45 -04:00
Rodrigo Vivi	1144e0dff5	drm/xe: Introduce xe_ggtt_largest_hole Introduce a new xe_ggtt_largest_hole helper that attends the SRIOV demand and continue with the goal of limiting drm_mm access to xe_ggtt. v2: Fix a typo (Michal) Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-8-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:45 -04:00
Rodrigo Vivi	8b5ccc9743	drm/xe: Limit drm_mm_node_allocated access to xe_ggtt_node Continue with the encapsulation of drm_mm_node inside xe_ggtt. Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-7-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:44 -04:00
Rodrigo Vivi	0567f18e07	drm/xe: Rename xe_ggtt_node related functions Bring some consistency and prepare for more xe_ggtt_node related functions to be introduced. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-6-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:44 -04:00
Rodrigo Vivi	6062ea9398	drm/xe: Encapsulate drm_mm_node inside xe_ggtt_node The xe_ggtt component uses drm_mm to manage the GGTT. The drm_mm_node is just a node inside drm_mm, but in Xe we use that only in the GGTT context. So, this patch encapsulates the drm_mm_node into a xe_ggtt's new struct. This is the first step towards limiting all the drm_mm access through xe_ggtt. The ultimate goal is to have a better control of the node insertion and removal, so the removal can be delegated to a delayed workqueue. v2: Fix includes and typos (Michal and Brost) Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-5-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:44 -04:00
Michal Wajdeczko	5bdacb0907	drm/xe/pf: Fix VF config validation on multi-GT platforms When validating VF config on the media GT, we may wrongly report that VF is already partially configured on it, as we consider GGTT and LMEM provisioning done on the primary GT (since both GGTT and LMEM are tile-level resources, not a GT-level). This will cause skipping a VF auto-provisioning on the media-GT and in result will block a VF from successfully initialize that GT. Fix that by considering GGTT and LMEM configurations only when checking if a VF provisioning is complete, and omit GGTT and LMEM when reporting empty/partial provisioning. Fixes: `234670cea9` ("drm/xe/pf: Skip fair VFs provisioning if already provisioned") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240806180516.618-1-michal.wajdeczko@intel.com	2024-08-09 10:37:52 +02:00
Michal Wajdeczko	25ec7e809c	drm/xe: Add NEEDS_2M BO flag In addition of NEEDS_64K BO flag, add similar one to force 2 MiB alignment of the buffer objects. Explicitly use this flag during VF LMEM provisioning as LMTT uses 2 MiB pages and one day we may drop requirement of allocating pinned objects as contiguous. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240715180538.1418-3-michal.wajdeczko@intel.com	2024-07-22 12:53:06 +02:00
Michal Wajdeczko	4c3fe5eae4	drm/xe/pf: Limit fair VF LMEM provisioning Due to the current design of the BO and VRAM manager, any object with XE_BO_FLAG_PINNED flag, which the PF driver uses during VF LMEM provisionining, is created with the TTM_PL_FLAG_CONTIGUOUS flag, which may cause VRAM fragmentation that prevents subsequent allocations of larger objects, like fair VF LMEM provisioning. To avoid such failures, round down fair VF LMEM provisioning size to next power of two size, to compensate what xe_ttm_vram_mgr is doing to achieve contiguous allocations. Fixes: `ac6598aed1` ("drm/xe/pf: Add support to configure SR-IOV VFs") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240711192320.1198-2-michal.wajdeczko@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-07-12 13:45:56 -07:00
Michal Wajdeczko	411220808c	drm/xe/pf: Restart VFs provisioning after GT reset Any prior configurations pushed to the GuC are lost when the GT is reset. Push again all non-empty VF configurations to the GuC as part of the GuC reset procedure. This will also help restore early manual provisioning, when the PF was in the meantime suspended and then resumed. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240701102738.934-3-michal.wajdeczko@intel.com	2024-07-01 19:43:52 +02:00
Michal Wajdeczko	234670cea9	drm/xe/pf: Skip fair VFs provisioning if already provisioned Our debugfs allows to view and change VFs' provisioning configs. If we attempt to experiment with VFs provisioning before enabling them, this early config will affect fair provisioning calculations, and will also be overwritten, which is undesirable behavior. To improve this, check if the VFs configs are empty (unprovisioned) before starting the fair provisioning procedure. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240701102738.934-2-michal.wajdeczko@intel.com	2024-07-01 19:43:50 +02:00
Michal Wajdeczko	b321cb83a3	drm/xe/pf: Assert LMEM provisioning is done only on DGFX The Local Memory (aka VRAM) is only available on DGFX platforms. We shouldn't attempt to provision VFs with LMEM or attempt to update the LMTT on non-DGFX platforms. Add missing asserts that would enforce that and fix release code that could crash on iGFX due to uninitialized LMTT. Fixes: `c063cce7df` ("drm/xe/pf: Update the LMTT when freeing VF GT config") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240607153155.1592-1-michal.wajdeczko@intel.com	2024-06-10 12:14:23 +02:00
Michal Wajdeczko	c063cce7df	drm/xe/pf: Update the LMTT when freeing VF GT config The LMTT must be updated whenever we change the VF LMEM configuration. We missed that step when freeing the whole VF GT config, which could result in stale PTE in LMTT or LMTT PT object leaks. Fix that. Fixes: `ac6598aed1` ("drm/xe/pf: Add support to configure SR-IOV VFs") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240527115408.1064-1-michal.wajdeczko@intel.com	2024-05-31 14:44:09 +02:00
Michal Wajdeczko	629df234bf	drm/xe/pf: Introduce functions to configure VF thresholds The GuC firmware monitors VF's activity and notifies the PF driver once any configured threshold related to such activity is exceeded. Add functions to allow configuration of these thresholds per VF. Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240514190015.2172-5-michal.wajdeczko@intel.com	2024-05-16 18:04:42 +02:00
Michal Wajdeczko	49f853c78e	drm/xe/pf: Clamp maximum execution quantum to 100s GuC is silently clamping values of the execution quantum and preemption timeout KLVs to 100s. Perform explicit clamping on the driver side as later there is no way to read back values used by the firmware and we shouldn't mislead the user about actual values being used when we print them in dmesg or debugfs. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240419123543.270-3-michal.wajdeczko@intel.com	2024-04-24 15:32:26 +02:00
Michal Wajdeczko	d3b80dc7aa	drm/xe/pf: Fix xe_gt_sriov_pf_config_print_available_ggtt() This function is using internal helper pf_get_spare_ggtt() that expects PF's master mutex to be locked. Fix that. Fixes: `ac6598aed1` ("drm/xe/pf: Add support to configure SR-IOV VFs") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240419141000.314-1-michal.wajdeczko@intel.com	2024-04-22 19:50:47 +02:00
Michal Wajdeczko	ac6598aed1	drm/xe/pf: Add support to configure SR-IOV VFs To run correctly, each Virtual Function must be provisioned with some chunk of shared hardware or firmware resources (like GGTT, device memory, GuC doorbell IDs, GuC context IDs) and scheduling parameters (execution quantum or preemption timeout). All resources assigned to VFs must be excluded from the PF driver use and may require some additional preparation steps (like setup of the LMTT or update of the GGTT PTE). Those provisioning details must be then sent to the GuC firmware as most of those details will be shared later with the VF drivers during their boot. Add basic functions to provision VFs with all hardware resources or scheduling parameters. We will use them shortly in upcoming patches either in manual provisioning over debugfs, exposed to the advanced users, or automatic provisioning done by PF driver during VFs enabling. Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240415173937.1287-7-michal.wajdeczko@intel.com	2024-04-16 12:37:36 +02:00

38 Commits