linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-22 16:53:59 -04:00

Author	SHA1	Message	Date
Linus Torvalds	de848da12f	Merge tag 'drm-next-2024-09-19' of https://gitlab.freedesktop.org/drm/kernel Pull drm updates from Dave Airlie: "This adds a couple of patches outside the drm core, all should be acked appropriately, the string and pstore ones are the main ones that come to mind. Otherwise it's the usual drivers, xe is getting enabled by default on some new hardware, we've changed the device number handling to allow more devices, and we added some optional rust code to create QR codes in the panic handler, an idea first suggested I think 10 years ago :-) string: - add mem_is_zero() core: - support more device numbers - use XArray for minor ids - add backlight constants - Split dma fence array creation into alloc and arm fbdev: - remove usage of old fbdev hooks kms: - Add might_fault() to drm_modeset_lock priming - Add dynamic per-crtc vblank configuration support dma-buf: - docs cleanup buddy: - Add start address support for trim function printk: - pass description to kmsg_dump scheduler: - Remove full_recover from drm_sched_start ttm: - Make LRU walk restartable after dropping locks - Allow direct reclaim to allocate local memory panic: - add display QR code (in rust) displayport: - mst: GUID improvements bridge: - Silence error message on -EPROBE_DEFER - analogix: Clean aup - bridge-connector: Fix double free - lt6505: Disable interrupt when powered off - tc358767: Make default DP port preemphasis configurable - lt9611uxc: require DRM_BRIDGE_ATTACH_NO_CONNECTOR - anx7625: simplify OF array handling - dw-hdmi: simplify clock handling - lontium-lt8912b: fix mode validation - nwl-dsi: fix mode vsync/hsync polarity xe: - Enable LunarLake and Battlemage support - Introducing Xe2 ccs modifiers for integrated and discrete graphics - rename xe perf to xe observation - use wb caching on DGFX for system memory - add fence timeouts - Lunar Lake graphics/media/display workarounds - Battlemage workarounds - Battlemage GSC support - GSC and HuC fw updates for LL/BM - use dma_fence_chain_free - refactor hw engine lookup and mmio access - enable priority mem read for Xe2 - Add first GuC BMG fw - fix dma-resv lock - Fix DGFX display suspend/resume - Use xe_managed for kernel BOs - Use reserved copy engine for user binds on faulting devices - Allow mixing dma-fence jobs and long-running faulting jobs - fix media TLB invalidation - fix rpm in TTM swapout path - track resources and VF state by PF i915: - Type-C programming fix for MTL+ - FBC cleanup - Calc vblank delay more accurately - On DP MST, Enable LT fallback for UHBR<->non-UHBR rates - Fix DP LTTPR detection - limit relocations to INT_MAX - fix long hangs in buddy allocator on DG2/A380 amdgpu: - Per-queue reset support - SDMA devcoredump support - DCN 4.0.1 updates - GFX12/VCN4/JPEG4 updates - Convert vbios embedded EDID to drm_edid - GFX9.3/9.4 devcoredump support - process isolation framework for GFX 9.4.3/4 - take IOMMU mappings into account for P2P DMA amdkfd: - CRIU fixes - HMM fix - Enable process isolation support for GFX 9.4.3/4 - Allow users to target recommended SDMA engines - KFD support for targetting queues on recommended SDMA engines radeon: - remove .load and drm_dev_alloc - Fix vbios embedded EDID size handling - Convert vbios embedded EDID to drm_edid - Use GEM references instead of TTM - r100 cp init cleanup - Fix potential overflows in evergreen CS offset tracking msm: - DPU: - implement DP/PHY mapping on SC8180X - Enable writeback on SM8150, SC8180X, SM6125, SM6350 - DP: - Enable widebus on all relevant chipsets - MSM8998 HDMI support - GPU: - A642L speedbin support - A615/A306/A621 support - A7xx devcoredump support ast: - astdp: Support AST2600 with VGA - Clean up HPD - Fix timeout loop for DP link training - reorganize output code by type (VGA, DP, etc) - convert to struct drm_edid - fix BMC handling for all outputs exynos: - drop stale MAINTAINERS pattern - constify struct loongson: - use GEM refcount over TTM mgag200: - Improve BMC handling - Support VBLANK intterupts - transparently support BMC outputs nouveau: - Refactor and clean up internals - Use GEM refcount over TTM's gm12u320: - convert to struct drm_edid gma500: - update i2c terms lcdif: - pixel clock fix host1x: - fix syncpoint IRQ during resume - use iommu_paging_domain_alloc() imx: - ipuv3: convert to struct drm_edid omapdrm: - improve error handling - use common helper for_each_endpoint_of_node() panel: - add support for BOE TV101WUM-LL2 plus DT bindings - novatek-nt35950: improve error handling - nv3051d: improve error handling - panel-edp: - add support for BOE NE140WUM-N6G - revert support for SDC ATNA45AF01 - visionox-vtdr6130: - improve error handling - use devm_regulator_bulk_get_const() - boe-th101mb31ig002: - Support for starry-er88577 MIPI-DSI panel plus DT - Fix porch parameter - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1, BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2, CMN N116BCP-EA2, CSW MNB601LS1-4 - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT - jd9365da: - Support Melfas lmfbx101117480 MIPI-DSI panel plus DT - Refactor for code sharing - panel-edp: fix name for HKC MB116AN01 - jd9365da: fix "exit sleep" commands - jdi-fhd-r63452: simplify error handling with DSI multi-style helpers - mantix-mlaf057we51: simplify error handling with DSI multi-style helpers - simple: - support Innolux G070ACE-LH3 plus DT bindings - support On Tat Industrial Company KD50G21-40NT-A1 plus DT bindings - st7701: - decouple DSI and DRM code - add SPI support - support Anbernic RG28XX plus DT bindings mediatek: - support alpha blending - remove cl in struct cmdq_pkt - ovl adaptor fix - add power domain binding for mediatek DPI controller renesas: - rz-du: add support for RZ/G2UL plus DT bindings rockchip: - Improve DP sink-capability reporting - dw_hdmi: Support 4k@60Hz - vop: - Support RGB display on Rockchip RK3066 - Support 4096px width sti: - convert to struct drm_edid stm: - Avoid UAF wih managed plane and CRTC helpers - Fix module owner - Fix error handling in probe - Depend on COMMON_CLK - ltdc: - Fix transparency after disabling plane - Remove unused interrupt tegra: - gr3d: improve PM domain handling - convert to struct drm_edid - Call drm_atomic_helper_shutdown() vc4: - fix PM during detect - replace DRM_ERROR() with drm_error() - v3d: simplify clock retrieval v3d: - Clean up perfmon virtio: - add DRM capset" * tag 'drm-next-2024-09-19' of https://gitlab.freedesktop.org/drm/kernel: (1326 commits) drm/xe: Fix missing conversion to xe_display_pm_runtime_resume drm/xe/xe2hpg: Add Wa_15016589081 drm/xe: Don't keep stale pointer to bo->ggtt_node drm/xe: fix missing 'xe_vm_put' drm/xe: fix build warning with CONFIG_PM=n drm/xe: Suppress missing outer rpm protection warning drm/xe: prevent potential UAF in pf_provision_vf_ggtt() drm/amd/display: Add all planes on CRTC to state for overlay cursor drm/i915/bios: fix printk format width drm/i915/display: Fix BMG CCS modifiers drm/amdgpu: get rid of bogus includes of fdtable.h drm/amdkfd: CRIU fixes drm/amdgpu: fix a race in kfd_mem_export_dmabuf() drm: new helper: drm_gem_prime_handle_to_dmabuf() drm/amdgpu/atomfirmware: Silence UBSAN warning drm/amdgpu: Fix kdoc entry in 'amdgpu_vm_cpu_prepare' drm/amd/amdgpu: apply command submission parser for JPEG v1 drm/amd/amdgpu: apply command submission parser for JPEG v2+ drm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3 drm/amd/pm: update the features set on smu v14.0.2/3 ...	2024-09-19 10:18:15 +02:00
Dave Airlie	32bd3eb5fb	Merge tag 'drm-intel-gt-next-2024-09-06' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next Driver Changes: - Expose fan speed via hwmon (Raag) - Correction to Wa_14019159160 on ARL (John H) - Whitelist COMMON_SLICE_CHICKEN1 for UMD access on DG2/MTL/ARL (Dnyaneshwar) - Do not attempt to load the GSC multiple times to avoid hanging GSC HW (Daniele) - Populate /sys/class/drm/cardX/engines/ even if one engine fails (Andi) - Use kmemdup_array instead of kmemdup for multiple allocation (Yu) - Remove extra unlikely() (Hongbo) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Ztrfr_Wuurfa-3Rv@jlahtine-mobl.ger.corp.intel.com	2024-09-11 09:11:54 +10:00
Nikita Zhandarovich	d3d37f7468	drm/i915/guc: prevent a possible int overflow in wq offsets It may be possible for the sum of the values derived from i915_ggtt_offset() and __get_parent_scratch_offset()/ i915_ggtt_offset() to go over the u32 limit before being assigned to wq offsets of u64 type. Mitigate these issues by expanding one of the right operands to u64 to avoid any overflow issues just in case. Found by Linux Verification Center (linuxtesting.org) with static analysis tool SVACE. Fixes: `c2aa552ff0` ("drm/i915/guc: Add multi-lrc context registration") Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru> Link: https://patchwork.freedesktop.org/patch/msgid/20240725155925.14707-1-n.zhandarovich@fintech.ru Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `1f1c1bd566`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2024-09-10 08:13:51 +01:00
Daniele Ceraolo Spurio	59d3cfdd7f	drm/i915: Do not attempt to load the GSC multiple times If the GSC FW fails to load the GSC HW hangs permanently; the only ways to recover it are FLR or D3cold entry, with the former only being supported on driver unload and the latter only on DGFX, for which we don't need to load the GSC. Therefore, if GSC fails to load there is no need to try again because the HW is stuck in the error state and the submission to load the FW would just hang the GSCCS. Note that, due to wa_14015076503, on MTL the GuC escalates all GSCCS hangs to full GT resets, which would trigger a new attempt to load the GSC FW in the post-reset HW re-init; this issue is also fixed by not attempting to load the GSC FW after an error. Fixes: `15bd4a67e9` ("drm/i915/gsc: GSC firmware loading") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.3+ Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820215952.2290807-1-daniele.ceraolospurio@intel.com (cherry picked from commit `03ded4d432`) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2024-09-02 13:25:13 +03:00
Rodrigo Vivi	04cf420bbc	Merge drm/drm-next into drm-intel-next Need to take some Xe bo definition in here before we can add the BMG display 64k aligned size restrictions. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-27 17:06:28 -04:00
Daniele Ceraolo Spurio	03ded4d432	drm/i915: Do not attempt to load the GSC multiple times If the GSC FW fails to load the GSC HW hangs permanently; the only ways to recover it are FLR or D3cold entry, with the former only being supported on driver unload and the latter only on DGFX, for which we don't need to load the GSC. Therefore, if GSC fails to load there is no need to try again because the HW is stuck in the error state and the submission to load the FW would just hang the GSCCS. Note that, due to wa_14015076503, on MTL the GuC escalates all GSCCS hangs to full GT resets, which would trigger a new attempt to load the GSC FW in the post-reset HW re-init; this issue is also fixed by not attempting to load the GSC FW after an error. Fixes: `15bd4a67e9` ("drm/i915/gsc: GSC firmware loading") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.3+ Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820215952.2290807-1-daniele.ceraolospurio@intel.com	2024-08-27 08:08:09 -07:00
Daniel Vetter	3f53d7e442	Merge tag 'drm-intel-gt-next-2024-08-23' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next UAPI Changes: - Limit the number of relocations to INT_MAX (Tvrtko) Only impact should be synthetic tests. Driver Changes: - Fix for #11396: GPU Hang and rcs0 reset on Cherrytrail platform - Fix Virtual Memory mapping boundaries calculation (Andi) - Fix for #11255: Long hangs in buddy allocator with DG2/A380 without Resizable BAR since 6.9 (David) - Mark the GT as dead when mmio is unreliable (Chris, Andi) - Workaround additions / fixes for MTL, ARL and DG2 (John H, Nitin) - Enable partial memory mapping of GPU virtual memory (Andi, Chris) - Prevent NULL deref on intel_memory_regions_hw_probe (Jonathan, Dan) - Avoid UAF on intel_engines_release (Krzysztof) - Don't update PWR_CLK_STATE starting Gen12 (Umesh) - Code and dmesg cleanups (Andi, Jesus, Luca) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZshcfSqgfnl8Mh4P@jlahtine-mobl.ger.corp.intel.com	2024-08-27 12:55:17 +02:00
John Harrison	2955ae8186	drm/i915: ARL requires a newer GSC firmware ARL and MTL share a single GSC firmware blob. However, ARL requires a newer version of it. So add differentiate of the PCI ids for ARL from MTL and create ARL as a sub-platform of MTL. That way, all the existing workarounds and such still treat ARL as MTL exactly as before. However, now the GSC code can check for ARL and do an extra version check on the firmware before committing to it. Also, the version extraction code has various ways of failing but the return code was being ignore and so the firmware load would attempt to continue anyway. Fix that by propagating the return code to the next level out. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Fixes: `213c43676b` ("drm/i915/mtl: Remove the 'force_probe' requirement for Meteor Lake") Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240802031051.3816392-1-John.C.Harrison@Intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `67733d7a71`) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2024-08-27 13:23:58 +03:00
John Harrison	54f90b0333	drm/i915/guc: Fix missing enable of Wa_14019159160 on ARL The previous update to enable the workaround on ARL only changed two out of three places where the w/a needs to be enabled. That meant the GuC side was operational but not the KMD side. And as the KMD side is the trigger, it meant the w/a was not actually active. So fix that. Fixes: `104bcfae57` ("drm/i915/arl: Enable Wa_14019159160 for ARL") Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809000646.1747507-1-John.C.Harrison@Intel.com	2024-08-26 15:24:49 -07:00
John Harrison	67733d7a71	drm/i915: ARL requires a newer GSC firmware ARL and MTL share a single GSC firmware blob. However, ARL requires a newer version of it. So add differentiate of the PCI ids for ARL from MTL and create ARL as a sub-platform of MTL. That way, all the existing workarounds and such still treat ARL as MTL exactly as before. However, now the GSC code can check for ARL and do an extra version check on the firmware before committing to it. Also, the version extraction code has various ways of failing but the return code was being ignore and so the firmware load would attempt to continue anyway. Fix that by propagating the return code to the next level out. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Fixes: `213c43676b` ("drm/i915/mtl: Remove the 'force_probe' requirement for Meteor Lake") Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240802031051.3816392-1-John.C.Harrison@Intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-26 10:21:11 -04:00
Jesus Narvaez	437ad4534a	drm/i915/guc: Change GEM_WARN_ON to guc_err to prevent taints in CI This warning was supposed to catch a harmless issue, but changing to guc_error should prevent kernel taints in CI runs. Signed-off-by: Jesus Narvaez <jesus.narvaez@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240808204943.911727-1-jesus.narvaez@intel.com	2024-08-16 11:04:53 -07:00
Andi Shyti	b7b930d104	drm/i915: Replace double blank with single blank after comma in gem/ and gt/ Do not use double blanks, ", " in function parameters where it's not required by any alignment purpose. Replase it with a single blank, ", ". Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240807130516.491053-2-andi.shyti@linux.intel.com	2024-08-08 11:40:41 +01:00
John Harrison	e4a0251d36	drm/i915/guc: Extend w/a 14019159160 There is a new part to an existing workaround, so enable that piece as well. v2: Extend even further. v3: Drop DG2 as there are CI failures still to resolve. Also re-order the parameters to a function to reduce excessive line wrapping. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240622004636.662081-3-John.C.Harrison@Intel.com	2024-07-18 12:51:56 -07:00
John Harrison	104bcfae57	drm/i915/arl: Enable Wa_14019159160 for ARL The context switch out workaround also applies to ARL. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240622004636.662081-2-John.C.Harrison@Intel.com	2024-07-18 12:51:55 -07:00
Daniel Vetter	bfc109361c	Merge tag 'drm-intel-gt-next-2024-07-04' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next Driver Changes: Fixes/improvements/new stuff: - Downgrade stolen lmem setup warning [gem] (Jonathan Cavitt) - Evaluate GuC priority within locks [gt/uc] (Andi Shyti) - Fix potential UAF by revoke of fence registers [gt] (Janusz Krzysztofik) - Return NULL instead of '0' [gem] (Andi Shyti) - Use the correct format specifier for resource_size_t [gem] (Andi Shyti) - Suppress oom warning in favour of ENOMEM to userspace [gem] (Nirmoy Das) Miscellaneous: - Evaluate forcewake usage within locks [gt] (Andi Shyti) - Fix typo in comment [gt/uc] (Andi Shyti) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Tvrtko Ursulin <tursulin@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZoZP6mUSergfzFMh@linux	2024-07-05 12:14:59 +02:00
Dave Airlie	a78313bb20	Merge tag 'drm-intel-gt-next-2024-06-12' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next UAPI Changes: - Support replaying GPU hangs with captured context image (Tvrtko Ursulin) Driver Changes: Fixes/improvements/new stuff: - Automate CCS Mode setting during engine resets [gt] (Andi Shyti) - Revert "drm/i915: Remove extra multi-gt pm-references" (Janusz Krzysztofik) - Fix HAS_REGION() usage in intel_gt_probe_lmem() (Ville Syrjälä) - Disarm breadcrumbs if engines are already idle [gt] (Chris Wilson) - Shadow default engine context image in the context (Tvrtko Ursulin) - Support replaying GPU hangs with captured context image (Tvrtko Ursulin) - avoid FIELD_PREP warning [guc] (Arnd Bergmann) - Fix CCS id's calculation for CCS mode setting [gt] (Andi Shyti) - Increase FLR timeout from 3s to 9s (Andi Shyti) - Update workaround 14018575942 [mtl] (Angus Chen) Future platform enablement: - Enable w/a 16021333562 for DG2, MTL and ARL [guc] (John Harrison) Miscellaneous: - Pass the region ID rather than a bitmask to HAS_REGION() (Ville Syrjälä) - Remove counter productive REGION_* wrappers (Ville Syrjälä) - Fix typo [gem/i915_gem_ttm_move] (Deming Wang) - Delete the live_hearbeat_fast selftest [gt] (Krzysztof Niemiec) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tursulin@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/Zmmazub+U9ewH9ts@linux	2024-06-27 17:21:44 +10:00
Jani Nikula	d754ed2821	Merge drm/drm-next into drm-intel-next Sync to v6.10-rc3. Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2024-06-19 11:38:31 +03:00
Andi Shyti	ae45f07cad	drm/i915/gt/uc: Evaluate GuC priority within locks The ce->guc_state.lock was made to protect guc_prio, which indicates the GuC priority level. But at the begnning of the function we perform some sanity check of guc_prio outside its protected section. Move them within the locked region. Use this occasion to expand the if statement to make it clearer. Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240613222402.551625-2-andi.shyti@linux.intel.com	2024-06-15 13:58:18 +02:00
Andi Shyti	45ebbbbeaa	drm/i915/gt/uc: Fix typo in comment Replace "dynmically" with "dynamically". Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240613222837.552277-1-andi.shyti@linux.intel.com	2024-06-15 12:38:59 +02:00
John Harrison	6ef078383a	drm/i915/guc: Enable w/a 16021333562 for DG2, MTL and ARL Enable another workaround that is implemented inside the GuC. v2: Use the correct Gen12 w/a id rather than the Xe version (review feedback from Matthew R) also extend to include ARL. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240528230515.479395-1-John.C.Harrison@Intel.com	2024-06-06 17:28:55 -07:00
Jani Nikula	1bb01bdab0	drm: move i915_component.h under include/drm/intel Clean up the top level include/drm directory by grouping all the Intel specific files under a common subdirectory. v2: Also change Documentation/gpu/i915.rst (Andi) Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Dave Airlie <airlied@gmail.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Tomas Winkler <tomas.winkler@intel.com> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/a8c07233a8234858eb6711140482ef8db4c91cf4.1717075103.git.jani.nikula@intel.com	2024-05-31 16:11:00 +03:00
Jani Nikula	0706d57100	drm: move i915_gsc_proxy_mei_interface.h under include/drm/intel Clean up the top level include/drm directory by grouping all the Intel specific files under a common subdirectory. Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Dave Airlie <airlied@gmail.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Tomas Winkler <tomas.winkler@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/461662d528c3f327c81b764b7c883cd4519d8729.1717075103.git.jani.nikula@intel.com	2024-05-31 16:10:56 +03:00
Arnd Bergmann	d4f36db623	drm/i915/guc: avoid FIELD_PREP warning With gcc-7 and earlier, there are lots of warnings like In file included from <command-line>:0:0: In function '__guc_context_policy_add_priority.isra.66', inlined from '__guc_context_set_prio.isra.67' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3292:3, inlined from 'guc_context_set_prio' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3320:2: include/linux/compiler_types.h:399:38: error: call to '__compiletime_assert_631' declared with attribute error: FIELD_PREP: mask is not constant _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ ... drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:2422:3: note: in expansion of macro 'FIELD_PREP' FIELD_PREP(GUC_KLV_0_KEY, GUC_CONTEXT_POLICIES_KLV_ID_##id) \| \ ^~~~~~~~~~ Make sure that GUC_KLV_0_KEY is an unsigned value to avoid the warning. Fixes: `77b6f79df6` ("drm/i915/guc: Update to GuC version 69.0.3") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240430164809.482131-1-julia.filipchuk@intel.com (cherry picked from commit `364e039827`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2024-05-29 11:35:26 +03:00
Arnd Bergmann	364e039827	drm/i915/guc: avoid FIELD_PREP warning With gcc-7 and earlier, there are lots of warnings like In file included from <command-line>:0:0: In function '__guc_context_policy_add_priority.isra.66', inlined from '__guc_context_set_prio.isra.67' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3292:3, inlined from 'guc_context_set_prio' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3320:2: include/linux/compiler_types.h:399:38: error: call to '__compiletime_assert_631' declared with attribute error: FIELD_PREP: mask is not constant _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ ... drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:2422:3: note: in expansion of macro 'FIELD_PREP' FIELD_PREP(GUC_KLV_0_KEY, GUC_CONTEXT_POLICIES_KLV_ID_##id) \| \ ^~~~~~~~~~ Make sure that GUC_KLV_0_KEY is an unsigned value to avoid the warning. Fixes: `77b6f79df6` ("drm/i915/guc: Update to GuC version 69.0.3") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240430164809.482131-1-julia.filipchuk@intel.com	2024-05-16 09:15:09 -07:00
Dave Airlie	68b89e23c2	Merge tag 'drm-intel-gt-next-2024-04-26' of https://anongit.freedesktop.org/git/drm/drm-intel into drm-next UAPI Changes: - drm/i915/guc: Use context hints for GT frequency Allow user to provide a low latency context hint. When set, KMD sends a hint to GuC which results in special handling for this context. SLPC will ramp the GT frequency aggressively every time it switches to this context. The down freq threshold will also be lower so GuC will ramp down the GT freq for this context more slowly. We also disable waitboost for this context as that will interfere with the strategy. We need to enable the use of SLPC Compute strategy during init, but it will apply only to contexts that set this bit during context creation. Userland can check whether this feature is supported using a new param- I915_PARAM_HAS_CONTEXT_FREQ_HINT. This flag is true for all guc submission enabled platforms as they use SLPC for frequency management. The Mesa usage model for this flag is here - https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint - drm/i915/gt: Enable only one CCS for compute workload Enable only one CCS engine by default with all the compute sices allocated to it. While generating the list of UABI engines to be exposed to the user, exclude any additional CCS engines beyond the first instance *** NOTE: This W/A will make all DG2 SKUs appear like single CCS SKUs by default to mitigate a hardware bug. All the EUs will still remain usable, and all the userspace drivers have been confirmed to be able to dynamically detect the change in number of CCS engines and adjust. For the smaller percent of applications that get perf benefit from letting the userspace driver dispatch across all 4 CCS engines we will be introducing a sysfs control as a later patch to choose 4 CCS each with 25% EUs (or 50% if 2 CCS). NOTE: A regression has been reported at https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10895 However Andi has been triaging the issue and we're closing in a fix to the gap in the W/A implementation: https://lists.freedesktop.org/archives/intel-gfx/2024-April/348747.html Driver Changes: - Add new and fix to existing workarounds: Wa_14018575942 (MTL), Wa_16019325821 (Gen12.70), Wa_14019159160 (MTL), Wa_16015675438, Wa_14020495402 (Gen12.70) (Tejas, John, Lucas) - Fix UAF on destroy against retire race and remove two earlier partial fixes (Janusz) - Limit the reserved VM space to only the platforms that need it (Andi) - Reset queue_priority_hint on parking for execlist platforms (Chris) - Fix gt reset with GuC submission is disabled (Nirmoy) - Correct capture of EIR register on hang (John) - Remove usage of the deprecated ida_simple_xx() API - Refactor confusing __intel_gt_reset() (Nirmoy) - Fix the fix for GuC reset lock confusion (John) - Simplify/extend platform check for Wa_14018913170 (John) - Replace dev_priv with i915 (Andi) - Add and use gt_to_guc() wrapper (Andi) - Remove bogus null check (Rodrigo, Dan) . Selftest improvements (Janusz, Nirmoy, Daniele) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZitVBTvZmityDi7D@jlahtine-mobl.ger.corp.intel.com	2024-04-30 14:40:43 +10:00
Dave Airlie	0208ca55aa	Backmerge tag 'v6.9-rc5' into drm-next Linux 6.9-rc5 I've had a persistent msm failure on clang, and the fix is in fixes so just pull it back to fix that. Signed-off-by: Dave Airlie <airlied@redhat.com>	2024-04-22 14:35:52 +10:00
Christophe JAILLET	2af231e1b8	drm/i915/guc: Remove usage of the deprecated ida_simple_xx() API ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove(). Note that the upper limit of ida_simple_get() is exclusive, but the one of ida_alloc_range() is inclusive. So a -1 has been added when needed. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/7108c1871c6cb08d403c4fa6534bc7e6de4cb23d.1705245316.git.christophe.jaillet@wanadoo.fr Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-04-14 15:09:18 -04:00
John Harrison	152191e5e9	drm/i915/guc: Fix the fix for reset lock confusion The previous fix for the circlular lock splat about the busyness worker wasn't quite complete. Even though the reset-in-progress flag is cleared at the start of intel_uc_reset_finish, the entire function is still inside the reset mutex lock. Not sure why the patch appeared to fix the issue both locally and in CI. However, it is now back again. There is a further complication that the wedge code path within intel_gt_reset() jumps around so much that it results in nested reset_prepare/_finish calls. That is, the call sequence is: intel_gt_reset \| reset_prepare \| __intel_gt_set_wedged \| \| reset_prepare \| \| reset_finish \| reset_finish The nested finish means that even if the clear of the in-progress flag was moved to the end of _finish, it would still be clear for the entire second call. Surprisingly, this does not seem to be causing any other problems at present. As an aside, a wedge on fini does not call the finish functions at all. The reset_in_progress flag is left set (twice). So instead of trying to cancel the worker anywhere at all in the reset path, just add a cancel to intel_guc_submission_fini instead. Note that it is not a problem if the worker is still active during a reset. Either it will run before the reset path starts locking things and will simply block the reset code for a tiny amount of time. Or it will run after the locks have been acquired and will early exit due to the try-lock. Also, do not use the reset-in-progress flag to decide whether a synchronous cancel is safe (from a lockdep perspective) or not. Instead, use the actual reset mutex state (both the genuine one and the custom rolled BACKOFF one). Fixes: `0e00a8814e` ("drm/i915/guc: Avoid circular locking issue on busyness flush") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Cc: Zhanjun Dong <zhanjun.dong@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240329235306.1559639-1-John.C.Harrison@Intel.com (cherry picked from commit `3563d85531`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-04-08 13:09:37 -04:00
John Harrison	3563d85531	drm/i915/guc: Fix the fix for reset lock confusion The previous fix for the circlular lock splat about the busyness worker wasn't quite complete. Even though the reset-in-progress flag is cleared at the start of intel_uc_reset_finish, the entire function is still inside the reset mutex lock. Not sure why the patch appeared to fix the issue both locally and in CI. However, it is now back again. There is a further complication that the wedge code path within intel_gt_reset() jumps around so much that it results in nested reset_prepare/_finish calls. That is, the call sequence is: intel_gt_reset \| reset_prepare \| __intel_gt_set_wedged \| \| reset_prepare \| \| reset_finish \| reset_finish The nested finish means that even if the clear of the in-progress flag was moved to the end of _finish, it would still be clear for the entire second call. Surprisingly, this does not seem to be causing any other problems at present. As an aside, a wedge on fini does not call the finish functions at all. The reset_in_progress flag is left set (twice). So instead of trying to cancel the worker anywhere at all in the reset path, just add a cancel to intel_guc_submission_fini instead. Note that it is not a problem if the worker is still active during a reset. Either it will run before the reset path starts locking things and will simply block the reset code for a tiny amount of time. Or it will run after the locks have been acquired and will early exit due to the try-lock. Also, do not use the reset-in-progress flag to decide whether a synchronous cancel is safe (from a lockdep perspective) or not. Instead, use the actual reset mutex state (both the genuine one and the custom rolled BACKOFF one). Fixes: `0e00a8814e` ("drm/i915/guc: Avoid circular locking issue on busyness flush") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Cc: Zhanjun Dong <zhanjun.dong@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240329235306.1559639-1-John.C.Harrison@Intel.com	2024-04-05 09:31:37 -07:00
Rodrigo Vivi	7406538860	drm/i915/guc: Remove bogus null check This null check is bogus because we are already using 'ce' stuff in many places before this function is called. Having this here is useless and confuses static analyzer tools that can see: struct intel_engine_cs *engine = ce->engine; before this check, in the same function. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202403101225.7AheJhZJ-lkp@intel.com/ Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240328213107.90632-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-04-03 14:44:22 -04:00
Lucas De Marchi	1bfc03b137	drm/i915: Remove special handling for !RCS_MASK() With both XEHPSDV and PVC removed (as platforms, most of their code remain used by others), there's no need to handle !RCS_MASK() as other platforms don't ever have fused-off render. Remove those code paths and the special WA flag when initializing GuC. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Tvrtko Ursulin <tursulin@ursulin.net> Link: https://patchwork.freedesktop.org/patch/msgid/20240320060543.4034215-7-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-03-22 14:15:00 -07:00
Lucas De Marchi	326e30e462	drm/i915: Drop dead code for pvc PCI IDs for PVC were never added and platform always marked with force_probe. Drop what's not used and rename some places as needed. The registers not used anymore are also removed. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Tvrtko Ursulin <tursulin@ursulin.net> Link: https://patchwork.freedesktop.org/patch/msgid/20240320060543.4034215-6-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-03-22 14:14:56 -07:00
Lucas De Marchi	48ba4a6dc3	drm/i915: Update IP_VER(12, 50) With no platform using graphics/media IP_VER(12, 50), replace the checks throughout the code with IP_VER(12, 55) so the code makes sense by itself with no additional explanation of previous baggage. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Tvrtko Ursulin <tursulin@ursulin.net> Link: https://patchwork.freedesktop.org/patch/msgid/20240320060543.4034215-5-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-03-22 14:14:52 -07:00
Lucas De Marchi	cb4046d289	drm/i915: Drop dead code for xehpsdv PCI IDs for XEHPSDV were never added and platform always marked with force_probe. Drop what's not used and rename some places to either be xehp or dg2, depending on the platform/IP checks. The registers not used anymore are also removed. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Tvrtko Ursulin <tursulin@ursulin.net> Link: https://patchwork.freedesktop.org/patch/msgid/20240320060543.4034215-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-03-22 14:14:39 -07:00
Lucas De Marchi	af7c4a648e	drm/i915: Drop WA 16015675438 With dynamic load-balancing disabled on the compute side, there's no reason left to enable WA 16015675438. Drop it from both PVC and DG2. Note that this can be done because now the driver always set a fixed partition of EUs during initialization via the ccs_mode configuration. The flag to GuC is still needed because of 18020744125, so update the comment accordingly. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Mateusz Jablonski <mateusz.jablonski@intel.com> Acked-by: Michal Mrozek <michal.mrozek@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240306144723.1826977-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-03-11 08:32:35 -07:00
John Harrison	7ad6a8fae5	drm/i915/guc: Enable Wa_14019159160 Use the new w/a KLV support to enable a MTL w/a. Note, this w/a is a super-set of Wa_16019325821, so requires turning that one as well as setting the new flag for Wa_14019159160 itself. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223205632.1621019-4-John.C.Harrison@Intel.com	2024-03-07 15:26:47 -08:00
John Harrison	6cc7a5c7dc	drm/i915/guc: Add support for w/a KLVs To prevent running out of bits, new w/a enable flags are being added via a KLV system instead of a 32 bit flags word. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223205632.1621019-3-John.C.Harrison@Intel.com	2024-03-07 15:26:45 -08:00
John Harrison	f673d59e31	drm/i915: Enable Wa_16019325821 Some platforms require holding RCS context switches until CCS is idle (the reverse w/a of Wa_14014475959). Some platforms require both versions. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223205632.1621019-2-John.C.Harrison@Intel.com	2024-03-07 15:26:42 -08:00
Vinay Belgaumkar	cec82816d0	drm/i915/guc: Use context hints for GT frequency Allow user to provide a low latency context hint. When set, KMD sends a hint to GuC which results in special handling for this context. SLPC will ramp the GT frequency aggressively every time it switches to this context. The down freq threshold will also be lower so GuC will ramp down the GT freq for this context more slowly. We also disable waitboost for this context as that will interfere with the strategy. We need to enable the use of SLPC Compute strategy during init, but it will apply only to contexts that set this bit during context creation. Userland can check whether this feature is supported using a new param- I915_PARAM_HAS_CONTEXT_FREQ_HINT. This flag is true for all guc submission enabled platforms as they use SLPC for frequency management. The Mesa usage model for this flag is here - https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint v2: Rename flags as per review suggestions (Rodrigo, Tvrtko). Also, use flag bits in intel_context as it allows finer control for toggling per engine if needed (Tvrtko). v3: Minor review comments (Tvrtko) v4: Update comment (Sushma) Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Ivan Briano <ivan.briano@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240306012759.204938-1-vinay.belgaumkar@intel.com	2024-03-07 10:25:06 -08:00
John Harrison	4ae86a7f8d	drm/i915/guc: Simplify/extend platform check for Wa_14018913170 The above w/a is required for every platform that the i915 driver supports. It is fixed on the latest platforms but they are only supported by Xe instead of i915. So just remove the platform check completely and keep the code simple. v2: Add extra comment (review feedback from Rodrigo). Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223202846.1532176-1-John.C.Harrison@Intel.com	2024-03-05 17:19:50 -08:00
John Harrison	e45afbeb59	drm/i915/guc: Correct capture of EIR register on hang The EIR register (0x20B0) was being included in the engine class list for render and compute as the absolute register address. However, it is actually a ring register available on all engines at an offset of (base) + 0xB0. As it was included as an RCS engine but with the absolute address, GuC was adding on another 0x2000 and coming out at an invalid location. Thus it would reject the register and complain about only managing a partial capture. So update the list to use the RING_EIR version of the register and include it for all engines. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223203204.1533410-1-John.C.Harrison@Intel.com	2024-03-04 15:35:22 -08:00
Andi Shyti	3f2f20da79	drm/i915/guc: Use the new gt_to_guc() wrapper Get the guc reference from the gt using the gt_to_guc() helper. Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231229102734.674362-3-andi.shyti@linux.intel.com	2024-03-01 13:19:43 +01:00
Dave Airlie	ca7a1d0d18	Merge tag 'drm-intel-next-2024-02-27-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-next drm/i915 feature pull #2 for v6.9: Features and functionality: - DP tunneling and bandwidth allocation support (Imre) - Add more ADL-N PCI IDs (Gustavo) - Enable fastboot also on older platforms (Ville) - Bigjoiner force enable debugfs option for testing (Stan) Refactoring and cleanups: - Remove unused structs and struct members (Jiri Slaby) - Use per-device debug logging (Ville) - State check improvements (Ville) - Hardcoded cd2x divider cleanups (Ville) - CDCLK documentation updates (Ville, Rodrigo) Fixes: - HDCP MST Type1 fixes (Suraj) - Fix MTL C20 PHY PLL values (Ravi) - More hardware access prevention during init (Imre) - Always enable decompression with tile4 on Xe2 (Juha-Pekka) - Improve LNL package C residency (Suraj) drm core changes: - DP tunneling and bandwidth allocation helpers (Imre) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87sf1devbj.fsf@intel.com	2024-02-28 11:02:55 +10:00
Jiri Slaby (SUSE)	394a1376d8	drm/i915: remove intel_guc::ads_engine_usage_size intel_guc::ads_engine_usage_size was never used since its addition in commit `77cdd054dd` (drm/i915/pmu: Connect engine busyness stats from GuC to pmu). Drop it. Found by https://github.com/jirislaby/clang-struct. Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: intel-gfx@lists.freedesktop.org Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240216065326.6910-9-jirislaby@kernel.org	2024-02-19 15:35:17 -05:00
Randy Dunlap	af3cfcad49	drm/i915/guc: reconcile Excess struct member kernel-doc warnings Document nested struct members with full names as described in Documentation/doc-guide/kernel-doc.rst. intel_guc.h:305: warning: Excess struct member 'lock' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'guc_ids' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'num_guc_ids' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'guc_ids_bitmap' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'guc_id_list' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'guc_ids_in_use' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'destroyed_contexts' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'destroyed_worker' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'reset_fail_worker' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'reset_fail_mask' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'sched_disable_delay_ms' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'sched_disable_gucid_threshold' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'lock' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'gt_stamp' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'ping_delay' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'work' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'shift' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'last_stat_jiffies' description in 'intel_guc' 18 warnings as Errors Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: intel-gfx@lists.freedesktop.org Cc: Jonathan Corbet <corbet@lwn.net> Cc: dri-devel@lists.freedesktop.org Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231226195432.10891-3-rdunlap@infradead.org (cherry picked from commit `e4cf1a70fa`) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2024-01-10 11:56:38 +02:00
John Harrison	0e00a8814e	drm/i915/guc: Avoid circular locking issue on busyness flush Avoid the following lockdep complaint: <4> [298.856498] ====================================================== <4> [298.856500] WARNING: possible circular locking dependency detected <4> [298.856503] 6.7.0-rc5-CI_DRM_14017-g58ac4ffc75b6+ #1 Tainted: G N <4> [298.856505] ------------------------------------------------------ <4> [298.856507] kworker/4:1H/190 is trying to acquire lock: <4> [298.856509] ffff8881103e9978 (&gt->reset.backoff_srcu){++++}-{0:0}, at: _intel_gt_reset_lock+0x35/0x380 [i915] <4> [298.856661] but task is already holding lock: <4> [298.856663] ffffc900013f7e58 ((work_completion)(&(&guc->timestamp.work)->work)){+.+.}-{0:0}, at: process_scheduled_works+0x264/0x530 <4> [298.856671] which lock already depends on the new lock. The complaint is not actually valid. The busyness worker thread does indeed hold the worker lock and then attempt to acquire the reset lock (which may have happened in reverse order elsewhere). However, it does so with a trylock that exits if the reset lock is not available (specifically to prevent this and other similar deadlocks). Unfortunately, lockdep does not understand the trylock semantics (the lock is an i915 specific custom implementation for resets). Not doing a synchronous flush of the worker thread when a reset is in progress resolves the lockdep splat by never even attempting to grab the lock in this particular scenario. There are situatons where a synchronous cancel is required, however. So, always do the synchronous cancel if not in reset. And add an extra synchronous cancel to the end of the reset flow to account for when a reset is occurring at driver shutdown and the cancel is no longer synchronous but could lead to unallocated memory accesses if the worker is not stopped. Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20231219195957.212600-1-John.C.Harrison@Intel.com	2024-01-09 10:14:09 -08:00
Alan Previn	2f2cc53b5f	drm/i915/guc: Close deregister-context race against CT-loss If we are at the end of suspend or very early in resume its possible an async fence signal (via rcu_call) is triggered to free_engines which could lead us to the execution of the context destruction worker (after a prior worker flush). Thus, when suspending, insert rcu_barriers at the start of i915_gem_suspend (part of driver's suspend prepare) and again in i915_gem_suspend_late so that all such cases have completed and context destruction list isn't missing anything. In destroyed_worker_func, close the race against CT-loss by checking that CT is enabled before calling into deregister_destroyed_contexts. Based on testing, guc_lrc_desc_unpin may still race and fail as we traverse the GuC's context-destroy list because the CT could be disabled right before calling GuC's CT send function. We've witnessed this race condition once every ~6000-8000 suspend-resume cycles while ensuring workloads that render something onscreen is continuously started just before we suspend (and the workload is small enough to complete and trigger the queued engine/context free-up either very late in suspend or very early in resume). In such a case, we need to unroll the entire process because guc-lrc-unpin takes a gt wakeref which only gets released in the G2H IRQ reply that never comes through in this corner case. Without the unroll, the taken wakeref is leaked and will cascade into a kernel hang later at the tail end of suspend in this function: intel_wakeref_wait_for_idle(&gt->wakeref) (called by) - intel_gt_pm_wait_for_idle (called by) - wait_for_suspend Thus, do an unroll in guc_lrc_desc_unpin and deregister_destroyed_- contexts if guc_lrc_desc_unpin fails due to CT send falure. When unrolling, keep the context in the GuC's destroy-list so it can get picked up on the next destroy worker invocation (if suspend aborted) or get fully purged as part of a GuC sanitization (end of suspend) or a reset flow. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Tested-by: Mousumi Jana <mousumi.jana@intel.com> Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231229215143.581619-1-alan.previn.teres.alexis@intel.com	2024-01-09 09:33:08 -08:00
Alan Previn	5e83c060e9	drm/i915/guc: Flush context destruction worker at suspend When suspending, flush the context-guc-id deregistration worker at the final stages of intel_gt_suspend_late when we finally call gt_sanitize that eventually leads down to __uc_sanitize so that the deregistration worker doesn't fire off later as we reset the GuC microcontroller. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Tested-by: Mousumi Jana <mousumi.jana@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231228045558.536585-2-alan.previn.teres.alexis@intel.com	2024-01-09 09:33:07 -08:00
John Harrison	a797099562	drm/i915/huc: Allow for very slow HuC loading A failure to load the HuC is occasionally observed where the cause is believed to be a low GT frequency leading to very long load times. So a) increase the timeout so that the user still gets a working system even in the case of slow load. And b) report the frequency during the load to see if that is the cause of the slow down. Also update the similar code on the GuC load to not use uncore->gt when there is a local gt available. The two should match, but no need for unnecessary de-referencing. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240102222202.310495-1-John.C.Harrison@Intel.com	2024-01-05 16:18:21 -08:00
Shuicheng Lin	835e4d9bb3	drm/i915/guc: Change wa and EU_PERF_CNTL registers to MCR type Some of the wa registers are MCR register, and EU_PERF_CNTL registers are MCR register. MCR register needs extra process for read/write. As normal MMIO register also could work with the MCR register process, change all wa registers to MCR type for code simplicity. Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240102010231.843778-1-shuicheng.lin@intel.com	2024-01-02 10:09:26 -08:00

1 2 3 4 5 ...

812 Commits