linux

mirror of https://github.com/torvalds/linux.git synced 2026-04-20 15:53:59 -04:00

Author	SHA1	Message	Date
Himal Prasad Ghimiray	9cca49021c	drm/xe/xe2: Updates on XY_CTRL_SURF_COPY_BLT - The XY_CTRL_SURF_COPY_BLT instruction operating on ccs data expects size in pages of main memory for which CCS data should be copied. - The bitfield representing copy size in XY_CTRL_SURF_COPY_BLT has shifted one bit higher in the instruction. v2: - Fix the num_pages for ccs size calculation. - Address nits (Thomas) v3: - Use FIELD_PREP and FIELD_FIT instead of shifts and numbers.(Matt) Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:46:15 -05:00
Himal Prasad Ghimiray	064686272b	drm/xe/xe2: Modify main memory to ccs memory ratio. On xe2 platforms each byte of CCS data now represents 512 bytes of main memory data. Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:46:09 -05:00
Tejas Upadhyay	0ac3d319cb	drm/xe/xe2: Add workaround 16020292621 Workaround applies to Graphics 20.04 as part of ring submission V4(MattR): - Rule for engine in oob WA not supported, add explicitly V3(MattR): - Pass hwe and rename API name to hint end of ring work - Use existing RING_NOPID API V2: - Marking this WA for 20.04 instead of 20.00 Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:10 -05:00
Matt Roper	0134f130e7	drm/xe: Extract MI_* instructions to their own header Extracting the common MI_* instructions that can be used with any engine to their own header will make it easier as we add additional engine instructions in upcoming patches. Also, since the majority of GPU instructions (both MI and non-MI) have a "length" field in bits 7:0 of the instruction header, a common define is added for that. Instruction-specific length fields are still defined for special case instructions that have larger/smaller length fields. v2: - Use "instr" instead of "inst" as the short form of "instruction" everywhere. (Lucas) - Include xe_reg_defs.h instead of the i915 compat header. (Lucas) Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231016163449.1300701-12-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:43:00 -05:00
Matt Roper	14a1e6a4a4	drm/xe: Clarify number of dwords/qwords stored by MI_STORE_DATA_IMM MI_STORE_DATA_IMM can store either dword values or qword values, and can store more than one value if the instruction's length field is large enough. Create explicit defines to specify the number of dwords/qwords to be stored, which will set the instruction length correctly and, if necessary, turn on the 'store qword' bit. While we're here, also replace an open-coded version of MI_STORE_DATA_IMM with the common macros. Bspec: 60246 Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231016163449.1300701-11-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:43:00 -05:00
Matt Roper	e12a64881e	drm/xe: Separate number of registers from MI_LRI opcode Keeping the number of registers to be loaded as a separate macro from the instruction opcode will simplify some upcoming LRC parsing code. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231016163449.1300701-10-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:43:00 -05:00
Matt Roper	de54bb81d9	drm/xe: Make MI_FLUSH_DW immediate size more explicit Despite its name, MI_FLUSH_DW instruction can write an immediate value of either dword size or qword size, depending on the 'length' field of the instruction. Since "length" excludes the first two dwords of the instruction, a value of 2 in the length field implies a dword write and a value of 3 implies a qword write. Even in cases where the flush instruction's post-sync operation is set to "no write" we're still expected to size the overall instruction as if we were doing a dword or qword write (i.e., a length of 1 shouldn't be used on modern platforms). Rather than baking a size of "1" into the #define and then adding another unexplained "+ 1" at all the spots where the definition gets used, lets just create MI_FLUSH_IMM_DW and MI_FLUSH_IMM_QW definitions that should be OR'd into the instruction header to make it more explicit what behavior we're requesting. Bspec: 60229 Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231016163449.1300701-9-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:43:00 -05:00
Haridhar Kalvala	30603b5b0f	drm/xe/xe2: Update MOCS fields in blitter instructions Xe2 changes or adds bits for mocs in a few BLT instructions: XY_CTRL_SURF_COPY_BLT, XY_FAST_COLOR_BLT, XY_FAST_COPY_BLT, and MEM_SET. Modify the code to deal with the new location. Unlike Xe1, the MOCS field in those instructions is only the MOCS index and not the Structure_MEMORY_OBJECT_CONTROL_STATE anymore. The pxp bit is now explicitly documented separately. Bspec: 57567,57566,57565,57562 Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230929213640.3189912-5-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:42:08 -05:00
Haridhar Kalvala	4bdd8c2ed9	drm/xe/xe2: Set tile y type in XY_FAST_COPY_BLT to Tile4 Set bits 30 and 31 of XY_FAST_COPY_BLT's dword1 for XeHP and above. Destination or source being Y-Major is selected on dword0 and there's nothing to set on dword1. According to the bspec for Xe2, "Behavior is undefined when programmed the value 0". Also for XeHP, the only value allowed in those bits is 0b11, not being possible to select "Legacy Tile-Y" anymore, only the newer Tile4. So, unconditionally set those bits for graphics IP 12.50 and above. v2: Reword commit message and extend it to graphics version >= 12.50 (Matt Roper) Bspec: 57567 Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230929213640.3189912-4-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:42:04 -05:00
Haridhar Kalvala	c690f0e6b7	drm/xe: Rename MEM_SET instruction PVC_MS_* doesn't reflect the real name of the instruction. Rename it to follow the name used in the bspec. Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230929213640.3189912-3-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:42:04 -05:00
Haridhar Kalvala	2c0ac321d9	drm/xe: Adjust mocs field mask definitions Instead of using xe_mocs_index_to_value(), simply define the bitmask with the shift left applied. This will make it easier to adapt to new platforms that simply use the index. This also fixes PVC bug in emit_clear_link_copy() where the MOCS was getting shifted both by PVC_MS_MOCS_INDEX_MASK definition and by the xe_moc_index_to_value function. Bspec: 44509 Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230929213640.3189912-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:42:03 -05:00
Thomas Hellström	9f8f93bee3	drm/xe: Emit a render cache flush after each rcs/ccs batch We need to flush render caches before fence signalling, where we might release the memory for reuse. We can't rely on userspace doing this, so flush render caches after the batch, but before user fence- and dma_fence signalling. Copy the cache flush from i915, but omit PIPE_CONTROL_FLUSH_L3, since it should be implied by the other flushes. Also omit PIPE_CONTROL_TLB_INVALIDATE since there should be no apparent need to invalidate TLB after batch completion. v2: - Update Makefile for OOB WA. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Tested-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> #1 Reported-by: José Roberto de Souza <jose.souza@intel.com> Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291 Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-19 18:35:21 -05:00
Thomas Hellström	85dbfe47d0	drm/xe: Invalidate TLB also on bind if in scratch page mode For scratch table mode we need to cover the case where a scratch PTE might have been pre-fetched and cached and used instead of that of the newly bound vma. For compute vms, invalidate TLB globally using GuC before signalling bind complete. For !long-running vms, invalidate TLB at batch start. Also document how TLB invalidation works. v2: - Fix a pointer to the comment about TLB invalidation (Jose Souza). - Add a bool to the vm whether we want to invalidate TLB at batch start. - Invalidate TLB also on BCS- and video engines at batch start where needed. - Use BIT() macro instead of explicit shift. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Tested-by: José Roberto de Souza <jose.souza@intel.com> #v1 Reported-by: José Roberto de Souza <jose.souza@intel.com> #v1 Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291 Acked-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-19 18:35:20 -05:00
Lucas De Marchi	d9b79ad275	drm/xe: Drop gen afixes from registers The defines for the registers were brought over from i915 while bootstrapping the driver. As xe supports TGL and later only, it doesn't make sense to keep the GEN* prefixes and suffixes in the registers: TGL is graphics version 12, previously called "GEN12". So drop the prefix everywhere. v2: - Also drop _TGL suffix and reword commit message as suggested by Matt Roper. While at it, rename VSUNIT_CLKGATE_DIS_TGL to VSUNIT_CLKGATE2_DIS with the additional "2", so it doesn't clash with the define for the other register Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230427223256.1432787-3-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-19 18:32:15 -05:00
Lucas De Marchi	56492dacee	drm/xe: Rename instruction field to avoid confusion There was both BLT_DEPTH_32 and XY_FAST_COLOR_BLT_DEPTH_32 - also add the prefix to the first to make it clear this is about the FAST_COPY operation. While at it, remove the GEN9_ prefix. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-19 18:31:46 -05:00
Matthew Brost	4f1411e2da	drm/xe: Reinstate render / compute cache invalidation in ring ops Render / compute engines have additional caches (not just TLBs) that need to be invalidated each batch, reinstate these invalidations in ring ops. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Suggested-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-19 18:30:21 -05:00
Balasubramani Vivekanandan	11a2407ed5	drm/xe: Stop accepting value in xe_migrate_clear Although xe_migrate_clear() has a value argument, currently the driver is only passing 0 at all the places this function is invoked with the exception the kunit tests are using the parameter to validate this function with different values. xe_migrate_clear() is failing on platforms with link copy engines because xe_migrate_clear() via emit_clear() is using the blitter instruction XY_FAST_COLOR_BLT to clear the memory. But this instruction is not supported by link copy engine. So the solution is to use the alternate instruction MEM_SET when platform contains link copy engine. But MEM_SET instruction accepts only 8-bit value for setting whereas the value agrument of xe_migrate_clear() is 32-bit. So instead of spreading this limitation around all invocations of xe_migrate_clear() and causing more confusion, it was decided to not accept any value itself as driver does not really need this currently. All the kunit tests are adapted as per the new function prototype. This will be followed by a patch to add support for link copy engines. Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-19 18:30:20 -05:00
Lucas De Marchi	63955b3bfa	drm/xe: Remove dependency on intel_gpu_commands.h Copy the macros used by xe in intel_gpu_commands.h to regs/xe_gpu_commands.h. PIPE_CONTROL_3D_ENGINE_FLAGS and PIPE_CONTROL_3D_ARCH_FLAGS were already defined in drivers/gpu/drm/xe/xe_ring_ops.c and only used there. So let that define to be used instead of also adding to the new header. v2: Let PIPE_CONTROL_3D_ENGINE_FLAGS/PIPE_CONTROL_3D_ARCH_FLAGS in the only .c that uses it instead of redefining (Matt Roper) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-19 18:29:21 -05:00

18 Commits