Dave Airlie
b7cc4ff787
nouveau: lock the client object tree.
...
It appears the client object tree has no locking unless I've missed
something else. Fix races around adding/removing client objects,
mostly vram bar mappings.
4562.099306] general protection fault, probably for non-canonical address 0x6677ed422bceb80c: 0000 [#1 ] PREEMPT SMP PTI
[ 4562.099314] CPU: 2 PID: 23171 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27
[ 4562.099324] Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021
[ 4562.099330] RIP: 0010:nvkm_object_search+0x1d/0x70 [nouveau]
[ 4562.099503] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 48 89 f8 48 85 f6 74 39 48 8b 87 a0 00 00 00 48 85 c0 74 12 <48> 8b 48 f8 48 39 ce 73 15 48 8b 40 10 48 85 c0 75 ee 48 c7 c0 fe
[ 4562.099506] RSP: 0000:ffffa94cc420bbf8 EFLAGS: 00010206
[ 4562.099512] RAX: 6677ed422bceb814 RBX: ffff98108791f400 RCX: ffff9810f26b8f58
[ 4562.099517] RDX: 0000000000000000 RSI: ffff9810f26b9158 RDI: ffff98108791f400
[ 4562.099519] RBP: ffff9810f26b9158 R08: 0000000000000000 R09: 0000000000000000
[ 4562.099521] R10: ffffa94cc420bc48 R11: 0000000000000001 R12: ffff9810f02a7cc0
[ 4562.099526] R13: 0000000000000000 R14: 00000000000000ff R15: 0000000000000007
[ 4562.099528] FS: 00007f629c5017c0(0000) GS:ffff98142c700000(0000) knlGS:0000000000000000
[ 4562.099534] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4562.099536] CR2: 00007f629a882000 CR3: 000000017019e004 CR4: 00000000003706f0
[ 4562.099541] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 4562.099542] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 4562.099544] Call Trace:
[ 4562.099555] <TASK>
[ 4562.099573] ? die_addr+0x36/0x90
[ 4562.099583] ? exc_general_protection+0x246/0x4a0
[ 4562.099593] ? asm_exc_general_protection+0x26/0x30
[ 4562.099600] ? nvkm_object_search+0x1d/0x70 [nouveau]
[ 4562.099730] nvkm_ioctl+0xa1/0x250 [nouveau]
[ 4562.099861] nvif_object_map_handle+0xc8/0x180 [nouveau]
[ 4562.099986] nouveau_ttm_io_mem_reserve+0x122/0x270 [nouveau]
[ 4562.100156] ? dma_resv_test_signaled+0x26/0xb0
[ 4562.100163] ttm_bo_vm_fault_reserved+0x97/0x3c0 [ttm]
[ 4562.100182] ? __mutex_unlock_slowpath+0x2a/0x270
[ 4562.100189] nouveau_ttm_fault+0x69/0xb0 [nouveau]
[ 4562.100356] __do_fault+0x32/0x150
[ 4562.100362] do_fault+0x7c/0x560
[ 4562.100369] __handle_mm_fault+0x800/0xc10
[ 4562.100382] handle_mm_fault+0x17c/0x3e0
[ 4562.100388] do_user_addr_fault+0x208/0x860
[ 4562.100395] exc_page_fault+0x7f/0x200
[ 4562.100402] asm_exc_page_fault+0x26/0x30
[ 4562.100412] RIP: 0033:0x9b9870
[ 4562.100419] Code: 85 a8 f7 ff ff 8b 8d 80 f7 ff ff 89 08 e9 18 f2 ff ff 0f 1f 84 00 00 00 00 00 44 89 32 e9 90 fa ff ff 0f 1f 84 00 00 00 00 00 <44> 89 32 e9 f8 f1 ff ff 0f 1f 84 00 00 00 00 00 66 44 89 32 e9 e7
[ 4562.100422] RSP: 002b:00007fff9ba2dc70 EFLAGS: 00010246
[ 4562.100426] RAX: 0000000000000004 RBX: 000000000dd65e10 RCX: 000000fff0000000
[ 4562.100428] RDX: 00007f629a882000 RSI: 00007f629a882000 RDI: 0000000000000066
[ 4562.100432] RBP: 00007fff9ba2e570 R08: 0000000000000000 R09: 0000000123ddf000
[ 4562.100434] R10: 0000000000000001 R11: 0000000000000246 R12: 000000007fffffff
[ 4562.100436] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 4562.100446] </TASK>
[ 4562.100448] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables libcrc32c nfnetlink cmac bnep sunrpc iwlmvm intel_rapl_msr intel_rapl_common snd_sof_pci_intel_cnl x86_pkg_temp_thermal intel_powerclamp snd_sof_intel_hda_common mac80211 coretemp snd_soc_acpi_intel_match kvm_intel snd_soc_acpi snd_soc_hdac_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof_intel_hda_mlink snd_sof_intel_hda snd_sof kvm snd_sof_utils snd_soc_core snd_hda_codec_realtek libarc4 snd_hda_codec_generic snd_compress snd_hda_ext_core vfat fat snd_hda_intel snd_intel_dspcfg irqbypass iwlwifi snd_hda_codec snd_hwdep snd_hda_core btusb btrtl mei_hdcp iTCO_wdt rapl mei_pxp btintel snd_seq iTCO_vendor_support btbcm snd_seq_device intel_cstate bluetooth snd_pcm cfg80211 intel_wmi_thunderbolt wmi_bmof intel_uncore snd_timer mei_me snd ecdh_generic i2c_i801
[ 4562.100541] ecc mei i2c_smbus soundcore rfkill intel_pch_thermal acpi_pad zram nouveau drm_ttm_helper ttm gpu_sched i2c_algo_bit drm_gpuvm drm_exec mxm_wmi drm_display_helper drm_kms_helper drm crct10dif_pclmul crc32_pclmul nvme e1000e crc32c_intel nvme_core ghash_clmulni_intel video wmi pinctrl_cannonlake ip6_tables ip_tables fuse
[ 4562.100616] ---[ end trace 0000000000000000 ]---
Signed-off-by: Dave Airlie <airlied@redhat.com >
Cc: stable@vger.kernel.org
2024-03-08 13:40:56 +10:00
Dave Airlie
a2e36cd560
nouveau: use an rwlock for the event lock.
...
This allows it to break the following circular locking dependency.
Aug 10 07:01:29 dg1test kernel: ======================================================
Aug 10 07:01:29 dg1test kernel: WARNING: possible circular locking dependency detected
Aug 10 07:01:29 dg1test kernel: 6.4.0-rc7+ #10 Not tainted
Aug 10 07:01:29 dg1test kernel: ------------------------------------------------------
Aug 10 07:01:29 dg1test kernel: wireplumber/2236 is trying to acquire lock:
Aug 10 07:01:29 dg1test kernel: ffff8fca5320da18 (&fctx->lock){-...}-{2:2}, at: nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau]
Aug 10 07:01:29 dg1test kernel:
but task is already holding lock:
Aug 10 07:01:29 dg1test kernel: ffff8fca41208610 (&event->list_lock#2){-...}-{2:2}, at: nvkm_event_ntfy+0x50/0xf0 [nouveau]
Aug 10 07:01:29 dg1test kernel:
which lock already depends on the new lock.
Aug 10 07:01:29 dg1test kernel:
the existing dependency chain (in reverse order) is:
Aug 10 07:01:29 dg1test kernel:
-> #3 (&event->list_lock#2){-...}-{2:2}:
Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70
Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy+0x50/0xf0 [nouveau]
Aug 10 07:01:29 dg1test kernel: ga100_fifo_nonstall_intr+0x24/0x30 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvkm_intr+0x12c/0x240 [nouveau]
Aug 10 07:01:29 dg1test kernel: __handle_irq_event_percpu+0x88/0x240
Aug 10 07:01:29 dg1test kernel: handle_irq_event+0x38/0x80
Aug 10 07:01:29 dg1test kernel: handle_edge_irq+0xa3/0x240
Aug 10 07:01:29 dg1test kernel: __common_interrupt+0x72/0x160
Aug 10 07:01:29 dg1test kernel: common_interrupt+0x60/0xe0
Aug 10 07:01:29 dg1test kernel: asm_common_interrupt+0x26/0x40
Aug 10 07:01:29 dg1test kernel:
-> #2 (&device->intr.lock){-...}-{2:2}:
Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70
Aug 10 07:01:29 dg1test kernel: nvkm_inth_allow+0x2c/0x80 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_state+0x181/0x250 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_allow+0x63/0xd0 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvkm_uevent_mthd+0x4d/0x70 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvkm_ioctl+0x10b/0x250 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvif_object_mthd+0xa8/0x1f0 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvif_event_allow+0x2a/0xa0 [nouveau]
Aug 10 07:01:29 dg1test kernel: nouveau_fence_enable_signaling+0x78/0x80 [nouveau]
Aug 10 07:01:29 dg1test kernel: __dma_fence_enable_signaling+0x5e/0x100
Aug 10 07:01:29 dg1test kernel: dma_fence_add_callback+0x4b/0xd0
Aug 10 07:01:29 dg1test kernel: nouveau_cli_work_queue+0xae/0x110 [nouveau]
Aug 10 07:01:29 dg1test kernel: nouveau_gem_object_close+0x1d1/0x2a0 [nouveau]
Aug 10 07:01:29 dg1test kernel: drm_gem_handle_delete+0x70/0xe0 [drm]
Aug 10 07:01:29 dg1test kernel: drm_ioctl_kernel+0xa5/0x150 [drm]
Aug 10 07:01:29 dg1test kernel: drm_ioctl+0x256/0x490 [drm]
Aug 10 07:01:29 dg1test kernel: nouveau_drm_ioctl+0x5a/0xb0 [nouveau]
Aug 10 07:01:29 dg1test kernel: __x64_sys_ioctl+0x91/0xd0
Aug 10 07:01:29 dg1test kernel: do_syscall_64+0x3c/0x90
Aug 10 07:01:29 dg1test kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc
Aug 10 07:01:29 dg1test kernel:
-> #1 (&event->refs_lock#4){....}-{2:2}:
Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70
Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_state+0x37/0x250 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_allow+0x63/0xd0 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvkm_uevent_mthd+0x4d/0x70 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvkm_ioctl+0x10b/0x250 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvif_object_mthd+0xa8/0x1f0 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvif_event_allow+0x2a/0xa0 [nouveau]
Aug 10 07:01:29 dg1test kernel: nouveau_fence_enable_signaling+0x78/0x80 [nouveau]
Aug 10 07:01:29 dg1test kernel: __dma_fence_enable_signaling+0x5e/0x100
Aug 10 07:01:29 dg1test kernel: dma_fence_add_callback+0x4b/0xd0
Aug 10 07:01:29 dg1test kernel: nouveau_cli_work_queue+0xae/0x110 [nouveau]
Aug 10 07:01:29 dg1test kernel: nouveau_gem_object_close+0x1d1/0x2a0 [nouveau]
Aug 10 07:01:29 dg1test kernel: drm_gem_handle_delete+0x70/0xe0 [drm]
Aug 10 07:01:29 dg1test kernel: drm_ioctl_kernel+0xa5/0x150 [drm]
Aug 10 07:01:29 dg1test kernel: drm_ioctl+0x256/0x490 [drm]
Aug 10 07:01:29 dg1test kernel: nouveau_drm_ioctl+0x5a/0xb0 [nouveau]
Aug 10 07:01:29 dg1test kernel: __x64_sys_ioctl+0x91/0xd0
Aug 10 07:01:29 dg1test kernel: do_syscall_64+0x3c/0x90
Aug 10 07:01:29 dg1test kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc
Aug 10 07:01:29 dg1test kernel:
-> #0 (&fctx->lock){-...}-{2:2}:
Aug 10 07:01:29 dg1test kernel: __lock_acquire+0x14e3/0x2240
Aug 10 07:01:29 dg1test kernel: lock_acquire+0xc8/0x2a0
Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70
Aug 10 07:01:29 dg1test kernel: nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvkm_client_event+0xf/0x20 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy+0x9b/0xf0 [nouveau]
Aug 10 07:01:29 dg1test kernel: ga100_fifo_nonstall_intr+0x24/0x30 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvkm_intr+0x12c/0x240 [nouveau]
Aug 10 07:01:29 dg1test kernel: __handle_irq_event_percpu+0x88/0x240
Aug 10 07:01:29 dg1test kernel: handle_irq_event+0x38/0x80
Aug 10 07:01:29 dg1test kernel: handle_edge_irq+0xa3/0x240
Aug 10 07:01:29 dg1test kernel: __common_interrupt+0x72/0x160
Aug 10 07:01:29 dg1test kernel: common_interrupt+0x60/0xe0
Aug 10 07:01:29 dg1test kernel: asm_common_interrupt+0x26/0x40
Aug 10 07:01:29 dg1test kernel:
other info that might help us debug this:
Aug 10 07:01:29 dg1test kernel: Chain exists of:
&fctx->lock --> &device->intr.lock --> &event->list_lock#2
Aug 10 07:01:29 dg1test kernel: Possible unsafe locking scenario:
Aug 10 07:01:29 dg1test kernel: CPU0 CPU1
Aug 10 07:01:29 dg1test kernel: ---- ----
Aug 10 07:01:29 dg1test kernel: lock(&event->list_lock#2);
Aug 10 07:01:29 dg1test kernel: lock(&device->intr.lock);
Aug 10 07:01:29 dg1test kernel: lock(&event->list_lock#2);
Aug 10 07:01:29 dg1test kernel: lock(&fctx->lock);
Aug 10 07:01:29 dg1test kernel:
*** DEADLOCK ***
Aug 10 07:01:29 dg1test kernel: 2 locks held by wireplumber/2236:
Aug 10 07:01:29 dg1test kernel: #0 : ffff8fca53177bf8 (&device->intr.lock){-...}-{2:2}, at: nvkm_intr+0x29/0x240 [nouveau]
Aug 10 07:01:29 dg1test kernel: #1 : ffff8fca41208610 (&event->list_lock#2){-...}-{2:2}, at: nvkm_event_ntfy+0x50/0xf0 [nouveau]
Aug 10 07:01:29 dg1test kernel:
stack backtrace:
Aug 10 07:01:29 dg1test kernel: CPU: 6 PID: 2236 Comm: wireplumber Not tainted 6.4.0-rc7+ #10
Aug 10 07:01:29 dg1test kernel: Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021
Aug 10 07:01:29 dg1test kernel: Call Trace:
Aug 10 07:01:29 dg1test kernel: <TASK>
Aug 10 07:01:29 dg1test kernel: dump_stack_lvl+0x5b/0x90
Aug 10 07:01:29 dg1test kernel: check_noncircular+0xe2/0x110
Aug 10 07:01:29 dg1test kernel: __lock_acquire+0x14e3/0x2240
Aug 10 07:01:29 dg1test kernel: lock_acquire+0xc8/0x2a0
Aug 10 07:01:29 dg1test kernel: ? nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau]
Aug 10 07:01:29 dg1test kernel: ? lock_acquire+0xc8/0x2a0
Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70
Aug 10 07:01:29 dg1test kernel: ? nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau]
Aug 10 07:01:29 dg1test kernel: nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvkm_client_event+0xf/0x20 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy+0x9b/0xf0 [nouveau]
Aug 10 07:01:29 dg1test kernel: ga100_fifo_nonstall_intr+0x24/0x30 [nouveau]
Aug 10 07:01:29 dg1test kernel: nvkm_intr+0x12c/0x240 [nouveau]
Aug 10 07:01:29 dg1test kernel: __handle_irq_event_percpu+0x88/0x240
Aug 10 07:01:29 dg1test kernel: handle_irq_event+0x38/0x80
Aug 10 07:01:29 dg1test kernel: handle_edge_irq+0xa3/0x240
Aug 10 07:01:29 dg1test kernel: __common_interrupt+0x72/0x160
Aug 10 07:01:29 dg1test kernel: common_interrupt+0x60/0xe0
Aug 10 07:01:29 dg1test kernel: asm_common_interrupt+0x26/0x40
Aug 10 07:01:29 dg1test kernel: RIP: 0033:0x7fb66174d700
Aug 10 07:01:29 dg1test kernel: Code: c1 e2 05 29 ca 8d 0c 10 0f be 07 84 c0 75 eb 89 c8 c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa e9 d7 0f fc ff 0f 1f 80 00 00 00 00 <f3> 0f 1e fa e9 c7 0f fc>
Aug 10 07:01:29 dg1test kernel: RSP: 002b:00007ffdd3c48438 EFLAGS: 00000206
Aug 10 07:01:29 dg1test kernel: RAX: 000055bb758763c0 RBX: 000055bb758752c0 RCX: 00000000000028b0
Aug 10 07:01:29 dg1test kernel: RDX: 000055bb758752c0 RSI: 000055bb75887490 RDI: 000055bb75862950
Aug 10 07:01:29 dg1test kernel: RBP: 00007ffdd3c48490 R08: 000055bb75873b10 R09: 0000000000000001
Aug 10 07:01:29 dg1test kernel: R10: 0000000000000004 R11: 000055bb7587f000 R12: 000055bb75887490
Aug 10 07:01:29 dg1test kernel: R13: 000055bb757f6280 R14: 000055bb758875c0 R15: 000055bb757f6280
Aug 10 07:01:29 dg1test kernel: </TASK>
Signed-off-by: Dave Airlie <airlied@redhat.com >
Tested-by: Danilo Krummrich <dakr@redhat.com >
Reviewed-by: Danilo Krummrich <dakr@redhat.com >
Signed-off-by: Danilo Krummrich <dakr@redhat.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20231107053255.2257079-1-airlied@gmail.com
2023-11-14 22:40:32 +01:00
Ben Skeggs
17a74021a3
drm/nouveau/nvkm: support loading fws into sg_table
...
- preparation for GSP-RM, which has massive FW images
- based on a patch by Dave Airlie
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Signed-off-by: Dave Airlie <airlied@redhat.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-32-skeggsb@gmail.com
2023-10-31 15:08:15 +10:00
Ben Skeggs
12c9b05da9
drm/nouveau/imem: support allocations not preserved across suspend
...
Will initially be used to tag some large grctx allocations which don't
need to be saved, to speedup suspend/resume.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
Acked-by: Danilo Krummrich <me@dakr.org >
Signed-off-by: Lyude Paul <lyude@redhat.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20230919220442.202488-3-lyude@redhat.com
2023-09-19 18:21:49 -04:00
Justin Stitt
00fb28fd16
drm/nouveau/core: refactor deprecated strncpy
...
`strncpy` is deprecated for use on NUL-terminated destination strings [1].
We should prefer more robust and less ambiguous string interfaces.
A suitable replacement is `strscpy` [2] due to the fact that it guarantees
NUL-termination on the destination buffer without unnecessarily NUL-padding.
There is likely no bug in the current implementation due to the safeguard:
| cname[sizeof(cname) - 1] = '\0';
... however we can provide simpler and easier to understand code using
the newer (and recommended) `strscpy` api.
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com >
Reviewed-by: Kees Cook <keescook@chromium.org >
Reviewed-by: Lyude Paul <lyude@redhat.com >
Signed-off-by: Lyude Paul <lyude@redhat.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20230914-strncpy-drivers-gpu-drm-nouveau-nvkm-core-firmware-c-v1-1-3aeae46c032f@google.com
2023-09-15 13:36:40 -04:00
Ben Skeggs
ba1efd8e33
drm/nouveau/nvkm: punt spurious irq messages to debug level
...
This can be completely normal in some situations (ie. non-stall intrs
when nothing is waiting on them).
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Karol Herbst <kherbst@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
Signed-off-by: Karol Herbst <kherbst@redhat.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20230525003106.3853741-2-skeggsb@gmail.com
2023-07-06 17:22:33 +02:00
Ben Skeggs
83775e158a
drm/nouveau/nvkm: fini object children in reverse order
...
Turns out, we're currently tearing down the disp core channel *before*
the satellite channels (wndw, etc) during suspend.
This makes RM return NV_ERR_NOT_SUPPORTED on attempting to reallocate
the core channel on resume for some reason, but we probably shouldn't
be doing it on HW either.
Tear down children in the reverse of allocation order instead.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Karol Herbst <kherbst@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
Signed-off-by: Karol Herbst <kherbst@redhat.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20230525003106.3853741-1-skeggsb@gmail.com
2023-07-06 17:22:32 +02:00
Ben Skeggs
9074109676
drm/nouveau/acr/gm20b: regression fixes
...
Missed some Tegra-specific quirks when reworking ACR to support Ampere.
Fixes: 2541626cfb ("drm/nouveau/acr: use common falcon HS FW code for ACR FWs")
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt >
Tested-by: Nicolas Chauvet <kwizart@gmail.com >
Signed-off-by: Lyude Paul <lyude@redhat.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20230130223715.1831509-3-bskeggs@redhat.com
2023-01-30 18:50:09 -05:00
Ben Skeggs
1ed02c3f2d
drm/nouveau/engine: add HAL for engine-specific rc reset procedure
...
Will be used to improve gr reset on GF100 and newer.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2022-11-09 10:45:10 +10:00
Ben Skeggs
2541626cfb
drm/nouveau/acr: use common falcon HS FW code for ACR FWs
...
Adds context binding and support for FWs with a bootloader to the code
that was added to load VPR scrubber HS binaries, and ports ACR over to
using all of it.
- gv100 split from gp108 to handle FW exit status differences
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2022-11-09 10:44:58 +10:00
Ben Skeggs
0e44c21708
drm/nouveau/flcn: new code to load+boot simple HS FWs (VPR scrubber)
...
Adds the start of common interfaces to load and boot the HS binaries
provided by NVIDIA that enable the usage of GR.
ACR already handles most of this, but it's very much tied into ACR's
init process, and there's other code that could benefit from reusing
a lot of this stuff too (ie. VBIOS DEVINIT/PreOS, VPR scrubber).
The VPR scrubber code is fairly independent, and a good first target.
- adds better debug output to fw loading process, to ease bring-up/debug
v2:
- whitespace, 0->false
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2022-11-09 10:44:58 +10:00
Ben Skeggs
f48dd29361
drm/nouveau/fifo: add new engine context tracking
...
Channel groups have somewhat more complicated requirements than what we
currently support. An engine context is shared between all channels in
a channel group, VEID/subctx support (later) brings per-VEID components,
and we need to track an individual channel's engine context pointers.
This commit adds the structures and refcounting to support the above,
wrapping the prior implementation for the moment.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2022-11-09 10:44:48 +10:00
Ben Skeggs
6de125383a
drm/nouveau/fifo: expose runlist topology info on all chipsets
...
Previously only available from Kepler onwards.
- also fixes the info() queries causing fifo init()/fini() unnecessarily
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2022-11-09 10:44:47 +10:00
Ben Skeggs
8478cd5a74
drm/nouveau/nvkm: add locking to subdev/engine init paths
...
This wasn't really needed before; the main place this could race is with
channel recovery, but (through potentially fragile means) shouldn't have
been possible.
However, a number of upcoming patches benefit from having better control
over subdev init, necessitating some improvements here.
- allows subdev/engine oneinit() without init() (host/fifo patches)
- merges engine use locking/tracking into subdev, and extends it to fix
some issues that will arise with future usage patterns (acr patches)
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2022-11-09 10:44:46 +10:00
Ben Skeggs
fe76fe497c
drm/nouveau/mc: implement intr handling on top of nvkm_intr
...
- new-style handlers can now be used here too
- decent clean-up
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2022-11-09 10:44:36 +10:00
Ben Skeggs
a7ab200aeb
drm/nouveau/intr: add nvkm_subdev_intr() compatibility
...
It's quite a lot of tedious and error-prone work to switch over all the
subdevs at once, so allow an nvkm_intr to request new-style handlers to
be created that wrap the existing interfaces.
This will allow a more gradual transition.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2022-11-09 10:44:35 +10:00
Ben Skeggs
3ebd64aa3c
drm/nouveau/intr: support multiple trees, and explicit interfaces
...
Turing adds a second top-level interrupt tree in HW, in addition to the
trees available via NV_PMC. Most of the interrupts we care about are
exposed in both trees, but not all of them, and we have some rather
nasty hacks to route the fault buffer interrupts.
Ampere removes the NV_PMC trees entirely.
Here we add some infrastructure to be able to handle all of this more
cleanly, as well as providing more explicit control over handlers.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2022-11-09 10:44:35 +10:00
Ben Skeggs
727fd72f24
drm/nouveau/intr: add shared interrupt plumbing between pci/tegra
...
Unifies the handling between PCI-based and Tegra GPUs, and makes more
explicit/obvious where device interrupts can be expected.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2022-11-09 10:44:35 +10:00
Ben Skeggs
ab4f75eb1c
drm/nouveau/nvkm: give each nvkm_event its own lockdep class
...
The vblank and nonstall events have some annoying interactions with DRM
locking, and aren't able to do certain things as a result.
However, other uses of event notifications don't have such requirements,
and upcoming patches take advantage of this for various improvements.
Having separate classes for each nvkm_event's spinlocks allows lockdep
to distinguish between them and avoid false-positives.
v2: __always_inline + comment
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2022-11-09 10:44:35 +10:00
Ben Skeggs
99d0701afd
drm/nouveau/nvkm: rip out old notify
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2022-11-09 10:44:27 +10:00
Ben Skeggs
f43e47c090
drm/nouveau/nvkm: add a replacement for nvkm_notify
...
This replaces the twisty, confusing, relationship between nvkm_event and
nvkm_notify with something much simpler, and less racey. It also places
events in the object tree hierarchy, which will allow a heap of the code
tracking events across allocation/teardown/suspend to be removed.
This commit just adds the new interfaces, and passes the owning subdev to
the event constructor to enable debug-tracing in the new code.
v2:
- use ?: (lyude)
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2022-11-09 10:44:26 +10:00
Ben Skeggs
0196cc65f9
drm/nouveau/device: remove pwrsrc notify in favour of a direct call to clk
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2022-07-13 13:56:04 +10:00
Ben Skeggs
61c1f340bc
drm/nouveau/nvkm: use list_add_tail() when building object tree
...
Fixes resume from hibernate failing on (at least) TU102, where cursor
channel init failed due to being performed before the core channel.
Not solid idea why suspend-to-ram worked, but, presumably HW being in
an entirely clean state has something to do with it.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2022-07-13 13:55:43 +10:00
Zou Wei
5e18b97370
drm/nouveau/core/client: Mark nvkm_uclient_sclass with static keyword
...
Fix the following sparse warning:
drivers/gpu/drm/nouveau/nvkm/core/client.c:64:1: warning: symbol 'nvkm_uclient_sclass' was not declared. Should it be static?
Signed-off-by: Zou Wei <zou_wei@huawei.com >
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Karol Herbst <kherbst@redhat.com >
Signed-off-by: Karol Herbst <kherbst@redhat.com >
Link: https://gitlab.freedesktop.org/drm/nouveau/-/merge_requests/10
2021-11-12 23:46:04 +01:00
Ben Skeggs
59f216cf04
drm/nouveau: rip out nvkm_client.super
...
No longer required now that userspace can't touch anything that might
need it, and should fix DRM MM operations racing with each other, and
the random hangs/crashes that come with that.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-08-18 19:00:15 +10:00
Ben Skeggs
5ef25f068c
drm/nouveau/nvkm: remove nvkm_subdev.index
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:50:04 +10:00
Ben Skeggs
0fa5680c28
drm/nouveau/vic: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:50:00 +10:00
Ben Skeggs
8d6461d832
drm/nouveau/sw: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:50:00 +10:00
Ben Skeggs
d1866250a2
drm/nouveau/sec2: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:59 +10:00
Ben Skeggs
400c2a456c
drm/nouveau/sec: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:59 +10:00
Ben Skeggs
e73d371a73
drm/nouveau/pm: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:59 +10:00
Ben Skeggs
ee532a8d0e
drm/nouveau/nvenc: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:59 +10:00
Ben Skeggs
f8aeb13303
drm/nouveau/nvdec: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:59 +10:00
Ben Skeggs
b15147bd71
drm/nouveau/msvld: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:59 +10:00
Ben Skeggs
07a356bbe7
drm/nouveau/msppp: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:58 +10:00
Ben Skeggs
963216061c
drm/nouveau/mspdec: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:58 +10:00
Ben Skeggs
e9e9a219e4
drm/nouveau/msenc: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:58 +10:00
Ben Skeggs
e5e95a7639
drm/nouveau/mpeg: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:58 +10:00
Ben Skeggs
aba5e97b89
drm/nouveau/me: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:58 +10:00
Ben Skeggs
ee307030e9
drm/nouveau/ifb: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:58 +10:00
Ben Skeggs
864d37c3d8
drm/nouveau/gr: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:58 +10:00
Ben Skeggs
ab0db2bd85
drm/nouveau/fifo: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:56 +10:00
Ben Skeggs
09f409d74d
drm/nouveau/dma: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:56 +10:00
Ben Skeggs
a7f000ec56
drm/nouveau/disp: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:56 +10:00
Ben Skeggs
0b26ca68c9
drm/nouveau/cipher: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:56 +10:00
Ben Skeggs
50551b15c7
drm/nouveau/ce: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:56 +10:00
Ben Skeggs
fcc08a7c0d
drm/nouveau/bsp,vp: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:55 +10:00
Ben Skeggs
d07be5d788
drm/nouveau/volt: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:55 +10:00
Ben Skeggs
601c2a06d2
drm/nouveau/top: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:55 +10:00
Ben Skeggs
9aad54d5c7
drm/nouveau/tmr: switch to instanced constructor
...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com >
Reviewed-by: Lyude Paul <lyude@redhat.com >
2021-02-11 11:49:55 +10:00