The device heartbeat work is currently initialized at
device_heartbeat_schedule() which is called at the end of
hl_device_init().
However hl_device_init() can fail at a previous step, and in such a
case, a subsequent call to hl_device_fini() will lead to calling
cleanup_resources() and accessing this work uninitialized.
As there is no real need to re-initialize this work every time it is
rescheduled, move this initialization to device_early_init() to be done
once and early enough.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
The test packet which is sent to FW for the PQ heartbeat is used also as
the trigger in FW to send the EQ heartbeat event.
Add the time of the last sent packet to the debug info which is printed
upon a EQ heartbeat failure.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Add a dump of the EQ entries headers upon a EQ heartbeat failure.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Don't print the "previous EQ index" value in case of a EQ heartbeat
failure, because it is incremented along with the EQ CI and therefore
redundant.
In addition, as the CPU-CP PI is zeroed when it reaches a value that is
twice the queue size, add a value of the CI with a similar wrap around,
to make it easier to compare the values.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
When device release triggers a hard reset, there is a printout of
the cause. Currently listed causes (that increment context refcount)
are active command submissions and exported DMA buffer objects. In
any other case, the printout emits "unknown reason". We identify and
print another reason - allocated command buffers.
Signed-off-by: Ilia Levi <illevi@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
When sending disable pci msg towards firmware, there is a
possibility that an EQ packet is already pending,
disabling EQ interrupt will prevent this from happening.
The interrupt will be re-enabled after reset.
Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Currently we schedule the heartbeat thread at late init, only then
we set the INTS_REGISTER packet which enables events to be received
from firmware.
Init may take some time and we want to give firmware 2 full cycles of
heartbeat thread after it received INTS_REGISTER.
The patch will move the heartbeat thread scheduling to be after driver
is done with all initializations.
Signed-off-by: Farah Kassabri <fkassabri@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
As the new dynamic EQ includes clock change events which are common and
not ASIC-specific, add a common handler for these events.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Gaudi2 with PCI revision ID with the value of '4' represents Gaudi2D
device and should be detected and initialized as Gaudi2.
Signed-off-by: Farah Kassabri <fkassabri@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
hl_eq_heartbeat_event_handle() doesn't have ASIC specific code, and
therefore can be moved from Gaudi2-only code to common code, and
possibly used for other ASICs.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
It is hard to debug the reason for heartbeat check failures.
As an attempt to ease this task, this patch will provide more
information when this failure happens.
Heartbeat checks the communication with FW, so printing
the CPU queue pi/ci and the counter of how many times that event
was received would help in debugging the issue.
Signed-off-by: Farah Kassabri <fkassabri@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Trace events might still be recorded after the accel device is released,
while the device name is no longer available.
Modify the trace functions to use the parent device instead, which is
available at that point and still informative as the device name.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
If we detected heartbet event while some daemon in the background send
(via driver interface) CPUCP messages the dmesg will be flooded.
Instead, a slight refactor in hl_fw_send_cpu_message() returns -EAGAIN
when CPU is disabled (i.e. heartbeat failure) and only then.
Later, all calling functions that may be invoked by user space can issue
prints only if the error code is not -EAGAIN.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
The function returned an error code which isn't propagated up the stack
(nor is it printed).
The return value is only checked for =0 or !=0 which implies bool return
value.
The function signature is updated accordingly, renamed, and slightly
refactored.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Today we read PCI VENDOR-ID in order to make sure PCI link is
healthy. Apparently the VENDOR-ID might be stored on host and
hence, when we read it we might not access the PCI bus.
In order to make sure PCI health check is reliable, we will start
checking the DEVICE-ID instead.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In newer kernel versions, irq_set_affinity_hint() is deprecated.
Instead, use the newer version which is irq_set_affinity_and_hint().
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The mechanism of aborting device reset for consecutive fatal errors is
currently only for fatal errors that are reported by FW.
A non-responsive FW and consecutive heartbeat failures is also
considered fatal, so add them as well to this mechanism to avoid
recurring device reset in such a case.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
When the DRAM region size in the BAR is not a power of 2, calculating
the corresponding BAR base address should be done using the offset from
the DRAM start address, and not using directly the DRAM address.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
User interrupts are MSIx interrupts coming from Gaudi2, that have
specific range of IDs and are assigned to the sole use of the user
process that opened the Gaudi2 device (reminder: there can be only
a single user process running on Gaudi2 at any given time).
The interrupts are allocated and managed by the driver and therefore,
the user expects the driver to initialize them properly, which also
includes setting the affinity to the related CPU cores of the
device's NUMA node to get maximum performance.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Pull drm updates from Dave Airlie:
"This contains two major new drivers:
- imagination is a first driver for Imagination Technologies devices,
it only covers very specific devices, but there is hope to grow it
- xe is a reboot of the i915 GPU (shares display) side using a more
upstream focused development model, and trying to maximise code
sharing. It's not enabled for any hw by default, and will hopefully
get switched on for Intel's Lunarlake.
This also drops a bunch of the old UMS ioctls. It's been dead long
enough.
amdgpu has a bunch of new color management code that is being used in
the Steam Deck.
amdgpu also has a new ACPI WBRF interaction to help avoid radio
interference.
Otherwise it's the usual lots of changes in lots of places.
Detailed summary:
new drivers:
- imagination - new driver for Imagination Technologies GPU
- xe - new driver for Intel GPUs using core drm concepts
core:
- add CLOSE_FB ioctl
- remove old UMS ioctls
- increase max objects to accomodate AMD color mgmt
encoder:
- create per-encoder debugfs directory
edid:
- split out drm_eld
- SAD helpers
- drop edid_firmware module parameter
format-helper:
- cache format conversion buffers
sched:
- move from kthread to workqueue
- rename some internals
- implement dynamic job-flow control
gpuvm:
- provide more features to handle GEM objects
client:
- don't acquire module reference
displayport:
- add mst path property documentation
fdinfo:
- alignment fix
dma-buf:
- add fence timestamp helper
- add fence deadline support
bridge:
- transparent aux-bridge for DP/USB-C
- lt8912b: add suspend/resume support and power regulator support
panel:
- edp: AUO B116XTN02, BOE NT116WHM-N21,836X2, NV116WHM-N49
- chromebook panel support
- elida-kd35t133: rework pm
- powkiddy RK2023 panel
- himax-hx8394: drop prepare/unprepare and shutdown logic
- BOE BP101WX1-100, Powkiddy X55, Ampire AM8001280G
- Evervision VGG644804, SDC ATNA45AF01
- nv3052c: register docs, init sequence fixes, fascontek FS035VG158
- st7701: Anbernic RG-ARC support
- r63353 panel controller
- Ilitek ILI9805 panel controller
- AUO G156HAN04.0
simplefb:
- support memory regions
- support power domains
amdgpu:
- add new 64-bit sequence number infrastructure
- add AMD specific color management
- ACPI WBRF support for RF interference handling
- GPUVM updates
- RAS updates
- DCN 3.5 updates
- Rework PCIe link speed handling
- Document GPU reset types
- DMUB fixes
- eDP fixes
- NBIO 7.9/7.11 updates
- SubVP updates
- XGMI PCIe state dumping for aqua vanjaram
- GFX11 golden register updates
- enable tunnelling on high pri compute
amdkfd:
- Migrate TLB flushing logic to amdgpu
- Trap handler fixes
- Fix restore workers handling on suspend/resume
- Fix possible memory leak in pqm_uninit()
- support import/export of dma-bufs using GEM handles
radeon:
- fix possible overflows in command buffer checking
- check for errors in ring_lock
i915:
- reorg display code for reuse in xe driver
- fdinfo memory stats printing
- DP MST bandwidth mgmt improvements
- DP panel replay enabling
- MTL C20 phy state verification
- MTL DP DSC fractional bpp support
- Audio fastset support
- use dma_fence interfaces instead of i915_sw_fence
- Separate gem and display code
- AUX register macro refactoring
- Separate display module/device parameters
- Move display capabilities debugfs under display
- Makefile cleanups
- Register cleanups
- Move display lock inits under display/
- VLV/CHV DPIO PHY register and interface refactoring
- DSI VBT sequence refactoring
- C10/C20 PHY PLL hardware readout
- DPLL code cleanups
- Cleanup PXP plane protection checks
- Improve display debug msgs
- PSR selective fetch fixes/improvements
- DP MST fixes
- Xe2LPD FBC restrictions removed
- DGFX uses direct VBT pin mapping
- more MTL WAs
- fix MTL eDP bug
- eliminate use of kmap_atomic
habanalabs:
- sysfs entry to identify a device minor id with debugfs path
- sysfs entry to expose device module id
- add signed device info retrieval through INFO ioctl
- add Gaudi2C device support
- pcie reset prepare/done hooks
msm:
- Add support for SDM670, SM8650
- Handle the CFG interconnect to fix the obscure hangs / timeouts
- Kconfig fix for QMP dependency
- use managed allocators
- DPU: SDM670, SM8650 support
- DPU: Enable SmartDMA on SM8350 and SM8450
- DP: enable runtime PM support
- GPU: add metadata UAPI
- GPU: move devcoredumps to GPU device
- GPU: convert to drm_exec
ivpu:
- update FW API
- new debugfs file
- a new NOP job submission test mode
- improve suspend/resume
- PM improvements
- MMU PT optimizations
- firmware profile frequency support
- support for uncached buffers
- switch to gem shmem helpers
- replace kthread with threaded irqs
rockchip:
- rk3066_hdmi: convert to atomic
- vop2: support nv20 and nv30
- rk3588 support
mediatek:
- use devm_platform_ioremap_resource
- stop using iommu_present
- MT8188 VDOSYS1 display support
panfrost:
- PM improvements
- improve interrupt handling as poweroff
qaic:
- allow to run with single MSI
- support host/device time sync
- switch to persistent DRM devices
exynos:
- fix potential error pointer dereference
- fix wrong error checking
- add missing call to drm_atomic_helper_shutdown
omapdrm:
- dma-fence lockdep annotation fix
tidss:
- dma-fence lockdep annotation fix
- support for AM62A7
v3d:
- BCM2712 - rpi5 support
- fdinfo + gputop support
- uapi for CPU job handling
virtio-gpu:
- add context debug name"
* tag 'drm-next-2024-01-10' of git://anongit.freedesktop.org/drm/drm: (2340 commits)
drm/amd/display: Allow z8/z10 from driver
drm/amd/display: fix bandwidth validation failure on DCN 2.1
drm/amdgpu: apply the RV2 system aperture fix to RN/CZN as well
drm/amd/display: Move fixpt_from_s3132 to amdgpu_dm
drm/amd/display: Fix recent checkpatch errors in amdgpu_dm
Revert "drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole"
drm/amd/display: avoid stringop-overflow warnings for dp_decide_lane_settings()
drm/amd/display: Fix power_helpers.c codestyle
drm/amd/display: Fix hdcp_log.h codestyle
drm/amd/display: Fix hdcp2_execution.c codestyle
drm/amd/display: Fix hdcp_psp.h codestyle
drm/amd/display: Fix freesync.c codestyle
drm/amd/display: Fix hdcp_psp.c codestyle
drm/amd/display: Fix hdcp1_execution.c codestyle
drm/amd/pm/smu7: fix a memleak in smu7_hwmgr_backend_init
drm/amdkfd: Fix iterator used outside loop in 'kfd_add_peer_prop()'
drm/amdgpu: Drop 'fence' check in 'to_amdgpu_amdkfd_fence()'
drm/amdkfd: Confirm list is non-empty before utilizing list_first_entry in kfd_topology.c
drm/amdgpu: Fix '*fw' from request_firmware() not released in 'amdgpu_ucode_request()'
drm/amdgpu: Fix variable 'mca_funcs' dereferenced before NULL check in 'amdgpu_mca_smu_get_mca_entry()'
...
hl_device_cond_reset() might be called with the hard reset flag unset,
because a compute reset upon device release as part of a graceful reset
is valid.
If the conditions for graceful reset are not met, hl_device_reset() will
be called for an immediate reset. In this case a compute reset is not
valid, so it will be replaced with a hard reset together with a debug
message about it.
This message might be confusing, as it implies that a compute reset was
requested when it shouldn't. To prevent this confusion, set the hard
reset flag in hl_device_cond_reset() if going to an immediate reset.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Stop rescheduling another heartbeat check when EQ heartbeat check fails
as it generates confusing logs in dmesg that the heartbeat fails.
Signed-off-by: Farah Kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Gaudi2 with PCI revision ID with the value of '3' represents Gaudi2C
device and should be detected and initialized as Gaudi2.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add error log when no eq event is received from FW,
to cover a scenario when FW is stuck for some reason.
In such case driver will not receive neither the eq error interrupt
or the eq heartbeat event, and will just initiate a reset without
indication in the dmesg about the reason.
Signed-off-by: Farah Kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Notifies the user which device was removed. It is important in
a server with multiple devices.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Currently we use dynamic allocation inside the irq handler
in order to allocate free node to be used for the free jobs.
This operation is expensive, especially when we deal with large
burst of events records that get released at the same time.
The alternative is to have pre allocated pool of free nodes
and just fetch nodes from this pool at irq handling time instead
of allocating them.
In case the pool becomes full, then the driver will fallback to
dynamic allocations.
As part of the optimization also update the unregister flow
upon re-using a timestamp record, by making the operation much
simpler and quicker. We already have the record in the registration
flow and now we just seek to re-use with different interrupt.
Therefore, no need to look for buffer according to the user handle.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
After the heartbeat mechanism is now expanded to be used also
for EQ health check, we shouldn't send heartbeat messages
to FW before driver allow events to be received from FW.
Because if the driver will send two heartbeats before it enables
events to be received from FW, then the EQ health check
will fail and reset the device.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add mechanism for fw eq health check. this will be done using two flows:
using the heartbeat mechanism and raising a dedicated interrupt to
indicate an eq failure like EQ full.
This patch will add implementation for the eq heartbeat for gaudi2 asic.
More info about the heartbeat mechanism:
Expand the heartbeat mechanism to monitor a new event that
will be sent from FW upon receiving heartbeat message.
that way driver can know that the eq is working or not.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Because it is not used and also, for graceful reset to work
those ioctls should run on the compute device.
Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Register the compute device as an accel device, and remove the creation
of the habanalabs compute char device.
The IOCTLs in this patch are still handled by the current driver
handler. Moving to DRM IOCTL handling requires moving the IOCTLs
numbers to a specific range, so it will be handled in subsequent
patches.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
User gets notification for every engine error report, but he still
lacks the exact engine information. Hence, we allow user to query
for the exact engine reported an error.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
After being notified about certain errors, user is expected to finish
his post-errors actions and to release the device within some timeout,
after which is deice is being reset.
The default timeout value is 5 sec, which in some case is not enough for
a user application to collect debug data.
Increase the default value to 30 sec.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
If scrubbing memory after user released device has failed it means
the device is in a bad state and should be reset.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Our simulator supports idle check so no need anymore to check if pdev
exists.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Every time an FD is returned to the user, the driver adds
a corresponding private structure to the list.
Yet, it's still a list of private structures rather than of FDs.
Remove, as well, an unnecessary comment.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Because we might still be using related resources, decrementing PID's
reference count should be done at later stages of the device release.
A good place is right after the representing private structure is
removed from LKD's list.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As part of driver teardown, we attempt to kill all user processes.
It shouldn't fail, but if it does we want to print the error code that
the kapi returned to us.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
hl_device_status() returns the status of an acquired device.
If a device is going down (following an rmmod cmd),
it should be marked as an unusable/malfunctioning device, and
hence should not be acquired.
However, since this was not the case so far (i.e., a device going
down would inaccurately return 'in reset' status allowing the user
to acquire the device) it introduced a bug where as part of a reset
flow, the driver could not kill processes that have not run yet, and
since those processes aren't blocked from reacquiring a device,
we get eventually a new flow of a driver attempting to kill all
processes in a list that can't be ever really empty.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
If hl_device_cond_reset() is called while a reset is already pending but
hasn't started, the reset request will be dropped.
If the flags of the new request are more severe, e.g. a hard reset while
the pending reset is a compute reset, the eventual reset won't be
suitable for the device status.
To prevent such cases, update the pending reset flags with the new
requests flags before the requests are dropped.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
When a H/W event is received while a user is registered to events, no
immediate hard reset will happen, and instead the user will be notified
and will have some time to handle it and eventually release the
device, after which the reset will be done.
If a user, as part of the handling and as part of the cleanup steps
towards releasing the device, unregisters from receiving those events,
and at that time an adjacent H/W event is received, it will be assumed
that the user is not registered to events and thus an immediate hard
reset is required.
To prevent such an unwanted immediate reset, modify the driver to
perform it if the user is not registered to events AND we don't already
have a pending reset for a previous H/W event.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Moved error info reset code to single function for future use from
other places in the driver.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
There were a few places where simulator only code got into the upstream.
Remove those places that can confuse other developers.
Fixes: 2a0a839b6a ("habanalabs: extend fatal messages to contain PCI info")
Cc: Moti Haimovski <mhaimovski@habana.ai>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Currently the debugfs root folder and files for a device are created at
an early step, before the device initialization and before the char
device and sysfs files are exposed to user.
As there is no real reason not to do it together with the device
creation, postpone it to be done right afterwards.
The initialization of the debugfs entry structure is left in its
current position because it is used before creating the files.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>