Commit Graph

225 Commits

Author SHA1 Message Date
Xin Li (Intel)
efef7f184f x86/msr: Add explicit includes of <asm/msr.h>
For historic reasons there are some TSC-related functions in the
<asm/msr.h> header, even though there's an <asm/tsc.h> header.

To facilitate the relocation of rdtsc{,_ordered}() from <asm/msr.h>
to <asm/tsc.h> and to eventually eliminate the inclusion of
<asm/msr.h> in <asm/tsc.h>, add an explicit <asm/msr.h> dependency
to the source files that reference definitions from <asm/msr.h>.

[ mingo: Clarified the changelog. ]

Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Link: https://lore.kernel.org/r/20250501054241.1245648-1-xin@zytor.com
2025-05-02 10:23:47 +02:00
Linus Torvalds
2cd5769fb0 Merge tag 'driver-core-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updatesk from Greg KH:
 "Here is the big set of driver core updates for 6.15-rc1. Lots of stuff
  happened this development cycle, including:

   - kernfs scaling changes to make it even faster thanks to rcu

   - bin_attribute constify work in many subsystems

   - faux bus minor tweaks for the rust bindings

   - rust binding updates for driver core, pci, and platform busses,
     making more functionaliy available to rust drivers. These are all
     due to people actually trying to use the bindings that were in
     6.14.

   - make Rafael and Danilo full co-maintainers of the driver core
     codebase

   - other minor fixes and updates"

* tag 'driver-core-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (52 commits)
  rust: platform: require Send for Driver trait implementers
  rust: pci: require Send for Driver trait implementers
  rust: platform: impl Send + Sync for platform::Device
  rust: pci: impl Send + Sync for pci::Device
  rust: platform: fix unrestricted &mut platform::Device
  rust: pci: fix unrestricted &mut pci::Device
  rust: device: implement device context marker
  rust: pci: use to_result() in enable_device_mem()
  MAINTAINERS: driver core: mark Rafael and Danilo as co-maintainers
  rust/kernel/faux: mark Registration methods inline
  driver core: faux: only create the device if probe() succeeds
  rust/faux: Add missing parent argument to Registration::new()
  rust/faux: Drop #[repr(transparent)] from faux::Registration
  rust: io: fix devres test with new io accessor functions
  rust: io: rename `io::Io` accessors
  kernfs: Move dput() outside of the RCU section.
  efi: rci2: mark bin_attribute as __ro_after_init
  rapidio: constify 'struct bin_attribute'
  firmware: qemu_fw_cfg: constify 'struct bin_attribute'
  powerpc/perf/hv-24x7: Constify 'struct bin_attribute'
  ...
2025-04-01 11:02:03 -07:00
Easwar Hariharan
78cf56f883 accel/habanalabs: convert timeouts to secs_to_jiffies()
Commit b35108a51c ("jiffies: Define secs_to_jiffies()") introduced
secs_to_jiffies().  As the value here is a multiple of 1000, use
secs_to_jiffies() instead of msecs_to_jiffies() to avoid the
multiplication

This is converted using scripts/coccinelle/misc/secs_to_jiffies.cocci with
the following Coccinelle rules:

@depends on patch@
expression E;
@@

-msecs_to_jiffies
+secs_to_jiffies
 (E
- * \( 1000 \| MSEC_PER_SEC \)
 )

Link: https://lkml.kernel.org/r/20250225-converge-secs-to-jiffies-part-two-v3-3-a43967e36c88@linux.microsoft.com
Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Cc: Carlos Maiolino <cem@kernel.org>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Chris Mason <clm@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Maol <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: David Sterba <dsterba@suse.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: Dongsheng Yang <dongsheng.yang@easystack.cn>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: Frank Li <frank.li@nxp.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Ilpo Jarvinen <ilpo.jarvinen@linux.intel.com>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: James Bottomley <james.bottomley@HansenPartnership.com>
Cc: James Smart <james.smart@broadcom.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Julia Lawall <julia.lawall@inria.fr>
Cc: Kalesh Anakkur Purayil <kalesh-anakkur.purayil@broadcom.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Mark Brown <broonie@kernel.org>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Nicolas Palix <nicolas.palix@imag.fr>
Cc: Niklas Cassel <cassel@kernel.org>
Cc: Oded Gabbay <ogabbay@kernel.org>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Selvin Thyparampil Xavier <selvin.xavier@broadcom.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Shyam-sundar S-k <Shyam-sundar.S-k@amd.com>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16 23:24:16 -07:00
Thomas Weißschuh
f800cc5859 accel/habanalabs: constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20241216-sysfs-const-bin_attr-habanalabs-v1-1-b35463197efb@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21 09:20:31 +01:00
Linus Torvalds
c159dfbdd4 Merge tag 'mm-nonmm-stable-2025-01-24-23-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
 "Mainly individually changelogged singleton patches. The patch series
  in this pull are:

   - "lib min_heap: Improve min_heap safety, testing, and documentation"
     from Kuan-Wei Chiu provides various tightenings to the min_heap
     library code

   - "xarray: extract __xa_cmpxchg_raw" from Tamir Duberstein preforms
     some cleanup and Rust preparation in the xarray library code

   - "Update reference to include/asm-<arch>" from Geert Uytterhoeven
     fixes pathnames in some code comments

   - "Converge on using secs_to_jiffies()" from Easwar Hariharan uses
     the new secs_to_jiffies() in various places where that is
     appropriate

   - "ocfs2, dlmfs: convert to the new mount API" from Eric Sandeen
     switches two filesystems to the new mount API

   - "Convert ocfs2 to use folios" from Matthew Wilcox does that

   - "Remove get_task_comm() and print task comm directly" from Yafang
     Shao removes now-unneeded calls to get_task_comm() in various
     places

   - "squashfs: reduce memory usage and update docs" from Phillip
     Lougher implements some memory savings in squashfs and performs
     some maintainability work

   - "lib: clarify comparison function requirements" from Kuan-Wei Chiu
     tightens the sort code's behaviour and adds some maintenance work

   - "nilfs2: protect busy buffer heads from being force-cleared" from
     Ryusuke Konishi fixes an issues in nlifs when the fs is presented
     with a corrupted image

   - "nilfs2: fix kernel-doc comments for function return values" from
     Ryusuke Konishi fixes some nilfs kerneldoc

   - "nilfs2: fix issues with rename operations" from Ryusuke Konishi
     addresses some nilfs BUG_ONs which syzbot was able to trigger

   - "minmax.h: Cleanups and minor optimisations" from David Laight does
     some maintenance work on the min/max library code

   - "Fixes and cleanups to xarray" from Kemeng Shi does maintenance
     work on the xarray library code"

* tag 'mm-nonmm-stable-2025-01-24-23-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (131 commits)
  ocfs2: use str_yes_no() and str_no_yes() helper functions
  include/linux/lz4.h: add some missing macros
  Xarray: use xa_mark_t in xas_squash_marks() to keep code consistent
  Xarray: remove repeat check in xas_squash_marks()
  Xarray: distinguish large entries correctly in xas_split_alloc()
  Xarray: move forward index correctly in xas_pause()
  Xarray: do not return sibling entries from xas_find_marked()
  ipc/util.c: complete the kernel-doc function descriptions
  gcov: clang: use correct function param names
  latencytop: use correct kernel-doc format for func params
  minmax.h: remove some #defines that are only expanded once
  minmax.h: simplify the variants of clamp()
  minmax.h: move all the clamp() definitions after the min/max() ones
  minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()
  minmax.h: reduce the #define expansion of min(), max() and clamp()
  minmax.h: update some comments
  minmax.h: add whitespace around operators and after commas
  nilfs2: do not update mtime of renamed directory that is not moved
  nilfs2: handle errors that nilfs_prepare_chunk() may return
  CREDITS: fix spelling mistake
  ...
2025-01-26 17:50:53 -08:00
Yafang Shao
7e70433c2b drivers: remove get_task_comm() and print task comm directly
Since task->comm is guaranteed to be NUL-terminated, we can print it
directly without the need to copy it into a separate buffer.  This
simplifies the code and avoids unnecessary operations.

Link: https://lkml.kernel.org/r/20241219023452.69907-6-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Jiri Slaby <jirislaby@kernel.org> (For tty)
Reviewed-by: Lyude Paul <lyude@redhat.com> (For nouveau)
Cc: Oded Gabbay <ogabbay@kernel.org>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tursulin@ursulin.net>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: "André Almeida" <andrealmeid@igalia.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: Kees Cook <kees@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vineet Gupta <vgupta@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-12 20:21:16 -08:00
Easwar Hariharan
c2cd5986eb accel/habanalabs: convert timeouts to secs_to_jiffies()
Commit b35108a51c ("jiffies: Define secs_to_jiffies()") introduced
secs_to_jiffies().  As the value here is a multiple of 1000, use
secs_to_jiffies() instead of msecs_to_jiffies to avoid the multiplication.

This is converted using scripts/coccinelle/misc/secs_to_jiffies.cocci with
the following Coccinelle rules:

@@ constant C; @@

- msecs_to_jiffies(C * 1000)
+ secs_to_jiffies(C)

@@ constant C; @@

- msecs_to_jiffies(C * MSEC_PER_SEC)
+ secs_to_jiffies(C)

Link: https://lkml.kernel.org/r/20241210-converge-secs-to-jiffies-v3-7-ddfefd7e9f2a@linux.microsoft.com
Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: Anna-Maria Behnsen <anna-maria@linutronix.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Daniel Mack <daniel@zonque.org>
Cc: David Airlie <airlied@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Florian Fainelli <florian.fainelli@broadcom.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haojian Zhuang <haojian.zhuang@gmail.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jack Wang <jinpu.wang@cloud.ionos.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: James Smart <james.smart@broadcom.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Jeff Johnson <jjohnson@kernel.org>
Cc: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jeroen de Borst <jeroendb@google.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Julia Lawall <julia.lawall@inria.fr>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: Louis Peens <louis.peens@corigine.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Naveen N Rao <naveen@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Nicolas Palix <nicolas.palix@imag.fr>
Cc: Oded Gabbay <ogabbay@kernel.org>
Cc: Ofir Bitton <obitton@habana.ai>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Praveen Kaligineedi <pkaligineedi@google.com>
Cc: Ray Jui <rjui@broadcom.com>
Cc: Robert Jarzmik <robert.jarzmik@free.fr>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Scott Branden <sbranden@broadcom.com>
Cc: Shailend Chand <shailend@google.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Simon Horman <horms@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-12 20:21:02 -08:00
Dave Airlie
c7d6cb4c43 Merge tag 'drm-misc-next-2024-12-05' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
[airlied: handle module ns conflict]

drm-misc-next for 6.14:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
  - Remove driver date from drm_driver

Driver Changes:
  - amdxdna: New driver!
  - ivpu: Fix qemu crash when using passthrough
  - nouveau: expose GSP-RM logging buffers via debugfs
  - panfrost: Add MT8188 Mali-G57 MC3 support
  - panthor: misc improvements,
  - rockchip: Gamma LUT support
  - tidss: Misc improvements
  - virtio: convert to helpers, add prime support for scanout buffers
  - v3d: Add DRM_IOCTL_V3D_PERFMON_SET_GLOBAL
  - vc4: Add support for BCM2712
  - vkms: Improvements all across the board

  - panels:
    - Introduce backlight quirks infrastructure
    - New panels: KDB KD116N2130B12

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241205-agile-straight-pegasus-aca7f4@houat
2024-12-13 08:48:09 +10:00
Jani Nikula
cb2e1c2136 drm: remove driver date from struct drm_driver and all drivers
We stopped using the driver initialized date in commit 7fb8af6798
("drm: deprecate driver date") and (eventually) started returning "0"
for drm_version ioctl instead.

Finish the job, and remove the unused date member from struct
drm_driver, its initialization from drivers, along with the common
DRIVER_DATE macros.

v2: Also update drivers/accel (kernel test robot)

Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Simon Ser <contact@emersion.fr>
Acked-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # msm
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/1f2bf2543aed270a06f6c707fd6ed1b78bf16712.1733322525.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-12-05 12:35:42 +02:00
Peter Zijlstra
cdd30ebb1b module: Convert symbol namespace to string literal
Clean up the existing export namespace code along the same lines of
commit 33def8498f ("treewide: Convert macro and uses of __section(foo)
to __section("foo")") and for the same reason, it is not desired for the
namespace argument to be a macro expansion itself.

Scripted using

  git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file;
  do
    awk -i inplace '
      /^#define EXPORT_SYMBOL_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /^#define MODULE_IMPORT_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /MODULE_IMPORT_NS/ {
        $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g");
      }
      /EXPORT_SYMBOL_NS/ {
        if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) {
  	if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ &&
  	    $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ &&
  	    $0 !~ /^my/) {
  	  getline line;
  	  gsub(/[[:space:]]*\\$/, "");
  	  gsub(/[[:space:]]/, "", line);
  	  $0 = $0 " " line;
  	}

  	$0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/,
  		    "\\1(\\2, \"\\3\")", "g");
        }
      }
      { print }' $file;
  done

Requested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc
Acked-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-02 11:34:44 -08:00
Didi Freiman
9dec27bb8a accel/habanalabs: gradual sleep in polling memory macro
It’s better to avoid long sleeps right from the beginning of the polling
since the data may be available much sooner than the sleep period.
Because polling host memory is inexpensive, this change gradually
increases the sleep time up to the user-requested period.

Signed-off-by: Didi Freiman <dfreiman@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:33 +03:00
Tomer Tayar
0199e6392e accel/habanalabs: move heartbeat work initialization to early init
The device heartbeat work is currently initialized at
device_heartbeat_schedule() which is called at the end of
hl_device_init().
However hl_device_init() can fail at a previous step, and in such a
case, a subsequent call to hl_device_fini() will lead to calling
cleanup_resources() and accessing this work uninitialized.

As there is no real need to re-initialize this work every time it is
rescheduled, move this initialization to device_early_init() to be done
once and early enough.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:33 +03:00
Tomer Tayar
5cb97d74c3 accel/habanalabs: print timestamp of last PQ heartbeat on EQ heartbeat failure
The test packet which is sent to FW for the PQ heartbeat is used also as
the trigger in FW to send the EQ heartbeat event.
Add the time of the last sent packet to the debug info which is printed
upon a EQ heartbeat failure.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:33 +03:00
Tomer Tayar
c4548eee53 accel/habanalabs: dump the EQ entries headers on EQ heartbeat failure
Add a dump of the EQ entries headers upon a EQ heartbeat failure.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:32 +03:00
Tomer Tayar
795f93e650 accel/habanalabs: revise print on EQ heartbeat failure
Don't print the "previous EQ index" value in case of a EQ heartbeat
failure, because it is incremented along with the EQ CI and therefore
redundant.

In addition, as the CPU-CP PI is zeroed when it reaches a value that is
twice the queue size, add a value of the CI with a similar wrap around,
to make it easier to compare the values.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:32 +03:00
Farah Kassabri
9ee446f9b5 accel/habanalabs: add more info upon cpu pkt timeout
In order to have better debuggability upon encountering FW issues,
We are adding additional info once CPU packet timeout expires.

Signed-off-by: Farah Kassabri <fkassabri@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:32 +03:00
Ilia Levi
fda92282b0 accel/habanalabs: additional print in device-in-use info
When device release triggers a hard reset, there is a printout of
the cause. Currently listed causes (that increment context refcount)
are active command submissions and exported DMA buffer objects. In
any other case, the printout emits "unknown reason". We identify and
print another reason - allocated command buffers.

Signed-off-by: Ilia Levi <illevi@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:32 +03:00
Tal Cohen
61f4f624ea accel/habanalabs: disable EQ interrupt after disabling pci
When sending disable pci msg towards firmware, there is a
possibility that an EQ packet is already pending,
disabling EQ interrupt will prevent this from happening.
The interrupt will be re-enabled after reset.

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:05 +03:00
Farah Kassabri
52fbab90ed accel/habanalabs: change the heartbeat scheduling point
Currently we schedule the heartbeat thread at late init, only then
we set the INTS_REGISTER packet which enables events to be received
from firmware.

Init may take some time and we want to give firmware 2 full cycles of
heartbeat thread after it received INTS_REGISTER.

The patch will move the heartbeat thread scheduling to be after driver
is done with all initializations.

Signed-off-by: Farah Kassabri <fkassabri@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:04 +03:00
Ofir Bitton
2214cafd10 accel/habanalabs: remove timestamp registration debug prints
There are several timestamp registration debug prints which
spams the kernel log whenever dyn debug is enabled.
Remove those prints.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:04 +03:00
Vitaly Margolin
cebb64f933 accel/habanalabs: add cpld ts cpld_timestamp cpucp
Add cpld_timestamp field to cpucp_info structure and return cpld
timestamp as part of cpld version

Signed-off-by: Vitaly Margolin <vmargolin@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:04 +03:00
Tomer Tayar
3d613b0cb5 accel/habanalabs: add a common handler for clock change events
As the new dynamic EQ includes clock change events which are common and
not ASIC-specific, add a common handler for these events.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:04 +03:00
Farah Kassabri
cbac6f4088 accel/habanalabs/gaudi2: add GAUDI2D revision support
Gaudi2 with PCI revision ID with the value of '4' represents Gaudi2D
device and should be detected and initialized as Gaudi2.

Signed-off-by: Farah Kassabri <fkassabri@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:04 +03:00
Tomer Tayar
93a296dde1 accel/habanalabs: move hl_eq_heartbeat_event_handle() to common code
hl_eq_heartbeat_event_handle() doesn't have ASIC specific code, and
therefore can be moved from Gaudi2-only code to common code, and
possibly used for other ASICs.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:04 +03:00
Tomer Tayar
5f6ad3c64a accel/habanalabs: add an EQ size ASIC property
Future supported ASICs might use the dynamic EQ mechanism with the
firmware, and in that case the EQ size won't be equal to the default
HL_EQ_SIZE_IN_BYTES value.
Add an ASIC property to enable overriding this value.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:04 +03:00
Farah Kassabri
b94488be08 accel/habanalabs: check for errors after preboot is ready
Driver should check and report any fatal errors detected by preboot,
before it attempts to load the boot fit.
Some errors may cause the driver to stop the boot process and mark
the device as unusable.
This check will allow the driver to fail and print the error reported
by preboot and skip the time wasting attempt of trying to load the
boot fit, which will fail due to the error.

Signed-off-by: Farah Kassabri <fkassabri@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:03 +03:00
Igal Zeltser
25abbe7ac1 accel/habanalabs: use msg_header instead of desc_header
Struct comms_desc_header is deprecated and replaced by struct
comms_msg_header. As a preparation for removing comms_desc_header
from FW, all it's usage in code is replaced by comms_msg_header.

Signed-off-by: Igal Zeltser <izeltser@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:53:03 +03:00
Farah Kassabri
31bd26931d accel/habanalabs: add heartbeat debug info
It is hard to debug the reason for heartbeat check failures.
As an attempt to ease this task, this patch will provide more
information when this failure happens.
Heartbeat checks the communication with FW, so printing
the CPU queue pi/ci and the counter of how many times that event
was received would help in debugging the issue.

Signed-off-by: Farah Kassabri <fkassabri@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:52:53 +03:00
Ohad Sharabi
42f04ca65c accel/habanalabs: add device name to invalidation failure msg
This addition helps log parsers better define the error without the need
to go back and search the device name on former log lines.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:46:14 +03:00
Tal Risin
26250af92f accel/habanalabs: expose server type in debugfs
Exposing server type through debugfs to enable easier access via
scripts.

Signed-off-by: Tal Risin <trisin@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:46:07 +03:00
Tomer Tayar
3d4bd5e292 accel/habanalabs: use parent device for trace events
Trace events might still be recorded after the accel device is released,
while the device name is no longer available.
Modify the trace functions to use the parent device instead, which is
available at that point and still informative as the device name.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:45:59 +03:00
Ohad Sharabi
ecda35d461 accel/habanalabs: no CPUCP prints on heartbeat failure
If we detected heartbet event while some daemon in the background send
(via driver interface) CPUCP messages the dmesg will be flooded.

Instead, a slight refactor in hl_fw_send_cpu_message() returns -EAGAIN
when CPU is disabled (i.e. heartbeat failure) and only then.

Later, all calling functions that may be invoked by user space can issue
prints only if the error code is not -EAGAIN.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:45:51 +03:00
Ohad Sharabi
705e520dc5 accel/habanalabs: restructure function that checks heartbeat received
The function returned an error code which isn't propagated up the stack
(nor is it printed).

The return value is only checked for =0 or !=0 which implies bool return
value.

The function signature is updated accordingly, renamed, and slightly
refactored.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:44:25 +03:00
Dani Liberman
241fa91e7c accel/habanalabs: add device name to error print
The extra info will help in better traceability and debug.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:44:08 +03:00
Ohad Sharabi
892bc64827 accel/habanalabs/gaudi2: use single function to compare FW versions
Currently, the code contains 2 types of FW version comparison functions:
- hl_is_fw_sw_ver_[below/equal_or_greater]()
- gaudi2 specific function of the type
  gaudi2_is_fw_ver_[below/above]x_y_z()

Moreover, some functions use the inner FW version which shuold be only
stage during development but not version dependencies.

Finally, some tests are done to deprecated FW version to which LKD
should hold no compatibility.

This commit aligns all APIs to a single function that just compares the
version and return an integers indicator (similar in some way to
strcmp()).

In addition, this generic function now considers also the sub-minor FW
version and also remove dead code resulting in deprecated FW versions
compatibility.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Ofir Bitton <obitton@habana.ai>
2024-06-23 09:42:24 +03:00
Ofir Bitton
fa58b59493 accel/habanalabs: modify pci health check
Today we read PCI VENDOR-ID in order to make sure PCI link is
healthy. Apparently the VENDOR-ID might be stored on host and
hence, when we read it we might not access the PCI bus.
In order to make sure PCI health check is reliable, we will start
checking the DEVICE-ID instead.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2024-02-26 09:47:32 +02:00
Tomer Tayar
c517068349 accel/habanalabs: keep explicit size of reserved memory for FW
The reserved memory for FW is currently saved in an ASIC property in
units of MB, just like the value that comes from FW.
Except the fact that it is not clear from the property's name, it means
also that a calculation to actual size is required everywhere that it is
used.
Modify the property to hold the size in bytes.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2024-02-26 09:47:27 +02:00
Tomer Tayar
db45bbdd02 accel/habanalabs: handle reserved memory request when working with full FW
Currently the reserved memory request from FW is handled when running
with preboot only, but this request is relevant also when running with
full FW.
Modify to always handle this reservation request.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2024-02-26 09:47:24 +02:00
Ofir Bitton
5b6658eb7c accel/habanalabs/hwmon: rate limit errors user can generate
Fetching sensor data can fail due to various reasons. In order
not to pollute the kernel log, those error prints must be
rate limited.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2024-02-26 09:47:19 +02:00
Dani Liberman
fd8d2fa066 accel/habanalabs: fix error print
The unmasking is for event and it can be other event than RAZWI.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2024-02-26 09:47:12 +02:00
Tal Risin
c8c062e967 accel/habanalabs: initialize maybe-uninitialized variables
Prevent static analysis warning.

Signed-off-by: Tal Risin <trisin@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2024-02-26 09:47:08 +02:00
Avri Kehat
0b105a2a72 accel/habanalabs: fix debugfs files permissions
debugfs files are created with permissions that don't align
with the access requirements.

Signed-off-by: Avri Kehat <akehat@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2024-02-26 09:47:03 +02:00
Tomer Tayar
e855869bec accel/habanalabs: fix glbl error cause handling
The glbl error cause handling has a wrong assumption that all error
bits are consecutive.
Fix the handling to check all relevant error bits per ASIC.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2024-02-26 09:47:00 +02:00
Tomer Tayar
7159813c91 accel/habanalabs: modify print for skip loading linux FW to debug log
Skip loading a linux FW image into the device with the current supported
ASICs is done for test purposes only.
Moreover, for future supported ASICs it is possible that there won't be
a need to load such an image.
The print in such a case is therefore not needed in most cases, so
replace the used dev_info() with dev_dbg().

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2024-02-26 09:46:52 +02:00
Farah Kassabri
c14e5cd3ed accel/habanalabs: remove hop size from asic properties
The hop size related properties is a MMU properties and not
asic properties.
As for PMMU and HMMU we could have different sizes.

Signed-off-by: Farah Kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2024-02-26 09:46:40 +02:00
Erick Archer
9e263c5042 accel/habanalabs: use kcalloc() instead of kzalloc()
As noted in the "Deprecated Interfaces, Language Features, Attributes,
and Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead
to values wrapping around and a smaller allocation being made than the
caller was expecting. Using those allocations could lead to linear
overflows of heap memory and other misbehaviors.

So, use the purpose specific kcalloc() function instead of the argument
size * count in the kzalloc() function.

Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [1]
Link: https://github.com/KSPP/linux/issues/162

Signed-off-by: Erick Archer <erick.archer@gmx.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2024-02-26 09:30:41 +02:00
Dani Liberman
731d320e68 accel/habanalabs: remove call to deprecated function
In newer kernel versions, irq_set_affinity_hint() is deprecated.
Instead, use the newer version which is irq_set_affinity_and_hint().

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2024-02-26 09:30:40 +02:00
Malkoot Khan
8a5be2b62b accel/habanalabs: Remove unnecessary braces from if statement
The coding style in the Linux kernel prefers not to use
braces for single-statement if conditions.
This patch removes the unnecessary braces from an if statement
in the file drivers/accel/habanalabs/common/command_submission.c,
which also resolves a coding style warning.

Signed-off-by: Malkoot Khan <engr.mkhan1990@gmail.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2024-02-26 09:30:40 +02:00
Farah Kassabri
f728c17fc9 accel/habanalabs/gaudi2: move HMMU page tables to device memory
Currently the HMMU page tables reside in the host memory,
which will cause host access from the device for every page walk.
This can affect PCIe bandwidth in certain scenarios.

To prevent that problem, HMMU page tables will be moved to the device
memory so the miss transaction will read the hops from there instead of
going to the host.

Signed-off-by: Farah Kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2024-02-26 09:30:40 +02:00
Tomer Tayar
246d8b6cfb accel/habanalabs: abort device reset for consecutive heartbeat failures
The mechanism of aborting device reset for consecutive fatal errors is
currently only for fatal errors that are reported by FW.
A non-responsive FW and consecutive heartbeat failures is also
considered fatal, so add them as well to this mechanism to avoid
recurring device reset in such a case.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2024-02-26 09:30:40 +02:00