ollama

mirror of https://github.com/ollama/ollama.git synced 2026-04-18 09:03:35 -04:00

Files

Daniel Hiltgen e823bff873 gemma4: enable flash attention (#15378 )

Backport GGML kernels so we can enable flash attention for the gemma 4 model on
Metal and CUDA.

2026-04-07 08:12:36 -07:00

ggml_test.go

2025-04-27 11:38:06 -07:00

ggml.go

2026-04-07 08:12:36 -07:00

gguf_test.go

2026-02-24 21:52:44 -04:00

gguf.go

2026-02-24 21:52:44 -04:00

type.go

2025-10-15 21:53:38 -07:00