ollama

mirror of https://github.com/ollama/ollama.git synced 2025-11-10 22:20:14 +01:00

Files

Jesse Gross fdb109469f llm: Allow overriding flash attention setting

As we automatically enable flash attention for more models, there
are likely some cases where we get it wrong. This allows setting
OLLAMA_FLASH_ATTENTION=0 to disable it, even for models that usually
have flash attention.

2025-10-02 12:07:20 -07:00

config_test.go

…

config.go

llm: Allow overriding flash attention setting

2025-10-02 12:07:20 -07:00