ollama/llm/server.go at ad6f6a1d29f45a5c7266bcd7edb5671621e86810

mirror of https://github.com/ollama/ollama.git synced 2025-11-10 18:08:32 +01:00

Files

Jesse Gross ad6f6a1d29 llm: Change memory allocation backoff from exponential to incremental

If we create a memory layout that should fit based on report free VRAM
but allocation still fails, we start applying a backoff. This reduces
free VRAM by an exponential percentage (1%, 2%, 4%...). However, the
points chosen tend to be too dense at the beginning and too sparse at
the end. Therefore, this switches to an incremental backoff (10%, 20%,
30%...).

2025-10-23 12:58:31 -07:00

53 KiB

Raw Blame History

View Raw

53 KiB Raw Blame History

53 KiB

Raw Blame History