mirror of
https://github.com/ollama/ollama.git
synced 2025-11-10 18:08:32 +01:00
If we create a memory layout that should fit based on report free VRAM but allocation still fails, we start applying a backoff. This reduces free VRAM by an exponential percentage (1%, 2%, 4%...). However, the points chosen tend to be too dense at the beginning and too sparse at the end. Therefore, this switches to an incremental backoff (10%, 20%, 30%...).
53 KiB
53 KiB