mirror of
https://github.com/ollama/ollama.git
synced 2025-11-11 08:17:22 +01:00
The context must always be able to store the current batch, so if the user requests a small context then we should also shrink the batch to match. This also fixes the TestLongInputContext test on the new engine. (The old engine already has this behavior.)
6.7 KiB
6.7 KiB