ollama

mirror of https://github.com/ollama/ollama.git synced 2025-03-18 05:41:43 +01:00

History

Jesse Gross bf24498b1e ollamarunner: Check for minBatch of context space when shifting

Models can specify that a group of inputs need to be handled a single
batch. However, context shifting didn't respect this and could trigger
a break anyways. In this case, we should instead trigger a context
shift earlier so that it occurs before the grouped batch.

Note that there still some corner cases:
 - A long prompt that exceeds the context window can get truncated
   in the middle of an image. With the current models, this will
   result in the model not recognizing the image at all, which is
   pretty much the expected result with truncation.
 - The context window is set less than the minimum batch size. The
   only solution to this is to refuse to load the model with these
   settings. However, this can never occur with current models and
   default settings.

Since users are unlikely to run into these scenarios, fixing them is
left as a follow up.

2025-03-17 15:33:16 -07:00

cache_test.go

runner: remove cache prompt flag from ollama runner (#9826 )

2025-03-17 15:11:15 -07:00

cache.go

runner: remove cache prompt flag from ollama runner (#9826 )

2025-03-17 15:11:15 -07:00

runner.go

ollamarunner: Check for minBatch of context space when shifting

2025-03-17 15:33:16 -07:00