mirror of
https://github.com/ollama/ollama.git
synced 2025-03-20 14:52:59 +01:00
Models may require that a set of inputs all be processed as part of the same batch. For example, if an image has multiple patches with fully connected attention between them, we should not split the batch in the middle of an image. Fixes #9697