ollama

mirror of https://github.com/ollama/ollama.git synced 2025-11-12 17:47:39 +01:00

Files

Jesse Gross 1c093e97af kvcache: Remove special case for reservation mask

We currently short circuit generation of the cache mask and just
generate an empty tensor of the correct size. However, in some
cases, this can also skip a cast operation. This can result in the
worst case graph being not fully worst case.

We don't actually need the fast path for mask generation, so it's
better to just use the normal code path.

2025-10-22 17:38:04 -07:00

cache.go

…

causal_test.go

kvcache: Clean up sliding window state with independent batches

2025-10-08 16:43:14 -07:00

causal.go

kvcache: Remove special case for reservation mask

2025-10-22 17:38:04 -07:00

encoder.go

…

wrapper.go

…