mirror of
https://github.com/ollama/ollama.git
synced 2025-11-11 02:17:45 +01:00
Fall back to alternative quantization types when a tensor's dimensions aren't divisible by the block size required for the original desired quantization type. If retried quantization types fail, the system ultimately falls back to F16 (half-precision floating point) which has a block size of 1 and can handle any tensor dimension.
8.1 KiB
8.1 KiB