mirror of
https://github.com/ollama/ollama.git
synced 2025-06-30 21:50:56 +02:00
Merge pull request #10468 from ollama/drifkin/num-parallel-1
This commit is contained in:
@ -58,7 +58,7 @@ var defaultModelsPerGPU = 3
|
|||||||
// Default automatic value for parallel setting
|
// Default automatic value for parallel setting
|
||||||
// Model will still need to fit in VRAM. If this setting won't fit
|
// Model will still need to fit in VRAM. If this setting won't fit
|
||||||
// we'll back off down to 1 to try to get it to fit
|
// we'll back off down to 1 to try to get it to fit
|
||||||
var defaultParallel = 4
|
var defaultParallel = 2
|
||||||
|
|
||||||
var ErrMaxQueue = errors.New("server busy, please try again. maximum pending requests exceeded")
|
var ErrMaxQueue = errors.New("server busy, please try again. maximum pending requests exceeded")
|
||||||
|
|
||||||
|
Reference in New Issue
Block a user