993cf8bf55
llm: limit generation to 10x context size to avoid run on generations ( #3918 )
...
* llm: limit generation to 10x context size to avoid run on generations
* add comment
* simplify condition statement
2024-04-25 19:02:30 -04:00
34b9db5afc
Request and model concurrency
...
This change adds support for multiple concurrent requests, as well as
loading multiple models by spawning multiple runners. The default
settings are currently set at 1 concurrent request per model and only 1
loaded model at a time, but these can be adjusted by setting
OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.
2024-04-22 19:29:12 -07:00
62be2050dd
chore: use errors.New to replace fmt.Errorf will much better ( #3789 )
2024-04-20 22:11:06 -04:00
ad90b9ab3d
api: start adding documentation to package api ( #2878 )
...
* api: start adding documentation to package api
Updates #2840
* Fix lint typo report
2024-04-10 13:31:55 -04:00
01114b4526
fix: rope
2024-04-09 16:15:24 -07:00
9502e5661f
cgo quantize
2024-04-08 15:31:08 -07:00
e1c9a2a00f
no blob create if already exists
2024-04-08 15:09:48 -07:00
be517e491c
no rope parameters
2024-04-05 18:05:27 -07:00
1b272d5bcd
change github.com/jmorganca/ollama
to github.com/ollama/ollama
( #3347 )
2024-03-26 13:04:17 -07:00
47cfe58af5
Default Keep Alive environment variable ( #3094 )
...
---------
Co-authored-by: Chris-AS1 <8493773+Chris-AS1@users.noreply.github.com >
2024-03-13 13:29:40 -07:00
3b4bab3dc5
Fix embeddings load model behavior ( #2848 )
2024-02-29 17:40:56 -08:00
e95b896790
Update types.go ( #2744 )
...
specfied -> specified
2024-02-25 13:41:25 -05:00
897b213468
use http.DefaultClient ( #2530 )
...
default client already handles proxy
2024-02-20 18:34:47 -05:00
caf2b13c10
Fix infinite keep_alive ( #2480 )
2024-02-13 15:40:32 -08:00
b5cf31b460
add keep_alive to generate/chat/embedding api endpoints ( #2146 )
2024-01-26 14:28:02 -08:00
7c40a67841
Save and load sessions ( #2063 )
2024-01-25 12:12:36 -08:00
745b5934fa
add model to ModelResponse
2024-01-18 14:32:55 -08:00
a38d88d828
api: add model for all requests
...
prefer using req.Model and fallback to req.Name
2024-01-18 14:31:37 -08:00
5ffbbea1d7
remove client.py
2024-01-11 15:53:10 -08:00
22e93efa41
add show info command and fix the modelfile
2024-01-05 12:20:05 -08:00
0d6e3565ae
Add embeddings to API ( #1773 )
2024-01-04 15:00:52 -05:00
55978c1dc9
clean up cache api option
2023-12-27 14:27:45 -05:00
d4ebdadbe7
enable cache_prompt
by default
2023-12-27 14:23:42 -05:00
10da41d677
Add Cache flag to api ( #1642 )
2023-12-22 17:16:20 -05:00
d99fa6ce0a
send empty messages on last chat response ( #1530 )
2023-12-18 14:23:38 -05:00
d9e60f634b
add image support to the chat api ( #1490 )
2023-12-12 13:28:58 -08:00
910e9401d0
Multimodal support ( #1216 )
...
---------
Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local >
2023-12-11 13:56:22 -08:00
9e1406e4ed
Don't expose model information in /api/generate
2023-12-09 02:05:43 -08:00
c3ff36088b
Merge pull request #774 from jmorganca/mxyng/server-version
...
add version api and show server version in cli
2023-12-06 13:22:55 -08:00
5d75505ebd
return model configuration in generate
2023-12-05 14:39:02 -08:00
195e3d9dbd
chat api endpoint ( #1392 )
2023-12-05 14:57:33 -05:00
0db4706ec2
api: add version api handler
2023-12-05 09:36:01 -08:00
00d06619a1
Revert "chat api ( #991 )" while context variable is fixed
...
This reverts commit 7a0899d62d
.
2023-12-04 21:16:27 -08:00
7a0899d62d
chat api ( #991 )
...
- update chat docs
- add messages chat endpoint
- remove deprecated context and template generate parameters from docs
- context and template are still supported for the time being and will continue to work as expected
- add partial response to chat history
2023-12-04 18:01:06 -05:00
cde31cb220
Allow setting parameters in the REPL ( #1294 )
2023-11-29 09:56:42 -08:00
928950fcc6
update python client create example ( #1227 )
...
* add remote create to python example client
2023-11-27 15:36:19 -05:00
bc22d5a38b
no blob response
2023-11-15 15:16:23 -08:00
1901044b07
use checksum reference
2023-11-15 15:16:23 -08:00
1552cee59f
client create modelfile
2023-11-15 15:16:23 -08:00
3ca56b5ada
add create modelfile field
2023-11-15 15:16:23 -08:00
cdddd3df65
add format
to example python client
2023-11-10 10:22:21 -08:00
5cba29b9d6
JSON mode: add `"format" as an api parameter ( #1051 )
...
* add `"format": "json"` as an API parameter
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-09 16:44:02 -08:00
a49d6acc1e
add a complete /generate options example ( #1035 )
2023-11-08 16:44:36 -08:00
ec2a31e9b3
support raw generation requests ( #952 )
...
- add the optional `raw` generate request parameter to bypass prompt formatting and response context
-add raw request to docs
2023-11-08 14:05:02 -08:00
17678b7225
Restore system prompt on requests and default num_keep
to 0
2023-11-03 13:25:25 -07:00
06589a3b30
Set NumKeep
to 4
by default ( #982 )
2023-11-02 17:26:11 -07:00
1fd511e661
Merge pull request #975 from jmorganca/mxyng/downloads
...
update downloads to use retry wrapper
2023-11-02 16:12:48 -07:00
6db3691b8f
update default NumKeep
2023-11-02 15:47:35 -07:00
60bb3c03a1
use http.Method
2023-11-02 13:12:45 -07:00
5c3491f425
allow for a configurable ollama model storage directory ( #897 )
...
* allow for a configurable ollama models directory
- set OLLAMA_MODELS in the environment that ollama is running in to change where model files are stored
- update docs
Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com >
Co-Authored-By: Jay Nakrani <dhananjaynakrani@gmail.com >
Co-Authored-By: Akhil Acharya <akhilcacharya@gmail.com >
Co-Authored-By: Sasha Devol <sasha.devol@protonmail.com >
2023-10-27 10:19:59 -04:00