Commit Graph

145 Commits

Author SHA1 Message Date
993cf8bf55 llm: limit generation to 10x context size to avoid run on generations (#3918)
* llm: limit generation to 10x context size to avoid run on generations

* add comment

* simplify condition statement
2024-04-25 19:02:30 -04:00
34b9db5afc Request and model concurrency
This change adds support for multiple concurrent requests, as well as
loading multiple models by spawning multiple runners. The default
settings are currently set at 1 concurrent request per model and only 1
loaded model at a time, but these can be adjusted by setting
OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.
2024-04-22 19:29:12 -07:00
62be2050dd chore: use errors.New to replace fmt.Errorf will much better (#3789) 2024-04-20 22:11:06 -04:00
ad90b9ab3d api: start adding documentation to package api (#2878)
* api: start adding documentation to package api

Updates #2840

* Fix lint typo report
2024-04-10 13:31:55 -04:00
01114b4526 fix: rope 2024-04-09 16:15:24 -07:00
9502e5661f cgo quantize 2024-04-08 15:31:08 -07:00
e1c9a2a00f no blob create if already exists 2024-04-08 15:09:48 -07:00
be517e491c no rope parameters 2024-04-05 18:05:27 -07:00
1b272d5bcd change github.com/jmorganca/ollama to github.com/ollama/ollama (#3347) 2024-03-26 13:04:17 -07:00
47cfe58af5 Default Keep Alive environment variable (#3094)
---------

Co-authored-by: Chris-AS1 <8493773+Chris-AS1@users.noreply.github.com>
2024-03-13 13:29:40 -07:00
3b4bab3dc5 Fix embeddings load model behavior (#2848) 2024-02-29 17:40:56 -08:00
e95b896790 Update types.go (#2744)
specfied -> specified
2024-02-25 13:41:25 -05:00
897b213468 use http.DefaultClient (#2530)
default client already handles proxy
2024-02-20 18:34:47 -05:00
caf2b13c10 Fix infinite keep_alive (#2480) 2024-02-13 15:40:32 -08:00
b5cf31b460 add keep_alive to generate/chat/embedding api endpoints (#2146) 2024-01-26 14:28:02 -08:00
7c40a67841 Save and load sessions (#2063) 2024-01-25 12:12:36 -08:00
745b5934fa add model to ModelResponse 2024-01-18 14:32:55 -08:00
a38d88d828 api: add model for all requests
prefer using req.Model and fallback to req.Name
2024-01-18 14:31:37 -08:00
5ffbbea1d7 remove client.py 2024-01-11 15:53:10 -08:00
22e93efa41 add show info command and fix the modelfile 2024-01-05 12:20:05 -08:00
0d6e3565ae Add embeddings to API (#1773) 2024-01-04 15:00:52 -05:00
55978c1dc9 clean up cache api option 2023-12-27 14:27:45 -05:00
d4ebdadbe7 enable cache_prompt by default 2023-12-27 14:23:42 -05:00
10da41d677 Add Cache flag to api (#1642) 2023-12-22 17:16:20 -05:00
d99fa6ce0a send empty messages on last chat response (#1530) 2023-12-18 14:23:38 -05:00
d9e60f634b add image support to the chat api (#1490) 2023-12-12 13:28:58 -08:00
910e9401d0 Multimodal support (#1216)
---------

Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local>
2023-12-11 13:56:22 -08:00
9e1406e4ed Don't expose model information in /api/generate 2023-12-09 02:05:43 -08:00
c3ff36088b Merge pull request #774 from jmorganca/mxyng/server-version
add version api and show server version in cli
2023-12-06 13:22:55 -08:00
5d75505ebd return model configuration in generate 2023-12-05 14:39:02 -08:00
195e3d9dbd chat api endpoint (#1392) 2023-12-05 14:57:33 -05:00
0db4706ec2 api: add version api handler 2023-12-05 09:36:01 -08:00
00d06619a1 Revert "chat api (#991)" while context variable is fixed
This reverts commit 7a0899d62d.
2023-12-04 21:16:27 -08:00
7a0899d62d chat api (#991)
- update chat docs
- add messages chat endpoint
- remove deprecated context and template generate parameters from docs
- context and template are still supported for the time being and will continue to work as expected
- add partial response to chat history
2023-12-04 18:01:06 -05:00
cde31cb220 Allow setting parameters in the REPL (#1294) 2023-11-29 09:56:42 -08:00
928950fcc6 update python client create example (#1227)
* add remote create to python example client
2023-11-27 15:36:19 -05:00
bc22d5a38b no blob response 2023-11-15 15:16:23 -08:00
1901044b07 use checksum reference 2023-11-15 15:16:23 -08:00
1552cee59f client create modelfile 2023-11-15 15:16:23 -08:00
3ca56b5ada add create modelfile field 2023-11-15 15:16:23 -08:00
cdddd3df65 add format to example python client 2023-11-10 10:22:21 -08:00
5cba29b9d6 JSON mode: add `"format" as an api parameter (#1051)
* add `"format": "json"` as an API parameter
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2023-11-09 16:44:02 -08:00
a49d6acc1e add a complete /generate options example (#1035) 2023-11-08 16:44:36 -08:00
ec2a31e9b3 support raw generation requests (#952)
- add the optional `raw` generate request parameter to bypass prompt formatting and response context
-add raw request to docs
2023-11-08 14:05:02 -08:00
17678b7225 Restore system prompt on requests and default num_keep to 0 2023-11-03 13:25:25 -07:00
06589a3b30 Set NumKeep to 4 by default (#982) 2023-11-02 17:26:11 -07:00
1fd511e661 Merge pull request #975 from jmorganca/mxyng/downloads
update downloads to use retry wrapper
2023-11-02 16:12:48 -07:00
6db3691b8f update default NumKeep 2023-11-02 15:47:35 -07:00
60bb3c03a1 use http.Method 2023-11-02 13:12:45 -07:00
5c3491f425 allow for a configurable ollama model storage directory (#897)
* allow for a configurable ollama models directory

- set OLLAMA_MODELS in the environment that ollama is running in to change where model files are stored
- update docs

Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
Co-Authored-By: Jay Nakrani <dhananjaynakrani@gmail.com>
Co-Authored-By: Akhil Acharya <akhilcacharya@gmail.com>
Co-Authored-By: Sasha Devol <sasha.devol@protonmail.com>
2023-10-27 10:19:59 -04:00