Commit Graph

106 Commits

Author SHA1 Message Date
3ca56b5ada add create modelfile field 2023-11-15 15:16:23 -08:00
cdddd3df65 add format to example python client 2023-11-10 10:22:21 -08:00
5cba29b9d6 JSON mode: add `"format" as an api parameter (#1051)
* add `"format": "json"` as an API parameter
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2023-11-09 16:44:02 -08:00
a49d6acc1e add a complete /generate options example (#1035) 2023-11-08 16:44:36 -08:00
ec2a31e9b3 support raw generation requests (#952)
- add the optional `raw` generate request parameter to bypass prompt formatting and response context
-add raw request to docs
2023-11-08 14:05:02 -08:00
17678b7225 Restore system prompt on requests and default num_keep to 0 2023-11-03 13:25:25 -07:00
06589a3b30 Set NumKeep to 4 by default (#982) 2023-11-02 17:26:11 -07:00
1fd511e661 Merge pull request #975 from jmorganca/mxyng/downloads
update downloads to use retry wrapper
2023-11-02 16:12:48 -07:00
6db3691b8f update default NumKeep 2023-11-02 15:47:35 -07:00
60bb3c03a1 use http.Method 2023-11-02 13:12:45 -07:00
5c3491f425 allow for a configurable ollama model storage directory (#897)
* allow for a configurable ollama models directory

- set OLLAMA_MODELS in the environment that ollama is running in to change where model files are stored
- update docs

Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
Co-Authored-By: Jay Nakrani <dhananjaynakrani@gmail.com>
Co-Authored-By: Akhil Acharya <akhilcacharya@gmail.com>
Co-Authored-By: Sasha Devol <sasha.devol@protonmail.com>
2023-10-27 10:19:59 -04:00
28c3f288e2 client: fix trailing slash 2023-10-26 11:09:38 -07:00
459f4a7889 fix: ollama host for hostname 2023-10-20 11:32:41 -07:00
fe6f3b48f7 do not reload the running llm when runtime params change (#840)
- only reload the running llm if the model has changed, or the options for loading the running model have changed
- rename loaded llm to runner to differentiate from loaded model image
- remove logic which keeps the first system prompt in the generation context
2023-10-19 10:39:58 -04:00
92189a5855 fix memory check 2023-10-13 14:47:29 -07:00
6fe178134d improve api error handling (#781)
- remove new lines from llama.cpp error messages relayed to client
- check api option types and return error on wrong type
- change num layers from 95% VRAM to 92% VRAM
2023-10-13 16:57:10 -04:00
7804b8fab9 validate api options fields from map (#711) 2023-10-12 11:18:11 -04:00
b599946b74 add format bytes 2023-10-11 14:08:23 -07:00
274d5a5fdf optional parameter to not stream response (#639)
* update streaming request accept header
* add optional stream param to request bodies
2023-10-11 12:54:27 -04:00
2cfffea02e handle client proxy 2023-10-09 12:33:47 -07:00
2130c0708b output type parsed from modelfile (#678) 2023-10-05 14:58:04 -04:00
9e2de1bd2c increase streaming buffer size (#692) 2023-10-04 14:09:00 -04:00
1fbf3585d6 Relay default values to llama runner (#672)
* include seed in params for llama.cpp server and remove empty filter for temp

* relay default predict options to llama.cpp

- reorganize options to match predict request for readability

* omit empty stop

---------

Co-authored-by: hallh <hallh@users.noreply.github.com>
2023-10-02 14:53:16 -04:00
a1b2d95f96 remove unused push/pull params (#650) 2023-09-29 17:27:19 -04:00
f40b3de758 use int64 consistently 2023-09-28 11:07:24 -07:00
8efbc5df55 DRAFT: add a simple python client to access ollama (#522) 2023-09-14 16:37:38 -07:00
f221637053 first pass at linux gpu support (#454)
* linux gpu support
* handle multiple gpus
* add cuda docker image (#488)
---------

Co-authored-by: Michael Yang <mxyng@pm.me>
2023-09-12 11:04:35 -04:00
790d24eb7b add show command (#474) 2023-09-06 11:04:17 -07:00
0f541a0367 s/ListResponseModel/ModelResponse/ 2023-08-31 09:47:10 -04:00
42998d797d subprocess llama.cpp server (#401)
* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm
2023-08-30 16:35:03 -04:00
982c535428 Merge pull request #428 from jmorganca/mxyng/upload-chunks
update upload chunks
2023-08-30 07:47:17 -07:00
8bbff2df98 add model IDs (#439) 2023-08-28 20:50:24 -07:00
246dc65417 loosen http status code checks 2023-08-28 18:34:53 -04:00
22ab7f5f88 default host to 127.0.0.1, fixes #424 2023-08-26 11:59:28 -07:00
2c7f956b38 add version 2023-08-22 09:40:58 -07:00
f723bf0879 ignore nil map values 2023-08-17 15:50:46 -07:00
54bb49a502 parse protocol for OLLAMA_HOST 2023-08-17 18:20:44 -04:00
5ee6116420 set default OLLAMA_HOST to http://localhost:11434 2023-08-16 12:22:59 -04:00
67e593e355 cmd: support OLLAMA_CLIENT_HOST environment variable (#262)
* cmd: support OLLAMA_HOST environment variable

This commit adds support for the OLLAMA_HOST environment
variable. This variable can be used to specify the host to which
the client should connect. This is useful when the client is
running somewhere other than the host where the server is running.

The new api.FromEnv function is used to read configure clients from the
environment. Clients wishing to use the environment variable being
consistent with the Ollama CLI can use this new function.

* Update api/client.go

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update api/client.go

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

---------

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2023-08-16 11:03:48 -04:00
f27bc261cf s/parmeter/parameter/ 2023-08-10 16:26:06 -07:00
81d8d7b73f fix could not convert int 2023-08-10 16:24:17 -07:00
be989d89d1 Token auth (#314) 2023-08-10 11:34:25 -07:00
4b3507f036 embeddings endpoint
Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
2023-08-10 11:45:57 -04:00
7a5f3616fd embed text document in modelfile 2023-08-09 10:26:19 -04:00
21ddcaa1f1 pr comments
- default to embeddings enabled
- move embedding logic for loaded model to request
- allow embedding full directory
- close llm on reload
2023-08-08 13:49:37 -04:00
f2074ed4c0 Merge pull request #306 from jmorganca/default-keep-system
automatically set num_keep if num_keep < 0
2023-08-08 09:25:34 -07:00
8713ac23a8 allow overriding template and system in /api/generate
Fixes #297
Fixes #296
2023-08-08 00:55:34 -04:00
4dc5b117dd automatically set num_keep if num_keep < 0
num_keep defines how many tokens to keep in the context when truncating
inputs. if left to its default value of -1, the server will calculate
num_keep to be the left of the system instructions
2023-08-07 16:19:12 -07:00
b9f4d67554 configurable rope frequency parameters 2023-08-03 22:11:58 -07:00
8b1e791820 allow specifying zero values in modelfile 2023-08-02 17:07:53 -04:00