Commit Graph

267 Commits

Author SHA1 Message Date
ad7e641815 add batch embeddings 2024-04-26 20:13:33 -04:00
2a80f55e2a Update windows.md (#3855)
Fixed a typo
2024-04-26 16:04:15 -04:00
74d2a9ef9a add OLLAMA_KEEP_ALIVE env variable to FAQ (#3865) 2024-04-23 21:06:51 -07:00
e6f9bfc0e8 Update api.md (#3705) 2024-04-20 15:17:03 -04:00
85bdf14b56 update jetson tutorial 2024-04-17 16:17:42 -04:00
a27e419b47 Update langchainjs.md (#2030)
Changed ollama.call() for ollama.invoke() as per deprecated documentation from langchain
2024-04-15 18:37:30 -04:00
e54a3c7fcd Update modelfile.md
Remove Modelfile parameters that are decided at runtime
2024-04-15 15:35:44 -04:00
1524f323a3 Revert "build.go: introduce a friendlier way to build Ollama (#3548)" (#3564) 2024-04-09 15:57:45 -07:00
fccf3eecaa build.go: introduce a friendlier way to build Ollama (#3548)
This commit introduces a more friendly way to build Ollama dependencies
and the binary without abusing `go generate` and removing the
unnecessary extra steps it brings with it.

This script also provides nicer feedback to the user about what is
happening during the build process.

At the end, it prints a helpful message to the user about what to do
next (e.g. run the new local Ollama).
2024-04-09 14:18:47 -07:00
cb03fc9571 Docs: Remove wrong parameter for Chat Completion (#3515)
Fixes gh-3514

Signed-off-by: Thomas Vitale <ThomasVitale@users.noreply.github.com>
2024-04-06 09:08:35 -07:00
0a74cb31d5 Safeguard for noexec
We may have users that run into problems with our current
payload model, so this gives us an escape valve.
2024-04-01 16:48:33 -07:00
856b8ec131 remove need for $VSINSTALLDIR since build will fail if ninja cannot be found (#3350) 2024-03-26 16:23:16 -04:00
1b272d5bcd change github.com/jmorganca/ollama to github.com/ollama/ollama (#3347) 2024-03-26 13:04:17 -07:00
f38b705dc7 Fix ROCm link in development.md 2024-03-25 16:32:44 -04:00
22921a3969 doc: specify ADAPTER is optional (#3333) 2024-03-25 09:43:19 -07:00
d8fdbfd8da Add docs for GPU selection and nvidia uvm workaround 2024-03-21 11:52:54 +01:00
a5ba0fcf78 doc: faq gpu compatibility (#3142) 2024-03-21 05:21:34 -04:00
3a30bf56dc Update faq.md 2024-03-20 17:48:39 +01:00
7ed3e94105 Update faq.md 2024-03-18 10:24:39 +01:00
2297ad39da update faq.md 2024-03-18 10:17:59 +01:00
6459377ae0 Add ROCm support to linux install script (#2966) 2024-03-14 18:00:16 -07:00
5ce997a7b9 Update README.md 2024-03-13 21:12:17 -07:00
ba7cf7fb66 add more docs on for the modelfile message command (#3087) 2024-03-12 16:41:41 -07:00
b53229a2ed Add docs explaining GPU selection env vars 2024-03-12 11:33:06 -07:00
6d3adfbea2 Update troubleshooting.md 2024-03-11 13:22:28 -07:00
0fdebb34a9 Doc how to set up ROCm builds on windows 2024-03-09 11:29:45 -08:00
4a5c9b8035 Finish unwinding idempotent payload logic
The recent ROCm change partially removed idempotent
payloads, but the ggml-metal.metal file for mac was still
idempotent.  This finishes switching to always extract
the payloads, and now that idempotentcy is gone, the
version directory is no longer useful.
2024-03-09 08:34:39 -08:00
6c0af2599e Update docs README.md and table of contents 2024-03-08 22:45:11 -08:00
280da44522 Merge pull request #2988 from dhiltgen/rocm_docs
Refined ROCm troubleshooting docs
2024-03-08 13:33:30 -08:00
b886bec3f9 Update api.md 2024-03-07 23:27:51 -08:00
69f0227813 Refined ROCm troubleshooting docs 2024-03-07 11:22:37 -08:00
6c5ccb11f9 Revamp ROCm support
This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed.  It also cleans up after itself.

We now build only a single ROCm version (latest major) on both windows
and linux.  Given the large size of ROCms tensor files, we split the
dependency out.  It's bundled into the installer on windows, and a
separate download on windows.  The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.

For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us.  For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.
2024-03-07 10:36:50 -08:00
d481fb3cc8 update go to 1.22 in other places (#2975) 2024-03-07 07:39:49 -08:00
23ebe8fe11 fix some typos (#2973)
Signed-off-by: hishope <csqiye@126.com>
2024-03-06 22:50:11 -08:00
ce9f7c4674 Update api.md 2024-03-05 13:13:23 -08:00
3b4bab3dc5 Fix embeddings load model behavior (#2848) 2024-02-29 17:40:56 -08:00
1f087c4d26 Update langchain python tutorial (#2737)
Remove unused GPT4all
Use nomic-embed-text as embedded model
Fix a deprecation warning (__call__)
2024-02-25 00:31:36 -05:00
bdc0ea1ba5 Update import.md 2024-02-22 02:08:03 -05:00
7fab7918cc Update import.md 2024-02-22 02:06:24 -05:00
f0425d3de9 Update faq.md 2024-02-20 20:44:45 -05:00
8125ce4cb6 Update import.md
Add instructions to get public key on windows
2024-02-19 22:48:24 -05:00
df56f1ee5e Update faq.md 2024-02-19 22:16:42 -05:00
41aca5c2d0 Update faq.md 2024-02-19 21:11:01 -05:00
753724d867 Update api.md to include examples for reproducible outputs 2024-02-19 20:36:16 -05:00
9a7a4b9533 add faqs for memory pre-loading and the keep_alive setting (#2601) 2024-02-19 14:45:25 -08:00
b338c0635f Document setting server vars for windows 2024-02-19 13:30:46 -08:00
9774663013 Update faq.md with the location of models on Windows (#2545) 2024-02-16 11:04:19 -08:00
1ba734de67 typo 2024-02-15 14:56:55 -08:00
29e90cc13b Implement new Go based Desktop app
This focuses on Windows first, but coudl be used for Mac
and possibly linux in the future.
2024-02-15 05:56:45 +00:00
48a273f80b Fix issues with templating prompt in chat mode (#2460) 2024-02-12 15:06:57 -08:00