Commit Graph

1850 Commits

Author SHA1 Message Date
fac9060da5 Init submodule with new path 2024-01-04 13:00:13 -08:00
a554616f8e remove old llama.cpp submodule path 2024-01-04 12:12:21 -08:00
77d96da94b Code shuffle to clean up the llm dir 2024-01-04 12:12:05 -08:00
0d6e3565ae Add embeddings to API (#1773) 2024-01-04 15:00:52 -05:00
b5939008a1 Merge pull request #1785 from dhiltgen/win_native_cli
Load dynamic cpu lib on windows
2024-01-04 08:55:01 -08:00
e9ce91e9a6 Load dynamic cpu lib on windows
On linux, we link the CPU library in to the Go app and fall back to it
when no GPU match is found. On windows we do not link in the CPU library
so that we can better control our dependencies for the CLI.  This fixes
the logic so we correctly fallback to the dynamic CPU library
on windows.
2024-01-04 08:41:41 -08:00
4ad6c9b11f fix: pull either original model or from model on create (#1774) 2024-01-04 01:34:38 -05:00
c0285158a9 tweak memory requirements error text 2024-01-03 19:47:18 -05:00
77a66df72c add macOS memory check for 47B models 2024-01-03 19:46:16 -05:00
5b4837f881 remove unused filetype check 2024-01-03 19:45:39 -05:00
29340c2e62 update cmake flags for amd64 macOS (#1780)
* update cmake flags for intel macOS

* remove `LLAMA_K_QUANTS`

* put back `CMAKE_OSX_DEPLOYMENT_TARGET` and disable `LLAMA_F16C`
2024-01-03 19:22:15 -05:00
d5ec730354 Merge pull request #1779 from dhiltgen/refined_amd_gpu_list
Improve maintainability of Radeon card list
2024-01-03 16:18:57 -08:00
8bed487aba Merge pull request #1778 from dhiltgen/wsl1
Fail fast on WSL1 while allowing on WSL2
2024-01-03 16:18:41 -08:00
c1a10a6e9b Merge pull request #1781 from dhiltgen/cpu_only_build
Fix CPU only builds
2024-01-03 16:18:25 -08:00
ddbfa6fe31 Fix CPU only builds
Go embed doesn't like when there's no matching files, so put
a dummy placeholder in to allow building without any GPU support
If no "server" library is found, it's safely ignored at runtime.
2024-01-03 16:08:34 -08:00
2fcd41ef81 Fail fast on WSL1 while allowing on WSL2
This prevents users from accidentally installing on WSL1 with instructions
guiding how to upgrade their WSL instance to version 2.  Once running WSL2
if you have an NVIDIA card, you can follow their instructions to set up
GPU passthrough and run models on the GPU.  This is not possible on WSL1.
2024-01-03 16:02:32 -08:00
16f4603b67 Improve maintainability of Radeon card list
This moves the list of AMD GPUs to an easier to maintain list which
should make it easier to update over time.
2024-01-03 15:16:56 -08:00
1184686649 Merge pull request #1776 from dhiltgen/render_group
Add ollama user to render group for Radeon support
2024-01-03 13:07:54 -08:00
2588cb2daa Add ollama user to render group for Radeon support
For the ROCm libraries to access the driver, we need to add the ollama user
to the render group.
2024-01-03 12:56:31 -08:00
c7ea8f237e set num_gpu to 1 only by default on darwin arm64 (#1771) 2024-01-03 14:10:29 -05:00
0b3118e0af fix: relay request opts to loaded llm prediction (#1761) 2024-01-03 12:01:42 -05:00
05face44ef Merge pull request #1683 from dhiltgen/fix_windows_test
Fix windows system memory lookup
2024-01-03 09:00:39 -08:00
a2ad952440 Fix windows system memory lookup
This refines the gpu package error handling and fixes a bug with the
system memory lookup on windows.
2024-01-03 08:50:01 -08:00
5fea4410be Merge pull request #1680 from dhiltgen/better_patching
Refactor how we augment llama.cpp and refine windows native build
2024-01-03 08:10:17 -08:00
b846eb64d0 Fix template api doc description (#1661) 2024-01-03 11:00:59 -05:00
3c5dd9ed1d Update README.md (#1766) 2024-01-03 10:44:22 -05:00
b17ccd0542 Update import.md 2024-01-02 22:28:18 -05:00
d0409f772f keyboard shortcut help (#1764) 2024-01-02 18:04:12 -08:00
ec261422af use docker build in build scripts 2024-01-02 19:32:54 -05:00
0498f7ce56 Get rid of one-line llama.log
This one log line was triggering a single line llama.log to be generated
in the pwd of the server
2024-01-02 15:36:16 -08:00
738a8d12eb Rename the ollama cmakefile 2024-01-02 15:36:16 -08:00
d966b730ac Switch windows build to fully dynamic
Refactor where we store build outputs, and support a fully dynamic loading
model on windows so the base executable has no special dependencies thus
doesn't require a special PATH.
2024-01-02 15:36:16 -08:00
9a70aecccb Refactor how we augment llama.cpp
This changes the model for llama.cpp inclusion so we're not applying a patch,
but instead have the C++ code directly in the ollama tree, which should make it
easier to refine and update over time.
2024-01-02 15:35:55 -08:00
22cd5eaab6 Added Ollama-SwiftUI to integrations (#1747) 2024-01-02 09:47:50 -05:00
304a8799ca Update README.md (#1757) 2024-01-02 09:47:08 -05:00
2a2fa3c329 api.md cleanup & formatting 2023-12-27 14:32:35 -05:00
55978c1dc9 clean up cache api option 2023-12-27 14:27:45 -05:00
d4ebdadbe7 enable cache_prompt by default 2023-12-27 14:23:42 -05:00
e201efa14b Add windows native build instructions 2023-12-25 08:31:34 -08:00
c5f21f73a4 follow best practices by adding resp.Body.Close() (#1708) 2023-12-25 09:01:37 -05:00
371bc73531 Update README.md 2023-12-24 11:54:08 -05:00
c651d8b824 Update README.md 2023-12-23 11:18:12 -05:00
cf50ef5b51 Merge pull request #1684 from dhiltgen/tag_integration_tests
Guard integration tests with a tag
2023-12-22 16:43:41 -08:00
697bea6939 Guard integration tests with a tag
This should help CI avoid running the integration test logic in a
container where it's not currently possible.
2023-12-22 16:33:27 -08:00
10da41d677 Add Cache flag to api (#1642) 2023-12-22 17:16:20 -05:00
db356c8519 post-response templating (#1427) 2023-12-22 17:07:05 -05:00
b80081022f cache docker builds in build_linux.sh 2023-12-22 16:01:20 -05:00
790457398a Merge pull request #1677 from jmorganca/mattw/docrunupdate
update where are models stored q
2023-12-22 09:56:27 -08:00
511069a2a5 update where are models stored q
Signed-off-by: Matt Williams <m@technovangelist.com>
2023-12-22 09:48:44 -08:00
5a85070c22 Update readmes, requirements, packagejsons, etc for all examples (#1452)
Most of the examples needed updates of Readmes to show how to run them. Some of the requirements.txt files had extra content that wasn't needed, or missing altogether. Apparently some folks like to run npm start
to run typescript, so a script was added to all typescript examples which
hadn't been done before.

Basically just a lot of cleanup.

Signed-off-by: Matt Williams <m@technovangelist.com>
2023-12-22 09:10:41 -08:00