Revamp ROCm support

This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed.  It also cleans up after itself.

We now build only a single ROCm version (latest major) on both windows
and linux.  Given the large size of ROCms tensor files, we split the
dependency out.  It's bundled into the installer on windows, and a
separate download on windows.  The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.

For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us.  For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.
This commit is contained in:
Daniel Hiltgen
2024-02-15 17:15:09 -08:00
parent 2e20110e50
commit 6c5ccb11f9
27 changed files with 1091 additions and 588 deletions

View File

@@ -116,7 +116,7 @@ Note: The windows build for Ollama is still under development.
Install required tools:
- MSVC toolchain - C/C++ and cmake as minimal requirements
- MSVC toolchain - C/C++ and cmake as minimal requirements - You must build from a "Developer Shell" with the environment variables set
- go version 1.22 or higher
- MinGW (pick one variant) with GCC.
- <https://www.mingw-w64.org/>
@@ -132,6 +132,6 @@ go build .
#### Windows CUDA (NVIDIA)
In addition to the common Windows development tools described above, install:
In addition to the common Windows development tools described above, install CUDA **AFTER** you install MSVC.
- [NVIDIA CUDA](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html)

View File

@@ -10,6 +10,14 @@ Install Ollama running this one-liner:
curl -fsSL https://ollama.com/install.sh | sh
```
## AMD Radeon GPU support
While AMD has contributed the `amdgpu` driver upstream to the official linux
kernel source, the version is older and may not support all ROCm features. We
recommend you install the latest driver from
https://www.amd.com/en/support/linux-drivers for best support of your Radeon
GPU.
## Manual install
### Download the `ollama` binary

View File

@@ -67,6 +67,43 @@ You can see what features your CPU has with the following.
cat /proc/cpuinfo| grep flags | head -1
```
## AMD Radeon GPU Support
Ollama leverages the AMD ROCm library, which does not support all AMD GPUs. In
some cases you can force the system to try to use a close GPU type. For example
The Radeon RX 5400 is `gfx1034` (also known as 10.3.4) however, ROCm does not
support this patch-level, the closest support is `gfx1030`. You can use the
environment variable `HSA_OVERRIDE_GFX_VERSION` with `x.y.z` syntax. So for
example, to force the system to run on the RX 5400, you would set
`HSA_OVERRIDE_GFX_VERSION="10.3.0"` as an environment variable for the server.
At this time, the known supported GPU types are the following: (This may change from
release to release)
- gfx900
- gfx906
- gfx908
- gfx90a
- gfx940
- gfx941
- gfx942
- gfx1030
- gfx1100
- gfx1101
- gfx1102
This will not work for all unsupported GPUs. Reach out on [Discord](https://discord.gg/ollama)
or file an [issue](https://github.com/ollama/ollama/issues) for additional help.
## Installing older versions on Linux
If you run into problems on Linux and want to install an older version you can tell the install script
which version to install.
```sh
curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION="0.1.27" sh
```
## Known issues
* N/A

View File

@@ -4,7 +4,7 @@ Welcome to the Ollama Windows preview.
No more WSL required!
Ollama now runs as a native Windows application, including NVIDIA GPU support.
Ollama now runs as a native Windows application, including NVIDIA and AMD Radeon GPU support.
After installing Ollama Windows Preview, Ollama will run in the background and
the `ollama` command line is available in `cmd`, `powershell` or your favorite
terminal application. As usual the Ollama [api](./api.md) will be served on
@@ -21,6 +21,7 @@ Logs will often be helpful in dianosing the problem (see
* Windows 10 or newer, Home or Pro
* NVIDIA 452.39 or newer Drivers if you have an NVIDIA card
* AMD Radeon Driver https://www.amd.com/en/support if you have a Radeon card
## API Access