63 Commits

Author SHA1 Message Date
Daniel Hiltgen
e91ae3d47d
Update ROCm (6.3 linux, 6.2 windows) and CUDA v12.8 (#9304)
* Bump cuda and rocm versions

Update ROCm to linux:6.3 win:6.2 and CUDA v12 to 12.8.
Yum has some silent failure modes, so largely switch to dnf.

* Fix windows build script
2025-02-25 13:47:36 -08:00
Michael Yang
ba9ec3d05e ci: use clang for windows cpu builds
clang outputs are faster. we were previously building with clang via gcc
wrapper in cgo but this was missed during the build updates so there was
a drop in performance
2025-02-20 20:22:36 +00:00
Michael Yang
7b5d916a9a ci: set owner/group in tarball
set owner and group when building the linux tarball so extracted files
are consistent. this is the behaviour of release tarballs in version
0.5.7 and lower
2025-02-18 20:11:09 +00:00
Michael Yang
1f766c36fb
ci: use windows-2022 to sign and bundle (#8941)
ollama requires vcruntime140_1.dll which isn't found on 2019. previously
the job used the windows runner (2019) but it explicitly installs
2022 to build the app. since the sign job doesn't actually build
anything, it can use the windows-2022 runner instead.
2025-02-08 13:07:00 -08:00
Michael Yang
1c198977ec
ci: fix linux archive (#8862)
the find returns intermediate directories which pulls the parent
directories. it also omits files under lib/ollama.

switch back to globbing
2025-02-05 19:45:58 -08:00
Michael Yang
070ad913ac ci: fix linux archive 2025-02-05 15:08:02 -08:00
Michael Yang
63f0269f7f ci: split docker build by platform
this improves build reliability and concurrency
2025-02-04 17:04:27 -08:00
Michael Yang
65b7ecac7b fix extra quote 2025-02-04 08:35:30 -08:00
Michael Yang
f9d2d89135 fix linux archive 2025-02-03 16:12:33 -08:00
Michael Yang
669dc31cf3 fix build 2025-02-03 15:10:51 -08:00
Michael Yang
e806184023 fix release workflow 2025-02-03 13:19:57 -08:00
Michael Yang
475333d533 fix docker build-args
env context is not accessible from job.*.strategy. since it's in the
environment, just tell docker to use the environment variable[1]

[1]: https://docs.docker.com/reference/cli/docker/buildx/build/#build-arg
2025-01-31 14:56:02 -08:00
Michael Yang
39fd89308c build: set CFLAGS=-O3 specifically for cpu.go 2025-01-31 10:25:39 -08:00
Michael Yang
3f0cb36bdb build: set goflags in linux release 2025-01-30 13:07:32 -08:00
Michael Yang
dcfb7a105c
next build (#8539)
* add build to .dockerignore

* test: only build one arch

* add build to .gitignore

* fix ccache path

* filter amdgpu targets

* only filter if autodetecting

* Don't clobber gpu list for default runner

This ensures the GPU specific environment variables are set properly

* explicitly set CXX compiler for HIP

* Update build_windows.ps1

This isn't complete, but is close.  Dependencies are missing, and it only builds the "default" preset.

* build: add ollama subdir

* add .git to .dockerignore

* docs: update development.md

* update build_darwin.sh

* remove unused scripts

* llm: add cwd and build/lib/ollama to library paths

* default DYLD_LIBRARY_PATH to LD_LIBRARY_PATH in runner on macOS

* add additional cmake output vars for msvc

* interim edits to make server detection logic work with dll directories like lib/ollama/cuda_v12

* remove unncessary filepath.Dir, cleanup

* add hardware-specific directory to path

* use absolute server path

* build: linux arm

* cmake install targets

* remove unused files

* ml: visit each library path once

* build: skip cpu variants on arm

* build: install cpu targets

* build: fix workflow

* shorter names

* fix rocblas install

* docs: clean up development.md

* consistent build dir removal in development.md

* silence -Wimplicit-function-declaration build warnings in ggml-cpu

* update readme

* update development readme

* llm: update library lookup logic now that there is one runner (#8587)

* tweak development.md

* update docs

* add windows cuda/rocm tests

---------

Co-authored-by: jmorganca <jmorganca@gmail.com>
Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
2025-01-29 15:03:38 -08:00
Daniel Hiltgen
581a4a5553
ci: fix artifact path prefix for missing windows payloads (#8052)
upload-artifacts strips off leading common paths so when
the ./build/ artifacts were removed, the ./dist/windows-amd64
prefix became common and was stripped, making the
later download-artifacts place them in the wrong location
2024-12-11 10:59:32 -08:00
Daniel Hiltgen
6a6328a5e9
ci: build dir changed (#8037)
Remove no longer relevant build log dir
2024-12-10 20:33:34 -08:00
Daniel Hiltgen
4879a234c4
build: Make target improvements (#7499)
* llama: wire up builtin runner

This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build.  After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.

* build: Make target improvements

Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.

* Support customized CPU flags for runners

This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash.  If the user builds a customized set, we omit the naming
scheme and don't check for compatibility.  This avoids checking
requirements at runtime, so that logic has been removed as well.  This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.

* Use relative paths

If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.

* Remove payloads from main binary

* install: clean up prior libraries

This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
Daniel Hiltgen
046054fa3b
CI: Switch to v13 macos runner (#7498) 2024-11-04 13:02:07 -08:00
Daniel Hiltgen
95483f348b
CI: matrix strategy fix (#7496)
Github actions matrix strategy can't access env settings
2024-11-04 10:48:35 -08:00
Daniel Hiltgen
44bd9e5994
Sign windows arm64 official binaries (#7493) 2024-11-04 09:15:14 -08:00
Daniel Hiltgen
b8d5036e33
CI: omit unused tools for faster release builds (#7432)
This leverages caching, and some reduced installer scope to try
to speed up builds. It also tidies up some windows build logic
that was only relevant for the older generate/cmake builds.
2024-11-02 13:56:54 -07:00
Daniel Hiltgen
712e99d477
Soften windows clang requirement (#7428)
This will no longer error if built with regular gcc on windows.  To help
triage issues that may come in related to different compilers, the runner now
reports the compier used by cgo.
2024-10-30 12:28:36 -07:00
Daniel Hiltgen
b754f5a6a3
Remove submodule and shift to Go server - 0.4.0 (#7157)
* Remove llama.cpp submodule and shift new build to top

* CI: install msys and clang gcc on win

Needed for deepseek to work properly on windows
2024-10-30 10:34:28 -07:00
Daniel Hiltgen
e9e9bdb8d9
CI: Fix win arm version defect (#6940)
write-host in powershell writes directly to the console and will not be picked
up by a pipe.  Echo, or write-output will.
2024-09-24 15:18:10 -07:00
Daniel Hiltgen
2a038c1d7e
CI: win arm artifact dist dir (#6900)
The upload artifact is missing the dist prefix since all
payloads are in the same directory, so restore the prefix
on download.
2024-09-20 19:16:18 -07:00
Daniel Hiltgen
616c5eafee
CI: win arm adjustments (#6898) 2024-09-20 16:58:56 -07:00
Daniel Hiltgen
f5ff917b1d
CI: adjust step ordering for win arm to match x64 (#6895) 2024-09-20 14:20:57 -07:00
Daniel Hiltgen
d632e23fba
Add Windows arm64 support to official builds (#5712)
* Unified arm/x86 windows installer

This adjusts the installer payloads to be architecture aware so we can cary
both amd64 and arm64 binaries in the installer, and install only the applicable
architecture at install time.

* Include arm64 in official windows build

* Harden schedule test for slow windows timers

This test seems to be a bit flaky on windows, so give it more time to converge
2024-09-20 13:09:38 -07:00
Daniel Hiltgen
8f9ab5e14d
CI: dist directories no longer present (#6834)
The new buildx based build no longer leaves the dist/linux-* directories
around, so we don't have to clean them up before uploading.
2024-09-16 17:31:37 -07:00
Daniel Hiltgen
7717bb6a84
CI: clean up naming, fix tagging latest (#6832)
The rocm CI step for RCs was incorrectly tagging them as the latest rocm build.
The multiarch manifest was incorrectly tagged twice (with and without the
prefix "v").  Static windows artifacts weren't being carried between build
jobs.  This also fixes the latest tagging script.
2024-09-16 16:18:41 -07:00
Daniel Hiltgen
0ec2915ea7
CI: set platform build build_linux script to keep buildx happy (#6829)
The runners don't have emulation set up so the default multi-platform build
wont work.
2024-09-16 14:07:29 -07:00
Daniel Hiltgen
cd5c8f6471
Optimize container images for startup (#6547)
* Optimize container images for startup

This change adjusts how to handle runner payloads to support
container builds where we keep them extracted in the filesystem.
This makes it easier to optimize the cpu/cuda vs cpu/rocm images for
size, and should result in faster startup times for container images.

* Refactor payload logic and add buildx support for faster builds

* Move payloads around

* Review comments

* Converge to buildx based helper scripts

* Use docker buildx action for release
2024-09-12 12:10:30 -07:00
Daniel Hiltgen
a017cf2fea
Split rocm back out of bundle (#6432)
We're over budget for github's maximum release artifact size with rocm + 2 cuda
versions.  This splits rocm back out as a discrete artifact, but keeps the layout so it can
be extracted into the same location as the main bundle.
2024-08-20 07:26:38 -07:00
Daniel Hiltgen
19e5a890f7
CI: remove directories from dist dir before upload step (#6429) 2024-08-19 15:19:21 -07:00
Daniel Hiltgen
f91c9e3709
CI: handle directories during checksum (#6427) 2024-08-19 13:48:45 -07:00
Daniel Hiltgen
d8be22e47d Fix overlapping artifact name on CI 2024-08-19 12:07:18 -07:00
Daniel Hiltgen
f9e31da946 Review comments 2024-08-19 10:36:15 -07:00
Daniel Hiltgen
927d98a6cd Add windows cuda v12 + v11 support 2024-08-19 09:38:53 -07:00
Daniel Hiltgen
74d45f0102 Refactor linux packaging
This adjusts linux to follow a similar model to windows with a discrete archive
(zip/tgz) to cary the primary executable, and dependent libraries. Runners are
still carried as payloads inside the main binary

Darwin retain the payload model where the go binary is fully self contained.
2024-08-19 09:38:53 -07:00
Daniel Hiltgen
feedf49c71 Go back to a pinned Go version
Go version 1.22.6 is triggering AV false positives, so go back to 1.22.5
2024-08-13 11:45:44 -07:00
Daniel Hiltgen
5d604eec5b Bump Go patch version 2024-07-22 16:16:28 -07:00
Daniel Hiltgen
1f50356e8e Bump ROCm on windows to 6.1.2
This also adjusts our algorithm to favor our bundled ROCm.
I've confirmed VRAM reporting still doesn't work properly so we
can't yet enable concurrency by default.
2024-07-10 11:01:22 -07:00
Daniel Hiltgen
b51e3b63ac Statically link c++ and thread lib
This makes sure we statically link the c++ and thread library on windows
to avoid unnecessary runtime dependencies on non-standard DLLs
2024-07-09 11:34:30 -07:00
jmorganca
c12f1c5b99 release: move mingw library cleanup to correct job 2024-07-06 16:12:29 -04:00
jmorganca
a08f20d910 release: remove unwanted mingw dll.a files 2024-07-06 15:21:15 -04:00
jmorganca
6cea036027 Revert "llm: only statically link libstdc++"
This reverts commit 5796bfc4013f4ebe26cdbf13554332a25c405027.
2024-07-06 15:10:48 -04:00
jmorganca
5796bfc401 llm: only statically link libstdc++ 2024-07-06 14:06:20 -04:00
Daniel Hiltgen
a12283e2ff Implement custom github release action
This implements the release logic we want via gh cli
to support updating releases with rc tags in place and retain
release notes and other community reactions.
2024-06-15 11:36:56 -07:00
Jeffrey Morgan
afd2b058b4
set codesign timeout to longer (#4605) 2024-05-23 22:46:23 -07:00