2024-10-17 15:03:09 -07:00
|
|
|
# Helpers for managing our vendored llama.cpp repo and patch set
|
|
|
|
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
REPO_ROOT:=./
|
|
|
|
DEST_DIR:=./llama/
|
2024-10-17 15:03:09 -07:00
|
|
|
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
include $(DEST_DIR)vendoring
|
2024-10-30 10:34:28 -07:00
|
|
|
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
LLAMACPP_REPO := ./llama/vendor/
|
2024-10-17 15:03:09 -07:00
|
|
|
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
# Relative to the vendor dir
|
|
|
|
VENDOR_RELATIVE_PATCH_DIR := ../patches/
|
2024-10-17 15:03:09 -07:00
|
|
|
|
|
|
|
|
|
|
|
help-sync:
|
|
|
|
@echo "The following make targets will help you update llama.cpp to a new base commit, or work on new features/fixes"
|
|
|
|
@echo ""
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
@echo " make apply-patches # Establish the tracking repo if not already present, reset to the base commit, and apply our patch set"
|
|
|
|
@echo " make sync # Vendor llama.cpp and ggml from the tracking repo working tree"
|
|
|
|
@echo " make sync-clean # Remove all vendored files"
|
|
|
|
@echo " make create-patches # Generate the patch set based on the current commits in the tracking repo since the base commit"
|
2024-10-17 15:03:09 -07:00
|
|
|
@echo ""
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
@echo "For more details on the workflow, see the Vendoring section in 'docs/development.md'"
|
2024-10-17 15:03:09 -07:00
|
|
|
|
|
|
|
apply-patches: $(LLAMACPP_REPO)
|
|
|
|
@if ! git -C $(LLAMACPP_REPO) --no-pager diff --exit-code ; then \
|
|
|
|
echo "ERROR: Your llama.cpp repo is dirty. The apply-patches target requires a clean working tree"; \
|
|
|
|
echo "To clobber: git -C $(LLAMACPP_REPO) reset --hard HEAD" ; \
|
|
|
|
exit 1; \
|
|
|
|
fi
|
|
|
|
@echo "Checking out $(LLAMACPP_BASE_COMMIT)"
|
|
|
|
@git -C $(LLAMACPP_REPO) checkout -q $(LLAMACPP_BASE_COMMIT) || \
|
|
|
|
git -C $(LLAMACPP_REPO) fetch --all && git -C $(LLAMACPP_REPO) checkout -q $(LLAMACPP_BASE_COMMIT)
|
|
|
|
@echo "Applying ollama patches..."
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
@cd $(LLAMACPP_REPO) && git -c 'user.name=nobody' -c 'user.email=<>' am -3 $(VENDOR_RELATIVE_PATCH_DIR)*.patch || \
|
2024-10-17 15:03:09 -07:00
|
|
|
echo "Please resolve the conflicts in $(LLAMACPP_REPO), and run 'git am --continue' to continue applying subsequent patches"
|
|
|
|
@echo ""
|
|
|
|
@echo "The tracking repo $(LLAMACPP_REPO) is now in a detached state with all patches applied."
|
|
|
|
@echo "Don't forget to commit any changes you make and run 'make create-patches' "
|
|
|
|
|
|
|
|
$(LLAMACPP_REPO):
|
|
|
|
@echo "Cloning llama.cpp to $(LLAMACPP_REPO)"
|
|
|
|
git clone https://github.com/ggerganov/llama.cpp.git $@
|
|
|
|
|
|
|
|
create-patches: $(LLAMACPP_REPO)
|
|
|
|
@if ! git -C $(LLAMACPP_REPO) --no-pager diff --exit-code ; then \
|
|
|
|
echo "ERROR: Your llama.cpp repo is dirty. You must commit any pending changes for format-patch to generate patches"; \
|
|
|
|
exit 1; \
|
|
|
|
fi
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
@cd $(LLAMACPP_REPO) && git format-patch --no-signature --no-numbered --zero-commit -o $(VENDOR_RELATIVE_PATCH_DIR) $(LLAMACPP_BASE_COMMIT)
|
2024-10-17 15:03:09 -07:00
|
|
|
|
|
|
|
# Vendoring template logic
|
|
|
|
EXCLUDED_FILES=sgemm.cpp sgemm.h sampling_ext.cpp sampling_ext.h stb_image.h json.hpp llama_darwin.c base64.hpp
|
|
|
|
OLLAMA_NATIVE_FILES=mllama.cpp mllama.h llama_darwin.c sampling_ext.cpp sampling_ext.h
|
|
|
|
define vendor_file
|
|
|
|
$(strip $(addprefix $(2),$(notdir $1))) : $(addprefix $(LLAMACPP_REPO),$(1))
|
|
|
|
ifneq ($$(filter-out $(EXCLUDED_FILES),$(notdir $1)),)
|
|
|
|
@echo "vendoring $1"; \
|
|
|
|
mkdir -p $$(dir $$@) && \
|
|
|
|
echo "/**" > $$@ && \
|
|
|
|
echo " * llama.cpp - commit $$(LLAMACPP_BASE_COMMIT) - do not edit this file" >> $$@ && \
|
|
|
|
echo " *" >> $$@ && \
|
|
|
|
sed 's/^/ * /' <$(LLAMACPP_REPO)/LICENSE | sed 's/ *$$$$//' >> $$@ && \
|
|
|
|
echo " */" >> $$@ && \
|
|
|
|
echo "" >> $$@ && \
|
|
|
|
cat $$< >> $$@
|
|
|
|
else
|
|
|
|
@echo "vendoring $1"; \
|
|
|
|
mkdir -p $$(dir $$@) && \
|
|
|
|
cat $$< > $$@
|
|
|
|
endif
|
|
|
|
VENDORED_FILES += $(strip $(addprefix $(2),$(notdir $1)))
|
|
|
|
endef
|
|
|
|
|
|
|
|
# llama.cpp files -> llama/
|
|
|
|
LLAMACPP_FILES=\
|
|
|
|
src/unicode.cpp \
|
|
|
|
src/unicode.h \
|
|
|
|
src/unicode-data.cpp \
|
|
|
|
src/unicode-data.h \
|
|
|
|
src/llama.cpp \
|
2025-01-08 11:22:01 -08:00
|
|
|
src/llama-adapter.cpp \
|
|
|
|
src/llama-adapter.h \
|
|
|
|
src/llama-arch.cpp \
|
|
|
|
src/llama-arch.h \
|
|
|
|
src/llama-batch.cpp \
|
|
|
|
src/llama-batch.h \
|
|
|
|
src/llama-chat.cpp \
|
|
|
|
src/llama-chat.h \
|
|
|
|
src/llama-context.cpp \
|
|
|
|
src/llama-context.h \
|
|
|
|
src/llama-cparams.cpp \
|
|
|
|
src/llama-cparams.h \
|
2024-10-17 15:03:09 -07:00
|
|
|
src/llama-grammar.cpp \
|
|
|
|
src/llama-grammar.h \
|
2025-01-08 11:22:01 -08:00
|
|
|
src/llama-hparams.cpp \
|
|
|
|
src/llama-hparams.h \
|
|
|
|
src/llama-impl.cpp \
|
|
|
|
src/llama-impl.h \
|
|
|
|
src/llama-kv-cache.cpp \
|
|
|
|
src/llama-kv-cache.h \
|
|
|
|
src/llama-mmap.cpp \
|
|
|
|
src/llama-mmap.h \
|
|
|
|
src/llama-model-loader.cpp \
|
|
|
|
src/llama-model-loader.h \
|
|
|
|
src/llama-model.cpp \
|
|
|
|
src/llama-model.h \
|
|
|
|
src/llama-quant.cpp \
|
|
|
|
src/llama-quant.h \
|
2024-10-17 15:03:09 -07:00
|
|
|
src/llama-sampling.cpp \
|
|
|
|
src/llama-sampling.h \
|
2025-01-08 11:22:01 -08:00
|
|
|
src/llama-vocab.cpp \
|
|
|
|
src/llama-vocab.h \
|
2024-10-17 15:03:09 -07:00
|
|
|
include/llama.h \
|
2025-01-08 11:22:01 -08:00
|
|
|
include/llama-cpp.h \
|
2024-12-10 19:21:34 -08:00
|
|
|
ggml/include/ggml-cpu.h \
|
|
|
|
ggml/src/ggml-cpu/llamafile/sgemm.cpp \
|
|
|
|
ggml/src/ggml-cpu/llamafile/sgemm.h
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
$(foreach name,$(LLAMACPP_FILES),$(eval $(call vendor_file,$(name),$(DEST_DIR))))
|
2024-10-17 15:03:09 -07:00
|
|
|
|
|
|
|
# llama.cpp files -> llama/llamafile
|
|
|
|
LLAMAFILE_FILES= \
|
2024-12-10 19:21:34 -08:00
|
|
|
ggml/src/ggml-cpu/llamafile/sgemm.h
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
$(foreach name,$(LLAMAFILE_FILES),$(eval $(call vendor_file,$(name),$(DEST_DIR)llamafile/)))
|
2024-10-17 15:03:09 -07:00
|
|
|
|
|
|
|
# ggml files -> llama/
|
|
|
|
GGML_FILES= \
|
|
|
|
ggml/src/ggml.c \
|
|
|
|
ggml/include/ggml.h \
|
|
|
|
ggml/src/ggml-quants.c \
|
|
|
|
ggml/src/ggml-quants.h \
|
2024-12-10 19:21:34 -08:00
|
|
|
ggml/src/ggml-metal/ggml-metal.metal \
|
2024-10-17 15:03:09 -07:00
|
|
|
ggml/include/ggml-metal.h \
|
|
|
|
ggml/src/ggml-impl.h \
|
2024-12-10 19:21:34 -08:00
|
|
|
ggml/src/ggml-threading.h \
|
2024-10-17 15:03:09 -07:00
|
|
|
ggml/include/ggml-cuda.h \
|
2024-12-10 19:21:34 -08:00
|
|
|
ggml/src/ggml-backend-reg.cpp \
|
|
|
|
ggml/src/ggml-metal/ggml-metal-impl.h \
|
2024-10-17 15:03:09 -07:00
|
|
|
ggml/src/ggml-common.h \
|
|
|
|
ggml/include/ggml-backend.h \
|
2024-12-10 19:21:34 -08:00
|
|
|
ggml/src/ggml-backend.cpp \
|
2024-10-17 15:03:09 -07:00
|
|
|
ggml/src/ggml-backend-impl.h \
|
|
|
|
ggml/include/ggml-alloc.h \
|
|
|
|
ggml/src/ggml-alloc.c \
|
|
|
|
ggml/include/ggml-blas.h \
|
2024-12-10 19:21:34 -08:00
|
|
|
ggml/include/ggml-cpp.h \
|
|
|
|
ggml/src/ggml-threading.cpp \
|
|
|
|
ggml/src/ggml-blas/ggml-blas.cpp \
|
|
|
|
ggml/src/ggml-cpu/ggml-cpu.c \
|
|
|
|
ggml/src/ggml-cpu/ggml-cpu.cpp \
|
|
|
|
ggml/src/ggml-cpu/ggml-cpu-aarch64.h \
|
2024-12-14 14:55:51 -08:00
|
|
|
ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp \
|
2024-12-10 19:21:34 -08:00
|
|
|
ggml/src/ggml-cpu/ggml-cpu-quants.h \
|
|
|
|
ggml/src/ggml-cpu/ggml-cpu-quants.c \
|
|
|
|
ggml/src/ggml-cpu/ggml-cpu-impl.h \
|
2024-12-14 14:55:51 -08:00
|
|
|
ggml/src/ggml-cpu/ggml-cpu-traits.h \
|
|
|
|
ggml/src/ggml-cpu/ggml-cpu-traits.cpp \
|
2024-12-10 19:21:34 -08:00
|
|
|
ggml/src/ggml-cpu/amx/amx.h \
|
|
|
|
ggml/src/ggml-cpu/amx/amx.cpp \
|
|
|
|
ggml/src/ggml-cpu/amx/mmq.cpp \
|
|
|
|
ggml/src/ggml-cpu/amx/mmq.h
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
$(foreach name,$(GGML_FILES),$(eval $(call vendor_file,$(name),$(DEST_DIR))))
|
2024-10-17 15:03:09 -07:00
|
|
|
|
2024-12-10 19:21:34 -08:00
|
|
|
$(DEST_DIR)ggml-metal-embed.metal: $(DEST_DIR)ggml-common.h $(DEST_DIR)ggml-metal-impl.h
|
|
|
|
@sed -e '/__embed_ggml-common.h__/r $(DEST_DIR)/ggml-common.h' \
|
|
|
|
-e '/__embed_ggml-common.h__/d' \
|
|
|
|
< $(DEST_DIR)/ggml-metal.metal \
|
|
|
|
> $(DEST_DIR)/ggml-metal-embed.metal.tmp
|
|
|
|
@sed -e '/#include "ggml-metal-impl.h"/r $(DEST_DIR)/ggml-metal-impl.h' \
|
|
|
|
-e '/#include "ggml-metal-impl.h"/d' \
|
|
|
|
< $(DEST_DIR)/ggml-metal-embed.metal.tmp \
|
|
|
|
> $(DEST_DIR)/ggml-metal-embed.metal
|
|
|
|
@rm $(DEST_DIR)/ggml-metal-embed.metal.tmp
|
|
|
|
|
|
|
|
VENDORED_FILES += $(DEST_DIR)ggml-metal-embed.metal
|
|
|
|
|
2024-10-17 15:03:09 -07:00
|
|
|
# TODO generalize renaming pattern if we have more of these
|
2024-12-10 19:21:34 -08:00
|
|
|
$(DEST_DIR)ggml-metal_darwin_arm64.m : $(LLAMACPP_REPO)ggml/src/ggml-metal/ggml-metal.m
|
2024-10-17 15:03:09 -07:00
|
|
|
@echo "vendoring $(subst $(LLAMACPP_REPO),,$<)"; \
|
|
|
|
mkdir -p $(dir $@) && \
|
|
|
|
echo "/**" > $@ && \
|
|
|
|
echo " * llama.cpp - commit $(LLAMACPP_BASE_COMMIT) - do not edit this file" >> $@ && \
|
|
|
|
echo " *" >> $@ && \
|
|
|
|
sed 's/^/ * /' <$(LLAMACPP_REPO)/LICENSE | sed 's/ *$$//' >> $@ && \
|
|
|
|
echo " */" >> $@ && \
|
|
|
|
echo "" >> $@ && \
|
|
|
|
cat $< >> $@
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
VENDORED_FILES += $(DEST_DIR)ggml-metal_darwin_arm64.m
|
2024-10-17 15:03:09 -07:00
|
|
|
|
|
|
|
# ggml-cuda -> llama/ggml-cuda/
|
|
|
|
GGML_CUDA_FILES= ggml/src/ggml-cuda/*.cu ggml/src/ggml-cuda/*.cuh
|
|
|
|
GGML_CUDA_FILES_EXPANDED = $(addprefix ggml/src/ggml-cuda/,$(notdir $(wildcard $(addprefix $(LLAMACPP_REPO),$(GGML_CUDA_FILES)))))
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
$(foreach name,$(GGML_CUDA_FILES_EXPANDED),$(eval $(call vendor_file,$(name),$(DEST_DIR)ggml-cuda/)))
|
2024-10-17 15:03:09 -07:00
|
|
|
|
|
|
|
GGML_TEMPLATE_FILES= ggml/src/ggml-cuda/template-instances/*.cu
|
|
|
|
GGML_TEMPLATE_FILES_EXPANDED = $(addprefix ggml/src/ggml-cuda/template-instances/,$(notdir $(wildcard $(addprefix $(LLAMACPP_REPO),$(GGML_TEMPLATE_FILES)))))
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
$(foreach name,$(GGML_TEMPLATE_FILES_EXPANDED),$(eval $(call vendor_file,$(name),$(DEST_DIR)ggml-cuda/template-instances/)))
|
2024-10-17 15:03:09 -07:00
|
|
|
|
|
|
|
GGML_VENDOR_FILES= ggml/src/ggml-cuda/vendors/*.h
|
|
|
|
GGML_VENDOR_FILES_EXPANDED=$(addprefix ggml/src/ggml-cuda/vendors/,$(notdir $(wildcard $(addprefix $(LLAMACPP_REPO),$(GGML_VENDOR_FILES)))))
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
$(foreach name,$(GGML_VENDOR_FILES_EXPANDED),$(eval $(call vendor_file,$(name),$(DEST_DIR)ggml-cuda/vendors/)))
|
2024-10-17 15:03:09 -07:00
|
|
|
|
|
|
|
# llava -> llama/
|
|
|
|
LAVA_FILES= \
|
|
|
|
examples/llava/clip.cpp \
|
|
|
|
examples/llava/clip.h \
|
|
|
|
examples/llava/llava.cpp \
|
|
|
|
examples/llava/llava.h \
|
|
|
|
common/log.h \
|
|
|
|
common/log.cpp \
|
|
|
|
common/stb_image.h
|
|
|
|
# These files are mostly used by the llava code
|
|
|
|
# and shouldn't be necessary once we use clip.cpp directly
|
|
|
|
LAVA_FILES+= \
|
|
|
|
common/common.cpp \
|
|
|
|
common/common.h \
|
|
|
|
common/sampling.cpp \
|
|
|
|
common/sampling.h \
|
|
|
|
common/json.hpp \
|
|
|
|
common/json-schema-to-grammar.cpp \
|
|
|
|
common/json-schema-to-grammar.h \
|
|
|
|
common/base64.hpp
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
$(foreach name,$(LAVA_FILES),$(eval $(call vendor_file,$(name),$(DEST_DIR))))
|
2024-10-17 15:03:09 -07:00
|
|
|
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
$(DEST_DIR)build-info.cpp:
|
2024-10-17 15:03:09 -07:00
|
|
|
@echo "Generating $@"
|
|
|
|
@echo "int LLAMA_BUILD_NUMBER = 0;" > $@
|
|
|
|
@echo "char const *LLAMA_COMMIT = \"$(LLAMACPP_BASE_COMMIT)\";" >> $@
|
|
|
|
@echo "char const *LLAMA_COMPILER = \"\";" >> $@
|
|
|
|
@echo "char const *LLAMA_BUILD_TARGET = \"\";" >> $@
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
VENDORED_FILES += $(DEST_DIR)build-info.cpp
|
2024-10-17 15:03:09 -07:00
|
|
|
|
|
|
|
|
|
|
|
sync: $(LLAMACPP_REPO) .WAIT $(VENDORED_FILES) .WAIT remove-stale-files
|
|
|
|
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
sync-clean:
|
|
|
|
rm -f $(VENDORED_FILES) $(EXTRA_NATIVE_FILES)
|
|
|
|
|
2024-10-17 15:03:09 -07:00
|
|
|
PATS=*.c *.h *.cpp *.m *.metal *.cu *.cuh
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
NATIVE_DIRS=$(DEST_DIR) $(DEST_DIR)llamafile/ $(DEST_DIR)ggml-cuda/ $(DEST_DIR)ggml-cuda/template-instances/ $(DEST_DIR)ggml-cuda/vendors/
|
2024-10-17 15:03:09 -07:00
|
|
|
ALL_NATIVE_FILES=$(foreach dir,$(NATIVE_DIRS),$(wildcard $(addprefix $(dir),$(PATS))))
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
EXTRA_NATIVE_FILES=$(filter-out $(VENDORED_FILES) $(addprefix $(DEST_DIR),$(OLLAMA_NATIVE_FILES)), $(ALL_NATIVE_FILES))
|
2024-10-17 15:03:09 -07:00
|
|
|
remove-stale-files:
|
|
|
|
@rm -f $(EXTRA_NATIVE_FILES)
|
|
|
|
|
build: Make target improvements (#7499)
* llama: wire up builtin runner
This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build. After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.
* build: Make target improvements
Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.
* Support customized CPU flags for runners
This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash. If the user builds a customized set, we omit the naming
scheme and don't check for compatibility. This avoids checking
requirements at runtime, so that logic has been removed as well. This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.
* Use relative paths
If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.
* Remove payloads from main binary
* install: clean up prior libraries
This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
|
|
|
.PHONY: help-sync apply-patches sync create-patches remove-stale-fails .WAIT
|
2024-10-17 15:03:09 -07:00
|
|
|
|
|
|
|
|
|
|
|
# Handy debugging for make variables
|
|
|
|
print-%:
|
|
|
|
@echo '$*=$($*)'
|