mirror of
https://github.com/ollama/ollama.git
synced 2025-03-27 18:22:14 +01:00
* remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm
318 B
318 B
Development
- Install cmake or (optionally, required tools for GPUs)
- run
go generate ./...
- run
go build .
Install required tools:
brew install go cmake gcc
Get the required libraries:
go generate ./...
Then build ollama:
go build .
Now you can run ollama
:
./ollama