5 Commits

Author SHA1 Message Date
Bruce MacDonald
bebb6823c0
server: validate local path on safetensor create ()
More validation during the safetensor creation process.
Properly handle relative paths (like ./model.safetensors) while rejecting absolute paths
Add comprehensive test coverage for various paths
No functionality changes for valid inputs - existing workflows remain unaffected
Leverages Go 1.24's new os.Root functionality for secure containment
2025-02-28 16:10:43 -08:00
Michael Yang
58245413f4
next ollama runner ()
feat: add new Ollama engine using ggml through cgo

This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this.

- `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go`
- `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go`
- `ml.Tensor` defines the interface for a tensor and tensor operations

This is the first implementation of the new engine. Follow up PRs will implement more features:

- non-greedy sampling ()
- integration with Ollama and KV caching ()
- more model support () with more coming soon

Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2025-02-13 16:31:21 -08:00
Patrick Devine
2539f2dbf9
Fix absolute path names + gguf detection () 2025-01-14 19:01:24 -08:00
Patrick Devine
8bccae4f92
show a more descriptive error in the client if it is newer than the server () 2025-01-09 10:12:30 -08:00
Patrick Devine
86a622cbdc
Update the /api/create endpoint to use JSON ()
Replaces `POST /api/create` to use JSON instead of a Modelfile.

This is a breaking change.
2024-12-31 18:02:30 -08:00