ollama

mirror of https://github.com/ollama/ollama.git synced 2025-04-08 03:48:21 +02:00

Author	SHA1	Message	Date
Michael Yang	58245413f4	next ollama runner (#7913 ) feat: add new Ollama engine using ggml through cgo This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this. - `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go` - `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go` - `ml.Tensor` defines the interface for a tensor and tensor operations This is the first implementation of the new engine. Follow up PRs will implement more features: - non-greedy sampling (#8410) - integration with Ollama and KV caching (#8301) - more model support (#9080) with more coming soon Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2025-02-13 16:31:21 -08:00
Patrick Devine	c7cb0f0602	image processing for llama3.2 (#6963 ) Co-authored-by: jmorganca <jmorganca@gmail.com> Co-authored-by: Michael Yang <mxyng@pm.me> Co-authored-by: Jesse Gross <jesse@ollama.com>	2024-10-18 16:12:35 -07:00
Michael Yang	b732beba6a	lint	2024-08-01 17:06:06 -07:00
Jeffrey Morgan	20090f3172	preserve last assistant message (#5802 )	2024-07-19 20:19:26 -07:00
Michael Yang	d290e87513	add suffix support to generate endpoint this change is triggered by the presence of "suffix", particularly useful for code completion tasks	2024-07-16 14:31:35 -07:00
Michael Yang	36c87c433b	template: preprocess message and collect system	2024-07-12 12:26:43 -07:00
Michael Yang	5056bb9c01	rename aggregate to contents	2024-07-11 17:00:26 -07:00
Michael Yang	57ec6901eb	revert embedded templates to use prompt/response This reverts commit 19753c18c01183b4c974e36e89b0c7cbdcc3c38a. for compat. messages will be added at a later date	2024-07-11 14:49:35 -07:00
Michael Yang	e64f9ebb44	do no automatically aggregate system messages	2024-07-11 14:49:35 -07:00
Michael Yang	41be28096a	add system prompt to first legacy template	2024-07-10 17:03:08 -07:00
Michael Yang	fb6cbc02fb	update named templates	2024-07-05 16:29:32 -07:00
Michael Yang	326363b3a7	no funcs	2024-07-05 13:17:25 -07:00
Michael Yang	2c3fe1fd97	comments	2024-07-05 13:17:24 -07:00
Michael Yang	269ed6e6a2	update message processing	2024-07-05 13:16:58 -07:00
Michael Yang	a30915bde1	add capabilities	2024-07-01 10:47:43 -07:00
Michael Yang	58e3fff311	rename templates to template	2024-07-01 10:40:54 -07:00

16 Commits