ollama

mirror of https://github.com/ollama/ollama.git synced 2025-03-29 03:01:45 +01:00

Author	SHA1	Message	Date
Michael Yang	58245413f4	next ollama runner (#7913 ) feat: add new Ollama engine using ggml through cgo This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this. - `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go` - `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go` - `ml.Tensor` defines the interface for a tensor and tensor operations This is the first implementation of the new engine. Follow up PRs will implement more features: - non-greedy sampling (#8410) - integration with Ollama and KV caching (#8301) - more model support (#9080) with more coming soon Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2025-02-13 16:31:21 -08:00
Josh	93a8daf285	convert: import support for command-r models from safetensors (#6063 ) --------- Co-authored-by: Patrick Devine <patrick@infrahq.com>	2025-01-15 16:31:22 -08:00
Bruce MacDonald	f6f3713001	convert: qwen2 from safetensors (#8408 ) Add native support for converting Qwen2 family models (including Qwen2.5) from safetensors to gguf format so we can run it.	2025-01-14 10:34:37 -08:00
Patrick Devine	c7cb0f0602	image processing for llama3.2 (#6963 ) Co-authored-by: jmorganca <jmorganca@gmail.com> Co-authored-by: Michael Yang <mxyng@pm.me> Co-authored-by: Jesse Gross <jesse@ollama.com>	2024-10-18 16:12:35 -07:00
Patrick Devine	608e87bf87	Fix gemma2 2b conversion (#6645 )	2024-09-05 17:02:28 -07:00
Michael Yang	9cfd2dd3e3	Merge pull request #6522 from ollama/mxyng/detect-chat detect chat template from configs that contain lists	2024-08-28 11:04:18 -07:00
Patrick Devine	6c1c1ad6a9	throw an error when encountering unsupport tensor sizes (#6538 )	2024-08-27 17:54:04 -07:00
Michael Yang	eae3af6807	clean up convert tokenizer	2024-08-27 11:11:43 -07:00
Patrick Devine	0c819e167b	convert safetensor adapters into GGUF (#6327 )	2024-08-23 11:29:56 -07:00
Michael Yang	77903ab8b4	llama3.1	2024-08-21 11:49:31 -07:00
Michael Yang	3546bbd08c	convert gemma2	2024-08-20 17:27:51 -07:00
Michael Yang	5a28b9cf5f	bert	2024-08-20 17:27:34 -07:00
Michael Yang	6ffb5cb017	add conversion for microsoft phi 3 mini/medium 4k, 128	2024-08-12 15:13:29 -07:00
Michael Yang	b732beba6a	lint	2024-08-01 17:06:06 -07:00
Michael Yang	eafc607abb	convert: only extract large files	2024-07-31 15:58:55 -07:00
Michael Yang	5e9db9fb0b	refactor convert	2024-07-31 15:58:33 -07:00
Michael Yang	6b252918fb	update convert test to check result data	2024-07-31 10:59:38 -07:00
Michael Yang	3591bbe56f	add test	2024-05-21 11:28:22 -07:00

18 Commits