ollama

mirror of https://github.com/ollama/ollama.git synced 2025-08-28 11:40:52 +02:00

Author	SHA1	Message	Date
Michael Yang	8d76fa23ef	count non-repeating vision layers	2025-03-13 16:53:29 -07:00
Michael Yang	65b88c544f	fix divide by zero	2025-03-13 16:35:00 -07:00
Michael Yang	a422ba39c9	roughly count gemma3 graph the largest operation is by far (q @ k) so just count that for simplicity	2025-03-13 16:35:00 -07:00
Michael Yang	d2ec22371e	count all vision tensors	2025-03-13 16:35:00 -07:00
Michael Yang	033cec232a	count gemma3 vision tensors	2025-03-13 16:34:42 -07:00
Shane-XB-Qian	6b45b1d6b4	cli: adding support ctrl-n/p like general cli (#9136 ) Signed-off-by: shane.xb.qian <shane.qian@foxmail.com>	2025-03-12 08:51:56 -07:00
frob	b3af953a55	cli: don't exit for invalid model during /load. (#9576 ) Co-authored-by: Richard Lyons <frob@cloudstaff.com>	2025-03-11 23:42:53 -07:00
Michael	ad4e0bf3be	Adding Gemma 3 to readme (#9671 )	2025-03-12 07:39:25 +01:00
Michael Yang	aee28501b5	Merge pull request #9661 from ollama/gemma engine: add gemma support v0.6.0-rc0 v0.6.0	2025-03-11 15:07:50 -07:00
jmorganca	83f0ec8269	all: address linter errors	2025-03-11 14:49:20 -07:00
jmorganca	c6b6938b3a	kvcache: fix tests by adding AvgPool2D stub	2025-03-11 14:49:20 -07:00
jmorganca	fb4664fcec	model: add more spm tokenizer tests	2025-03-11 14:49:20 -07:00
jmorganca	20e3593863	model: validate left and right pairs before merging them	2025-03-11 14:49:20 -07:00
Michael Yang	63a394068c	use 2d pooling	2025-03-11 14:49:20 -07:00
Daniel Hiltgen	ab39e08eb9	llm: auto detect models that require Ollama Engine (#1 )	2025-03-11 14:49:20 -07:00
jmorganca	11bfa62796	add trailing \n\n after <end_of_image> to match reference implementation	2025-03-11 14:49:20 -07:00
jmorganca	f63e62e546	reduce kernel size, add TODO for loading from config	2025-03-11 14:49:20 -07:00
jmorganca	65b0f329d1	Revert "Allow models to force a new batch" This reverts commit c7eae586b899083acebcd9b3847b89ea78c2850c.	2025-03-11 14:49:20 -07:00
Jesse Gross	06007c0a18	Allow models to force a new batch This is useful for a few things: - Work around bugs, such as having 2 images in one batch - Keep the image in a single batch for fully connected attention - Improve performance by not evaluating embeddings multiple times	2025-03-11 14:49:20 -07:00
Jesse Gross	a8e83a7654	Disable causal attention based on batch index Currently we are using positions, which are relative to a sequence and may not be unique.	2025-03-11 14:49:20 -07:00
Jesse Gross	475005504e	Restrict Gemma to a single image per request	2025-03-11 14:49:20 -07:00
Jesse Gross	2c40c4d35e	Fix follow up images and images split across batches	2025-03-11 14:49:19 -07:00
Michael Yang	e95278932b	use non-causal mask only for image positions	2025-03-11 14:49:19 -07:00
Michael Yang	9d2a20a763	use non-causal mask for inputs with images	2025-03-11 14:49:19 -07:00
Patrick Devine	2e54d72fc3	fix gemma3 1b conversion	2025-03-11 14:49:19 -07:00
Michael Yang	6b32a2d549	compat with upstream gguf	2025-03-11 14:49:19 -07:00
Michael Yang	c5cbe4fc2a	fallback to cpu	2025-03-11 14:49:19 -07:00
Michael Yang	f888912870	fix vision encoder	2025-03-11 14:49:19 -07:00
Michael Yang	9e4642e9b3	ollama debug tensor	2025-03-11 14:49:19 -07:00
Michael Yang	6b0486c216	duplicate token_embd to output	2025-03-11 14:49:19 -07:00
Michael Yang	d368c039f0	skip repacking vision tensors	2025-03-11 14:49:19 -07:00
Patrick Devine	9b54267e69	fix configs	2025-03-11 14:49:19 -07:00
Michael Yang	46bb0169c4	update model	2025-03-11 14:49:19 -07:00
Michael Yang	8934324b72	use fast attention	2025-03-11 14:49:18 -07:00
Jesse Gross	0e886595bf	Fix tests and drift from main	2025-03-11 14:49:18 -07:00
Patrick Devine	c62861f4fa	fix conversion	2025-03-11 14:49:18 -07:00
Michael Yang	0df1800436	set non-causal attention	2025-03-11 14:49:18 -07:00
Patrick Devine	631fecc6d9	temporary work around for converting spm	2025-03-11 14:49:18 -07:00
Jesse Gross	4346c2409d	fix drift from main	2025-03-11 14:49:18 -07:00
Michael Yang	4b037a97dc	add gemma vision encoder	2025-03-11 14:49:17 -07:00
Patrick Devine	5f74d1fd47	gemma2 impl	2025-03-11 14:35:08 -07:00
Daniel Hiltgen	4dcf80167a	Build release for windows with local script (#9636 )	2025-03-11 08:34:20 -07:00
Michael Yang	26a26998fb	Merge pull request #9590 from ollama/mxyng/dump-pad fix: pad tensor item if ge zero	2025-03-10 16:34:55 -07:00
Michael Yang	9926eae015	fix: pad tensor item if ge zero this produces a nicer output since both positive and negative values produces the same width	2025-03-10 16:18:12 -07:00
Vincent Koc	8585b7b151	docs: add opik to observability integrations (#9626 )	2025-03-10 16:15:10 -07:00
Parth Sareen	7e34f4fbfa	sample: add numerical stability to temperature/softmax transform (#9631 )	2025-03-10 14:43:53 -07:00
Michael Yang	fe776293f7	Merge pull request #9569 from dwt/patch-1 Better WantedBy declaration	2025-03-10 14:09:37 -07:00
frob	d8a5d96b98	docs: Add OLLAMA_CONTEXT_LENGTH to FAQ. (#9545 )	2025-03-10 11:02:54 -07:00
Xiaowei Zhu	757668c42f	docs: add SwiftChat (#9540 )	2025-03-10 11:01:09 -07:00
Sam	96ec8afd09	docs(tool): add mcp-llm (#9537 )	2025-03-10 09:52:02 -07:00

1 2 3 4 5 ...

4045 Commits