ollama

mirror of https://github.com/ollama/ollama.git synced 2025-08-25 13:01:31 +02:00

Author	SHA1	Message	Date
Daniel Hiltgen	34088dbcfb	API/CLI context enhancements (#11331 ) * API: expose context size of loaded models * CLI: add context UX This adds a column in the ps output to show the models context size.	2025-07-08 11:59:06 -07:00
Parth Sareen	43107b15b9	add `tool_name` to api.md (#11326 )	2025-07-07 16:53:13 -07:00
Parth Sareen	1f91cb0c8c	template: add tool result compatibility (#11294 )	2025-07-07 15:53:42 -07:00
Daniel Hiltgen	12d8ad0d38	ci: modularization (#11324 ) switch a few constants to variables	2025-07-07 14:07:43 -07:00
Jesse Gross	592d21e7db	Revert "ggml: Temporarily disable reporting UUIDs" The root cause was an unclean upgrade - this code is fine. This reverts commit `45f216a9c7`.	2025-07-07 11:31:02 -07:00
Jeffrey Morgan	5a08b01f5b	readme: update Ollama icon size	2025-07-05 17:20:42 -07:00
Daniel Hiltgen	4f473e224c	int: add performance integration tests (#11173 ) usage example: go test --tags=integration,perf -count 1 ./integration -v -timeout 1h -run TestModelsPerf 2>&1 \| tee int.log cat int.log \| grep MODEL_PERF_HEADER \| cut -f2- -d: > perf.csv cat int.log \| grep MODEL_PERF_DATA \| cut -f2- -d: >> perf.csv	2025-07-05 16:07:09 -07:00
Daniel Hiltgen	9d60bb44cf	doc: add NVIDIA blackwell to supported list (#11307 )	2025-07-05 16:06:30 -07:00
Vincent RAMPAL	f371260e75	Update base image to Ubuntu 24.04 LTS (#9681 )	2025-07-05 16:02:33 -07:00
Daniel Hiltgen	c9e6d7719e	doc: Update link for mac install (#11288 ) Favor the dmg now.	2025-07-03 09:48:45 -07:00
Daniel Hiltgen	2c4ce40334	mimic logs for layers on new engine (#11278 ) This adds some extra logs to make the new engine a bit more consistent with the llama engine.	2025-07-02 16:38:36 -07:00
XuKecheng	5d8c173529	readme: add NativeMind to community integrations (#11242 )	2025-07-01 09:46:15 -07:00
Jeffrey Morgan	44b17d2bfa	tools: fix parsing tool calls with empty arguments, missing required fields (#11233 )	2025-06-30 08:59:03 -07:00
Attogram Project	3b8b692218	readme: add ollama-bash-toolshed to community integrations (#11224 )	2025-06-29 14:59:54 -07:00
Michael Yang	4129af9205	chore: cleanup comments + unused vars (#11225 )	2025-06-27 11:45:33 -07:00
Jesse Gross	45f216a9c7	ggml: Temporarily disable reporting UUIDs This is causing segfaults, so disable it. Currently UUIDs are only used for debugging purposes, although they planned to be used in additional ways in the future. Bug #11211	2025-06-27 11:27:22 -07:00
Michael Yang	d0b32def60	skip quantizing per_layer_token_embd (#11207 ) this tensor isn't compatible with cuda when quantized to q4_K so skip it	2025-06-26 21:49:35 -07:00
Daniel Hiltgen	11ffc36157	ci: multi-stage release process (#11001 )	2025-06-26 10:32:48 -07:00
Jeffrey Morgan	ba04902670	fs/ggml: add multiplier in graph estimates (#11208 )	2025-06-26 00:19:44 -07:00
Jeffrey Morgan	3944602f51	fs/ggml: add missing architecture to OllamaEngineRequired() (#11206 )	2025-06-26 00:11:23 -07:00
Michael Yang	73b642e6f3	add new gemma model (#11204 ) * update patches * cherry pick metal mean kernel * cherry pick cuda mean kernel * gemma3n	2025-06-25 21:47:09 -07:00
Daniel Hiltgen	ad118d8b13	ci: arm sbsa fixes (#11194 )	2025-06-24 21:00:15 -07:00
Daniel Hiltgen	f08534137b	ci: include dependencies	2025-06-24 20:27:43 -07:00
Daniel Hiltgen	4b4a90f233	ci: pick up arm sbsa cuda libs (#11192 )	2025-06-24 18:59:22 -07:00
Daniel Hiltgen	03274a6b2f	ci: recombine linux amd64 binaries (#11188 ) Glue the rocm and archive builds back together.	2025-06-24 18:45:01 -07:00
Devon Rifkin	cc6463ebca	Merge pull request #10238 from ollama/drifkin/array-head-count-simple ggml: fix crash for array head counts	2025-06-24 17:50:02 -07:00
Daniel Hiltgen	405d2f628f	ci: rocm parallel builds on windows (#11187 ) The preset CMAKE_HIP_FLAGS isn't getting used on Windows. This passes the parallel flag in through the C/CXX flags, along with suppression for some log spew warnings to quiet down the build.	2025-06-24 15:27:09 -07:00
Devon Rifkin	a3f7dd3e98	Merge branch 'main' into drifkin/array-head-count-simple	2025-06-24 14:20:05 -07:00
Daniel Hiltgen	c85c0ebf89	CI: switch windows to vs 2022 (#11184 ) * CI: switch windows to vs 2022 * ci: fix regex match	2025-06-24 13:26:55 -07:00
Daniel Hiltgen	10a8e04a8d	avoid context overflow (#11175 ) For smaller context models, make sure we do not exceed the training size.	2025-06-23 15:52:50 -07:00
Daniel Hiltgen	1c6669e64c	Re-remove cuda v11 (#10694 ) * Re-remove cuda v11 Revert the revert - drop v11 support requiring drivers newer than Feb 23 This reverts commit `c6bcdc4223`. * Simplify layout With only one version of the GPU libraries, we can simplify things down somewhat. (Jetsons still require special handling) * distinct sbsa variant for linux arm64 This avoids accidentally trying to load the sbsa cuda libraries on a jetson system which results in crashes. * temporary prevent rocm+cuda mixed loading	2025-06-23 14:07:00 -07:00
Devon Rifkin	b2b270ad5d	Merge branch 'main' into drifkin/array-head-count-simple	2025-06-23 10:37:31 -07:00
AJ	2bb69b40c7	readme: add ai-hub to community integrations (#11169 )	2025-06-23 09:21:12 -07:00
Daniel Hiltgen	65bff664cb	build speedups (#11142 ) Enable parallel building of the GPU architectures.	2025-06-20 12:32:51 -07:00
Michael Yang	c088ac0e79	convert: utility for merging tensors (#11069 )	2025-06-20 11:12:01 -07:00
Michael Yang	0a066cfd91	Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 ) * Reapply "feat: incremental gguf parser (#10822)" (#11114) This reverts commit `a6e64fbdf2`. * fix older ggufs	2025-06-20 11:11:40 -07:00
Jesse Gross	87b7af6cee	ggml: Check return status for computation. We don't check the return status after computing the graph, which can silently lead to bad outputs if we try to keep going and future computation succeeds. This appears to happens in certain cases on Apple M2 devices. Fixes #11070	2025-06-19 17:12:49 -07:00
Daniel Hiltgen	f2527b08fb	int: add coverage for older models (#11137 ) Verified these fail on 0.9.1 and pass on HEAD.	2025-06-19 12:10:19 -07:00
Jeffrey Morgan	8bcb3125c1	benchmark: remove unused benchmark test (#11120 ) Removes a test under benchmark/ that is unused	2025-06-18 12:58:50 -07:00
Jeffrey Morgan	6baf1e31e2	Revert "Revert "ggml: Export GPU UUIDs" (#11115 )" (#11117 ) Reverts PR #11115. The original change was mistakingly reverted instead of #10822	2025-06-18 07:30:49 -07:00
Jeffrey Morgan	ed567ef43b	Revert "ggml: Export GPU UUIDs" (#11115 ) This reverts commit `aaa7818000`.	2025-06-18 05:45:00 -07:00
Jeffrey Morgan	a6e64fbdf2	Revert "feat: incremental gguf parser (#10822 )" (#11114 ) This reverts commit `6b04cad7e8`.	2025-06-18 05:42:44 -07:00
曹家巧	60cfa2a203	cache: fix comment function name in cache.go (#11110 )	2025-06-18 05:21:45 -07:00
Jeffrey Morgan	55bbf3b4a1	tools: return empty arguments object instead of null (#11113 )	2025-06-18 05:20:43 -07:00
Jeffrey Morgan	6bda1d2479	tools: fix parsing tool calls without any parameters (#11101 ) Fixes issue where tool calls that don't expect any parameters were not being parsed. This also fixes two additional issues: one where 2+ tool calls would not be correctly parsed, and cases where tool calls with invalid parameters would still get parsed	2025-06-17 10:51:43 -07:00
Jeffrey Morgan	9e125d884c	model: treat 'user defined' tokens as special tokens (#11077 )	2025-06-16 16:03:16 -07:00
Michael Yang	a6fbfc880c	gguf: fix write order (#11068 ) * ggml: test write gguf order * ggml: fix write tensor order	2025-06-16 10:42:32 -07:00
NGC13009	502028968d	readme: add ollama-launcher to community integrations (#11080 )	2025-06-15 21:27:49 -07:00
Phil	5a8eb0e151	readme: add GPTranslate to community integrations (#11071 )	2025-06-14 08:54:03 -07:00
Jeffrey Morgan	9f8a18ec05	tools: loosen tool parsing to allow for more formats (#11030 )	2025-06-12 14:18:54 -07:00

1 2 3 4 5 ...

4493 Commits