3725 Commits

Author SHA1 Message Date
Daniel Hiltgen
6a6328a5e9
ci: build dir changed (#8037)
Remove no longer relevant build log dir
v0.5.2-rc1
2024-12-10 20:33:34 -08:00
Jeffrey Morgan
527cc97899
llama: update vendored code to commit 40c6d79f (#7875) v0.5.2-rc0 2024-12-10 19:21:34 -08:00
Blake Mizerany
a37f4a86a7
go.mod: go 1.22.8 -> 1.23.4 (#8036) 2024-12-10 18:16:16 -08:00
湛露先生
46f74e0cb5
Return err when NewHipLib() detect error. (#8012)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2024-12-10 16:32:29 -08:00
Phil Wornath
7622ea21af
readme: add AI summary helper plugin to community-integrations (#7202) 2024-12-10 16:13:06 -08:00
Tao Zuhong
c5d3947084
readme: add Kangaroo, an AI-powered SQL admin tool to community integrations (#7948) 2024-12-10 13:48:32 -08:00
frob
757eeacc1b
server: lowercase hostname for Host header check (#5851) 2024-12-10 13:43:22 -08:00
Dr. Daniel Bender
dd42acf737
readme: add aidful-ollama-model-delete to community integrations (#8024) 2024-12-10 13:03:19 -08:00
Daniel Hiltgen
b9ccb3741e
Remove unused runner CpuFeatures (#8032)
The final implementation of #7499 removed dynamic vector requirements
in favor of a simpler filename based model, and this was left over logic that
is no longer needed.
2024-12-10 12:59:39 -08:00
Stefan Weil
abfdc4710f
all: fix typos in documentation, code, and comments (#7021) 2024-12-10 12:58:06 -08:00
Daniel Hiltgen
82a02e18d9
build: fix typo in override variable (#8031)
The "F" was missing.
2024-12-10 10:51:16 -08:00
Daniel Hiltgen
4879a234c4
build: Make target improvements (#7499)
* llama: wire up builtin runner

This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build.  After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.

* build: Make target improvements

Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.

* Support customized CPU flags for runners

This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash.  If the user builds a customized set, we omit the naming
scheme and don't check for compatibility.  This avoids checking
requirements at runtime, so that logic has been removed as well.  This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.

* Use relative paths

If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.

* Remove payloads from main binary

* install: clean up prior libraries

This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
frob
63269668c0
Prevent underflow when FreeMemory < overhead (#8014)
Co-authored-by: Richard Lyons <frob@cloudstaff.com>
2024-12-10 09:10:40 -08:00
Jesse Gross
900f64e6be prompt: Don't trim whitespace from prompts
New lines can be an important part of a user's prompt and trimming
it can alter the results. We previously only trimmed prompts with
images but refactoring brought this behavior to all prompts, where
it became more noticable.

The /generate endpoint adds less whitespace and therefore doesn't
need to trim it out - this brings the same behavior to /chat.

Thanks to @gabe-l-hart for spotting the issue!

Fixes #7795
2024-12-09 11:02:55 -08:00
Yannick Gloster
da09488fbf
docs: remove comment regarding tool streaming in openai.md (#7960) 2024-12-07 22:16:21 -08:00
湛露先生
7f0ccc8a9d
docs: fix syntax error in openai.md (#7986) 2024-12-07 22:14:36 -08:00
Parth Sareen
de52b6c2f9
bugfix: "null" value json mode (#7979) v0.5.1 2024-12-06 14:13:15 -08:00
Michael
acd7d03266
readme: add llama3.3 to readme (#7975)
readme: add llama3.3 to readme
2024-12-06 14:05:11 -05:00
Parth Sareen
f6e87fd628
docs: update readmes for structured outputs (#7962) 2024-12-06 10:35:37 -08:00
Jeffrey Morgan
aed1419c64
ci: skip go build for tests (#7899) v0.5.0-rc1 v0.5.0 2024-12-04 21:22:36 -08:00
Parth Sareen
c6c526275d
api: add generate endpoint for structured outputs (#7939) 2024-12-04 17:37:12 -08:00
Parth Sareen
630e7dc6ff
api: structured outputs - chat endpoint (#7900)
Adds structured outputs to chat endpoint
---------

Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Hieu Nguyen <hieunguyen1053@outlook.com>
2024-12-04 16:31:19 -08:00
Michael Yang
eb8366d658
Merge pull request #7932 from ollama/mxyng/fix-merges v0.4.8-rc0 2024-12-04 10:04:52 -08:00
Michael Yang
4456012956 fix unmarshaling merges 2024-12-04 09:21:56 -08:00
Sam
539be43640
llm: normalise kvct parameter handling (#7926) 2024-12-03 16:30:40 -08:00
Sam
1bdab9fdb1
llm: introduce k/v context quantization (vRAM improvements) (#6279) 2024-12-03 15:57:19 -08:00
owboson
2b82c5a8a1
docs: correct default num_predict value in modelfile.md (#7693) 2024-12-03 15:00:05 -08:00
Tigran
55c3efa900
docs: remove extra quote in modelfile.md (#7908) 2024-12-02 09:28:56 -08:00
David Mayboroda
1aedffad93
readme: add minima to community integrations (#7906) 2024-12-02 01:14:47 -08:00
Jeffrey Morgan
ff6c2d6dc8
cmd: don't rely on reading repo file for test (#7898) 2024-11-30 14:12:53 -08:00
Jeffrey Morgan
d543b282a7
server: add warning message for deprecated context field (#7878) 2024-11-30 14:05:50 -08:00
Parth Sareen
5f8051180e
Enable index tracking for tools - openai api support (#7888) v0.4.7 2024-11-29 20:00:09 -08:00
Jeffrey Morgan
39e29ae5dd
llama: fix typo and formatting in readme (#7876) 2024-11-28 17:27:11 -08:00
TheCookingSenpai
30a9f063c9
readme: add SpaceLlama, YouLama, and DualMind to community integrations (#7216) 2024-11-28 15:16:27 -08:00
Parth Sareen
ce7455a8e1
api: enable tool streaming (#7836) v0.4.6 2024-11-27 13:40:57 -08:00
ItzCrazyKns
e3936d4fb3
Support Multiple LoRa Adapters (#7667)
Closes #7627
2024-11-27 11:00:04 -08:00
Bruce MacDonald
940e62772e
openai: remove unused error code (#7850)
The writeError takes a code argument which is no longer used. Remove it for clarity.
2024-11-26 16:08:09 -08:00
Jesse Gross
71e6a0d0d1 runner.go: Don't try to extract image tags for text models
When processing a prompt, we look for image tags of the form
[img-0], which are inserted by the Ollama server process.
However, this can cause errors if the original prompt has these
tags - typically an image not found error is returned.

This changes tag searching behavior to be similar to the 0.3.x
series, which will largely avoid these problems. However,they can
still happen when input text with these tags is used with image
models. The correct solution is to escape the tags but this is a
larger issue with special sequences in general so this is an
incremental fix that should avoid the problem for the majority
of cases.
2024-11-26 13:23:24 -08:00
Jesse Gross
2cd11ae365 runner.go: Add unit tests for context shifting
This also makes it easier to truncate long inputs the same as
shifting but does not actually implement it. This type of
truncation has a trade off between quality and time to first
token.
2024-11-26 11:21:35 -08:00
jake83741
52bbad12f9
readme: update description for vnc-lm community integration (#7832) 2024-11-25 17:56:30 -08:00
frob
30e88d7f31
cmd: don't submit svg files as images for now (#7830) 2024-11-25 16:43:29 -08:00
Blake Mizerany
2b7ed61ca2
server: fix Transport override (#7834)
This changes makeRequest to update the http client Transport if and only
if testMakeRequestDialContext is set. This is to avoid overriding the
default Transport when testMakeRequestDialContext is nil, which broke
existing behavior, included proxies, timeouts, and other behaviors.

Fixes #7829
Fixes #7788
v0.4.5
2024-11-25 15:08:34 -08:00
Shikhar Bakhda
647513a7d4
readme: add HoneyHive to community integrations (#7831) 2024-11-25 09:55:33 -08:00
Bruce MacDonald
a210ec74d2
cmd: print location of model after pushing (#7695)
After a user pushes their model it is not clear what to do next. Add a link
to the output of `ollama push` that tells the user where their model can now
be found.
2024-11-25 09:40:16 -08:00
Simon Schampijer
cfb1ddd6fc
examples: update langchain-python-simple (#3591)
- better formatting of input prompt
- use invoke instead of predict
2024-11-24 16:06:22 -08:00
reid41
3987acd7ec
readme: add descriptions for QA-Pilot and shell-pilot community integrations (#4303) 2024-11-24 15:55:09 -08:00
frob
fda1e6b563
llm: bring fileTypes into alignment with llama.cpp (#7819) 2024-11-24 10:33:33 -08:00
Adarsh Mishra
3440ffb37b
readme: add description for OpenTalkGpt in community integrations (#7818) 2024-11-24 10:32:23 -08:00
Patcher
a820d2b267
readme: add observability section with OpenLIT to community-integrations 2024-11-23 18:03:12 -08:00
Meng Zhuo
2ebdb54fb3
all: update math32 go mod to v1.11.0 (#6627) 2024-11-23 15:21:54 -08:00