3844 Commits

Author SHA1 Message Date
Blake Mizerany
2ddc32d5c5
llm: do not error on "null" format (#8139)
This fixes another regression in the previous commit that fixed other
known bugs.
v0.5.4
2024-12-17 09:49:37 -08:00
Jascha Beste
2cde4b8817
readme: change getting started guide link for pgai (#8119) 2024-12-16 22:13:23 -08:00
Blake Mizerany
87f0a49fe6
llm: do not silently fail for supplied, but invalid formats (#8130)
Changes in #8002 introduced fixes for bugs with mangling JSON Schemas.
It also fixed a bug where the server would silently fail when clients
requested invalid formats. It also, unfortunately, introduced a bug
where the server would reject requests with an empty format, which
should be allowed.

The change in #8127 updated the code to allow the empty format, but also
reintroduced the regression where the server would silently fail when
the format was set, but invalid.

This commit fixes both regressions. The server does not reject the empty
format, but it does reject invalid formats. It also adds tests to help
us catch regressions in the future.

Also, the updated code provides a more detailed error message when a
client sends a non-empty, but invalid format, echoing the invalid format
in the response.

This commits also takes the opportunity to remove superfluous linter
checks.
v0.5.3
2024-12-16 21:57:49 -08:00
Jeffrey Morgan
0f06a6daa7
llm: loosen format check to default to no format (#8127) 2024-12-16 18:45:46 -08:00
Daniel Hiltgen
8f805dd74b
darwin: restore multiple runners for x86 (#8125)
In 0.5.2 we simplified packaging to have avx only for macos x86.  It looks like
there may still be some non-AVX systems out there, so this puts back the prior
logic of building no-AVX for the primary binary, and now 2 runners for avx and avx2.
These will be packaged in the App bundle only, so the stand-alone binary will now be
without AVX support on macos.  On arm, we'll also see these runners reported
as available in the log, but they're dormant and will never be used at runtime.
2024-12-16 18:45:02 -08:00
Michael
89d5e2f2fd
readme: example/get started guide for pgai with Ollama (#8115)
readme: example/get started guide for pgai with Ollama
2024-12-16 17:14:37 +08:00
Jascha Beste
297ada6c87
readme: add pgai to readme for semantic search (#8028)
* docs: switch around database integrations order and link to quickstart

* docs: link to blog post in example readme

* chore: link to main readme

* readme: removing example to link externally

readme: removing example to link externally so we don't have to keep this example up-to-date

---------
2024-12-16 17:02:28 +08:00
Patrick Devine
8c9fb8eb73
imageproc mllama refactor (#7537)
Refactor mllama image processing code, and add pixtral and qwen2vl
2024-12-14 19:50:15 -08:00
Daniel Hiltgen
b75ccfc5ec
ci: be more aggressive on parallelism in build (#8102) v0.5.3-rc0 2024-12-14 14:56:05 -08:00
Jeffrey Morgan
7a81daf026
llama: update vendor code to commit ba1cb19c (#8101) 2024-12-14 14:55:51 -08:00
Daniel Hiltgen
60f75560a2
runner: switch logging back to stderr (#8091)
This puts the low-level runner logging back on stderr for consistency with prior releases
v0.5.2
2024-12-13 14:36:50 -08:00
Anuraag (Rag) Agrawal
e28f2d4900
openai: return usage as final chunk for streams (#6784)
* openai: return usage as final chunk for streams

---------

Co-authored-by: ParthSareen <parth.sareen@ollama.com>
2024-12-12 17:09:30 -08:00
Pascal Patry
c216850523
llama: parse JSON schema using nlohmann::ordered_json to maintain ordering (#8071) 2024-12-12 09:57:28 -08:00
Parth Sareen
18f6a98bd6
llama: enable JSON schema key ordering for generating grammars (#8055) 2024-12-11 17:17:36 -08:00
Blake Mizerany
b1fd7fef86
server: more support for mixed-case model names (#8017)
Fixes #7944
2024-12-11 15:29:59 -08:00
Daniel Hiltgen
36d111e788
ci: fix linux version (#8054)
Pass through the version override so the makefiles use it
2024-12-11 14:09:57 -08:00
Blake Mizerany
9039c821a2
llama: preserve field order in user-defined JSON schemas (#8002)
Previously we decoded and re-encoded JSON schemas during validation,
which served no purpose since json.RawMessage already validates JSON
syntax. Worse, the re-encoding lost field ordering from the original
schema, which affects inference quality during step-by-step reasoning.

While fixing this ordering issue by using json.RawMessage directly,
testing revealed that schema_to_grammar (from llama.cpp) also fails to
preserve field order during grammar generation. This appears to be the
root cause of inference degradation.

This change prevents us from mangling the user's original schema order,
but we still need to address the ordering issue in schema_to_grammar.
That will be a separate change.

Updates #7978
2024-12-11 14:07:30 -08:00
Daniel Hiltgen
581a4a5553
ci: fix artifact path prefix for missing windows payloads (#8052)
upload-artifacts strips off leading common paths so when
the ./build/ artifacts were removed, the ./dist/windows-amd64
prefix became common and was stripped, making the
later download-artifacts place them in the wrong location
v0.5.2-rc3
2024-12-11 10:59:32 -08:00
Daniel Hiltgen
cf4d7c52c4
win: builtin arm runner (#8039)
The new build embeds the arm runner in the
main binary, so there is no longer a lib/ollama
v0.5.2-rc2
2024-12-11 08:32:13 -08:00
Daniel Hiltgen
6a6328a5e9
ci: build dir changed (#8037)
Remove no longer relevant build log dir
v0.5.2-rc1
2024-12-10 20:33:34 -08:00
Jeffrey Morgan
527cc97899
llama: update vendored code to commit 40c6d79f (#7875) v0.5.2-rc0 2024-12-10 19:21:34 -08:00
Blake Mizerany
a37f4a86a7
go.mod: go 1.22.8 -> 1.23.4 (#8036) 2024-12-10 18:16:16 -08:00
湛露先生
46f74e0cb5
Return err when NewHipLib() detect error. (#8012)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2024-12-10 16:32:29 -08:00
Phil Wornath
7622ea21af
readme: add AI summary helper plugin to community-integrations (#7202) 2024-12-10 16:13:06 -08:00
Tao Zuhong
c5d3947084
readme: add Kangaroo, an AI-powered SQL admin tool to community integrations (#7948) 2024-12-10 13:48:32 -08:00
frob
757eeacc1b
server: lowercase hostname for Host header check (#5851) 2024-12-10 13:43:22 -08:00
Dr. Daniel Bender
dd42acf737
readme: add aidful-ollama-model-delete to community integrations (#8024) 2024-12-10 13:03:19 -08:00
Daniel Hiltgen
b9ccb3741e
Remove unused runner CpuFeatures (#8032)
The final implementation of #7499 removed dynamic vector requirements
in favor of a simpler filename based model, and this was left over logic that
is no longer needed.
2024-12-10 12:59:39 -08:00
Stefan Weil
abfdc4710f
all: fix typos in documentation, code, and comments (#7021) 2024-12-10 12:58:06 -08:00
Daniel Hiltgen
82a02e18d9
build: fix typo in override variable (#8031)
The "F" was missing.
2024-12-10 10:51:16 -08:00
Daniel Hiltgen
4879a234c4
build: Make target improvements (#7499)
* llama: wire up builtin runner

This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build.  After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.

* build: Make target improvements

Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.

* Support customized CPU flags for runners

This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash.  If the user builds a customized set, we omit the naming
scheme and don't check for compatibility.  This avoids checking
requirements at runtime, so that logic has been removed as well.  This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.

* Use relative paths

If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.

* Remove payloads from main binary

* install: clean up prior libraries

This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
frob
63269668c0
Prevent underflow when FreeMemory < overhead (#8014)
Co-authored-by: Richard Lyons <frob@cloudstaff.com>
2024-12-10 09:10:40 -08:00
Jesse Gross
900f64e6be prompt: Don't trim whitespace from prompts
New lines can be an important part of a user's prompt and trimming
it can alter the results. We previously only trimmed prompts with
images but refactoring brought this behavior to all prompts, where
it became more noticable.

The /generate endpoint adds less whitespace and therefore doesn't
need to trim it out - this brings the same behavior to /chat.

Thanks to @gabe-l-hart for spotting the issue!

Fixes #7795
2024-12-09 11:02:55 -08:00
Yannick Gloster
da09488fbf
docs: remove comment regarding tool streaming in openai.md (#7960) 2024-12-07 22:16:21 -08:00
湛露先生
7f0ccc8a9d
docs: fix syntax error in openai.md (#7986) 2024-12-07 22:14:36 -08:00
Parth Sareen
de52b6c2f9
bugfix: "null" value json mode (#7979) v0.5.1 2024-12-06 14:13:15 -08:00
Michael
acd7d03266
readme: add llama3.3 to readme (#7975)
readme: add llama3.3 to readme
2024-12-06 14:05:11 -05:00
Parth Sareen
f6e87fd628
docs: update readmes for structured outputs (#7962) 2024-12-06 10:35:37 -08:00
Jeffrey Morgan
aed1419c64
ci: skip go build for tests (#7899) v0.5.0-rc1 v0.5.0 2024-12-04 21:22:36 -08:00
Parth Sareen
c6c526275d
api: add generate endpoint for structured outputs (#7939) 2024-12-04 17:37:12 -08:00
Parth Sareen
630e7dc6ff
api: structured outputs - chat endpoint (#7900)
Adds structured outputs to chat endpoint
---------

Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Hieu Nguyen <hieunguyen1053@outlook.com>
2024-12-04 16:31:19 -08:00
Michael Yang
eb8366d658
Merge pull request #7932 from ollama/mxyng/fix-merges v0.4.8-rc0 2024-12-04 10:04:52 -08:00
Michael Yang
4456012956 fix unmarshaling merges 2024-12-04 09:21:56 -08:00
Sam
539be43640
llm: normalise kvct parameter handling (#7926) 2024-12-03 16:30:40 -08:00
Sam
1bdab9fdb1
llm: introduce k/v context quantization (vRAM improvements) (#6279) 2024-12-03 15:57:19 -08:00
owboson
2b82c5a8a1
docs: correct default num_predict value in modelfile.md (#7693) 2024-12-03 15:00:05 -08:00
Tigran
55c3efa900
docs: remove extra quote in modelfile.md (#7908) 2024-12-02 09:28:56 -08:00
David Mayboroda
1aedffad93
readme: add minima to community integrations (#7906) 2024-12-02 01:14:47 -08:00
Jeffrey Morgan
ff6c2d6dc8
cmd: don't rely on reading repo file for test (#7898) 2024-11-30 14:12:53 -08:00
Jeffrey Morgan
d543b282a7
server: add warning message for deprecated context field (#7878) 2024-11-30 14:05:50 -08:00