Commit Graph

53 Commits

Author SHA1 Message Date
Devon Rifkin
47991940d4 add qwen3-coder tool support
The format qwen3-coder uses is relatively unique, both in rendering and
in parsing. To implement parsing, I wrote a custom parser in similar
style to harmony. For the rendering, I found that the logic would be
much more difficult to follow in a template, so I introduced the concept
of a built-in renderer that uses go code, rather than a template to
generate prompts.

I set us up for future built-in parsers and renderers by making it so
they can be specified in a Modelfile like so:

```
RENDERER "qwen3-coder"
PARSER "qwen3-coder"
```

These need to be provided explicitly because the architecture alone is
not enough to understand what format the model expects to receive, and
what format we expect it to output (e.g., qwen3-coder is `qwen3moe`,
which includes other qwen3-family models as well)

I haven't converted harmony to be one of these "built-ins" yet, since
some of it is in flux with the changes @ParthSareen has been making to
move harmony to the runner. It is likely that many other built-ins will
need to move to the runner as well, but I'm able to slightly defer that
decision since qwen3-coder doesn't have thinking (and therefore doesn't
need to be in the runner to make structured outputs work). I expect to
unify harmony with this approach very soon.

Whether a particular model supports tools or thinking was previously
inferred from templates, but without a template we now also use the
parser itself to declare what it supports. If we have future models that
re-use the same parsing format, but have different capabilities, we'll
want to parameterize them and give them different names to be specified
as a `PARSER`.

Misc changes:

- I worked on the renderer by diffing outputs from the reference
  implementation and ours. To make it easier to do this, I extended
  <https://github.com/ollama/ollama/pull/11875> to also support
  returning the prompt via the openai compat layer
2025-09-15 11:33:47 -07:00
frob
4378ae4ffa parser: don't check the file type of safetensors to prevent false negatives. (#12176)
* Don't check the file type of safetensor to prevent false negatives.

---------

Co-authored-by: Patrick Devine <patrick@infrahq.com>
2025-09-05 16:27:40 -07:00
Michael Yang
0dabb4ef6a skip tokenizer.model if possible (#11050)
if tokenizer.json is already copied, skip tokenizer.model
2025-06-11 12:10:35 -07:00
Jeffrey Morgan
fa9973cd7f api: remove unused sampling parameters (#10581) 2025-05-08 08:31:08 -07:00
Jeffrey Morgan
3b2d2c8326 api: remove unused or unsupported api options (#10574)
Some options listed in api/types.go are not supported in
newer models, or have been deprecated in the past. This is
the first of a series of PRs to clean up the API options
2025-05-05 14:54:40 -07:00
Michael Yang
d931ee8f22 create blobs in parallel (#10135)
* default max term height
* error on out of tree files
2025-05-05 11:59:26 -07:00
Michael Yang
16fca86c4a digest files in parallel 2025-04-07 09:46:31 -07:00
Bruce MacDonald
6bd0a983cd model: support for mistral-small in the ollama runner
Mistral is a popular research lab making open source models. This updates
the forward pass of llama architecture models to support both llama models
and mistral models by accounting for additional metadata present in mistral
models, and finding the correct dimensions for the output projection.
2025-04-03 16:57:36 -07:00
Parth Sareen
00ebda8cc4 Revert "parser: remove role validation from Modelfile parser" (#9917)
This reverts commit ffbfe833da.
2025-03-21 12:38:09 -07:00
rylativity
ffbfe833da parser: remove role validation from Modelfile parser (#9874)
* updates parser/parser.go to allow arbitrary roles in Modelfile MESSAGE blocks
2025-03-20 13:11:17 -07:00
Jeffrey Morgan
42cf4db601 parser: fix parsing Modelfiles with multiple FROM commands (#8449) 2025-01-16 00:14:04 -08:00
Patrick Devine
2539f2dbf9 Fix absolute path names + gguf detection (#8428) 2025-01-14 19:01:24 -08:00
Patrick Devine
32bd37adf8 make the modelfile path relative for ollama create (#8380) 2025-01-10 16:14:08 -08:00
Jeffrey Morgan
1deafd8254 llama: update vendored code to commit 46e3556 (#8308) 2025-01-08 11:22:01 -08:00
Patrick Devine
86a622cbdc Update the /api/create endpoint to use JSON (#7935)
Replaces `POST /api/create` to use JSON instead of a Modelfile.

This is a breaking change.
2024-12-31 18:02:30 -08:00
Patrick Devine
4efb98cb4f add line numbers for parser errors (#7326) 2024-11-14 13:59:44 -08:00
Josh Yan
9bd00041fa trim all params 2024-06-27 11:18:38 -07:00
Josh Yan
4e986a823c unquote, trimp space 2024-06-27 10:59:15 -07:00
Michael Yang
d528e1af75 fix utf16 for multibyte runes 2024-06-13 13:07:42 -07:00
Michael Yang
20b9f8e6f4 Revert "proper utf16 support"
This reverts commit 66ab48772f.

this change broke utf-8 scanning of multi-byte runes
2024-06-13 10:22:16 -07:00
Michael Yang
66ab48772f proper utf16 support 2024-06-05 13:11:50 -07:00
Patrick Devine
ccdf0b2a44 Move the parser back + handle utf16 files (#4533) 2024-05-20 11:26:45 -07:00
Michael Yang
119589fcb3 rename parser to model/file 2024-05-01 09:53:50 -07:00
Michael Yang
bd8eed57fc fix parser name 2024-05-01 09:52:54 -07:00
Michael Yang
9cf0f2e973 use parser.Format instead of templating modelfile 2024-05-01 09:52:54 -07:00
Michael Yang
176ad3aa6e parser: add commands format 2024-05-01 09:52:54 -07:00
Michael Yang
4d08363580 comments 2024-05-01 09:52:54 -07:00
Michael Yang
8907bf51d2 fix multiline 2024-05-01 09:52:54 -07:00
Michael Yang
abe614c705 tests 2024-05-01 09:52:54 -07:00
Michael Yang
238715037d linting 2024-05-01 09:52:54 -07:00
Michael Yang
c0a00f68ae refactor modelfile parser 2024-05-01 09:52:54 -07:00
Patrick Devine
7c40a67841 Save and load sessions (#2063) 2024-01-25 12:12:36 -08:00
Daniel Hiltgen
fedd705aea Mechanical switch from log to slog
A few obvious levels were adjusted, but generally everything mapped to "info" level.
2024-01-18 14:12:57 -08:00
Michael Yang
38fe1a368b fix: trim space in modelfile fields 2023-12-05 11:57:29 -08:00
Bruce MacDonald
a0c3e989de deprecate modelfile embed command (#759) 2023-10-16 11:07:37 -04:00
Michael Yang
6517bcc53c Merge pull request #290 from jmorganca/add-adapter-layers
implement loading ggml lora adapters through the modelfile
2023-08-10 17:23:01 -07:00
Michael Yang
21e6197c0b Merge pull request #322 from jmorganca/no-comment-warning
no warning on comments
2023-08-10 16:24:41 -07:00
Michael Yang
20bf000e55 no warning on comments 2023-08-10 16:22:38 -07:00
Michael Yang
40d0c4a1dc length check for parameters 2023-08-10 16:09:02 -07:00
Michael Yang
6de5d032e1 implement loading ggml lora adapters through the modelfile 2023-08-10 09:23:39 -07:00
Bruce MacDonald
a6f6d18f83 embed text document in modelfile 2023-08-08 11:27:17 -04:00
Michael Yang
9c7f30d31c use max scan token size to hold large objects 2023-07-28 11:43:31 -07:00
Michael Yang
f5ac8ddfb4 refactor scan multiline for reuse 2023-07-27 11:30:51 -07:00
Michael Yang
24c2c77057 fix multiline string
the data needs to remove the multiline quotes but include the command:

e.g.

TEMPLATE """
my template values
"""

should be

TEMPLATE
my template values

after scanning
2023-07-25 11:51:43 -07:00
Mohit Gaur
f5f79049c2 Incorporate code review improvements 2023-07-25 22:52:23 +05:30
Mohit Gaur
ed89da92b4 Improve command parsing and multiline string handling 2023-07-24 18:11:13 +05:30
Jeffrey Morgan
d59b164fa2 add prompt back to parser 2023-07-20 01:13:30 -07:00
Jeffrey Morgan
3b135ac963 parser: fix case where multi line string termination error wouldnt show 2023-07-20 00:43:22 -07:00
Jeffrey Morgan
e6bae8d916 parser: keep seeking until eof 2023-07-20 00:37:52 -07:00
Michael Yang
df146c41e2 separate prompt into template and system 2023-07-19 23:24:31 -07:00