17 Commits

Author SHA1 Message Date
0a066cfd91 Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119)
* Reapply "feat: incremental gguf parser (#10822)" (#11114)

This reverts commit a6e64fbdf2.

* fix older ggufs
2025-06-20 11:11:40 -07:00
a6e64fbdf2 Revert "feat: incremental gguf parser (#10822)" (#11114)
This reverts commit 6b04cad7e8.
2025-06-18 05:42:44 -07:00
6b04cad7e8 feat: incremental gguf parser (#10822)
* incremental gguf parser
* gguf: update test to not rely on gguf on disc
* re-use existing create gguf
* read capabilities from gguf kv
* kv exists
* update tests
* s/doneFunc/successFunc/g
* new buffered reader

---------

Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2025-06-12 11:04:11 -07:00
6e9a7a2568 lint: enable usetesting, disable tenv (#10594) 2025-05-08 11:42:14 -07:00
e172f095ba api: return model capabilities from the show endpoint (#10066)
With support for multimodal models becoming more varied and common it is important for clients to be able to easily see what capabilities a model has. Retuning these from the show endpoint will allow clients to easily see what a model can do.
2025-04-01 15:21:46 -07:00
48a273f80b Fix issues with templating prompt in chat mode (#2460) 2024-02-12 15:06:57 -08:00
e49dc9f3d8 fix tests 2024-02-01 11:48:11 -08:00
a896079705 preserve last system message from modelfile (#2289) 2024-01-31 21:45:01 -05:00
0632dff3f8 trim chat prompt based on llm context size (#1963) 2024-01-30 15:59:29 -05:00
db356c8519 post-response templating (#1427) 2023-12-22 17:07:05 -05:00
4a1abfe4fa fix tests 2023-12-13 14:42:30 -05:00
3b0b8930d4 fix: only flush template in chat when current role encountered (#1426) 2023-12-08 16:44:24 -05:00
195e3d9dbd chat api endpoint (#1392) 2023-12-05 14:57:33 -05:00
00d06619a1 Revert "chat api (#991)" while context variable is fixed
This reverts commit 7a0899d62d.
2023-12-04 21:16:27 -08:00
7a0899d62d chat api (#991)
- update chat docs
- add messages chat endpoint
- remove deprecated context and template generate parameters from docs
- context and template are still supported for the time being and will continue to work as expected
- add partial response to chat history
2023-12-04 18:01:06 -05:00
a0c3e989de deprecate modelfile embed command (#759) 2023-10-16 11:07:37 -04:00
62d29b2157 do not HTML-escape prompt
The `html/template` package automatically HTML-escapes interpolated strings in templates. This behavior is undesirable because it causes prompts like `<h1>hello` to be escaped to `&lt;h1&gt;hello` before being passed to the LLM.

The included test case passes, but before the code change, it failed:

```
--- FAIL: TestModelPrompt
    images_test.go:21: got "a&lt;h1&gt;b", want "a<h1>b"
```
2023-09-01 17:16:38 -05:00