Commit Graph

3534 Commits

Author SHA1 Message Date
Daniel Hiltgen
0ec2915ea7 CI: set platform build build_linux script to keep buildx happy (#6829)
The runners don't have emulation set up so the default multi-platform build
wont work.
v0.3.11-rc2
2024-09-16 14:07:29 -07:00
Michael Yang
c9a7541b9c readme: add Agents-Flex to community integrations (#6788) v0.3.11-rc1 2024-09-16 13:42:52 -07:00
Patrick Devine
d81cfd7d6f fix typo in import docs (#6828) 2024-09-16 11:48:14 -07:00
Pepo
b330c830d3 readme: add vim-intelligence-bridge to Terminal section (#6818) 2024-09-15 21:20:36 -04:00
Edward Cui
d889c6fd07 readme: add Obsidian Quiz Generator plugin to community integrations (#6789) 2024-09-14 23:52:37 -04:00
Daniel Hiltgen
56b9af336a Fix incremental builds on linux (#6780)
scripts: fix incremental builds on linux or similar
2024-09-13 08:24:08 -07:00
Daniel Hiltgen
fda0d3be52 Use GOARCH for build dirs (#6779)
Corrects x86_64 vs amd64 discrepancy
2024-09-12 16:38:05 -07:00
Daniel Hiltgen
cd5c8f6471 Optimize container images for startup (#6547)
* Optimize container images for startup

This change adjusts how to handle runner payloads to support
container builds where we keep them extracted in the filesystem.
This makes it easier to optimize the cpu/cuda vs cpu/rocm images for
size, and should result in faster startup times for container images.

* Refactor payload logic and add buildx support for faster builds

* Move payloads around

* Review comments

* Converge to buildx based helper scripts

* Use docker buildx action for release
2024-09-12 12:10:30 -07:00
dcasota
fef257c5c5 examples: updated requirements.txt for privategpt example 2024-09-11 18:56:56 -07:00
Adrian Cole
d066d9b8e0 examples: polish loganalyzer example (#6744) 2024-09-11 18:37:37 -07:00
RAPID ARCHITECT
5a00dc9fc9 readme: add ollama_moe to community integrations (#6752) 2024-09-11 18:36:26 -07:00
Jesse Gross
c354e87809 Merge pull request #6767 from ollama/jessegross/bug_6707
runner: Flush pending responses before returning
2024-09-11 17:20:22 -07:00
Jesse Gross
93ac3760cb runner: Flush pending responses before returning
If there are any pending reponses (such as from potential stop
tokens) then we should send them back before ending the sequence.
Otherwise, we can be missing tokens at the end of a response.

Fixes #6707
2024-09-11 16:39:32 -07:00
Patrick Devine
abed273de3 add "stop" command (#6739) 2024-09-11 16:36:21 -07:00
Michael Yang
034392624c Merge pull request #6762 from ollama/mxyng/show-output
refactor show ouput
2024-09-11 14:58:40 -07:00
Michael Yang
ecab6f1cc5 refactor show ouput
fixes line wrapping on long texts
2024-09-11 14:23:09 -07:00
Petr Mironychev
7d6900827d readme: add QodeAssist to community integrations (#6754) 2024-09-11 13:19:49 -07:00
Daniel Hiltgen
9246e6dd15 Verify permissions for AMD GPU (#6736)
This adds back a check which was lost many releases back to verify /dev/kfd permissions
which when lacking, can lead to confusing failure modes of:
  "rocBLAS error: Could not initialize Tensile host: No devices found"

This implementation does not hard fail the serve command but instead will fall back to CPU
with an error log.  In the future we can include this in the GPU discovery UX to show
detected but unsupported devices we discovered.
2024-09-11 11:38:25 -07:00
Michael Yang
735a0ca2e4 Merge pull request #6732 from ollama/mxyng/debug-proxy
add *_proxy to env map for debugging
2024-09-10 16:13:25 -07:00
Michael Yang
dddb72e084 add *_proxy for debugging 2024-09-10 09:43:35 -07:00
Jeffrey Morgan
83a9b5271a docs: update examples to use llama3.1 (#6718) 2024-09-09 22:47:16 -07:00
Daniel Hiltgen
4a8069f9c4 Quiet down dockers new lint warnings (#6716)
* Quiet down dockers new lint warnings

Docker has recently added lint warnings to build.  This cleans up those warnings.

* Fix go lint regression
2024-09-09 17:22:20 -07:00
Patrick Devine
84b84ce2db catch when model vocab size is set correctly (#6714) 2024-09-09 17:18:54 -07:00
Jeffrey Morgan
bb6a086d63 readme: add crewAI to community integrations (#6699) 2024-09-08 00:36:24 -07:00
RAPID ARCHITECT
30c8f201cc readme: add crewAI with mesop to community integrations 2024-09-08 00:35:59 -07:00
frob
06d4fba851 openai: align chat temperature and frequency_penalty options with completion (#6688) v0.3.10 2024-09-07 09:08:08 -07:00
Jeffrey Morgan
108fb6c1d1 docs: improve linux install documentation (#6683)
Includes small improvements to document layout and code blocks
2024-09-06 22:05:37 -07:00
Yaroslav
da915345d1 openai: don't scale temperature or frequency_penalty (#6514) 2024-09-06 17:45:45 -07:00
nickthecook
8a027bc401 readme: add Archyve to community integrations (#6680) 2024-09-06 14:06:01 -07:00
imoize
5446903fbd readme: add Plasmoid Ollama Control to community integrations (#6681) 2024-09-06 14:04:12 -07:00
Daniel Hiltgen
56318fb365 Improve logging on GPU too small (#6666)
When we determine a GPU is too small for any layers, it's not always clear why.
This will help troubleshoot those scenarios.
2024-09-06 08:29:36 -07:00
frob
fe91d7fff1 openai: fix "presence_penalty" typo and add test (#6665) 2024-09-06 01:16:28 -07:00
Patrick Devine
608e87bf87 Fix gemma2 2b conversion (#6645) v0.3.10-rc1 2024-09-05 17:02:28 -07:00
Daniel Hiltgen
48685c6ed0 Document uninstall on windows (#6663) 2024-09-05 15:57:38 -07:00
Daniel Hiltgen
9565fa64a8 Revert "Detect running in a container (#6495)" (#6662)
This reverts commit a60d9b89ce.
2024-09-05 14:26:00 -07:00
Daniel Hiltgen
6719097649 llm: make load time stall duration configurable via OLLAMA_LOAD_TIMEOUT
With the new very large parameter models, some users are willing to wait for
a very long time for models to load.
2024-09-05 14:00:08 -07:00
Daniel Hiltgen
b05c9e83d9 Introduce GPU Overhead env var (#5922)
Provide a mechanism for users to set aside an amount of VRAM on each GPU
to make room for other applications they want to start after Ollama, or workaround
memory prediction bugs
2024-09-05 13:46:35 -07:00
Daniel Hiltgen
a60d9b89ce Detect running in a container (#6495) 2024-09-05 13:24:51 -07:00
Michael Yang
bf612cd608 Merge pull request #6260 from ollama/mxyng/mem
llama3.1 memory
2024-09-05 13:22:08 -07:00
Zeyo
ef98e56122 readme: add AiLama to the list of community integrations (#4957) 2024-09-05 13:10:44 -07:00
Michael
5f944baac7 Update gpu.md: Add RTX 3050 Ti and RTX 3050 Ti (#5888)
* Update gpu.md

    Seems strange that the laptop versions of 3050 and 3050 Ti would be supported but not the non-notebook, but this is what the page (https://developer.nvidia.com/cuda-gpus) says.

Signed-off-by: bean5 <2052646+bean5@users.noreply.github.com>

* Update gpu.md

Remove notebook reference

---------

Signed-off-by: bean5 <2052646+bean5@users.noreply.github.com>
2024-09-05 11:24:26 -07:00
Tobias Heinze
6fc9d22707 server: fix blob download when receiving a 200 response (#6656) 2024-09-05 10:48:26 -07:00
Vitaly Zdanevich
f27c00d8c5 readme: add Gentoo package manager entry to community integrations (#5714) 2024-09-05 09:58:14 -07:00
王卿
c7c845ec52 Update install.sh:Replace "command -v" with encapsulated functionality (#6035)
Replace "command -v" with encapsulated functionality
2024-09-05 09:49:48 -07:00
Augustinas Malinauskas
cf48603943 readme: include Enchanted for Apple Vision Pro (#4949)
Added Enchanted with Apple Vision Pro support
2024-09-05 01:30:19 -04:00
Silas Marvin
6e67be09b6 readme: add lsp-ai to community integrations (#5063) 2024-09-05 01:17:34 -04:00
Arda Günsüren
0f5f060d2b readme: add ollama-php library to community integrations (#6361) 2024-09-05 01:01:14 -04:00
jk011ru
b3554778bd readme: add vnc-lm discord bot community integration (#6644) 2024-09-04 19:46:02 -04:00
Pascal Patry
bbe7b96ded llm: use json.hpp from common (#6642) 2024-09-04 19:34:42 -04:00
Rune Berg
c18ff18b2c readme: add confichat to community integrations (#6378) 2024-09-04 17:26:02 -04:00