highperfocused/ollama

Fork 0

mirror of https://github.com/ollama/ollama.git synced 2025-07-08 17:20:23 +02:00

Go to file

Patrick Devine 3f1b7177f2 pass model and predict options

2023-07-07 09:34:05 -07:00

api

pass model and predict options

2023-07-07 09:34:05 -07:00

app

fix env var loading

2023-07-07 10:27:33 -04:00

cmd

do not pull when local is available

2023-07-07 10:22:37 -04:00

docs

Move python docs to separate file

2023-07-01 17:54:29 -04:00

llama

pass model and predict options

2023-07-07 09:34:05 -07:00

server

pass model and predict options

2023-07-07 09:34:05 -07:00

web

set version at build time

2023-07-06 16:34:44 -04:00

.dockerignore

update Dockerfile

2023-07-06 16:34:44 -04:00

.gitignore

use Makefile for dependency building instead of go generate

2023-07-06 16:34:44 -04:00

.prettierrc.json

move .prettierrc.json to root

2023-07-02 17:34:46 -04:00

Dockerfile

fix dockerfile

2023-07-06 16:34:44 -04:00

go.mod

progress

2023-07-06 17:07:40 -07:00

go.sum

progress

2023-07-06 17:07:40 -07:00

LICENSE

proto -> ollama

2023-06-26 15:57:13 -04:00

main.go

add llama.cpp go bindings

2023-07-06 16:34:44 -04:00

Makefile

update app to use go binary

2023-07-06 16:34:44 -04:00

models.json

Update models.json

2023-07-06 16:34:44 -04:00

README.md

update api documentation

2023-07-06 16:46:05 -04:00

README.md

Ollama

Run large language models with llama.cpp.

Note: certain models that can be run with this project are intended for research and/or non-commercial use only.

Features

Download and run popular large language models
Switch between multiple models on the fly
Hardware acceleration where available (Metal, CUDA)
Fast inference server written in Go, powered by llama.cpp
REST API to use with your application (python, typescript SDKs coming soon)

Install

Download for macOS
Download for Windows (coming soon)
Docker: docker run -p 11434:11434 ollama/ollama

You can also build the binary from source.

Quickstart

Run the model that started it all.

ollama run llama

Example models

💬 Chat

Have a conversation.

ollama run vicuna "Why is the sky blue?"

🗺️ Instructions

Ask questions. Get answers.

ollama run orca "Write an email to my boss."

👩‍💻 Code completion

Sometimes you just need a little help writing code.

ollama run replit "Give me react code to render a button"

📖 Storytelling

Venture into the unknown.

ollama run nous-hermes "Once upon a time"

Advanced usage

Run a local model

ollama run ~/Downloads/vicuna-7b-v1.3.ggmlv3.q4_1.bin

Building

make

To run it start the server:

./ollama server &

Finally, run a model!

./ollama run ~/Downloads/vicuna-7b-v1.3.ggmlv3.q4_1.bin

API Reference

`POST /api/pull`

Download a model

curl -X POST http://localhost:11343/api/pull -d '{"model": "orca"}'

`POST /api/generate`

Complete a prompt

curl -X POST http://localhost:11434/api/generate -d '{"model": "orca", "prompt": "hello!", "stream": true}'

Languages

Go 93.7%

C 2.4%

Shell 1.1%

TypeScript 1%

PowerShell 0.7%

Other 1%