Files
ollama/docs/turbo.md

2.5 KiB
Raw Permalink Blame History

Turbo

⚠️ Turbo is preview

Ollamas Turbo is a new way to run open-source models with acceleration from datacenter-grade hardware.

Currently, the following models are available in Turbo:

  • gpt-oss:20b
  • gpt-oss:120b

Get started

Ollama for macOS & Windows

Download Ollama

  • Select a model such as gpt-oss:20b or gpt-oss:120b
  • Click on Turbo. Youll be prompted to create an account or sign in

Ollamas CLI

  • Sign up for an Ollama account

  • Add your Ollama key to ollama.com.

    On macOS and Linux:

    cat ~/.ollama/id_ed25519.pub
    

    On Windows:

    type "%USERPROFILE%\.ollama\id_ed25519.pub"
    
  • Then run a model setting OLLAMA_HOST to ollama.com:

    OLLAMA_HOST=ollama.com ollama run gpt-oss:120b
    

Ollamas Python library

from ollama import Client

client = Client(
    host="https://ollama.com",
    headers={'Authorization': '<api key>'}
)

messages = [
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
]

for part in client.chat('gpt-oss:120b', messages=messages, stream=True):
  print(part['message']['content'], end='', flush=True)

Ollamas JavaScript library

import { Ollama } from 'ollama';

const ollama = new Ollama({
  host: 'https://ollama.com',
  headers: {
	  Authorization: "Bearer <api key>"
  }
});

const response = await ollama.chat({
  model: 'gpt-oss:120b',
  messages: [{ role: 'user', content: 'Explain quantum computing' }],
  stream: true
});

for await (const part of response) {
    process.stdout.write(part.message.content)
}

Community integrations

Turbo mode is also compatible with several community integrations.

Open WebUI

  • Go to settingsAdmin settingsConnections
  • Under Ollama API, click +
  • For the URL put https://ollama.com
  • For the API key, create an API key on https://ollama.com/settings/keys and add it.
  • Click Save

Now, if you navigate to the model selector, Turbo models should be available under External.