2023-07-20 02:21:51 -07:00
# Ollama Model File
2024-07-22 13:34:56 -07:00
> [!NOTE]
> `Modelfile` syntax is in development
2023-07-20 02:21:51 -07:00
A model file is the blueprint to create and share models with Ollama.
2023-07-27 15:13:04 -04:00
## Table of Contents
- [Format ](#format )
- [Examples ](#examples )
- [Instructions ](#instructions )
- [FROM (Required) ](#from-required )
2024-09-09 22:47:16 -07:00
- [Build from existing model ](#build-from-existing-model )
2024-08-27 14:46:47 -07:00
- [Build from a Safetensors model ](#build-from-a-safetensors-model )
- [Build from a GGUF file ](#build-from-a-gguf-file )
2023-07-27 15:13:04 -04:00
- [PARAMETER ](#parameter )
- [Valid Parameters and Values ](#valid-parameters-and-values )
- [TEMPLATE ](#template )
- [Template Variables ](#template-variables )
- [SYSTEM ](#system )
2023-08-15 09:23:36 -03:00
- [ADAPTER ](#adapter )
2023-07-27 15:13:04 -04:00
- [LICENSE ](#license )
2024-01-25 16:29:48 -08:00
- [MESSAGE ](#message )
2023-07-27 15:13:04 -04:00
- [Notes ](#notes )
2023-07-20 02:21:51 -07:00
## Format
2023-10-02 13:46:05 -07:00
The format of the `Modelfile` :
2023-07-20 02:21:51 -07:00
2025-02-08 00:55:07 +07:00
```
2023-07-20 02:21:51 -07:00
# comment
INSTRUCTION arguments
```
2023-12-12 14:43:19 -05:00
| Instruction | Description |
| ----------------------------------- | -------------------------------------------------------------- |
| [`FROM` ](#from-required ) (required) | Defines the base model to use. |
| [`PARAMETER` ](#parameter ) | Sets the parameters for how Ollama will run the model. |
| [`TEMPLATE` ](#template ) | The full prompt template to be sent to the model. |
| [`SYSTEM` ](#system ) | Specifies the system message that will be set in the template. |
| [`ADAPTER` ](#adapter ) | Defines the (Q)LoRA adapters to apply to the model. |
| [`LICENSE` ](#license ) | Specifies the legal license. |
2024-01-25 16:29:32 -08:00
| [`MESSAGE` ](#message ) | Specify message history. |
2023-07-20 02:21:51 -07:00
## Examples
2023-11-20 12:24:29 -08:00
### Basic `Modelfile`
2023-10-02 13:46:05 -07:00
An example of a `Modelfile` creating a mario blueprint:
2023-07-20 02:21:51 -07:00
2025-02-08 00:55:07 +07:00
```
2024-09-25 11:11:22 -07:00
FROM llama3.2
2023-07-20 02:21:51 -07:00
# sets the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1
2023-07-27 15:13:04 -04:00
# sets the context window size to 4096, this controls how many tokens the LLM can use as context to generate the next token
2023-07-20 02:21:51 -07:00
PARAMETER num_ctx 4096
2023-12-12 14:43:19 -05:00
# sets a custom system message to specify the behavior of the chat assistant
2023-07-20 02:21:51 -07:00
SYSTEM You are Mario from super mario bros, acting as an assistant.
```
To use this:
2023-10-02 13:54:27 -07:00
1. Save it as a file (e.g. `Modelfile` )
2024-12-02 19:28:56 +02:00
2. `ollama create choose-a-model-name -f <location of the file e.g. ./Modelfile>`
2023-10-02 13:54:27 -07:00
3. `ollama run choose-a-model-name`
2023-07-20 02:21:51 -07:00
4. Start using the model!
2024-05-04 05:25:04 +10:00
To view the Modelfile of a given model, use the `ollama show --modelfile` command.
2023-11-20 12:24:29 -08:00
2025-02-08 00:55:07 +07:00
```shell
ollama show --modelfile llama3.2
```
2023-11-20 12:24:29 -08:00
2025-02-08 00:55:07 +07:00
> **Output**:
>
> ```
> # Modelfile generated by "ollama show"
> # To build a new Modelfile based on this one, replace the FROM line with:
> # FROM llama3.2:latest
> FROM /Users/pdevine/.ollama/models/blobs/sha256-00e1317cbf74d901080d7100f57580ba8dd8de57203072dc6f668324ba545f29
> TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
>
> {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
>
> {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
>
> {{ .Response }}<|eot_id|>"""
> PARAMETER stop "<|start_header_id|>"
> PARAMETER stop "<|end_header_id|>"
> PARAMETER stop "<|eot_id|>"
> PARAMETER stop "<|reserved_special_token"
> ```
2023-11-20 12:24:29 -08:00
2023-07-27 15:13:04 -04:00
## Instructions
### FROM (Required)
2023-07-20 02:21:51 -07:00
2023-10-02 13:46:05 -07:00
The `FROM` instruction defines the base model to use when creating a model.
2023-07-20 02:21:51 -07:00
2025-02-08 00:55:07 +07:00
```
2023-07-20 02:21:51 -07:00
FROM < model name > :< tag >
```
2024-09-09 22:47:16 -07:00
#### Build from existing model
2023-07-20 02:21:51 -07:00
2025-02-08 00:55:07 +07:00
```
2024-09-25 11:11:22 -07:00
FROM llama3.2
2023-07-20 02:21:51 -07:00
```
A list of available base models:
2024-03-26 13:04:17 -07:00
< https: // github . com / ollama / ollama #model -library >
2024-08-27 14:46:47 -07:00
Additional models can be found at:
< https: / / ollama . com / library >
#### Build from a Safetensors model
2025-02-08 00:55:07 +07:00
```
2024-08-27 14:46:47 -07:00
FROM < model directory >
```
The model directory should contain the Safetensors weights for a supported architecture.
Currently supported model architectures:
2024-11-10 19:04:23 -08:00
* Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2)
2024-08-27 14:46:47 -07:00
* Mistral (including Mistral 1, Mistral 2, and Mixtral)
* Gemma (including Gemma 1 and Gemma 2)
* Phi3
2023-07-20 02:21:51 -07:00
2024-08-27 14:46:47 -07:00
#### Build from a GGUF file
2023-07-20 02:21:51 -07:00
2025-02-08 00:55:07 +07:00
```
2024-09-01 11:34:25 +09:00
FROM ./ollama-model.gguf
2023-07-20 02:21:51 -07:00
```
2024-09-01 11:34:25 +09:00
The GGUF file location should be specified as an absolute path or relative to the `Modelfile` location.
2024-08-27 14:46:47 -07:00
2023-07-27 15:13:04 -04:00
### PARAMETER
2023-07-20 02:21:51 -07:00
The `PARAMETER` instruction defines a parameter that can be set when the model is run.
2025-02-08 00:55:07 +07:00
```
2023-07-20 02:21:51 -07:00
PARAMETER < parameter > < parametervalue >
```
2024-03-12 16:41:41 -07:00
#### Valid Parameters and Values
2023-07-20 02:21:51 -07:00
2023-07-28 12:30:27 -04:00
| Parameter | Description | Value Type | Example Usage |
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | -------------------- |
| mirostat | Enable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0) | int | mirostat 0 |
| mirostat_eta | Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1) | float | mirostat_eta 0.1 |
| mirostat_tau | Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0) | float | mirostat_tau 5.0 |
| num_ctx | Sets the size of the context window used to generate the next token. (Default: 2048) | int | num_ctx 4096 |
| repeat_last_n | Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx) | int | repeat_last_n 64 |
| repeat_penalty | Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1) | float | repeat_penalty 1.1 |
| temperature | The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8) | float | temperature 0.7 |
2023-11-09 13:16:26 -08:00
| seed | Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0) | int | seed 42 |
| stop | Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return. Multiple stop patterns may be set by specifying multiple separate `stop` parameters in a modelfile. | string | stop "AI assistant:" |
2024-12-04 00:00:05 +01:00
| num_predict | Maximum number of tokens to predict when generating text. (Default: -1, infinite generation) | int | num_predict 42 |
2023-07-28 12:30:27 -04:00
| top_k | Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40) | int | top_k 40 |
| top_p | Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9) | float | top_p 0.9 |
2024-07-27 23:37:40 +02:00
| min_p | Alternative to the top_p, and aims to ensure a balance of quality and variety. The parameter *p* represents the minimum probability for a token to be considered, relative to the probability of the most likely token. For example, with *p* =0.05 and the most likely token having a probability of 0.9, logits with a value less than 0.045 are filtered out. (Default: 0.0) | float | min_p 0.05 |
2023-07-20 02:21:51 -07:00
2023-07-27 15:13:04 -04:00
### TEMPLATE
2024-02-12 15:06:57 -08:00
`TEMPLATE` of the full prompt template to be passed into the model. It may include (optionally) a system message, a user's message and the response from the model. Note: syntax may be model specific. Templates use Go [template syntax ](https://pkg.go.dev/text/template ).
2023-07-27 15:13:04 -04:00
#### Template Variables
2024-02-12 15:06:57 -08:00
| Variable | Description |
| ----------------- | --------------------------------------------------------------------------------------------- |
| `{{ .System }}` | The system message used to specify custom behavior. |
| `{{ .Prompt }}` | The user prompt message. |
| `{{ .Response }}` | The response from the model. When generating a response, text after this variable is omitted. |
2023-07-27 15:13:04 -04:00
2024-02-12 15:06:57 -08:00
```
TEMPLATE """{{ if .System }}< |im_start|>system
{{ .System }}< |im_end|>
{{ end }}{{ if .Prompt }}< |im_start|>user
{{ .Prompt }}< |im_end|>
{{ end }}< |im_start|>assistant
2023-07-27 15:13:04 -04:00
"""
```
### SYSTEM
2023-12-12 14:43:19 -05:00
The `SYSTEM` instruction specifies the system message to be used in the template, if applicable.
2023-07-27 15:13:04 -04:00
2025-02-08 00:55:07 +07:00
```
2023-07-27 15:13:04 -04:00
SYSTEM """< system message > """
```
2023-08-09 15:21:39 -07:00
### ADAPTER
2024-08-27 14:46:47 -07:00
The `ADAPTER` instruction specifies a fine tuned LoRA adapter that should apply to the base model. The value of the adapter should be an absolute path or a path relative to the Modelfile. The base model should be specified with a `FROM` instruction. If the base model is not the same as the base model that the adapter was tuned from the behaviour will be erratic.
#### Safetensor adapter
2025-02-08 00:55:07 +07:00
```
2024-08-27 14:46:47 -07:00
ADAPTER < path to safetensor adapter >
```
Currently supported Safetensor adapters:
* Llama (including Llama 2, Llama 3, and Llama 3.1)
* Mistral (including Mistral 1, Mistral 2, and Mixtral)
* Gemma (including Gemma 1 and Gemma 2)
#### GGUF adapter
2023-08-09 15:21:39 -07:00
2025-02-08 00:55:07 +07:00
```
2024-09-01 11:34:25 +09:00
ADAPTER ./ollama-lora.gguf
2023-08-09 15:21:39 -07:00
```
2023-07-27 15:13:04 -04:00
### LICENSE
2023-08-02 19:38:32 -07:00
The `LICENSE` instruction allows you to specify the legal license under which the model used with this Modelfile is shared or distributed.
2023-07-20 02:21:51 -07:00
2025-02-08 00:55:07 +07:00
```
2023-07-27 15:13:04 -04:00
LICENSE """
< license text >
"""
```
2023-07-20 02:21:51 -07:00
2024-01-25 16:29:32 -08:00
### MESSAGE
2024-03-12 16:41:41 -07:00
The `MESSAGE` instruction allows you to specify a message history for the model to use when responding. Use multiple iterations of the MESSAGE command to build up a conversation which will guide the model to answer in a similar way.
2025-02-08 00:55:07 +07:00
```
2024-03-12 16:41:41 -07:00
MESSAGE < role > < message >
```
#### Valid roles
| Role | Description |
| --------- | ------------------------------------------------------------ |
| system | Alternate way of providing the SYSTEM message for the model. |
| user | An example message of what the user could have asked. |
| assistant | An example message of how the model should respond. |
#### Example conversation
2024-01-25 16:29:32 -08:00
2025-02-08 00:55:07 +07:00
```
2024-01-25 16:29:32 -08:00
MESSAGE user Is Toronto in Canada?
MESSAGE assistant yes
MESSAGE user Is Sacramento in Canada?
MESSAGE assistant no
MESSAGE user Is Ontario in Canada?
MESSAGE assistant yes
```
2024-03-12 16:41:41 -07:00
2023-07-20 02:21:51 -07:00
## Notes
2023-12-22 09:10:01 -08:00
- the ** `Modelfile` is not case sensitive**. In the examples, uppercase instructions are used to make it easier to distinguish it from arguments.
- Instructions can be in any order. In the examples, the `FROM` instruction is first to keep it easily readable.
2023-11-20 12:24:29 -08:00
2024-02-09 15:19:30 -08:00
[1]: https://ollama.com/library