Feature/support all gpus (#1515)

* add `count` explicitely to ensure backwards compat

* fix main document URL
This commit is contained in:
Bijay Regmi 2024-06-09 02:45:26 +02:00 committed by GitHub
parent 5f2737f9ee
commit 260149b35a
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 4 additions and 2 deletions

View File

@ -2,10 +2,10 @@
# Deploying Danswer using Docker Compose # Deploying Danswer using Docker Compose
For general information, please read the instructions in this [README](https://github.com/danswer-ai/danswer/blob/main/deployment/docker_compose/README.md). For general information, please read the instructions in this [README](https://github.com/danswer-ai/danswer/blob/main/deployment/README.md).
## Deploy in a system without GPU support ## Deploy in a system without GPU support
This part is elaborated precisely in in this [README](https://github.com/danswer-ai/danswer/blob/main/deployment/docker_compose/README.md) in section *Docker Compose*. If you have any questions, please feel free to open an issue or get in touch in slack for support. This part is elaborated precisely in in this [README](https://github.com/danswer-ai/danswer/blob/main/deployment/README.md) in section *Docker Compose*. If you have any questions, please feel free to open an issue or get in touch in slack for support.
## Deploy in a system with GPU support ## Deploy in a system with GPU support
Running Model servers with GPU support while indexing and querying can result in significant improvements in performance. This is highly recommended if you have access to resources. Currently, Danswer offloads embedding model and tokenizers to the GPU VRAM and the size needed depends on chosen embedding model. Default embedding models `intfloat/e5-base-v2` takes up about 1GB of VRAM and since we need this for inference and embedding pipeline, you would need roughly 2GB of VRAM. Running Model servers with GPU support while indexing and querying can result in significant improvements in performance. This is highly recommended if you have access to resources. Currently, Danswer offloads embedding model and tokenizers to the GPU VRAM and the size needed depends on chosen embedding model. Default embedding models `intfloat/e5-base-v2` takes up about 1GB of VRAM and since we need this for inference and embedding pipeline, you would need roughly 2GB of VRAM.

View File

@ -215,6 +215,7 @@ services:
reservations: reservations:
devices: devices:
- driver: nvidia - driver: nvidia
count: all
capabilities: [gpu] capabilities: [gpu]
build: build:
context: ../../backend context: ../../backend
@ -253,6 +254,7 @@ services:
reservations: reservations:
devices: devices:
- driver: nvidia - driver: nvidia
count: all
capabilities: [gpu] capabilities: [gpu]
command: > command: >
/bin/sh -c "if [ \"${DISABLE_MODEL_SERVER:-false}\" = \"True\" ]; then /bin/sh -c "if [ \"${DISABLE_MODEL_SERVER:-false}\" = \"True\" ]; then