mirror of
https://github.com/danswer-ai/danswer.git
synced 2025-05-09 19:30:40 +02:00
107 lines
4.6 KiB
Markdown
107 lines
4.6 KiB
Markdown
# Search Quality Test Script
|
|
|
|
This Python script automates the process of running search quality tests for a backend system.
|
|
|
|
## Features
|
|
|
|
- Loads configuration from a YAML file
|
|
- Sets up Docker environment
|
|
- Manages environment variables
|
|
- Switches to specified Git branch
|
|
- Uploads test documents
|
|
- Runs search quality tests
|
|
- Cleans up Docker containers (optional)
|
|
|
|
## Usage
|
|
|
|
1. Ensure you have the required dependencies installed.
|
|
2. Configure the `search_test_config.yaml` file based on the `search_test_config.yaml.template` file.
|
|
3. Configure the `.env_eval` file in `deployment/docker_compose` with the correct environment variables.
|
|
4. Set up the PYTHONPATH permanently:
|
|
Add the following line to your shell configuration file (e.g., `~/.bashrc`, `~/.zshrc`, or `~/.bash_profile`):
|
|
```
|
|
export PYTHONPATH=$PYTHONPATH:/path/to/danswer/backend
|
|
```
|
|
Replace `/path/to/danswer` with the actual path to your Danswer repository.
|
|
After adding this line, restart your terminal or run `source ~/.bashrc` (or the appropriate config file) to apply the changes.
|
|
5. Navigate to Danswer repo:
|
|
```
|
|
cd path/to/danswer
|
|
```
|
|
6. Navigate to the answer_quality folder:
|
|
```
|
|
cd backend/tests/regression/answer_quality
|
|
```
|
|
7. To launch the evaluation environment, run the launch_eval_env.py script (this step can be skipped if you are running the env outside of docker, just leave "environment_name" blank):
|
|
```
|
|
python launch_eval_env.py
|
|
```
|
|
8. Run the file_uploader.py script to upload the zip files located at the path "zipped_documents_file"
|
|
```
|
|
python file_uploader.py
|
|
```
|
|
9. Run the run_qa.py script to ask questions from the jsonl located at the path "questions_file". This will hit the "query/answer-with-quote" API endpoint.
|
|
```
|
|
python run_qa.py
|
|
```
|
|
|
|
Note: All data will be saved even after the containers are shut down. There are instructions below to re-launching docker containers using this data.
|
|
|
|
If you decide to run multiple UIs at the same time, the ports will increment upwards from 3000 (E.g. http://localhost:3001).
|
|
|
|
To see which port the desired instance is on, look at the ports on the nginx container by running `docker ps` or using docker desktop.
|
|
|
|
Docker daemon must be running for this to work.
|
|
|
|
## Configuration
|
|
|
|
Edit `search_test_config.yaml` to set:
|
|
|
|
- output_folder
|
|
- This is the folder where the folders for each test will go
|
|
- These folders will contain the postgres/vespa data as well as the results for each test
|
|
- zipped_documents_file
|
|
- The path to the zip file containing the files you'd like to test against
|
|
- questions_file
|
|
- The path to the yaml containing the questions you'd like to test with
|
|
- commit_sha
|
|
- Set this to the SHA of the commit you want to run the test against
|
|
- You must clear all local changes if you want to use this option
|
|
- Set this to null if you want it to just use the code as is
|
|
- clean_up_docker_containers
|
|
- Set this to true to automatically delete all docker containers, networks and volumes after the test
|
|
- launch_web_ui
|
|
- Set this to true if you want to use the UI during/after the testing process
|
|
- only_state
|
|
- Whether to only run Vespa and Postgres
|
|
- only_retrieve_docs
|
|
- Set true to only retrieve documents, not LLM response
|
|
- This is to save on API costs
|
|
- use_cloud_gpu
|
|
- Set to true or false depending on if you want to use the remote gpu
|
|
- Only need to set this if use_cloud_gpu is true
|
|
- model_server_ip
|
|
- This is the ip of the remote model server
|
|
- Only need to set this if use_cloud_gpu is true
|
|
- model_server_port
|
|
- This is the port of the remote model server
|
|
- Only need to set this if use_cloud_gpu is true
|
|
- environment_name
|
|
- Use this if you would like to relaunch a previous test instance
|
|
- Input the env_name of the test you'd like to re-launch
|
|
- Leave empty to launch referencing local default network locations
|
|
- limit
|
|
- Max number of questions you'd like to ask against the dataset
|
|
- Set to null for no limit
|
|
- llm
|
|
- Fill this out according to the normal LLM seeding
|
|
|
|
|
|
## Relaunching From Existing Data
|
|
|
|
To launch an existing set of containers that has already completed indexing, set the environment_name variable. This will launch the docker containers mounted on the volumes of the indicated env_name and will not automatically index any documents or run any QA.
|
|
|
|
Once these containers are launched you can run file_uploader.py or run_qa.py (assuming you have run the steps in the Usage section above).
|
|
- file_uploader.py will upload and index additional zipped files located at the zipped_documents_file path.
|
|
- run_qa.py will ask questions located at the questions_file path against the indexed documents.
|