danswer/CONTRIBUTING.md

212 lines
7.5 KiB
Markdown
Raw Permalink Normal View History

2023-11-20 19:56:06 -08:00
<!-- DANSWER_METADATA={"link": "https://github.com/danswer-ai/danswer/blob/main/CONTRIBUTING.md"} -->
2023-07-12 19:29:13 -07:00
# Contributing to Danswer
Hey there! We are so excited that you're interested in Danswer.
As an open source project in a rapidly changing space, we welcome all contributions.
## 💃 Guidelines
### Contribution Opportunities
2023-10-25 18:26:02 -07:00
The [GitHub Issues](https://github.com/danswer-ai/danswer/issues) page is a great place to start for contribution ideas.
2023-07-12 19:29:13 -07:00
Issues that have been explicitly approved by the maintainers (aligned with the direction of the project)
will be marked with the `approved by maintainers` label.
Issues marked `good first issue` are an especially great place to start.
2023-07-15 16:20:04 -07:00
**Connectors** to other tools are another great place to contribute. For details on how, refer to this
[README.md](https://github.com/danswer-ai/danswer/blob/main/backend/danswer/connectors/README.md).
2023-07-12 19:29:13 -07:00
2023-07-15 16:20:04 -07:00
If you have a new/different contribution in mind, we'd love to hear about it!
2023-07-12 19:29:13 -07:00
Your input is vital to making sure that Danswer moves in the right direction.
Before starting on implementation, please raise a GitHub issue.
2023-10-25 18:26:02 -07:00
And always feel free to message us (Chris Weaver / Yuhong Sun) on
2024-01-11 17:42:26 -08:00
[Slack](https://join.slack.com/t/danswer/shared_invite/zt-2afut44lv-Rw3kSWu6_OmdAXRpCv80DQ) /
2023-10-25 18:26:02 -07:00
[Discord](https://discord.gg/TDJ59cGV2X) directly about anything at all.
2023-07-12 19:29:13 -07:00
### Contributing Code
To contribute to this project, please follow the
["fork and pull request"](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) workflow.
When opening a pull request, mention related issues and feel free to tag relevant maintainers.
Before creating a pull request please make sure that the new changes conform to the formatting and linting requirements.
See the [Formatting and Linting](#-formatting-and-linting) section for how to run these checks locally.
### Getting Help 🙋
Our goal is to make contributing as easy as possible. If you run into any issues please don't hesitate to reach out.
That way we can help future contributors and users can avoid the same issue.
We also have support channels and generally interesting discussions on our
2024-01-11 17:42:26 -08:00
[Slack](https://join.slack.com/t/danswer/shared_invite/zt-2afut44lv-Rw3kSWu6_OmdAXRpCv80DQ)
2023-07-12 19:29:13 -07:00
and
[Discord](https://discord.gg/TDJ59cGV2X).
We would love to see you there!
## Get Started 🚀
2024-09-07 14:05:36 -07:00
Danswer being a fully functional app, relies on some external software, specifically:
2023-10-25 18:26:02 -07:00
- [Postgres](https://www.postgresql.org/) (Relational DB)
2023-08-26 15:35:19 -07:00
- [Vespa](https://vespa.ai/) (Vector DB/Search Engine)
2024-09-07 14:05:36 -07:00
- [Redis](https://redis.io/) (Cache)
- [Nginx](https://nginx.org/) (Not needed for development flows generally)
2023-07-12 19:29:13 -07:00
2024-09-07 14:05:36 -07:00
> **Note:**
> This guide provides instructions to set up the Danswer specific services outside of Docker because it's easier for
> development purposes. However, you can also use the containers and update with local changes by providing the
> `--build` flag.
2023-07-12 19:29:13 -07:00
### Local Set Up
2024-09-07 14:05:36 -07:00
Be sure to use Python version 3.11.
2023-07-12 19:29:13 -07:00
2024-01-08 22:44:11 -08:00
If using a lower version, modifications will have to be made to the code.
2024-09-07 14:05:36 -07:00
If using a higher version, sometimes some libraries will not be available (i.e. we had problems with Tensorflow in the past with higher versions of python).
2024-09-02 15:30:18 -07:00
2023-07-12 19:29:13 -07:00
#### Installing Requirements
Currently, we use pip and recommend creating a virtual environment.
For convenience here's a command for it:
```bash
python -m venv .venv
source .venv/bin/activate
```
2024-09-07 14:05:36 -07:00
> **Note:**
> This virtual environment MUST NOT be set up WITHIN the danswer directory if you plan on using mypy within certain IDEs.
> For simplicity, we recommend setting up the virtual environment outside of the danswer directory.
_For Windows, activate the virtual environment using Command Prompt:_
```bash
.venv\Scripts\activate
```
If using PowerShell, the command slightly differs:
```powershell
.venv\Scripts\Activate.ps1
2024-02-21 17:44:13 -08:00
```
2023-07-12 19:29:13 -07:00
Install the required python dependencies:
```bash
2023-07-14 15:49:42 +00:00
pip install -r danswer/backend/requirements/default.txt
pip install -r danswer/backend/requirements/dev.txt
2024-09-07 14:05:36 -07:00
pip install -r danswer/backend/requirements/ee.txt
2024-04-07 21:25:06 -07:00
pip install -r danswer/backend/requirements/model_server.txt
2023-07-12 19:29:13 -07:00
```
2024-09-02 15:30:18 -07:00
2023-07-12 19:29:13 -07:00
Install [Node.js and npm](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm) for the frontend.
Once the above is done, navigate to `danswer/web` run:
```bash
npm i
```
2024-09-07 14:05:36 -07:00
Install Playwright (headless browser required by the Web Connector)
2024-01-01 18:06:56 -08:00
2024-09-07 14:05:36 -07:00
> **Note:**
> If you have just run the pip install, open a new terminal and source the python virtual-env again.
> This will pull the updated PATH to include playwright
2024-01-01 18:06:17 -08:00
2024-01-01 18:07:50 -08:00
Then install Playwright by running:
2023-07-12 19:29:13 -07:00
```bash
playwright install
```
#### Dependent Docker Containers
2024-09-02 15:30:18 -07:00
You will need Docker installed to run these containers.
2024-09-07 14:05:36 -07:00
First navigate to `danswer/deployment/docker_compose`, then start up Postgres/Vespa/Redis with:
2023-07-12 19:29:13 -07:00
```bash
docker compose -f docker-compose.dev.yml -p danswer-stack up -d index relational_db cache
2023-07-12 19:29:13 -07:00
```
(index refers to Vespa, relational_db refers to Postgres, and cache refers to Redis)
2023-07-12 19:29:13 -07:00
2024-09-07 14:05:36 -07:00
2023-07-12 19:29:13 -07:00
#### Running Danswer
To start the frontend, navigate to `danswer/web` and run:
```bash
2023-10-25 18:26:02 -07:00
npm run dev
2023-07-12 19:29:13 -07:00
```
2023-10-25 18:26:02 -07:00
2024-04-07 21:25:06 -07:00
Next, start the model server which runs the local NLP models.
Navigate to `danswer/backend` and run:
```bash
uvicorn model_server.main:app --reload --port 9000
```
2024-09-07 14:05:36 -07:00
2024-04-07 21:25:06 -07:00
_For Windows (for compatibility with both PowerShell and Command Prompt):_
```bash
2024-09-07 14:05:36 -07:00
powershell -Command "uvicorn model_server.main:app --reload --port 9000"
2024-04-07 21:25:06 -07:00
```
The first time running Danswer, you will need to run the DB migrations for Postgres.
2023-10-25 18:26:02 -07:00
After the first time, this is no longer required unless the DB models change.
2023-07-12 19:29:13 -07:00
Navigate to `danswer/backend` and with the venv active, run:
2023-07-12 19:29:13 -07:00
```bash
alembic upgrade head
```
2023-10-25 18:26:02 -07:00
Next, start the task queue which orchestrates the background jobs.
Jobs that take more time are run async from the API server.
Still in `danswer/backend`, run:
2023-08-26 15:35:19 -07:00
```bash
2023-10-25 18:26:02 -07:00
python ./scripts/dev_run_background_jobs.py
2023-08-26 15:35:19 -07:00
```
To run the backend API server, navigate back to `danswer/backend` and run:
2023-07-12 19:29:13 -07:00
```bash
2024-02-15 14:45:00 -08:00
AUTH_TYPE=disabled uvicorn danswer.main:app --reload --port 8080
2023-07-12 19:29:13 -07:00
```
2024-09-07 14:05:36 -07:00
_For Windows (for compatibility with both PowerShell and Command Prompt):_
```bash
2023-08-26 15:35:19 -07:00
powershell -Command "
$env:AUTH_TYPE='disabled'
2024-02-15 14:45:00 -08:00
uvicorn danswer.main:app --reload --port 8080
2023-08-26 15:35:19 -07:00
"
```
2023-07-12 19:29:13 -07:00
2024-09-07 14:05:36 -07:00
> **Note:**
> If you need finer logging, add the additional environment variable `LOG_LEVEL=DEBUG` to the relevant services.
2023-07-12 19:29:13 -07:00
### Formatting and Linting
#### Backend
For the backend, you'll need to setup pre-commit hooks (black / reorder-python-imports).
First, install pre-commit (if you don't have it already) following the instructions
[here](https://pre-commit.com/#installation).
2024-09-02 15:30:18 -07:00
2024-09-07 14:05:36 -07:00
With the virtual environment active, install the pre-commit library with:
2024-09-02 15:30:18 -07:00
```bash
pip install pre-commit
```
2023-07-12 19:29:13 -07:00
Then, from the `danswer/backend` directory, run:
```bash
pre-commit install
```
Additionally, we use `mypy` for static type checking.
2024-09-07 14:05:36 -07:00
Danswer is fully type-annotated, and we want to keep it that way!
2024-04-01 22:41:40 -07:00
To run the mypy checks manually, run `python -m mypy .` from the `danswer/backend` directory.
2023-07-12 19:29:13 -07:00
#### Web
We use `prettier` for formatting. The desired version (2.8.8) will be installed via a `npm i` from the `danswer/web` directory.
To run the formatter, use `npx prettier --write .` from the `danswer/web` directory.
2024-04-01 22:41:40 -07:00
Please double check that prettier passes before creating a pull request.
2023-07-12 19:29:13 -07:00
### Release Process
2024-09-07 14:05:36 -07:00
Danswer loosely follows the SemVer versioning standard.
Major changes are released with a "minor" version bump. Currently we use patch release versions to indicate small feature changes.
2023-07-12 19:29:13 -07:00
A set of Docker containers will be pushed automatically to DockerHub with every tag.
You can see the containers [here](https://hub.docker.com/search?q=danswer%2F).