mirror of https://github.com/danswer-ai/danswer.git synced 2025-07-06 04:32:47 +02:00

Files

rkuo-danswer 909403a648 Feature/confluence oauth (#3477 )

* first cut at slack oauth flow

* fix usage of hooks

* fix button spacing

* add additional error logging

* no dev redirect

* early cut at google drive oauth

* second pass

* switch to production uri's

* try handling oauth_interactive differently

* pass through client id and secret if uploaded

* fix call

* fix test

* temporarily disable check for testing

* Revert "temporarily disable check for testing"

This reverts commit 4b5a022a5f.

* support visibility in test

* missed file

* first cut at confluence oauth

* work in progress

* work in progress

* work in progress

* work in progress

* work in progress

* first cut at distributed locking

* WIP to make test work

* add some dev mode affordances and gate usage of redis behind dynamic credentials

* mypy and credentials provider fixes

* WIP

* fix created at

* fix setting initialValue on everything

* remove debugging, fix ??? some TextFormField issues

* npm fixes

* comment cleanup

* fix comments

* pin the size of the card section

* more review fixes

* more fixes

---------

Co-authored-by: Richard Kuo <rkuo@rkuo.com>
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>

2025-02-28 03:48:51 +00:00

airtable

Fix typing for metadata

2025-02-14 18:19:37 -08:00

asana

misc improvement

2025-02-04 12:06:11 -08:00

axero

welcome to onyx

2024-12-13 09:56:10 -08:00

blob

Heavy task improvements, logging, and validation (#4058 )

2025-02-24 13:48:53 -08:00

bookstack

Heavy task improvements, logging, and validation (#4058 )

2025-02-24 13:48:53 -08:00

clickup

welcome to onyx

2024-12-13 09:56:10 -08:00

confluence

Feature/confluence oauth (#3477 )

2025-02-28 03:48:51 +00:00

cross_connector_utils

Miscellaneous indexing fixes (#4042 )

2025-02-19 11:34:49 -08:00

discord

Discord cleanup (#3615 )

2025-01-06 15:11:03 -08:00

discourse

Fix discourse connector

2024-12-24 12:43:10 -08:00

document360

welcome to onyx

2024-12-13 09:56:10 -08:00

dropbox

Heavy task improvements, logging, and validation (#4058 )

2025-02-24 13:48:53 -08:00

egnyte

Improve egnyte connector (#3626 )

2025-01-08 03:09:46 +00:00

file

Update tenant logic (#4122 )

2025-02-26 03:53:46 +00:00

fireflies

Miscellaneous indexing fixes (#4042 )

2025-02-19 11:34:49 -08:00

freshdesk

welcome to onyx

2024-12-13 09:56:10 -08:00

gitbook

Fix gitbook connector issues

2025-02-20 15:29:11 -08:00

github

Add ability to index all of Github

2025-02-24 18:56:36 -08:00

gitlab

welcome to onyx

2024-12-13 09:56:10 -08:00

gmail

Heavy task improvements, logging, and validation (#4058 )

2025-02-24 13:48:53 -08:00

gong

welcome to onyx

2024-12-13 09:56:10 -08:00

google_drive

Content of .xlsl are not properly read during indexing. (#4035 )

2025-02-25 21:10:47 -08:00

google_site

welcome to onyx

2024-12-13 09:56:10 -08:00

google_utils

update

2025-02-04 12:06:11 -08:00

guru

welcome to onyx

2024-12-13 09:56:10 -08:00

hubspot

Heavy task improvements, logging, and validation (#4058 )

2025-02-24 13:48:53 -08:00

linear

small linear connector improvements (#3929 )

2025-02-07 01:31:49 +00:00

loopio

welcome to onyx

2024-12-13 09:56:10 -08:00

mediawiki

welcome to onyx

2024-12-13 09:56:10 -08:00

mock_connector

Connector checkpointing (#3876 )

2025-02-16 02:34:39 +00:00

notion

Heavy task improvements, logging, and validation (#4058 )

2025-02-24 13:48:53 -08:00

onyx_jira

Add option to index all Jira projects (#4106 )

2025-02-25 02:07:00 +00:00

productboard

welcome to onyx

2024-12-13 09:56:10 -08:00

requesttracker

welcome to onyx

2024-12-13 09:56:10 -08:00

salesforce

Bugfix/slack stop 2 (#3916 )

2025-02-08 23:45:41 +00:00

sharepoint

Fixed SharePoint connector polling (#3834 )

2025-01-30 17:43:11 +00:00

slab

Bugfix/slack stop 2 (#3916 )

2025-02-08 23:45:41 +00:00

slack

Heavy task improvements, logging, and validation (#4058 )

2025-02-24 13:48:53 -08:00

teams

Heavy task improvements, logging, and validation (#4058 )

2025-02-24 13:48:53 -08:00

web

Feature/confluence oauth (#3477 )

2025-02-28 03:48:51 +00:00

wikipedia

welcome to onyx

2024-12-13 09:56:10 -08:00

xenforo

welcome to onyx

2024-12-13 09:56:10 -08:00

zendesk

Bugfix/slack stop 2 (#3916 )

2025-02-08 23:45:41 +00:00

zulip

Fix ruff

2025-02-15 16:35:15 -08:00

__init__.py

welcome to onyx

2024-12-13 09:56:10 -08:00

connector_runner.py

Connector checkpointing (#3876 )

2025-02-16 02:34:39 +00:00

credentials_provider.py

Feature/confluence oauth (#3477 )

2025-02-28 03:48:51 +00:00

exceptions.py

Heavy task improvements, logging, and validation (#4058 )

2025-02-24 13:48:53 -08:00

factory.py

Feature/confluence oauth (#3477 )

2025-02-28 03:48:51 +00:00

interfaces.py

Feature/confluence oauth (#3477 )

2025-02-28 03:48:51 +00:00

models.py

Connector checkpointing (#3876 )

2025-02-16 02:34:39 +00:00

README.md

welcome to onyx

2024-12-13 09:56:10 -08:00

README.md

Writing a new Onyx Connector

This README covers how to contribute a new Connector for Onyx. It includes an overview of the design, interfaces, and required changes.

Thank you for your contribution!

Connector Overview

Connectors come in 3 different flows:

Load Connector:
- Bulk indexes documents to reflect a point in time. This type of connector generally works by either pulling all documents via a connector's API or loads the documents from some sort of a dump file.
Poll Connector:
- Incrementally updates documents based on a provided time range. It is used by the background job to pull the latest changes and additions since the last round of polling. This connector helps keep the document index up to date without needing to fetch/embed/index every document which would be too slow to do frequently on large sets of documents.
Slim Connector:
- This connector should be a lighter weight method of checking all documents in the source to see if they still exist.
- This connector should be identical to the Poll or Load Connector except that it only fetches the IDs of the documents, not the documents themselves.
- This is used by our pruning job which removes old documents from the index.
- The optional start and end datetimes can be ignored.
Event Based connectors:
- Connectors that listen to events and update documents accordingly.
- Currently not used by the background job, this exists for future design purposes.

Connector Implementation

Refer to interfaces.py and this first contributor created Pull Request for a new connector (Shoutout to Dan Brown): Reference Pull Request

For implementing a Slim Connector, refer to the comments in this PR: Slim Connector PR

All new connectors should have tests added to the backend/tests/daily/connectors directory. Refer to the above PR for an example of adding tests for a new connector.

Implementing the new Connector

The connector must subclass one or more of LoadConnector, PollConnector, SlimConnector, or EventConnector.

The __init__ should take arguments for configuring what documents the connector will and where it finds those documents. For example, if you have a wiki site, it may include the configuration for the team, topic, folder, etc. of the documents to fetch. It may also include the base domain of the wiki. Alternatively, if all the access information of the connector is stored in the credential/token, then there may be no required arguments.

load_credentials should take a dictionary which provides all the access information that the connector might need. For example this could be the user's username and access token.

Refer to the existing connectors for load_from_state and poll_source examples. There is not yet a process to listen for EventConnector events, this will come down the line.

Development Tip

It may be handy to test your new connector separate from the rest of the stack while developing. Follow the below template:

if __name__ == "__main__":
    import time
    test_connector = NewConnector(space="engineering")
    test_connector.load_credentials({
        "user_id": "foobar",
        "access_token": "fake_token"
    })
    all_docs = test_connector.load_from_state()

    current = time.time()
    one_day_ago = current - 24 * 60 * 60  # 1 day
    latest_docs = test_connector.poll_source(one_day_ago, current)

Note: Be sure to set PYTHONPATH to onyx/backend before running the above main.

Additional Required Changes:

Backend Changes

Add a new type to DocumentSource
Add a mapping from DocumentSource (and optionally connector type) to the right connector class here

Frontend Changes

Add the new Connector definition to the SOURCE_METADATA_MAP here.
Add the definition for the new Form to the connectorConfigs object here.

Docs Changes

Create the new connector page (with guiding images!) with how to get the connector credentials and how to set up the connector in Onyx. Then create a Pull Request in https://github.com/onyx-dot-app/onyx-docs.

Before opening PR

Be sure to fully test changes end to end with setting up the connector and updating the index with new docs from the new connector. To make it easier to review, please attach a video showing the successful creation of the connector via the UI (starting from the Add Connector page).
Add a folder + tests under backend/tests/daily/connectors director. For an example, checkout the test for Confluence. In the PR description, include a guide on how to setup the new source to pass the test. Before merging, we will re-create the environment and make sure the test(s) pass.
Be sure to run the linting/formatting, refer to the formatting and linting section in CONTRIBUTING.md