548 Commits

Author SHA1 Message Date
Chris Weaver
132a9f750d
Add Github Action to run mypy / reorder-python-imports / black on all PRs ()
Also fixes import ordering (previously, local imports weren't grouped together as they should have been)
2023-07-29 16:53:38 -07:00
Yuhong Sun
87fe6f7575
Add ingestion metrics () 2023-07-29 16:37:22 -07:00
Yuhong Sun
fe40e72b5c
Require Semantic Identifier to not be None () 2023-07-29 14:12:30 -07:00
jabdoa2
63780113d3
Add support for openid connect ()
This allow using Danswer in typical (non-google) enterprise environments.

* Access Tokens can be very large. A token without claims is already 1100 bytes for me (larger than allowed in danswer by default). With roles I got a 12kB token. For that reason I changed the field to TEXT in the database.
* Danswer used to swallow most errors when OIDC would fail. Nodejs forwards a request to the backend and swallows all errors. Even within the backend we catched all ValueErrors and only returned the last exception with the request. Added full stack trace logging to allow debugging issues with userinfo and other endpoints.
* Allow changing name of the login provider on the login button.
* Changed variables and URLs to generic OAUTH_XX (without google in the name) but kept compatibility with the existing google integration
* Tested again Keycloak with OpenID Connect

Next steps:
* Claim to role mappings
* Auto login/SSO (Login button is just an extra click)
2023-07-29 14:04:32 -07:00
jabdoa2
878d4e367f
prevent crash when semantic_identifier is None ()
This is a workaround around intermittent issues where sementic_identifier becomes None for some reason. It usually recovers when documents are rescraped.

Obviously, we do not yet understand the issue and are interested in a better solution.
2023-07-29 12:37:02 -07:00
Yuhong Sun
17e2008027
Add TODOs and minor style changes to web connector () 2023-07-29 12:35:38 -07:00
jabdoa2
0d7d54fddb
Improve Web Connector Output, Add Config Options and add OAuth Backend Flow () 2023-07-29 12:21:23 -07:00
Chris Weaver
3e8f5fa47e
Fix a few bugs with Google Drive polling ()
- Adds some offset to the `start` for the Google Drive connector to give time for `modifiedTime` to propagate so we don't miss updates
- Moves fetching folders into a separate call since folder `modifiedTime` doesn't get updated when a file in the folder is updated
- Uses `connector_credential_pair.last_successful_index_time` instead of `updated_at` to determine the `start` for poll connectors
2023-07-28 18:27:32 -07:00
Yuhong Sun
55adde5e27
Fix import location and mypy issue () 2023-07-28 16:06:25 -07:00
Yuhong Sun
2a339ec34b
Prevent too many tokens to GPT () 2023-07-28 16:00:26 -07:00
Yuhong Sun
d03ac44744
Guru Connector ()
Co-authored-by: Weves <chrisweaver101@gmail.com>
2023-07-28 14:27:02 -07:00
Yuhong Sun
4d0732395d
Standalone Script to Test OpenAI API Key () 2023-07-27 16:33:04 -07:00
Yuhong Sun
2a0d3b38e9
Google Drive Connector Debug Logging () 2023-07-27 09:27:57 -07:00
Weves
9e6467a0c9 Fix specifying folders for Google Drive connector 2023-07-26 21:39:31 -07:00
meherhendi
1a22666810
Adding vscode run & debug config ()
Also adds `.env` to `.gitignore` files outside of the `deployment` dir
2023-07-26 12:35:31 -07:00
Yuhong Sun
273802eff0
Disable Gpt4all due to mac not supporting it currently () 2023-07-25 22:19:15 -07:00
Yuhong Sun
e019db0bc7
Indexing Job has timezone discrepancy with DB making Poll timeframes incorrect () 2023-07-23 21:59:00 -07:00
Sid Ravinutala
d6d3d5291b added docx2txt 2023-07-24 01:42:39 +00:00
Sid Ravinutala
a4b47e0243 added support for docx in gdrive
rebase from main
2023-07-24 01:41:35 +00:00
Yuhong Sun
d6ca865034
Support GPT4All in memory () 2023-07-23 12:26:14 -07:00
Chris Weaver
dd084d40f6
Product board connector ()
Also fixes misc mypy issues across the repo
2023-07-22 13:00:51 -07:00
Yuhong Sun
25a028c4a7
Merge pull request from pkabra/notion-connector
Notion connector
2023-07-21 00:04:12 -07:00
Pratik Kabra
b33c8b1d7c Reorg public-private functions 2023-07-20 18:04:48 -05:00
Pratik Kabra
7ad98480be Black fixes for python files 2023-07-20 18:01:23 -05:00
Pratik Kabra
ab3bb13493 Fix notion titles missing in some cases 2023-07-20 17:58:09 -05:00
Yuhong Sun
0708002953
Check for Credential delete before running queued index attempt () 2023-07-19 23:52:48 -07:00
Yuhong Sun
191c166ab6
Merge pull request from jabdoa2/do_not_crash_when_deleting_source
catch crash when deleting a datasource
2023-07-19 23:46:14 -07:00
Chris Weaver
4958962855
Merge pull request from chrisedington/ce/slack-archive-fix
Fix: Don't include archived Slack channels
2023-07-19 21:47:25 -07:00
Chris Edington
dac2fdc163 Fix: Don't include archived Slack channels, as they cannot be called on conversations.join API 2023-07-18 22:04:30 +02:00
Jan Kantert
7290f1893d catch crash when deleting a datasource
Danswer background crashes when the index task for a deleted source is still in the task queue. Without this is won't recover without manual database cleanup.
2023-07-18 13:42:16 +02:00
Pratik Kabra
af921fb179 Add some more docstrings 2023-07-17 20:06:43 -05:00
Pratik Kabra
4c263b7130 Notion connector backend 2023-07-17 20:06:43 -05:00
Chris Weaver
3b1a8274a9
Allow specification of specific google drive folders to index () 2023-07-17 14:51:16 -07:00
Chris Weaver
676538da61
Better error message on GPT failures ()
* Better error message on GPT-call failures

* Add support for disabling Generative AI
2023-07-16 16:25:33 -07:00
Yuhong Sun
554f6f3fe7
Combine Images Cleanup () 2023-07-16 15:31:52 -07:00
Yuhong Sun
4b699fdab3
Better Logging () 2023-07-16 01:41:48 -07:00
Yuhong Sun
3436b864a3
Fix missing Import () 2023-07-15 18:11:24 -07:00
Yuhong Sun
1c042a8e95
Update README.md 2023-07-15 16:23:23 -07:00
Yuhong Sun
c5c1b01a4e
Update README.md 2023-07-15 16:16:22 -07:00
Yuhong Sun
cdd097a4bb
connectors README () 2023-07-15 16:15:18 -07:00
Yuhong Sun
20589d8d78
Merge pull request from ssddanbrown/merge_images
Merged background and api-server images
2023-07-15 11:29:48 -07:00
Yuhong Sun
e4820045f9
Add metadata to GPT () 2023-07-14 16:54:42 -07:00
Chris Weaver
33463b45e8
Fix issue with web connector for pages not ending with / () 2023-07-13 22:30:10 -07:00
Dan Brown
f27364a442
Merged background and api-server images 2023-07-13 23:59:22 +01:00
Yuhong Sun
d53ec8a905
DAN-169 Users whitelist () 2023-07-11 21:23:35 -07:00
Chris Weaver
d135bc7efa
Merge pull request from ssddanbrown/bookstack_connector
BookStack connector
2023-07-08 17:18:59 -07:00
Yuhong Sun
367330d27a
DAN-165 Option to pull image from hub () 2023-07-08 15:53:21 -07:00
Weves
3494d6a13a Replace IDs with names in Slack connector 2023-07-07 18:10:19 -07:00
Yuhong Sun
79013ac9fd
DAN-164 Background slack job to give up after 5 tries
also minor docker compose change
2023-07-07 17:19:24 -07:00
Chris Weaver
b4759403ac
Adjust slack bot ()
* Add handling for cases where an answer is not found

* Make danswer bot slightly more configurable

* Don't respond to messages in thread + add better formatting for slack messages
2023-07-07 09:56:01 -07:00