79 Commits

Author SHA1 Message Date
pablodanswer
f40c5ca9bd
Add tenant context (#2596)
* add proper tenant context to background tasks

* update for new session logic

* remove unnecessary functions

* add additional tenant context

* update ports

* proper format / directory structure

* update ports

* ensure tenant context properly passed to ee bg tasks

* add user provisioning

* nit

* validated for multi tenant

* auth

* nit

* nit

* nit

* nit

* validate pruning

* evaluate integration tests

* at long last, validated celery beat

* nit: minor edge case patched

* minor

* validate update

* nit
2024-10-10 16:34:32 +00:00
Yuhong Sun
5d356cc971
Remove Perm Sync Script Dev (#2712) 2024-10-07 13:50:30 -07:00
rkuo-danswer
fbf51b70d0
Feature/celery multi (#2470)
* first cut at redis

* some new helper functions for the db

* ignore kombu tables in alembic migrations (used by celery)

* multiline commands for readability, add vespa_metadata_sync queue to worker

* typo fix

* fix returning tuple fields

* add constants

* fix _get_access_for_document

* docstrings!

* fix double function declaration and typing

* fix type hinting

* add a global redis pool

* Add get_document function

* use task_logger in various celery tasks

* add celeryconfig.py to simplify configuration. Will be used in a subsequent commit

* Add celery redis helper. used in a subsequent PR

* kombu warning getting spammy since celery is not self managing its queue in Postgres any more

* add last_modified and last_synced to documents

* fix task naming convention

* use celeryconfig.py

* the big one. adds queues and tasks, updates functions to use the queues with priorities, etc

* change vespa index log line to debug

* mypy fixes

* update alembic migration

* fix fence ordering, rename to "monitor", fix fetch_versioned_implementation call

* mypy

* switch to monotonic time

* fix startup dependencies on redis

* rebase alembic migration

* kombu cleanup - fail silently

* mypy

* add redis_host environment override

* update REDIS_HOST env var in docker-compose.dev.yml

* update the rest of the docker files

* in flight

* harden indexing-status endpoint against db changes happening in the background.  Needs further improvement but OK for now.

* allow no task syncs to run because we create certain objects with no entries but initially marked as out of date

* add back writing to vespa on indexing

* actually working connector deletion

* update contributing guide

* backporting fixes from background_deletion

* renaming cache to cache_volume

* add redis password to various deployments

* try setting up pr testing for helm

* fix indent

* hopefully this release version actually exists

* fix command line option to --chart-dirs

* fetch-depth 0

* edit values.yaml

* try setting ct working directory

* bypass testing only on change for now

* move files and lint them

* update helm testing

* some issues suggest using --config works

* add vespa repo

* add postgresql repo

* increase timeout

* try amd64 runner

* fix redis password reference

* add comment to helm chart testing workflow

* rename helm testing workflow to disable it

* adding clarifying comments

* address code review

* missed a file

* remove commented warning ... just not needed

* fix imports

* refactor to use update_single

* mypy fixes

* add vespa test

* multiple celery workers

* update logs as well and set prefetch multipliers appropriate to the worker intent

* add db refresh to connector deletion

* add some preliminary locking

* organize tasks into separate files

* celery auto associates tasks created inside another task, which bloats the result metadata considerably. trail=False prevents this.

* code review fixes

* move monitor_usergroup_taskset to ee, improve logging

* add multi workers to dev_run_background_jobs.py

* update supervisord with some recommended settings for celery

* name celery workers and shorten dev script prefixing

* add configurable sql alchemy engine settings on startup (needed for various intents like API server, different celery workers and tasks, etc)

* fix comments

* autoscale sqlalchemy pool size to celery concurrency (allow override later?)

* supervisord needs the percent symbols escaped

* use name as primary check, some minor refactoring and type hinting too.

* addressing code review

* fix import

* fix prune_documents_task references

---------

Co-authored-by: Richard Kuo <rkuo@rkuo.com>
2024-09-27 00:50:55 +00:00
hagen-danswer
2274cab554
Added permission syncing (#2340)
* Added permission syncing on the backend

* Rewored to work with celery

alembic fix

fixed test

* frontend changes

* got groups working

* added comments and fixed public docs

* fixed merge issues

* frontend complete!

* frontend cleanup and mypy fixes

* refactored connector access_type selection

* mypy fixes

* minor refactor and frontend improvements

* get to fetch

* renames and comments

* minor change to var names

* got curator stuff working

* addressed pablo's comments

* refactored user_external_group to reference users table

* implemented polling

* small refactor

* fixed a whoopsies on the frontend

* added scripts to seed dummy docs and test query times

* fixed frontend build issue

* alembic fix

* handled is_public overlap

* yuhong feedback

* added more checks for sync

* black

* mypy

* fixed circular import

* todos

* alembic fix

* alembic
2024-09-19 22:07:36 +00:00
hagen-danswer
f3cea79c1c
Deleting a connector should redirect to the indexing status page (#2504)
* Deleting a connector should redirect to the indexing status page

* minor update to dev background jobs

* update refresh logic

* remove print statement

---------

Co-authored-by: pablodanswer <pablo@danswer.ai>
2024-09-18 21:38:35 +00:00
Yuhong Sun
6cec31088d
CONTRIBUTING updates (#2354) 2024-09-07 14:05:36 -07:00
Weves
39c946536c Fix deletion due to foreign key issue 2024-09-02 17:56:43 -07:00
Yuhong Sun
5ab4d94d94
Logging Level Update (#2165) 2024-08-18 21:53:40 -07:00
Weves
0853d1a8f1 Update force deletion script 2024-08-14 23:29:26 -07:00
Nathan Schwerdfeger
c7e5b11c63
EE Connector Deletion Bugfix + Refactor (#2042)
---------

Co-authored-by: Weves <chrisweaver101@gmail.com>
2024-08-11 20:33:07 -07:00
Yuhong Sun
c8ead6a0dc
Need Reindexing Flag Setup (#2102) 2024-08-09 17:44:57 -07:00
rkuo-danswer
7c283b090d
Feature/postgres connection names (#1998)
* avoid reindexing secondary indexes after they succeed

* use postgres application names to facilitate connection debugging

* centralize all postgres application_name constants in the constants file

* missed a couple of files

* mypy fixes

* update dev background script
2024-07-31 20:36:30 +00:00
rkuo-danswer
546bfbd24b
autoscale with pool=thread crashes celery. remove and use concurrency… (#1929)
* autoscale with pool=thread crashes celery. remove and use concurrency instead (to be improved later)

* update dev background script as well
2024-07-25 00:15:27 +00:00
rkuo-danswer
6ee74bd0d1
fix pointers to various background tasks and scripts (#1914) 2024-07-24 10:12:51 -07:00
Weves
6222f533be Update force delete script to handle user groups 2024-07-21 22:22:37 -07:00
Brent Kwok
07b2ed3d8f
Fix HTTP 422 error for api_inference_sample.py (#1868) 2024-07-19 18:54:43 -07:00
versecafe
86d1804eb0 Add GPT-4o-Mini & fix a missing gpt-4o 2024-07-19 12:10:27 -07:00
hagen-danswer
ac14369716
Added search quality testing pipeline (#1774) 2024-07-06 11:51:50 -07:00
Yuhong Sun
0c827d1e6c Permission Sync Framework (#44) 2024-06-25 15:07:56 -07:00
Chris Weaver
17cc262f5d Private personas doc sets (#52)
Private Personas and Document Sets

---------

Co-authored-by: Yuhong Sun <yuhongsun96@gmail.com>
2024-06-25 15:07:56 -07:00
pablodanswer
6c71bc05ea
modify script deletion name (#1690) 2024-06-23 08:29:37 -07:00
pablodanswer
7253316b9e
Add script for forced connector deletion (#1683) 2024-06-22 17:15:25 -07:00
Yuhong Sun
c798ade127
Code for ease of eval (#1656) 2024-06-17 20:32:12 -07:00
Yuhong Sun
546815dc8c
Consolidate File Processing (#1449) 2024-05-11 23:11:22 -07:00
Yuhong Sun
34d05f4599
Mypy fixes for default configs (#1442) 2024-05-10 16:46:28 -07:00
Yuhong Sun
23bf6ad4c7
Sample API Script (#1079) 2024-02-13 14:47:28 -08:00
Yuhong Sun
517c27c5ed
Dev Script to Restart Containers (#1063) 2024-02-08 17:34:15 -08:00
Yuhong Sun
6768c24723
Default LLM Update (#1042) 2024-02-05 01:25:51 -08:00
Yuhong Sun
d7141df5fc
Metadata and Title Search (#903) 2024-01-02 11:25:50 -08:00
Yuhong Sun
65fde8f1b3
Chat Backend (#801) 2023-12-14 22:14:37 -08:00
Chris Weaver
37daf4f3e4
Remove AI Thoughts by default (#783)
- Removes AI Thoughts by default - only shows when validation fails
- Removes punctuation "words" from queries in addition to stopwords (Vespa ignores punctuation anyways)
- Fixes Vespa deletion script for larger doc counts
2023-11-29 01:00:53 -08:00
Yuhong Sun
05c2b7d34e
Update LLM related Libs (#771) 2023-11-26 19:54:16 -08:00
Yuhong Sun
39d09a162a
Danswer APIs Document Ingestion Endpoint (#716) 2023-11-26 19:09:22 -08:00
Yuhong Sun
13001ede98
Search Regression Test and Save/Load State updates (#761) 2023-11-23 00:00:30 -08:00
mattboret
e78aefb408
Add script to analyse the sources selection (#721)
---------

Co-authored-by: Matthieu Boret <matthieu.boret@fr.clara.net>
2023-11-21 18:35:26 -08:00
Weves
e8786e1a20 Small formatting fixes 2023-11-01 21:46:23 -07:00
Bryan Peterson
44e3dcb19f
support for zendesk help center (#661) 2023-11-01 21:11:56 -07:00
Yuhong Sun
26b491fb0c
Prep for Hybrid Search (#648) 2023-10-29 00:13:21 -07:00
Yuhong Sun
fe117513b0
Reorganize and Cleanup for Hybrid Search (#643) 2023-10-28 14:24:28 -07:00
Yuhong Sun
9a51745fc9
Updated Contributing for Celery (#629) 2023-10-25 18:26:02 -07:00
Yuhong Sun
8403b94722
Default Personas to have Document Sets (#614) 2023-10-22 16:57:16 -07:00
Yuhong Sun
e279918f95
Introduce Time Filters (#610) 2023-10-22 15:06:52 -07:00
Yuhong Sun
b5982c10c3
Celery Beat (#575) 2023-10-16 14:59:42 -07:00
Yuhong Sun
595f61ea3a
Add Retrieval to Chat History (#577) 2023-10-15 13:40:07 -07:00
Yuhong Sun
af510cc965
API support for Chat to have citations (#569) 2023-10-13 17:38:25 -07:00
Weves
7afcf3489f Auto-populate ACL fields on server startup 2023-09-26 22:53:01 -07:00
Chris Weaver
8594bac30b
Transition to using access_control_list to manage access in Vespa (#450) 2023-09-26 12:26:39 -07:00
Yuhong Sun
e549d2bb4a
Chat with Context Backend (#441) 2023-09-15 12:17:05 -07:00
Yuhong Sun
4a0c2bf866
Vespa Save and Load (#422) 2023-09-09 20:25:31 -07:00
Yuhong Sun
d73d81c867
Scripts to Reset Postgres and Vespa (#382) 2023-09-01 14:43:04 -07:00