28 Commits

Author SHA1 Message Date
rkuo-danswer
5258e6ed67 update indexing and slack bot to use stdout options (#2752) 2024-10-10 00:55:56 +00:00
Richard Kuo (Danswer)
cb668bcff5 backport: rely on stdout redirection for supervisord logging (#2711) 2024-10-07 15:13:43 -07:00
rkuo-danswer
fbf51b70d0
Feature/celery multi (#2470)
* first cut at redis

* some new helper functions for the db

* ignore kombu tables in alembic migrations (used by celery)

* multiline commands for readability, add vespa_metadata_sync queue to worker

* typo fix

* fix returning tuple fields

* add constants

* fix _get_access_for_document

* docstrings!

* fix double function declaration and typing

* fix type hinting

* add a global redis pool

* Add get_document function

* use task_logger in various celery tasks

* add celeryconfig.py to simplify configuration. Will be used in a subsequent commit

* Add celery redis helper. used in a subsequent PR

* kombu warning getting spammy since celery is not self managing its queue in Postgres any more

* add last_modified and last_synced to documents

* fix task naming convention

* use celeryconfig.py

* the big one. adds queues and tasks, updates functions to use the queues with priorities, etc

* change vespa index log line to debug

* mypy fixes

* update alembic migration

* fix fence ordering, rename to "monitor", fix fetch_versioned_implementation call

* mypy

* switch to monotonic time

* fix startup dependencies on redis

* rebase alembic migration

* kombu cleanup - fail silently

* mypy

* add redis_host environment override

* update REDIS_HOST env var in docker-compose.dev.yml

* update the rest of the docker files

* in flight

* harden indexing-status endpoint against db changes happening in the background.  Needs further improvement but OK for now.

* allow no task syncs to run because we create certain objects with no entries but initially marked as out of date

* add back writing to vespa on indexing

* actually working connector deletion

* update contributing guide

* backporting fixes from background_deletion

* renaming cache to cache_volume

* add redis password to various deployments

* try setting up pr testing for helm

* fix indent

* hopefully this release version actually exists

* fix command line option to --chart-dirs

* fetch-depth 0

* edit values.yaml

* try setting ct working directory

* bypass testing only on change for now

* move files and lint them

* update helm testing

* some issues suggest using --config works

* add vespa repo

* add postgresql repo

* increase timeout

* try amd64 runner

* fix redis password reference

* add comment to helm chart testing workflow

* rename helm testing workflow to disable it

* adding clarifying comments

* address code review

* missed a file

* remove commented warning ... just not needed

* fix imports

* refactor to use update_single

* mypy fixes

* add vespa test

* multiple celery workers

* update logs as well and set prefetch multipliers appropriate to the worker intent

* add db refresh to connector deletion

* add some preliminary locking

* organize tasks into separate files

* celery auto associates tasks created inside another task, which bloats the result metadata considerably. trail=False prevents this.

* code review fixes

* move monitor_usergroup_taskset to ee, improve logging

* add multi workers to dev_run_background_jobs.py

* update supervisord with some recommended settings for celery

* name celery workers and shorten dev script prefixing

* add configurable sql alchemy engine settings on startup (needed for various intents like API server, different celery workers and tasks, etc)

* fix comments

* autoscale sqlalchemy pool size to celery concurrency (allow override later?)

* supervisord needs the percent symbols escaped

* use name as primary check, some minor refactoring and type hinting too.

* addressing code review

* fix import

* fix prune_documents_task references

---------

Co-authored-by: Richard Kuo <rkuo@rkuo.com>
2024-09-27 00:50:55 +00:00
rkuo-danswer
f531d071af
Feature/background deletion (#2337)
* first cut at redis

* some new helper functions for the db

* ignore kombu tables in alembic migrations (used by celery)

* multiline commands for readability, add vespa_metadata_sync queue to worker

* typo fix

* fix returning tuple fields

* add constants

* fix _get_access_for_document

* docstrings!

* fix double function declaration and typing

* fix type hinting

* add a global redis pool

* Add get_document function

* use task_logger in various celery tasks

* add celeryconfig.py to simplify configuration. Will be used in a subsequent commit

* Add celery redis helper. used in a subsequent PR

* kombu warning getting spammy since celery is not self managing its queue in Postgres any more

* add last_modified and last_synced to documents

* fix task naming convention

* use celeryconfig.py

* the big one. adds queues and tasks, updates functions to use the queues with priorities, etc

* change vespa index log line to debug

* mypy fixes

* update alembic migration

* fix fence ordering, rename to "monitor", fix fetch_versioned_implementation call

* mypy

* switch to monotonic time

* fix startup dependencies on redis

* rebase alembic migration

* kombu cleanup - fail silently

* mypy

* add redis_host environment override

* update REDIS_HOST env var in docker-compose.dev.yml

* update the rest of the docker files

* in flight

* harden indexing-status endpoint against db changes happening in the background.  Needs further improvement but OK for now.

* allow no task syncs to run because we create certain objects with no entries but initially marked as out of date

* add back writing to vespa on indexing

* actually working connector deletion

* update contributing guide

* backporting fixes from background_deletion

* renaming cache to cache_volume

* add redis password to various deployments

* try setting up pr testing for helm

* fix indent

* hopefully this release version actually exists

* fix command line option to --chart-dirs

* fetch-depth 0

* edit values.yaml

* try setting ct working directory

* bypass testing only on change for now

* move files and lint them

* update helm testing

* some issues suggest using --config works

* add vespa repo

* add postgresql repo

* increase timeout

* try amd64 runner

* fix redis password reference

* add comment to helm chart testing workflow

* rename helm testing workflow to disable it

* adding clarifying comments

* address code review

* missed a file

* remove commented warning ... just not needed

* fix imports

* refactor to use update_single

* mypy fixes

* add vespa test

* add db refresh to connector deletion

* code review fixes

* move monitor_usergroup_taskset to ee, improve logging

---------

Co-authored-by: Richard Kuo <rkuo@rkuo.com>
2024-09-18 16:50:11 +00:00
pablodanswer
7f7559e3d2
Allow users to share assistants (#2434)
* enable assistant sharing

* functional

* remove logs

* revert ports

* remove accidental update

* minor updates to copy

* update formatting

* update for merge queue
2024-09-17 01:35:29 +00:00
rkuo-danswer
f1c5e80f17
Feature/background processing (#2275)
* first cut at redis

* some new helper functions for the db

* ignore kombu tables in alembic migrations (used by celery)

* multiline commands for readability, add vespa_metadata_sync queue to worker

* typo fix

* fix returning tuple fields

* add constants

* fix _get_access_for_document

* docstrings!

* fix double function declaration and typing

* fix type hinting

* add a global redis pool

* Add get_document function

* use task_logger in various celery tasks

* add celeryconfig.py to simplify configuration. Will be used in a subsequent commit

* Add celery redis helper. used in a subsequent PR

* kombu warning getting spammy since celery is not self managing its queue in Postgres any more

* add last_modified and last_synced to documents

* fix task naming convention

* use celeryconfig.py

* the big one. adds queues and tasks, updates functions to use the queues with priorities, etc

* change vespa index log line to debug

* mypy fixes

* update alembic migration

* fix fence ordering, rename to "monitor", fix fetch_versioned_implementation call

* mypy

* switch to monotonic time

* fix startup dependencies on redis

* rebase alembic migration

* kombu cleanup - fail silently

* mypy

* add redis_host environment override

* update REDIS_HOST env var in docker-compose.dev.yml

* update the rest of the docker files

* harden indexing-status endpoint against db changes happening in the background.  Needs further improvement but OK for now.

* allow no task syncs to run because we create certain objects with no entries but initially marked as out of date

* add back writing to vespa on indexing

* update contributing guide

* backporting fixes from background_deletion

* renaming cache to cache_volume

* add redis password to various deployments

* try setting up pr testing for helm

* fix indent

* hopefully this release version actually exists

* fix command line option to --chart-dirs

* fetch-depth 0

* edit values.yaml

* try setting ct working directory

* bypass testing only on change for now

* move files and lint them

* update helm testing

* some issues suggest using --config works

* add vespa repo

* add postgresql repo

* increase timeout

* try amd64 runner

* fix redis password reference

* add comment to helm chart testing workflow

* rename helm testing workflow to disable it

* adding clarifying comments

* address code review

* missed a file

* remove commented warning ... just not needed

---------

Co-authored-by: Richard Kuo <rkuo@rkuo.com>
2024-09-10 16:28:19 +00:00
Yuhong Sun
12f0dbcfc5
Background Container Logs (#2176) 2024-08-20 11:26:45 -07:00
Yuhong Sun
119aefba88
Add log files to containers (#2164) 2024-08-18 19:18:28 -07:00
rkuo-danswer
288e6fa606
Bugfix/pg connections (#2002)
* increase max_connections to 150 in all docker files

* lower celery worker concurrency to 6
2024-07-31 19:49:20 +00:00
rkuo-danswer
546bfbd24b
autoscale with pool=thread crashes celery. remove and use concurrency… (#1929)
* autoscale with pool=thread crashes celery. remove and use concurrency instead (to be improved later)

* update dev background script as well
2024-07-25 00:15:27 +00:00
Weves
1ee8ee9e8b Prepare EE to merge with MIT 2024-06-25 15:07:56 -07:00
Yuhong Sun
22fb7c3352
Slack LLM Filter Enabled by Default (#943) 2024-01-13 17:37:51 -08:00
Yuhong Sun
503a709e37
Update Slack Link (#936) 2024-01-11 17:42:26 -08:00
Weves
3cec854c5c Allow different model servers for different models / indexing jobs 2023-11-23 23:39:03 -08:00
Yuhong Sun
0618b59de6
Fix Indexing Frozen (#660) 2023-10-30 20:49:39 -07:00
Yuhong Sun
11d96b2807
Rename DanswerBot (#644) 2023-10-28 14:41:36 -07:00
Chris Weaver
89807c8c05
Fix deletion status display + add celery util + fix seg faults (#615) 2023-10-22 19:41:29 -07:00
Yuhong Sun
b5982c10c3
Celery Beat (#575) 2023-10-16 14:59:42 -07:00
Chris Weaver
0c58c8d6cb
Adding Document Sets (#477)
Adds:
- name for connector credential pairs + frontend changes to start populating this field
- document set table migration
- during indexing, document sets are now checked and inserted into Vespa
- background job to check if document sets need to be synced
- document set management APIs
- document set management dashboard in the UI
2023-09-26 12:53:19 -07:00
Weves
c4e0face9b Move connector / credential pair deletion to celery 2023-09-14 16:23:13 -07:00
Weves
cea3e1f3d5 Add support for multiple allowed email domains + make slack bot logs go to stdout 2023-08-30 13:47:46 -07:00
Chris Weaver
038f646c09
Fe for feedback (#346) 2023-08-30 12:52:24 -07:00
Chris Weaver
067503bc84
Background logs to stdout (#315) 2023-08-18 19:02:32 -07:00
Chris Weaver
89f71ac335
Support deletion of documents when a connector is deleted (#271) 2023-08-09 00:53:42 -07:00
Yuhong Sun
79013ac9fd
DAN-164 Background slack job to give up after 5 tries
also minor docker compose change
2023-07-07 17:19:24 -07:00
Chris Weaver
2f54795631
Basic Slack Bot Support (#128) 2023-07-03 14:26:33 -07:00
Yuhong Sun
590fbbc253
DAN-120 Kubernetes (#98)
Sample Kubernetes deployment with Auth default on
Also includes a bugfix for credentialed connectors with Auth turned on
2023-06-14 00:11:25 -07:00
Chris Weaver
f20563c9bc
File connector (#93)
* Initial backend changes for file connector

* Add another background job to clean up files

* UI + tweaks for backend
2023-06-09 21:28:50 -07:00