96 Commits

Author SHA1 Message Date
rkuo-danswer
6848337445
add validation for pruning/group sync etc (#3882)
* add validation for pruning

* fix missing class

* get external group sync validation working

* backport fix for pruning check

* fix pruning

* log the payload id

* remove scan_iter from pruning

* missed removed scan_iter, also remove other scan_iters and replace with sscan_iter of the lookup table

* external group sync needs active signal. h

* log the payload id when the task starts

* log the payload id in more places

* use the replica

* increase primary pool and slow down beat

* scale sql pool based on concurrency

* fix concurrency

* add debugging for external group sync and tenant

* remove debugging and fix payload id

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
2025-02-10 03:12:21 +00:00
rkuo-danswer
a222fae7c8
Bugfix/beat templates (#3754)
* WIP

* migrate most beat tasks to fan out strategy

* fix kwargs

* migrate EE tasks

* lock on the task_name level

* typo fix

* transform beat tasks for cloud

* cloud multiplier is only for cloud tasks

* bumpity

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
2025-02-08 06:57:57 +00:00
rkuo-danswer
6ccb3f085a
select only doc_id (#3920)
* select only doc_id

* select more doc ids

* fix user group

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
2025-02-06 07:00:40 +00:00
rkuo-danswer
bd08e6d787
alert if revisions are null or query fails (#3910)
* alert if revisions are null or query fails

* comment

* mypy

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
2025-02-05 23:45:38 +00:00
rkuo-danswer
47e6192b99
fix bug in validation logic (#3915)
* fix bug in validation logic

* test

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
2025-02-05 22:49:18 +00:00
Weves
8eb4320f76 Support not pausing connectors on initialization failure 2025-02-04 19:32:55 -08:00
rkuo-danswer
5854b39dd4
Merge pull request #3893 from onyx-dot-app/mypy_random
Mypy random fixes
2025-02-04 16:02:18 -08:00
Richard Kuo (Danswer)
aff4ee5ebf commented code 2025-02-04 15:56:18 -08:00
pablodanswer
42a0f45a96 update 2025-02-04 12:06:11 -08:00
pablodanswer
125e5eaab1 various mypy improvements 2025-02-04 12:06:10 -08:00
Richard Kuo
02a068a68b multiplier from 8 to 4 2025-02-03 23:59:36 -08:00
Richard Kuo (Danswer)
6f018d75ee use replica, remove some commented code 2025-02-03 10:10:05 -08:00
Richard Kuo (Danswer)
fd947aadea slow down to 8 again 2025-02-03 00:32:23 -08:00
Richard Kuo (Danswer)
3a950721b9 get rid of some more scan_iter 2025-02-02 01:14:10 -08:00
Richard Kuo (Danswer)
bbee2865e9 Merge branch 'main' of https://github.com/onyx-dot-app/onyx into feature/no_scan_iter 2025-02-01 10:46:38 -08:00
Chris Weaver
a5d2f0d9ac
Fix airtable connector w/ mt cloud + move telem logic to match new st… (#3868)
* Fix airtable connector w/ mt cloud + move telem logic to match new standard

* Address Greptile comment

* Small fixes/improvements

* Revert back monitoring frequency

* Small monitoring fix
2025-01-31 16:29:04 -08:00
Richard Kuo (Danswer)
d3cf18160e lower CLOUD_BEAT_SCHEDULE_MULTIPLIER to 4 2025-01-31 16:13:13 -08:00
Richard Kuo (Danswer)
618e4addd8 better signal names 2025-01-31 13:25:27 -08:00
Richard Kuo (Danswer)
69f16cc972 dont add to the lookup table if it already exists 2025-01-31 13:23:52 -08:00
Richard Kuo (Danswer)
b64545c7c7 build a lookup table every so often to handle cloud migration 2025-01-31 12:12:52 -08:00
Richard Kuo (Danswer)
5232aeacad Merge branch 'main' of https://github.com/onyx-dot-app/onyx into feature/no_scan_iter
# Conflicts:
#	backend/onyx/background/celery/tasks/vespa/tasks.py
#	backend/onyx/redis/redis_connector_doc_perm_sync.py
2025-01-31 10:38:10 -08:00
rkuo-danswer
261150e81a
Validate permission locks (#3799)
* WIP for external group sync lock fixes

* prototyping permissions validation

* validate permission sync tasks in celery

* mypy

* cleanup and wire off external group sync checks for now

* add active key to reset

* improve logging

* reset on payload format change

* return False on exception

* missed a return

* add count of tasks scanned

* add comment

* better logging

* add return

* more return

* catch payload exceptions

* code review fixes

* push to restart test

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
2025-01-31 17:33:07 +00:00
Richard Kuo (Danswer)
30e8fb12e4 remove commented code 2025-01-30 15:34:00 -08:00
Richard Kuo (Danswer)
d8578bc1cb first full cut 2025-01-30 15:21:52 -08:00
Richard Kuo (Danswer)
7ccfe85ee5 WIP 2025-01-29 22:52:21 -08:00
Chris Weaver
95701db1bd
Add more sync records + fix small bug in monitoring task causing deletion metrics to never be emitted (#3837)
Double check we don't double-emit + fix pruning metric

Add log

Fix comment

rename
2025-01-29 18:03:49 -08:00
rkuo-danswer
24105254ac
fix race condition with permission sync and fences (#3841)
* fix race condition with permission sync and fences

* comments

* set the fence

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
2025-01-29 23:40:44 +00:00
rkuo-danswer
4fe99d05fd
add timings for syncing (#3798)
* add timings for syncing

* add more logging

* more debugging

* refactor multipass/db check out of VespaIndex

* circular imports?

* more debugging

* add logs

* various improvements

* additional logs to narrow down issue

* use global httpx pool for the main vespa flows in celery. Use in more places eventually.

* cleanup debug logging, etc

* remove debug logging

* this should use the secondary index

* mypy

* missed some logging

* review fixes

* refactor get_default_document_index to use search settings

* more missed logging

* fix circular refs

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
Co-authored-by: pablodanswer <pablo@danswer.ai>
2025-01-29 23:24:44 +00:00
rkuo-danswer
fa78f50fe3
Bugfix/celery ignore result (#3770)
* try using a redis replica in some areas

* harden up replica usage

* ignore results

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
2025-01-27 08:53:01 +00:00
pablonyx
70795a4047
Sync status improvements (#3782)
* minor improvments / clarity

* additional comment for clarity

* typing

* quick updates to monitoring

* connector deletion

* quick nit

* fix typing

* update values

* quick nit

* functioning

* improvements to monitoring

* update

* minutes -> seconds
2025-01-26 17:35:26 +00:00
rkuo-danswer
d8a17a7238
try using a redis replica in some areas (#3748)
* try using a redis replica in some areas

* harden up replica usage

* comment

* slow down cloud dispatch temporarily

* add ignored syncing list back

* raise multiplier to 8

* comment out per tenant code (no longer used by fanout)

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
2025-01-26 03:48:25 +00:00
Chris Weaver
5d6a18f358
Add support for more /models/list formats (#3739) 2025-01-24 18:25:19 +00:00
rkuo-danswer
a2d8e815f6
Feature/more celery fanout (#3740)
* WIP

* migrate most beat tasks to fan out strategy

* fix kwargs

* migrate EE tasks

* lock on the task_name level

* typo fix

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
2025-01-23 19:08:42 +00:00
rkuo-danswer
b1e05bb909
Merge pull request #3751 from onyx-dot-app/bugfix/remove_index_debugging
remove debugging for specific problem tenants
2025-01-23 10:20:36 -08:00
pablonyx
ccb16b7484
Indexing latency check fix (#3747)
* add logs + update dev script

* update conig

* remove prints

* temporarily turn off

* va

* update

* fix

* finalize monitoring updates

* update
2025-01-23 17:14:26 +00:00
Richard Kuo (Danswer)
32f220e02c remove debugging for specific problem tenants 2025-01-22 16:23:24 -08:00
rkuo-danswer
69c60feda4
cloud check for migrations (#3734)
* cloud check for migrations

* fix table declaration

* change back interval

* Fix usage of POSTGRES_DEFAULT_SCHEMA

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
2025-01-22 22:41:28 +00:00
rkuo-danswer
b095e17827
Bugfix/watchdog signal (#3699)
* signal from the watchdog so that the monitor task doesn't try to clean up before it can exit

* ttl constants

* improve comment

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
2025-01-22 17:51:06 +00:00
pablonyx
8ddd95d0d4
Fix exceptional seeding delay (#3723)
* fix seeding waiting

* k

* updated
2025-01-21 01:02:13 +00:00
Chris Weaver
342bb9f685
Fix document counts (#3671)
* Various fixes/improvements to document counting

* Add new column + index

* Avoid double scan

* comment fixes

* Fix revision history

* Fix IT

* Fix IT

* Fix migration

* Rebase
2025-01-19 05:36:07 +00:00
Weves
a72bd31f5d Small background telemetry fix 2025-01-18 16:19:28 -08:00
rkuo-danswer
6fc52c81ab
Bugfix/beat redux (#3639)
* WIP

* WIP

* try spinning out check for indexing into a system task

* check for the correct delimiter

* use constants

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
Co-authored-by: Richard Kuo <rkuo@rkuo.com>
2025-01-17 20:59:43 +00:00
Chris Weaver
a05addec19
Add is_cloud info to telemetry + get consistent customer_uuid's for a… (#3684)
* Add is_cloud info to telemetry + get consistent customer_uuid's for a given tenant

* Address Richard's comments
2025-01-16 02:43:21 +00:00
rkuo-danswer
c9a420ec49
better logging and reduce long sessions (#3673)
* testing some tweaks based on issues seen with okteto

* shorten session usage in indexing. still a couple of long running sessions to clean up

* merge sessions

* fixing detached session issues

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
2025-01-16 01:27:12 +00:00
Chris Weaver
3b7695539f
Add monitoring worker (#3677)
* Add monitoring worker

* Add locks

* Add tenant id to lock

* Remove unneeded tenant postfix
2025-01-15 01:39:56 +00:00
hagen-danswer
b1957737f2
refactored _add_user_filter usage (#3674)
* refactored db.connector_credential_pair

* Rerfactored the db.credentials user filtering

* the restr
2025-01-14 23:35:52 +00:00
Richard Kuo (Danswer)
46cfaa96b7 Merge branch 'main' of https://github.com/danswer-ai/danswer into bugfix/index_attempt_query 2025-01-13 13:23:30 -08:00
rkuo-danswer
cb66aadd80
Merge pull request #3648 from onyx-dot-app/bugfix/light_cpu
figuring out why multiprocessing set_start_method isn't working.
2025-01-13 13:08:55 -08:00
Chris Weaver
9ea2ae267e
Performance monitoring (#3658)
* Initial scaffolding for metrics

* iterate

* more

* More metrics + SyncRecord concept

* Add indices, standardize timing

* Small cleanup

* Address comments
2025-01-13 12:36:45 -08:00
Richard Kuo (Danswer)
7d86b28335 maybe we don't need pre ping yet 2025-01-13 12:14:32 -08:00