* try using a redis replica in some areas
* harden up replica usage
* comment
* slow down cloud dispatch temporarily
* add ignored syncing list back
* raise multiplier to 8
* comment out per tenant code (no longer used by fanout)
---------
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
* WIP
* migrate most beat tasks to fan out strategy
* fix kwargs
* migrate EE tasks
* lock on the task_name level
* typo fix
---------
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
* signal from the watchdog so that the monitor task doesn't try to clean up before it can exit
* ttl constants
* improve comment
---------
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
* WIP
* WIP
* try spinning out check for indexing into a system task
* check for the correct delimiter
* use constants
---------
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
Co-authored-by: Richard Kuo <rkuo@rkuo.com>
* testing some tweaks based on issues seen with okteto
* shorten session usage in indexing. still a couple of long running sessions to clean up
* merge sessions
* fixing detached session issues
---------
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
* more debugging
* test reacquire outside of loop
* more logging
* move lock_beat test outside the try catch so that we don't worry about testing locks we never took
* use a larger scan_iter value for performance
* batch stale document sync batches
* add debug logging for a particular timeout issue
---------
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
* try fixing exception in cloud
* raise beat expiry ... 60 seconds might be starving certain tasks completely
* adjust expiry down to 10 min
* raise concurrency overflow for indexing worker.
* parent pid check
* fix comment
* fix parent pid check, also actually raise an exception from the task if the spawned task exit status is bad
* fix pid check
* some cleanup and task wait fixes
* review fixes
* comment some code so we don't change too many things at once
---------
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
Co-authored-by: Richard Kuo <rkuo@rkuo.com>
* temporarily disabling validate indexing fences
* add back a few startup checks in the cloud
* use common vespa client to perform health check
* log vespa url and try using http1 on light worker index methods
---------
Co-authored-by: Richard Kuo <rkuo@rkuo.com>
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
* allow beat tasks to expire. it isn't important that they all run
* validate fences are in a good state and cancel/fail them if not
* add function timings for important beat tasks
* optimize lookups, add lots of comments
* review changes
---------
Co-authored-by: Richard Kuo <rkuo@rkuo.com>
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>