* try fixing exception in cloud
* raise beat expiry ... 60 seconds might be starving certain tasks completely
* adjust expiry down to 10 min
* raise concurrency overflow for indexing worker.
* parent pid check
* fix comment
* fix parent pid check, also actually raise an exception from the task if the spawned task exit status is bad
* fix pid check
* some cleanup and task wait fixes
* review fixes
* comment some code so we don't change too many things at once
---------
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
Co-authored-by: Richard Kuo <rkuo@rkuo.com>
* temporarily disabling validate indexing fences
* add back a few startup checks in the cloud
* use common vespa client to perform health check
* log vespa url and try using http1 on light worker index methods
---------
Co-authored-by: Richard Kuo <rkuo@rkuo.com>
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
* allow beat tasks to expire. it isn't important that they all run
* validate fences are in a good state and cancel/fail them if not
* add function timings for important beat tasks
* optimize lookups, add lots of comments
* review changes
---------
Co-authored-by: Richard Kuo <rkuo@rkuo.com>
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>