* add utility function
* add utility functions to DocExternalAccess
* refactor db access out of individual celery tasks and put it directly into the heavy task
* code review and remove leftovers
* fix circular imports
---------
Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>
* Update mode to be a default parameter in `FileStore.read`
* Move query history exporting process to be a background job instead
* Move hardcoded report-file-naming to a common utility function
* Add type annotations
* Update download component
* Implement button to re-ping and download CSV file; fix up some backend file-checking logic
* De-indent logic (w/ early return)
* Return different error codes dependings on the type of task status
* Add more resistant failure retrying mechanisms
* Remove default parameter in helper function
* Use popup for error messaging
* Update return code
* Update web/src/app/ee/admin/performance/query-history/DownloadAsCSV.tsx
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* Add type to useState call
* Update backend/ee/onyx/server/query_history/api.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* Update backend/onyx/file_store/file_store.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* Update backend/ee/onyx/background/celery/apps/primary.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* Move rerender call to after check
* Run formatter
* Add type conversions back (smh greptile)
* Remove duplicated call to save_file
* Move non-fallible logic out of try-except block
* Pass date-ranges into API call
* Convert to ISO strings before passing it into the API call
* Add API to list all tasks
* Create new pydantic model to represent tasks to return instead
* Change helper to only fetch query-history tasks
* Use `shared_tasks` instead of old method
* Address more comments from PR; consolidate how task name is generated
* Mark task as failed if any exception is raised
* Change the task object which is returned back to the FE
* Add a table to display previously generated query-history-csv's
* Add timestamps to task; delete tasks as soon as file finishes processing
* Raise exception if start_time is not present
* Convert hard-coded string to constant
* Add "Generated At" field to table
* Return task list in sorted order (based off of start-time)
* Implement pagination
* Remove unused props and cleanup tailwind classes
* Change the name of kickoff button
* Redesign how previous query exports are viewed
* Make button a constant width even when contents change
* Remove timezone information before comparing
* Decrease interval time for re-pinging API
* Add timezone to start-time creation
* Add a refreshInterval for getting updated task status
* Add new background queue
* Edit small verbiage and remove error popup when max-retries is hit
* Change up heavy worker to recognize new task in new module
* Ensure `celery_app` is imported
* Change how `celery_app` is imported and defined
* Update comment on why `celery_app` must be imported
* Add basic skeleton for new beat task to cleanup any dead / failed query-history-export tasks
* Move cleanup task to different worker / queue
* Implement cleanup task
* Add return type
* Address comment on PR
* Remove delimiter from prefix
* Change name of function to be more descriptive
* Remove delimiter from prefix constant
* Move function invocation closer to usage location
* Move imports to top of file
* Move variable up a scope due to undefined error
* Remove dangling if-statement
* Make function more pure-functional
* Remove redefinition
---------
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* debug script + slight refactor of db class
* better comments
* move setup logger
---------
Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>
Co-authored-by: Richard Kuo <rkuo@rkuo.com>
* tool to generate vespa schema variations for our cloud
* extraneous assign
* use a real templating system instead of search/replace
* fix float
* maybe this should be double
* remove redundant var
* template the other files
* try a spawned process
* move the wrapper
* fix args
* increase timeout
---------
Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>
Co-authored-by: Richard Kuo <rkuo@rkuo.com>
* tool to generate vespa schema variations for our cloud
* extraneous assign
* float, not double
* back to double
---------
Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>
* initial working version
* ranking profile
* modification for keyword/instruction retrieval
* mypy fixes
* EL comments
* added env var (True for now)
* flipped default to False
* mypy & final EL/CW comments + import issue
* bump fastapi and starlette
* bumping llama index and nltk and associated deps
* bump to fix python-multipart
* bump aiohttp
* update package lock for examples/widget
* bump black
* sentencesplitter has changed namespaces
* fix reorder import check, fix missing passlib
* update package-lock.json
* black formatter updated
* reformatted again
* change to black compatible reorder
* change to black compatible reorder-python-imports fork
* fix pytest dependency
* black format again
* we don't need cdk.txt. update packages to be consistent across all packages
---------
Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>
Co-authored-by: Richard Kuo <rkuo@rkuo.com>
* fix acl prefixing
* increase timeout a tad
* block access to init'ing DocumentAccess directly, fix test to work with ee/MIT
* fix env var checks
---------
Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>
* remove title for slack
* initial working code
* simplification
* improvements
* name change to information_content_model
* avoid boost_score > 1.0
* nit
* EL comments and improvements
Improvements:
- proper import of information content model from cache or HF
- warm up for information content model
Other:
- EL PR review comments
* nit
* requirements version update
* fixed docker file
* new home for model_server configs
* default off
* small updates
* YS comments - pt 1
* renaming to chunk_boost & chunk table def
* saving and deleting chunk stats in new table
* saving and updating chunk stats
* improved dict score update
* create columns for individual boost factors
* RK comments
* Update migration
* manual import reordering
* early work in progress
* rename utility script
* move actual data seeding to a shareable function
* add test
* make the test pass with the fix
* fix comment
---------
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
* wip checkpointing/continue on failure
more stuff for checkpointing
Basic implementation
FE stuff
More checkpointing/failure handling
rebase
rebase
initial scaffolding for IT
IT to test checkpointing
Cleanup
cleanup
Fix it
Rebase
Add todo
Fix actions IT
Test more
Pagination + fixes + cleanup
Fix IT networking
fix it
* rebase
* Address misc comments
* Address comments
* Remove unused router
* rebase
* Fix mypy
* Fixes
* fix it
* Fix tests
* Add drop index
* Add retries
* reset lock timeout
* Try hard drop of schema
* Add timeout/retries to downgrade
* rebase
* test
* test
* test
* Close all connections
* test closing idle only
* Fix it
* fix
* try using null pool
* Test
* fix
* rebase
* log
* Fix
* apply null pool
* Fix other test
* Fix quality checks
* Test not using the fixture
* Fix ordering
* fix test
* Change pooling behavior
* add timings for syncing
* add more logging
* more debugging
* refactor multipass/db check out of VespaIndex
* circular imports?
* more debugging
* add logs
* various improvements
* additional logs to narrow down issue
* use global httpx pool for the main vespa flows in celery. Use in more places eventually.
* cleanup debug logging, etc
* remove debug logging
* this should use the secondary index
* mypy
* missed some logging
* review fixes
* refactor get_default_document_index to use search settings
* more missed logging
* fix circular refs
---------
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
Co-authored-by: pablodanswer <pablo@danswer.ai>
* prototype tools for handling prod issues
* add some commands
* add batching and dry run options
* custom redis tool
* comment
* default to app config settings for redis
---------
Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>