Commit Graph

57 Commits

Author SHA1 Message Date
Chris Weaver
ba805f766f New assistants api (#3097) 2024-11-11 07:55:23 -08:00
pablodanswer
9272d6ebfe Remove ee (#3093)
* move api key to non-ee

* finalize previous migration

* move token rate limit to non-ee

* general cleanup

* update

* update

* finalize

* finalize

* ensure callable

* k
2024-11-09 20:51:36 +00:00
rkuo-danswer
5f5cc9a724 Feature/redis connector refactor (#2992)
* refactor RedisConnectorDeletion into RedisConnector

* refactor redis stop and deletion

* port pruning

* nest pruning

* port deletion

* port indexing

* refactor into individual files

* refactor redis connector index  to take search settings at init

* move back to debug level log

* refactor doc set and user group (mostly)

* mypy fixes
2024-11-02 19:53:04 +00:00
Chris Weaver
ecf4923a3a Fix answer with specified doc ids (#2703)
* Fix

Fix

Refactor

more

more

fix

refactor

Fix circular imports

Refactor

Move tests around

* Add quote support

* Testing

* More testing

* Fix image generation slowness

* Remove unused exception

* Fix UT

* fix stop generating

* minor typo

* minor logging updates for clarity

---------

Co-authored-by: pablodanswer <pablo@danswer.ai>
2024-11-01 19:50:20 +00:00
Yuhong Sun
b34f5862d7 Remove License Issues (#3013)
* k

* k

* k

* k

* k
2024-11-01 00:31:19 +00:00
rkuo-danswer
e05846db9f change test port to 8889 (docker desktop is now using port 8888 which blocks the test from working on mac) (#2972) 2024-10-28 18:33:32 +00:00
pablodanswer
1261d859ac Tenant aware JWT strategy (#2943)
* add tenantJWTSrategy

* nit
2024-10-26 23:27:40 +00:00
pablodanswer
9b147ae437 Tenant integration tests (#2913)
* check for index swap

* initial bones

* kk

* k

* k:

* nit

* nit

* rebase + update

* nit

* minior update

* k

* minor integration test fixes

* nit

* ensure we build test docker image

* remove one space

* k

* ensure we wipe volumes

* remove log

* typo

* nit

* k

* k
2024-10-25 18:47:17 +00:00
Chris Weaver
bd63119684 Fix structured outputs (#2923)
* Fix structured outputs

* Add back rest
2024-10-25 18:19:54 +00:00
Weves
4ca38201d1 Fix IT fixture ordering 2024-10-24 22:43:38 -07:00
Chris Weaver
4a47e9a841 Add strict json mode (#2917) 2024-10-24 22:38:46 -07:00
Yuhong Sun
b49a9ab171 Seeding (#2902)
* checkpoint

* k

* k

* k

* fixed slack api calls

* missed one

---------

Co-authored-by: hagen-danswer <hagen@danswer.ai>
2024-10-24 23:45:48 +00:00
rkuo-danswer
e4779c29a7 tighter signaling to prevent indexing cleanup from hitting tasks that are just starting (#2867)
* better indexing synchronization

* add logging for fence wait

* handle the task not creating

* add more logging

* add more logging

* raise retry count
2024-10-21 23:46:23 +00:00
hagen-danswer
802086ee57 Refactored Confluence Connector (#2859)
* Refactored Confluence Connector

* rename metadataconnector to slimconnector

Finish rename

* danswer->onyx

* added rec

* typo

* refactored doc_sync for confluence

* mypy + enable tests

* tested and fixed for confluence cloud

* fixed all server syncing

* fixed connector test

* mypy+connector test fixes

* addressed richards comments

* minor fix
2024-10-21 23:03:40 +00:00
rkuo-danswer
6913efef90 fresh indexing feature branch (#2790)
* fresh indexing feature branch

* cherry pick test

* Revert "cherry pick test"

This reverts commit 2a62422068.

* set multitenant so that vespa fields match when indexing

* cleanup pass

* mypy

* pass through env var to control celery indexing concurrency

* comments on task kickoff and some logging improvements

* use get_session_with_tenant

* comment out all of update.py

* rename to RedisConnectorIndexingFenceData

* first check num_indexing_workers

* refactor RedisConnectorIndexingFenceData

* comment out on_worker_process_init

* fix where num_indexing_workers falls back

* remove extra brace
2024-10-18 22:40:05 +00:00
hagen-danswer
deb66a88aa dont fail flaky tests 2024-10-17 13:37:50 -07:00
hagen-danswer
28ad01a51a py 2024-10-17 12:37:34 -07:00
hagen-danswer
0c102ebb5c simplified the document search function 2024-10-17 12:13:42 -07:00
hagen-danswer
5063b944ec Make flakey test still run but not fail CI 2024-10-17 11:36:59 -07:00
pablodanswer
db0779dd02 Session id: int -> UUID (#2814)
* session id: int -> UUID

* nit

* validated

* validated downgrade + upgrade + all functionality

* nit

* minor nit

* fix test case
2024-10-16 22:18:45 +00:00
rkuo-danswer
0a0215ceee check last_pruned instead of is_pruning (#2748)
* check last_pruned instead of is_pruning

* try using the ThreadingHTTPServer class for stability and avoiding blocking single-threaded behavior

* add startup delay to web server in test

* just explicitly return None if we can't parse the datetime

* switch to uvicorn for test stability
2024-10-16 18:52:27 +00:00
rkuo-danswer
aa187c86e2 Merge pull request #2726 from danswer-ai/bugfix/docker-web-runners
try porting docker web build to runs-on
2024-10-08 14:42:43 -07:00
Richard Kuo (Danswer)
c72c5619f0 remove more flaky tests 2024-10-08 14:42:04 -07:00
Richard Kuo (Danswer)
057321a59f disable flaky test 2024-10-08 13:40:35 -07:00
Richard Kuo (Danswer)
a52485bda2 Fix all LegacyKeyValueFormat docker warnings 2024-10-07 15:22:28 -07:00
rkuo-danswer
3404c7eb1d Feature/background prune 2 (#2583)
* first cut at redis

* some new helper functions for the db

* ignore kombu tables in alembic migrations (used by celery)

* multiline commands for readability, add vespa_metadata_sync queue to worker

* typo fix

* fix returning tuple fields

* add constants

* fix _get_access_for_document

* docstrings!

* fix double function declaration and typing

* fix type hinting

* add a global redis pool

* Add get_document function

* use task_logger in various celery tasks

* add celeryconfig.py to simplify configuration. Will be used in a subsequent commit

* Add celery redis helper. used in a subsequent PR

* kombu warning getting spammy since celery is not self managing its queue in Postgres any more

* add last_modified and last_synced to documents

* fix task naming convention

* use celeryconfig.py

* the big one. adds queues and tasks, updates functions to use the queues with priorities, etc

* change vespa index log line to debug

* mypy fixes

* update alembic migration

* fix fence ordering, rename to "monitor", fix fetch_versioned_implementation call

* mypy

* switch to monotonic time

* fix startup dependencies on redis

* rebase alembic migration

* kombu cleanup - fail silently

* mypy

* add redis_host environment override

* update REDIS_HOST env var in docker-compose.dev.yml

* update the rest of the docker files

* in flight

* harden indexing-status endpoint against db changes happening in the background.  Needs further improvement but OK for now.

* allow no task syncs to run because we create certain objects with no entries but initially marked as out of date

* add back writing to vespa on indexing

* actually working connector deletion

* update contributing guide

* backporting fixes from background_deletion

* renaming cache to cache_volume

* add redis password to various deployments

* try setting up pr testing for helm

* fix indent

* hopefully this release version actually exists

* fix command line option to --chart-dirs

* fetch-depth 0

* edit values.yaml

* try setting ct working directory

* bypass testing only on change for now

* move files and lint them

* update helm testing

* some issues suggest using --config works

* add vespa repo

* add postgresql repo

* increase timeout

* try amd64 runner

* fix redis password reference

* add comment to helm chart testing workflow

* rename helm testing workflow to disable it

* adding clarifying comments

* address code review

* missed a file

* remove commented warning ... just not needed

* fix imports

* refactor to use update_single

* mypy fixes

* add vespa test

* multiple celery workers

* update logs as well and set prefetch multipliers appropriate to the worker intent

* add db refresh to connector deletion

* add some preliminary locking

* organize tasks into separate files

* celery auto associates tasks created inside another task, which bloats the result metadata considerably. trail=False prevents this.

* code review fixes

* move monitor_usergroup_taskset to ee, improve logging

* add multi workers to dev_run_background_jobs.py

* update supervisord with some recommended settings for celery

* name celery workers and shorten dev script prefixing

* add configurable sql alchemy engine settings on startup (needed for various intents like API server, different celery workers and tasks, etc)

* fix comments

* autoscale sqlalchemy pool size to celery concurrency (allow override later?)

* supervisord needs the percent symbols escaped

* use name as primary check, some minor refactoring and type hinting too.

* stash merge (may not function yet)

* remove dead code

* more cleanup

* remove dead file

* we shouldn't be checking for deletion attempts in the db any more

* print cc_pair_id

* print status on status mismatch again

* add logging when cc_pair isn't present

* don't indexing any ingestion type connectors, and don't pause any connectors that aren't active

* add more specific check for deletion completion

* remove flaky mediawiki test site

* move is_pruning

* remove unused code

* remove old function

---------

Co-authored-by: Richard Kuo <rkuo@rkuo.com>
2024-10-07 18:16:17 +00:00
pablodanswer
0da736bed9 Tenant provisioning in the dataplane (#2694)
* add tenant provisioning to data plane

* minor typing update

* ensure tenant router included

* proper auth check

* update disabling logic

* validated basic provisioning

* use new kv store
2024-10-06 04:08:35 +00:00
rkuo-danswer
4f47004d47 disable another flaky assert (#2678) 2024-10-04 00:25:46 +00:00
hagen-danswer
c2088602e1 Implement source testing framework + Slack (#2650)
* Added permission sync tests for Slack

* moved folders

* prune test + mypy

* added wait for indexing to cc_pair creation

* commented out check

* should fix other tests

* added slack channel pool

* fixed everything and mypy

* reduced flake
2024-10-02 23:16:07 +00:00
rkuo-danswer
140c5b3957 don't push integration testing docker images (#2584)
* experiment with build and no push

* use slightly more descriptive and consistent tags and names

* name integration test workflow consistently with other workflows

* put the tag back

* try runs-on s3 backend

* try adding runs-on cache

* add with key

* add a dummy path

* forget about multiline

* maybe we don't need runs-on cache immediately

* lower ram slightly, name test with a version bump

* don't need to explicitly include runs-on/cache for docker caching

* comment out flaky portion of knowledge chat test

---------

Co-authored-by: Richard Kuo <rkuo@rkuo.com>
2024-10-01 01:00:47 +00:00
hagen-danswer
1cff2b82fd Global Curator Fix + Testing (#2591)
* Global Curator Fix

* test fix
2024-09-28 20:14:39 +00:00
hagen-danswer
b73d66c84a Cleaned up foreign key cleanup for user group deletion (#2559)
* cleaned up fk cleanup for user group deletion

* added test for user group deletion
2024-09-26 03:38:01 +00:00
rkuo-danswer
c5a61f4820 Feature/test pruning (#2556)
* add test to exercise pruning

* add prettierignore

* mypy fix

* mypy again

* try getting all the env vars set up correctly

* fix ports and hostnames
2024-09-25 23:34:13 +00:00
rkuo-danswer
c8d13922a9 rename classes and ignore deprecation warnings we mostly don't have c… (#2546)
* rename classes and ignore deprecation warnings we mostly don't have control over

* copy pytest.ini

* ignore CryptographyDeprecationWarning

* fully qualify the warning
2024-09-24 00:21:42 +00:00
hagen-danswer
19dae1d870 Wrote tests for the chat apis (#2525)
* Wrote tests for the chat apis

* slight changes to the case
2024-09-20 19:00:03 +00:00
hagen-danswer
2274cab554 Added permission syncing (#2340)
* Added permission syncing on the backend

* Rewored to work with celery

alembic fix

fixed test

* frontend changes

* got groups working

* added comments and fixed public docs

* fixed merge issues

* frontend complete!

* frontend cleanup and mypy fixes

* refactored connector access_type selection

* mypy fixes

* minor refactor and frontend improvements

* get to fetch

* renames and comments

* minor change to var names

* got curator stuff working

* addressed pablo's comments

* refactored user_external_group to reference users table

* implemented polling

* small refactor

* fixed a whoopsies on the frontend

* added scripts to seed dummy docs and test query times

* fixed frontend build issue

* alembic fix

* handled is_public overlap

* yuhong feedback

* added more checks for sync

* black

* mypy

* fixed circular import

* todos

* alembic fix

* alembic
2024-09-19 22:07:36 +00:00
rkuo-danswer
bb279a8580 add pip retries. should help with github's occasional flaky network during build/test (#2506) 2024-09-19 00:46:41 +00:00
Chris Weaver
7ba829a585 Add top_documents to APIs (#2469)
* Add top_documents

* Fix test

---------

Co-authored-by: hagen-danswer <hagen@danswer.ai>
2024-09-16 23:48:33 +00:00
pablodanswer
e2c37d6847 Test stream + Update Copy (#2317)
* update copy + conditional ordering

* answer stream checks

* update

* add basic tests for chat streams

* slightly simplify

* fix typing

* quick typing updates + nits
2024-09-15 19:40:48 +00:00
rkuo-danswer
d807ad7699 fix document set connection removal sync, add tests for document set and user group removal (#2437) 2024-09-14 01:01:26 +00:00
rkuo-danswer
f4f2fb5943 Bugfix/connector deletion test (#2402)
* fixes a bug with deleting connectors and foreign keys

* test foreign key handling on deletion
2024-09-11 12:04:27 -07:00
rkuo-danswer
f52d1142eb Fail instead of continuing if vespa cannot be reached within the time… (#2379)
* Fail instead of continuing if vespa cannot be reached within the timeout period

* improve startup readability

---------

Co-authored-by: Richard Kuo <rkuo@rkuo.com>
2024-09-10 03:10:25 +00:00
Chris Weaver
ccf986808c Add retries (#2358)
* Add retries

* fix

* add

* remove --build

* Remove cache-to

* Don't push

* Add back push

* Add newline

* Remove alembic logs
2024-09-08 00:12:32 +00:00
hagen-danswer
ebce3ff6ba added wait for sync after creating document set in tests (#2319) 2024-09-04 00:34:40 +00:00
Weves
7520fae068 Add back test 2024-09-02 18:04:55 -07:00
Weves
39c946536c Fix deletion due to foreign key issue 2024-09-02 17:56:43 -07:00
hagen-danswer
aa84846298 Connector deletion fix (#2293)
---------

Co-authored-by: Weves <chrisweaver101@gmail.com>
2024-09-01 23:32:20 -07:00
hagen-danswer
8d443ada5b Integration tests (#2256)
* initial commit

* almost done

* finished 3 tests

* minor refactor

* built out initial permisison tests

* reworked test_deletion

* removed logging

* all original tests have been converted

* renamed user_groups to user_group

* mypy

* added test for doc set permissions

* unified naming for manager methods

* Refactored models and added new deletion test

* minor additions

* better logging+fixed input variables

* commented out failed tests

* Added readme

* readme update

* Added auth to IT

set auth_type to basic and require_email_verification to false

* Update run-it.yml

* used verify and added to readme

* added api key manager
2024-09-01 22:21:00 +00:00
josvdw
50c17438d5 Litellm bump (#2195)
* ran bump-pydantic

* replace root_validator with model_validator

* mostly working. some alternate assistant error. changed root_validator and typing_extensions

* working generation chat. changed type

* replacing .dict with .model_dump

* argument needed to bring model_dump up to parity with dict()

* fix a fewremaining issues -- working with llama and gpt

* updating requirements file

* more requirement updates

* more requirement updates

* fix to make search work

* return type fix:

* half way tpyes change

* fixes for mypy and pydantic:

* endpoint fix

* fix pydantic protected namespaces

* it works!

* removed unecessary None initializations

* better logging

* changed default values to empty lists

* mypy fixes

* fixed array defaulting

---------

Co-authored-by: hagen-danswer <hagen@danswer.ai>
2024-08-28 00:00:27 +00:00
pablodanswer
97ba71e1b3 Db search (#2235)
* k

* update enum imports

* add functional types + model swaps

* remove a log

* remove kv

* fully functional + robustified for kv swap

* validated with hosted + cloud

* ensure not updating current search settings when reindexing

* add instance check

* revert back to updating search settings (will need a slight refactor for endpoint)

* protect advanced config override1

* run pretty

* fix typing

* update typing

* remove unnecessary function

* update model name

* clearer interface names

* validated foreign key constaint

* proper migration

* squash

---------

Co-authored-by: Yuhong Sun <yuhongsun96@gmail.com>
2024-08-27 04:26:51 +00:00