1465 Commits

Author SHA1 Message Date
pablodanswer
cb2169f2a3
Warm up reranker on model switch (#2408)
* warm up reranker on model switch

* properly type

* fix issue

* Update search_settings.py
2024-09-12 22:12:17 +00:00
hagen-danswer
604ebafe6c
simple apis now cited/context doc indices (#2419)
* simple apis now cited/context doc indices

* minor fixes
2024-09-12 21:29:24 +00:00
rkuo-danswer
641690e3f7
fix enabling ssl in connection pool (#2418) 2024-09-12 19:18:04 +00:00
rkuo-danswer
eebf98e3a6
fix setting redis_scheme (#2416) 2024-09-12 18:07:38 +00:00
rkuo-danswer
4bc4da29f5
add SSL parameter support for redis (#2389)
* add SSL parameter support for redis

* add ssl support to redis pool
2024-09-12 16:18:11 +00:00
pablodanswer
7af572d0e7
display only failed (#2413) 2024-09-12 16:01:17 +00:00
pablodanswer
58bdf9d684
Add connector deletion failure message (#2392) 2024-09-11 22:38:15 -07:00
pablodanswer
f69922fff7
Add environment variable for setting vespa search threads (#2400) 2024-09-11 22:37:38 -07:00
pablodanswer
aee5fcd4e0
Add env variables for overriding embedding batch size (#2395)
* add env variabels for overriding

* proper ports

* proper overrides
2024-09-12 00:51:45 +00:00
pablodanswer
2c77dd241b
Add error table to re-indexing (#2388)
* add error table to re-indexing

* robustify

* update with proper comment

* add popup

* update typo
2024-09-11 22:55:55 +00:00
rkuo-danswer
f4f2fb5943
Bugfix/connector deletion test (#2402)
* fixes a bug with deleting connectors and foreign keys

* test foreign key handling on deletion
2024-09-11 12:04:27 -07:00
rkuo-danswer
71f2f1a90a
fixes a bug with deleting connectors and foreign keys (#2398) 2024-09-11 12:03:51 -07:00
hagen-danswer
74a2271422
Added HARD_DELETE_CHATS to environment variables (#2397) 2024-09-11 18:08:29 +00:00
pablodanswer
0d749ebd46
add ccpair id to logging (#2391) 2024-09-11 01:27:03 +00:00
rkuo-danswer
f1c5e80f17
Feature/background processing (#2275)
* first cut at redis

* some new helper functions for the db

* ignore kombu tables in alembic migrations (used by celery)

* multiline commands for readability, add vespa_metadata_sync queue to worker

* typo fix

* fix returning tuple fields

* add constants

* fix _get_access_for_document

* docstrings!

* fix double function declaration and typing

* fix type hinting

* add a global redis pool

* Add get_document function

* use task_logger in various celery tasks

* add celeryconfig.py to simplify configuration. Will be used in a subsequent commit

* Add celery redis helper. used in a subsequent PR

* kombu warning getting spammy since celery is not self managing its queue in Postgres any more

* add last_modified and last_synced to documents

* fix task naming convention

* use celeryconfig.py

* the big one. adds queues and tasks, updates functions to use the queues with priorities, etc

* change vespa index log line to debug

* mypy fixes

* update alembic migration

* fix fence ordering, rename to "monitor", fix fetch_versioned_implementation call

* mypy

* switch to monotonic time

* fix startup dependencies on redis

* rebase alembic migration

* kombu cleanup - fail silently

* mypy

* add redis_host environment override

* update REDIS_HOST env var in docker-compose.dev.yml

* update the rest of the docker files

* harden indexing-status endpoint against db changes happening in the background.  Needs further improvement but OK for now.

* allow no task syncs to run because we create certain objects with no entries but initially marked as out of date

* add back writing to vespa on indexing

* update contributing guide

* backporting fixes from background_deletion

* renaming cache to cache_volume

* add redis password to various deployments

* try setting up pr testing for helm

* fix indent

* hopefully this release version actually exists

* fix command line option to --chart-dirs

* fetch-depth 0

* edit values.yaml

* try setting ct working directory

* bypass testing only on change for now

* move files and lint them

* update helm testing

* some issues suggest using --config works

* add vespa repo

* add postgresql repo

* increase timeout

* try amd64 runner

* fix redis password reference

* add comment to helm chart testing workflow

* rename helm testing workflow to disable it

* adding clarifying comments

* address code review

* missed a file

* remove commented warning ... just not needed

---------

Co-authored-by: Richard Kuo <rkuo@rkuo.com>
2024-09-10 16:28:19 +00:00
rkuo-danswer
f52d1142eb
Fail instead of continuing if vespa cannot be reached within the time… (#2379)
* Fail instead of continuing if vespa cannot be reached within the timeout period

* improve startup readability

---------

Co-authored-by: Richard Kuo <rkuo@rkuo.com>
2024-09-10 03:10:25 +00:00
pablodanswer
e563746730
Consent screen (#2381)
* update

* add consent popup

* rm
2024-09-10 02:40:32 +00:00
Yuhong Sun
aa86830bde mypy 2024-09-09 16:43:45 -07:00
James Jordan
4558351801
Zendesk tickets (#2192) 2024-09-09 16:36:53 -07:00
Sebastian Müller
a4dcae57cd
Google Drive Plaintext Types (#2371) 2024-09-09 15:37:47 -07:00
hj-danswer
e4e4765c60
Add user when they interact outside of UI (e.g. Slack bot) (#2369)
* Add user when they interact outside of UI (e.g. Slack bot)

* fix mypy errors

* don't use user manager to avoid async messiness

* fix email is none scenario

* fix mypy

* make code slightly clearer

* PR comments

* get slack email in generate button as well

* fix alembic migration

* update name to be more descriptive

---------

Co-authored-by: Hyeong Joon Suh <hyeongjoonsuh@Hyeongs-MacBook-Pro.local>
2024-09-09 20:21:31 +00:00
pablodanswer
3a9b964d5c
Add Litellm Rerank proxy (#2346)
* add ability ot set reranking litellm proxy

* add fully functional rerank litellm cards

* minor formatting enforcement

* remove logs
2024-09-09 15:57:01 +00:00
Yuhong Sun
f04ecbf87a Un-bump nltk due to llamaindex issue 2024-09-08 16:39:19 -07:00
Shukant Pal
362156f97e
Model inference for connector classifier on queries (#2137) 2024-09-08 14:46:00 -07:00
Andres Jose Sebastian Rincon Gonzalez
3fa9676478
[1802] adjust the code to support a different db schemas (#1803) 2024-09-08 14:16:54 -07:00
Yuhong Sun
148c2a7375
Remove wordnet (#2365) 2024-09-08 12:34:09 -07:00
pablodanswer
1555ac9dab
More explicit credential creation flow (#2363)
* more explcit drive credential creation flow

* remove logs

* update naming

* fix user-contributed formatting

* fix (^) v2
2024-09-08 12:09:23 -07:00
Weves
80de408cef Fix formatting 2024-09-08 12:09:14 -07:00
Cola Chen
e20c825e16
Notion Connector to skip reading external blocks in NotionConnector
The commit skips reading 'external_object_instance_page' blocks in the NotionConnector due to the lack of support in the Notion API. This change is in response to the issue #1761.

Co-authored-by: Cola Chen <6825116+colachg@users.noreply.github.com>
2024-09-08 11:34:04 -07:00
mattboret
b0568ac8ae
Sharepoint: Fix get all sites (#1700)
Co-authored-by: Matthieu Boret <matthieu.boret@fr.clara.net>
2024-09-08 11:28:11 -07:00
Art Matsak
0896d3b7da
Fix content extraction from JIRA with API v2 vs. v3 (#1678) 2024-09-08 11:27:14 -07:00
dependabot[bot]
5e9c6d1499
Bump aiohttp from 3.9.4 to 3.10.2 in /backend/requirements (#2097)
Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.9.4 to 3.10.2.
- [Release notes](https://github.com/aio-libs/aiohttp/releases)
- [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst)
- [Commits](https://github.com/aio-libs/aiohttp/compare/v3.9.4...v3.10.2)

---
updated-dependencies:
- dependency-name: aiohttp
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-08 10:59:47 -07:00
dependabot[bot]
50211ec401
Bump nltk from 3.8.1 to 3.9 in /backend/requirements (#2174)
Bumps [nltk](https://github.com/nltk/nltk) from 3.8.1 to 3.9.
- [Changelog](https://github.com/nltk/nltk/blob/develop/ChangeLog)
- [Commits](https://github.com/nltk/nltk/compare/3.8.1...3.9)

---
updated-dependencies:
- dependency-name: nltk
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-08 10:50:36 -07:00
dependabot[bot]
1e4b27185d
Bump torch from 2.0.1 to 2.2.0 in /backend/requirements (#1933)
Bumps [torch](https://github.com/pytorch/pytorch) from 2.0.1 to 2.2.0.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/compare/v2.0.1...v2.2.0)

---
updated-dependencies:
- dependency-name: torch
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-08 10:17:17 -07:00
Moshe Zada
0c66da17bb
Web Connector - Get doc_updated_at from Last-Modified header (#1693) 2024-09-08 10:05:04 -07:00
Art Matsak
d985cd4352
Fix JIRA comment indexing when author has no email (#1663) 2024-09-08 09:43:09 -07:00
Yuhong Sun
c8891a5829
Remove LangChain Community (#2362) 2024-09-08 09:41:20 -07:00
Art Matsak
51a13f5fc7
Implement indexing of simple tables in Word files (#1651) 2024-09-08 09:38:46 -07:00
dependabot[bot]
e2e04af7e2
Bump msal from 1.26.0 to 1.28.0 in /backend/requirements (#1626)
Bumps [msal](https://github.com/AzureAD/microsoft-authentication-library-for-python) from 1.26.0 to 1.28.0.
- [Release notes](https://github.com/AzureAD/microsoft-authentication-library-for-python/releases)
- [Commits](https://github.com/AzureAD/microsoft-authentication-library-for-python/compare/1.26.0...1.28.0)

---
updated-dependencies:
- dependency-name: msal
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-07 21:05:11 -07:00
lombax85
c1735fcd3a
Google Drive connector - txt and markdown support (#1469) 2024-09-07 20:28:23 -07:00
hj-danswer
b43e5735d7
Use user information in Slack bot DMs (#2360)
* Use user information from Slack bot DMs

* fix lint

---------

Co-authored-by: Hyeong Joon Suh <hyeongjoonsuh@Hyeongs-MacBook-Pro.local>
2024-09-08 03:08:24 +00:00
pablodanswer
7d4f8ef4e8
Minor Confluence Fixes for Robustification (#2349)
* add connector config

* update confluence connector
2024-09-08 01:39:49 +00:00
Weves
7c03b6f521 Fix responses for HTTPExceptions 2024-09-07 17:40:21 -07:00
Chris Weaver
ccf986808c
Add retries (#2358)
* Add retries

* fix

* add

* remove --build

* Remove cache-to

* Don't push

* Add back push

* Add newline

* Remove alembic logs
2024-09-08 00:12:32 +00:00
Yuhong Sun
6cec31088d
CONTRIBUTING updates (#2354) 2024-09-07 14:05:36 -07:00
pablodanswer
5abf67fbf0
PDF metadata + list defaults (#2341)
* validate web list

* update pdf extraction of metadat

* remove pdf + log

* stricter type enforcing

* fix up indexing widths

* minor formatting

* add list case

* check for empty metadata
2024-09-06 21:21:24 +00:00
rkuo-danswer
2933c3598b
first cut at redis (#2226)
* first cut at redis

* fix startup dependencies on redis

* kombu cleanup - fail silently

* mypy

* add redis_host environment override

* update REDIS_HOST env var in docker-compose.dev.yml

* update the rest of the docker files

* update contributing guide

* renaming cache to cache_volume

* add redis password to various deployments

* try setting up pr testing for helm

* fix indent

* hopefully this release version actually exists

* fix command line option to --chart-dirs

* fetch-depth 0

* edit values.yaml

* try setting ct working directory

* bypass testing only on change for now

* move files and lint them

* update helm testing

* some issues suggest using --config works

* add vespa repo

* add postgresql repo

* increase timeout

* try amd64 runner

* fix redis password reference

* add comment to helm chart testing workflow

* rename helm testing workflow to disable it

---------

Co-authored-by: Richard Kuo <rkuo@rkuo.com>
2024-09-06 19:21:29 +00:00
pablodanswer
aeb6060854
Add ability to delete users (#2342)
* add ability to delete users

* fix tiny build issue

* Add comments
2024-09-06 17:37:04 +00:00
hagen-danswer
8977b1b5fc
Paginate connector page (#2328)
* Added pagination to individual connector pages

* I cooked

* Gordon Ramsay in this b

* meepe

* properly calculated max chunk and switch dict to array

* chunks -> batches

* increased max page size

* renmaed var
2024-09-06 17:00:25 +00:00
pablodanswer
69c0419146
Updated refreshing (#2327)
* clean up + add environment variables

* remove log

* update

* update api settings

* somewhat cleaner refresh functionality

* fully functional

* update settings

* validated

* remove random logs

* remove unneeded paramter + log

* move to ee + remove comments

* Cleanup unused

---------

Co-authored-by: Weves <chrisweaver101@gmail.com>
2024-09-06 04:36:55 +00:00