danswer

mirror of https://github.com/danswer-ai/danswer.git synced 2025-07-30 14:52:52 +02:00

Author	SHA1	Message	Date
joachim-danswer	c5adbe4180	Knowledge Graph v1 (#4626 ) * db setup * transfer 1 - incomplete * more adjustments * relationship table + query update * temp view creation * restructuring * nits * updates * separate read_only engine * extraction revamp * focus on metadata relatonships 1 * dev * migration downgrade fix * rebase migration change * a3+ * progress * base * new extraction * progress * fixed KG extraction * nits * updates * simplifications & cleanup * fixes * updates * more feature flag checks * fixes * extraction process fix * read-only user creation as part of setup * fix for missing entity attributes * kg read-only user creation as part of migration * typo * EL initial comments * initial Account/SF Connector chnges * SF Connector update - include account information * base w/ salesforce * evan updates + quite a bit more * kg-filtered search * EL changes pt 2 * migrations and env vars * quick migration fix * migration update * post_rebase fixes * mypy fixes * test fixes * test fix * test fix * read_only pool + misc * nf * env vars * test improvements * salesforce fix * test update * small changes * small adjustments * SF Connector fix & kg_stage removal for one table * mypy fix * small fixes * EL + RK (pt 1) comments * nit * setting updated * Salesforce test update * EL comments * read-only user replacement & cleanup * SQL View fix * converting entity type-name separators * sql view group ownership * view fix * SQL tweak * dealing with docs that were skipped by indexing * increased error handling * more error handling * Output formatting fix * kg-incremental-reindexing * 0-doc found improvement * celery * migration correction * timeout adjustments * nit * Updated migration * Entity Normalization for KG Dev 1 (#4746) * feat: trigrams column * fix: reranking and db * feat: v1 * fix: convert to orm * feat: parallel * fix: default to id_name * fix: renamed semantic_id and semantic_id_trigrams * fix: scalar subquery * fix: tuning + redundancy * fix: threshold * fix: typo * fix: shorten names * wip * fix: reverted * feat: config * feat: works but it was dumb * feat: clustering works * fix: mypy * normalization <-> language awareness for SQL generation * small type fixes --------- Co-authored-by: joachim-danswer <joachim@danswer.ai> * mypy * typo and dead code * kg_time_fencing * feat: remove temp views on migration downgrade * remove functions and triggers for now * rebase adjustments * EL code review results * quick fix + trigger/funcs for single tenant * fix: typo, mypy, dead code * fix: autoflake * small updatesd * nit * fix: typo * early + faster view creation * Extension creation in MT migration * nit changes to default ETs * Incremental Clustering and KG Refactor V1 (#4784) Optimized/restructured incremental clustering. New pipeline actually that moves vespa updates to clustering. Also, celery configuration has been updated. --------- Co-authored-by: joachim-danswer <joachim@danswer.ai> * prompt tweak & ET extraction reset * more general hierarchical structure * feat: better vespa reset logic * prompt optimization and entity replacemants * small prompt changes * KG Refactor V2 (#4814) Clustering & Extraction improvements & various nits Co-authored-by: joachim-danswer <joachim@danswer.ai> * add connector-level coverage days * fix: nit * initial EL responses * refactor: helper functions for formatting * fix: more helper fns & comments * fix: comment code that's been implemented elsewhere * fix: tenant_id missing arg * fix: removed debugging stuff * fix: moved kg_interactions db query to helper fn * fix: tenant_id * fix: tenant_id & removed outdated helper fn * fix always set entity class * fix: typo * fix alembic heads * fix: celery logging * fix: migrations fix * fix: multi tenant permissions * fix: temp connector fix * fix: downgrade * Fix upgrade migration * fix: tenant for normalization * added additional acl * stray EL comments * fix: connector test * fix mypy * fix: temporary connector test fix * fix: jira connector test * nit * small nits * fix: black * fix: mypy * fix: mypy --------- Co-authored-by: Rei Meguro <36625832+Orbital-Web@users.noreply.github.com>	2025-06-07 23:14:20 +00:00
Wenxi	dc4b9bc003	Fixed indexing when no sites are specified (#4822 ) * Fixed indexing when no sites are specificed * Added test for Sharepoint all sites index * Accounted for paginated results. * Typing * Typing --------- Co-authored-by: Wenxi Onyx <wenxi-onyx@Wenxis-MacBook-Pro.local>	2025-06-05 23:25:20 +00:00
Rei Meguro	4bb3ee03a0	Update GitHub Connector metadata (#4769 ) * feat: updated github metadata * feat: nullity check * feat: more metadata * feat: userinfo * test: connector test + more metadata * feat: num files changed * feat str * feat: list of str	2025-06-04 18:33:14 +00:00
Chris Weaver	094cc940a4	Small embedding model cleanups (#4820 ) * Small embedding model cleanups * fix * address greptile * fix build	2025-06-04 00:10:44 +00:00
joachim-danswer	80ecdb711d	New metadata for Jira for KG (#4785 ) * new metadata components * nits & tests	2025-06-03 20:12:56 +00:00
Chris Weaver	a599176bbf	Improve reasoning detection (#4817 ) * Improve reasoning detection * Address greptile comments * Fix mypy	2025-06-03 20:01:12 +00:00
Chris Weaver	84d916e210	Fix hard delete of agentic chats (#4803 ) * Fix hard delete of agentic chats * Update backend/tests/integration/tests/chat/test_chat_deletion.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Address Greptile comments * fix tests --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2025-06-03 11:14:11 -07:00
rkuo-danswer	ef4d5dcec3	new slack rate limiting approach (#4779 ) * fix slack rate limit retry handler for groups * trying to mitigate memory usage during csv download * Revert "trying to mitigate memory usage during csv download" This reverts commit `48262eacf6`. * integrated approach to rate limiting * code review * try no redis setting * add pytest-dotenv * add more debugging * added comments * add more stats --------- Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>	2025-05-29 19:49:32 +00:00
Evan Lohn	ca20e527fc	fix tool calling for bedrock claude models (#4761 ) * fix tool calling for bedrock claude models * unit test * fix unit test	2025-05-23 01:13:18 +00:00
Rei Meguro	9dbe12cea8	Feat: Search Eval Testing Overhaul (provide ground truth, categorize query, etc.) (#4739 ) * fix: autoflake & import order * docs: readme * fix: mypy * feat: eval * docs: readme * fix: oops forgot to remove comment * fix: typo * fix: rename var * updated default config * fix: config issue * oops * fix: black * fix: eval and config * feat: non tool calling query mod	2025-05-21 19:25:10 +00:00
rkuo-danswer	e78637d632	mitigate memory usage during csv download (#4745 ) * mitigate memory usage during csv download * more gc tweaks * missed some small changes --------- Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>	2025-05-21 00:44:27 +00:00
Evan Lohn	cac03c07f7	v1 answer refactor (#4721 ) * v1 answer refactor * fix tests * good catch, tests * more cleanup	2025-05-20 23:34:27 +00:00
Raunak Bhagat	95dabfaa18	fix: Add back Teams' replies processing (#4744 ) * Add replies to document construction and edit tests * Update tests * Add replies processing to teams * Fix test * Add try-except block around potential failure * Update entity-id during ConnectorFailure raise	2025-05-20 22:55:28 +00:00
rkuo-danswer	e92c418e0f	Feature/openapi (#4710 ) * starting openapi support * fix app / app_fn * send gitignore * dedupe function names * add readme * modify gitignore * update launch template * fix unused path param * fix mypy * local tests pass * first pass at making integration tests work * fixes * fix script path * set python path * try full path * fix output dir * fix integration test * more build fixes * add generated directory * use the config * add a comment * add * modify tsconfig.json * fix index linting bugs * tons of lint fixes * new gitignore * remove generated dir * add tasks template * check for undefined explicitly * fix hooks.ts * refactor destructureValue * improve readme --------- Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>	2025-05-20 21:33:18 +00:00
Evan Lohn	e0f5b95cfc	full drive perm sync	2025-05-19 21:06:43 -07:00
Evan Lohn	b76e4754bf	anthropic fix (#4733 ) * anthropic fix * naming	2025-05-19 20:34:29 +00:00
Rei Meguro	d64f479c9f	feat: error handling & optimization (#4722 )	2025-05-19 20:27:22 +00:00
Rei Meguro	30d9ce1310	feat: search quality eval (#4720 ) * fix: import order * test examples * fix: import * wip: reranker based eval * fix: import order * feat: adjuted score * fix: mypy * fix: suggestions * sorry cvs, you must go Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: mypy * fix: suggestions --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2025-05-15 23:44:33 +00:00
Evan Lohn	2af2b7f130	fix connector tests and drive indexing (#4715 ) * fix connector tests and drive indexing * fix other test * fix checkpoint data bug	2025-05-15 19:15:46 +00:00
Raunak Bhagat	312e3b92bc	perf: Implement checkpointing for Teams Connector. (#4601 ) * Add basic foundation for teams checkpointing classes * Fix slack connector main entrypoint * Saving changes * Finish teams checkpointing impl * Remove commented out code * Remove more unused code * Move code around * Add threadpool to process requests in parallel * Fix mypy errors / warnings * Move test import to main function only * Address nits on PR * Remove unnecessary check prior to entering while-loop * Remove print statement * Change exception message * Address more nits * Use indexing instead of destructuring * Add back invocation of `run_with_timeout` instead of a direct call * Revert slack testing code * Move early return to before second API call * Pull fetch to team outside of loop * Address nits on PR * Add back client-side filtering * Updated connector to return after a team's indexing is finished * Add type ignore * Implement proper datetime range fetching * Address comment on PR * Rename function * Change exception type when no team with the given id was found * Address nit on PR * Add comment on why `page_loaded` is needed to be specified explicitly * Remove duplicated calls to fetching channels * Use helper function for thread-based yielding instead of manual logic * Move datetime filtering to message-level instead * Address more comments on PR * Add new utility function for yielding sections * Add additional utility function * Add teams tests * Edit error message * Address nits on PR * Promote url-prefix to be a class level constant * Fix mypy error * Remove start/end parameters from function that doesn't use them anymore; move around comments * Address more nits on PR * Add comment	2025-05-14 04:30:57 +00:00
Chris Weaver	b19515e25d	Fix window_start (#4689 ) * Fix window_start * Add comment	2025-05-12 00:11:20 +00:00
rkuo-danswer	1a8b7abd00	add test (#4676 ) * add test * comment --------- Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>	2025-05-09 21:38:51 +00:00
Evan Lohn	4c0423f27b	fix github cursor pagination infinite loop (#4673 ) * fix infinite loop * unit test for infinite loop issue * mypy version * more logging * unbound locals	2025-05-09 21:35:37 +00:00
Chris Weaver	519aeb6a1f	Drive perm sync enhancement (#4672 ) * Enhance drive perm sync * add tests * more stuff * fixes * Fix * Speed up * Add missing file * Address EL comments * Add ondelete=CASCADE * Improve comment	2025-05-08 03:12:41 +00:00
Raunak Bhagat	d744c0dab4	fix: Fix error in which channel names would not have the leading "#" removed (#4664 ) * Fix failing entrypoint into slack connector * Pre-filter channel names upon instantiation of slack connector class * Add decrypt script * Add slack connector tests * Fix mypy errors on decrypt.py * Add property to SlackConnector class * Add some basic tests * Move location of tests * Change name of env token * Add secrets for Slack * Add more parameterized cases * Change env variable name * Change names * Update channel names * Edit tests * Modify tests * Only import type in __main__ * Fix tests to actually test connectors * Pass parameter to fixture directly	2025-05-07 04:55:21 +00:00
Chris Weaver	70df685709	Non default schema fix (#4667 ) * Use correct postgres schema * Remove raw Session() use * Refactor + add test * Fix comment	2025-05-06 20:35:59 -07:00
Chris Weaver	f85ef78238	Add more logging for confluence perm-sync + handle case where permiss… (#4586 ) * Add more logging for confluence perm-sync + handle case where permissions are removed from the access token * Make required permissions are explicit * more * Add slim fetch limit + mark all cc pairs of source type as successful upon group sync * Add to dev compose * Small teams fix * Add file * Add single limit pagination for confluence * Restrict to server only * more logging * cleanup * Cleanup * Remove CONFLUENCE_CONNECTOR_SLIM_FETCH_LIMIT * Handle teams error * Fix ut * Remove db dependency from confluence_doc_sync * move stuff back to debug	2025-05-06 18:35:14 +00:00
Evan Lohn	113876b276	id not set in checkpoint FINAL (#4656 ) * it will never happen again. * fix perm sync issue * fix perm sync issue2 * ensure member emails map is populated * other fix for perm sync * address CW comments * nit	2025-05-03 00:10:21 +00:00
rkuo-danswer	0db2ad2132	memory optimize task generation for connector deletion (#4645 ) * memory optimize task generation for connector deletion * test * fix up integration test docker file * more no-cache --------- Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>	2025-05-01 10:47:26 -07:00
Evan Lohn	6436b60763	github cursor pagination (#4642 ) * v1 of cursor pagination * mypy * unit tests * CW comments	2025-04-30 19:09:20 -07:00
rkuo-danswer	e254fdc066	add sendgrid as option (#4639 ) * add sendgrid as option * code review * mypy --------- Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>	2025-04-30 07:33:15 +00:00
Chris Weaver	9be3da2357	Fix gitlab (#4629 ) * Fix gitlab * Add back assert	2025-04-28 17:42:49 -07:00
Evan Lohn	eebfa5be18	Confluence server api time fix (#4589 ) * tolerance of confluence api weirdness * remove checkpointing * remove skipping logic from checkpointing * add back checkpointing * switch confluence checkpointing to be based on page starts * address CW comments and fix unit tests * some mitigations of bad confluence api * new checkpointing approach and testing fixes * fix test * CW comments	2025-04-28 06:06:29 +00:00
Evan Lohn	5db676967f	no more duplicate files during folder indexing (#4579 ) * no more duplicate files during folder indexing * cleanup checkpoint after a shared folder has been finished * cleanup * lint	2025-04-28 01:01:20 +00:00
Evan Lohn	5ca7a7def9	fix migration and add test (#4615 )	2025-04-25 21:27:59 +00:00
Chris Weaver	92b5e1adf4	Add support for overriding user list (#4616 ) * Add support for overriding user list * Fix * Add typing * pythonify	2025-04-25 15:15:23 -07:00
Chris Weaver	23c6e0f3bf	Single source of truth for image capability (#4612 ) * Single source of truth for image capability * Update web/src/app/admin/assistants/AssistantEditor.tsx Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Fix tests --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2025-04-25 20:37:16 +00:00
Evan Lohn	151aabea73	specific user emails for drive connector (#4608 ) * specific user emails for drive connector * fix drive connector tests * fix connector tests	2025-04-25 18:49:20 +00:00
Chris Weaver	d711680069	Add e2e test for assistant creation/edit (#4597 ) * Add e2e test for assistant creation/edit * Skip initial full reset to have seeded connector	2025-04-25 13:21:34 -07:00
rkuo-danswer	c9a609b7d8	Bugfix/slack bot channel config (#4585 ) * friendlier handling of slack channel retrieval * retry on downgrade_postgres deadlock * fix comment * text --------- Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>	2025-04-23 20:00:03 +00:00
rkuo-danswer	0d4c600852	out of process retry for multitenant test reset (#4566 ) * tool to generate vespa schema variations for our cloud * extraneous assign * use a real templating system instead of search/replace * fix float * maybe this should be double * remove redundant var * template the other files * try a spawned process * move the wrapper * fix args * increase timeout * run multitenant reset operations out of process as well --------- Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app> Co-authored-by: Richard Kuo <rkuo@rkuo.com>	2025-04-21 23:30:18 +00:00
Evan Lohn	eb569bf79d	add emails to retry with on 403 (#4565 ) * add emails to retry with on 403 * attempted fix for connector test * CW comments * connector test fix * test fixes and continue on 403 * fix tests * fix tests * fix concurrency tests * fix integration tests with llmprovider eager loading	2025-04-21 23:27:31 +00:00
Raunak Bhagat	b97628070e	feat: Add ability to specify max input token limit for custom LLM providers (#4510 ) * Add multi text array field * Add multiple values to model configuration for a custom LLM provider * Fix reference to old field name * Add migration * Update all instances of model_names / display_model_names to use new schema migration * Update background task * Update endpoints to not throw errors * Add test * Update backend/alembic/versions/7a70b7664e37_add_models_configuration_table.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update backend/onyx/background/celery/tasks/llm_model_update/tasks.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Fix list comprehension nits * Update web/src/components/admin/connectors/Field.tsx Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update web/src/app/admin/configuration/llm/interfaces.ts Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Implement greptile recommendations * Update backend/onyx/db/llm.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update backend/onyx/server/manage/llm/api.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update backend/onyx/background/celery/tasks/llm_model_update/tasks.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update backend/onyx/db/llm.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Fix more greptile suggestions * Run formatter again * Update backend/onyx/db/models.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Add relationship to `LLMProvider` and `ModelConfigurations` classes * Use sqlalchemy ORM relationships instead of manually populating fields * Upgrade migration * Update interface * Remove all instances of model_names and display_model_names from backend * Add more tests and fix bugs * Run prettier * Add types * Update migration to perform data transformation * Ensure native llm providers don't have custom max input tokens * Start updating frontend logic to support custom max input tokens * Pass max input tokens to LLM class (to be passed into `litellm.completion` call later) * Add ModelConfigurationField component for custom llm providers * Edit spacing and styling of model configuration matrix * Fix error message displaying bug * Edit opacity of `FiX` field for first index * Change opacity back * Change roundness * Address comments on PR * Perform fetching of `max_input_tokens` at the beginning of the callgraph and rope it throughout the entire callstack * Change `add` to `execute` * Move `max_input_tokens` into `LLMConfig` * Fix bug with error messages not being cleared * Change field used to fetch LLMProvider * Fix model-configuration UI * Address comments * Remove circular import * Fix failing tests in GH * Fix failing tests * Use `isSubset` instead of equality to determine native vs custom LLM Provider * Remove unused import * Make responses always display max_input_tokens * Fix api endpoint to hit * Update types in web application * Update object field * Fix more type errors * Fix failing llm provider tests --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2025-04-21 04:30:21 -07:00
rkuo-danswer	2111eccf07	Feature/vespa jinja (#4558 ) * tool to generate vespa schema variations for our cloud * extraneous assign * use a real templating system instead of search/replace * fix float * maybe this should be double * remove redundant var * template the other files * try a spawned process * move the wrapper * fix args * increase timeout --------- Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app> Co-authored-by: Richard Kuo <rkuo@rkuo.com>	2025-04-20 22:28:55 +00:00
evan-danswer	dc62d83a06	File connector tests (#4561 ) * danswer to onyx plus tests for file connector * actually add test	2025-04-19 15:54:30 -07:00
evan-danswer	5681df9095	address getting attachments forever (#4562 ) * address getting attachments forever * fix unit tests	2025-04-19 15:53:27 -07:00
Chris Weaver	6666300f37	Fix flakey web test (#4551 ) * Fix flakey web test * Increase wait time * Another attempt to fix * Simplify + add new test * Fix web tests	2025-04-19 15:12:11 -07:00
evan-danswer	953a4e3793	v1 file connector with metadata (#4552 )	2025-04-17 23:02:34 +00:00
Chris Weaver	6df1c6c72f	Pull in more fields for Jira (#4547 ) * Pull in more fields for Jira * Fix tests * Fix * more fix * Fix * Fix S3 test * fix	2025-04-17 01:52:50 +00:00
evan-danswer	5acae2dc80	fix re-processing of previously seen docs Confluence (#4544 ) * fix re-processing of previously seen docs * performance	2025-04-16 23:16:21 +00:00

1 2 3 4 5 ...

431 Commits