joachim-danswer c5adbe4180 Knowledge Graph v1 (#4626)
* db setup

* transfer 1 - incomplete

* more adjustments

* relationship table + query update

* temp view creation

* restructuring

* nits

* updates

* separate read_only engine

* extraction revamp

* focus on metadata relatonships 1

* dev

* migration downgrade fix

* rebase migration change

* a3+

* progress

* base

* new extraction

* progress

* fixed KG extraction

* nits

* updates

* simplifications & cleanup

* fixes

* updates

* more feature flag checks

* fixes

* extraction process fix

* read-only user creation as part of setup

* fix for missing entity attributes

* kg read-only user creation as part of migration

* typo

* EL initial comments

* initial Account/SF Connector chnges

* SF Connector update

 - include account information

* base w/ salesforce

* evan updates + quite a bit more

* kg-filtered search

* EL changes pt 2

* migrations and env vars

* quick migration fix

* migration update

* post_rebase fixes

* mypy fixes

* test fixes

* test fix

* test fix

* read_only pool + misc

* nf

* env vars

* test improvements

* salesforce fix

* test update

* small changes

* small adjustments

* SF Connector fix & kg_stage removal for one table

* mypy fix

* small fixes

* EL + RK (pt 1) comments

* nit

* setting updated

* Salesforce test update

* EL comments

* read-only user replacement & cleanup

* SQL View fix

* converting entity type-name separators

* sql view group ownership

* view fix

* SQL tweak

* dealing with docs that were skipped by indexing

* increased error handling

* more error handling

* Output formatting fix

* kg-incremental-reindexing

* 0-doc found improvement

* celery

* migration correction

* timeout adjustments

* nit

* Updated migration

* Entity Normalization for KG Dev 1 (#4746)

* feat: trigrams column

* fix: reranking and db

* feat: v1

* fix: convert to orm

* feat: parallel

* fix: default to id_name

* fix: renamed semantic_id and semantic_id_trigrams

* fix: scalar subquery

* fix: tuning + redundancy

* fix: threshold

* fix: typo

* fix: shorten names

* wip

* fix: reverted

* feat: config

* feat: works but it was dumb

* feat: clustering works

* fix: mypy

* normalization <-> language awareness for SQL generation

* small type fixes

---------

Co-authored-by: joachim-danswer <joachim@danswer.ai>

* mypy

* typo and dead code

* kg_time_fencing

* feat: remove temp views on migration downgrade

* remove functions and triggers for now

* rebase adjustments

* EL code review results

* quick fix + trigger/funcs for single tenant

* fix: typo, mypy, dead code

* fix: autoflake

* small updatesd

* nit

* fix: typo

* early + faster view creation

* Extension creation in MT migration

* nit changes to default ETs

* Incremental Clustering and KG Refactor V1 (#4784)

Optimized/restructured incremental clustering. New pipeline actually that moves vespa updates to clustering.
Also, celery configuration has been updated.
---------

Co-authored-by: joachim-danswer <joachim@danswer.ai>

* prompt tweak & ET extraction reset

* more general hierarchical structure

* feat: better vespa reset logic

* prompt optimization and entity replacemants

* small prompt changes

* KG Refactor V2 (#4814)

Clustering & Extraction improvements & various nits 

Co-authored-by: joachim-danswer <joachim@danswer.ai>

* add connector-level coverage days

* fix: nit

* initial  EL responses

* refactor: helper functions for formatting

* fix: more helper fns & comments

* fix: comment code that's been implemented elsewhere

* fix: tenant_id missing arg

* fix: removed debugging stuff

* fix: moved kg_interactions db query to helper fn

* fix: tenant_id

* fix: tenant_id & removed outdated helper fn

* fix always set entity class

* fix: typo

* fix alembic heads

* fix: celery logging

* fix: migrations fix

* fix: multi tenant permissions

* fix: temp connector fix

* fix: downgrade

* Fix upgrade migration

* fix: tenant for normalization

* added additional acl

* stray EL comments

* fix: connector test

* fix mypy

* fix: temporary connector test fix

* fix: jira connector test

* nit

* small nits

* fix: black

* fix: mypy

* fix: mypy

---------

Co-authored-by: Rei Meguro <36625832+Orbital-Web@users.noreply.github.com>
2025-06-07 23:14:20 +00:00
2025-06-07 23:14:20 +00:00
2025-06-07 23:14:20 +00:00
2025-06-07 23:14:20 +00:00
2025-06-07 23:14:20 +00:00
2025-06-04 00:03:01 +00:00
2025-05-01 09:58:33 -07:00

Open Source Gen-AI + Enterprise Search.

Documentation Slack Discord License

Onyx (formerly Danswer) is the AI platform connected to your company's docs, apps, and people. Onyx provides a feature rich Chat interface and plugs into any LLM of your choice. Keep knowledge and access controls sync-ed across over 40 connectors like Google Drive, Slack, Confluence, Salesforce, etc. Create custom AI agents with unique prompts, knowledge, and actions that the agents can take. Onyx can be deployed securely anywhere and for any scale - on a laptop, on-premise, or to cloud.

Feature Highlights

Deep research over your team's knowledge:

https://private-user-images.githubusercontent.com/32520769/414509312-48392e83-95d0-4fb5-8650-a396e05e0a32.mp4?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk5Mjg2MzYsIm5iZiI6MTczOTkyODMzNiwicGF0aCI6Ii8zMjUyMDc2OS80MTQ1MDkzMTItNDgzOTJlODMtOTVkMC00ZmI1LTg2NTAtYTM5NmUwNWUwYTMyLm1wND9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE5VDAxMjUzNlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWFhMzk5Njg2Y2Y5YjFmNDNiYTQ2YzM5ZTg5YWJiYTU2NWMyY2YwNmUyODE2NWUxMDRiMWQxZWJmODI4YTA0MTUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.a9D8A0sgKE9AoaoE-mfFbJ6_OKYeqaf7TZ4Han2JfW8

Use Onyx as a secure AI Chat with any LLM:

Onyx Chat Silent Demo

Easily set up connectors to your apps:

Onyx Connector Silent Demo

Access Onyx where your team already works:

Onyx Bot Demo

Deployment

To try it out for free and get started in seconds, check out Onyx Cloud.

Onyx can also be run locally (even on a laptop) or deployed on a virtual machine with a single docker compose command. Checkout our docs to learn more.

We also have built-in support for high-availability/scalable deployment on Kubernetes. References here.

🔍 Other Notable Benefits of Onyx

  • Custom deep learning models for indexing and inference time, only through Onyx + learning from user feedback.
  • Flexible security features like SSO (OIDC/SAML/OAuth2), RBAC, encryption of credentials, etc.
  • Knowledge curation features like document-sets, query history, usage analytics, etc.
  • Scalable deployment options tested up to many tens of thousands users and hundreds of millions of documents.

🚧 Roadmap

  • New methods in information retrieval (StructRAG, LightGraphRAG, etc.)
  • Personalized Search
  • Organizational understanding and ability to locate and suggest experts from your team.
  • Code Search
  • SQL and Structured Query Language

🔌 Connectors

Keep knowledge and access up to sync across 40+ connectors:

  • Google Drive
  • Confluence
  • Slack
  • Gmail
  • Salesforce
  • Microsoft Sharepoint
  • Github
  • Jira
  • Zendesk
  • Gong
  • Microsoft Teams
  • Dropbox
  • Local Files
  • Websites
  • And more ...

See the full list here.

📚 Licensing

There are two editions of Onyx:

  • Onyx Community Edition (CE) is available freely under the MIT Expat license. Simply follow the Deployment guide above.
  • Onyx Enterprise Edition (EE) includes extra features that are primarily useful for larger organizations. For feature details, check out our website.

To try the Onyx Enterprise Edition:

  1. Checkout Onyx Cloud.
  2. For self-hosting the Enterprise Edition, contact us at founders@onyx.app or book a call with us on our Cal.

💡 Contributing

Looking to contribute? Please check out the Contribution Guide for more details.

Description
Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
Readme MIT 7 GiB
Languages
Python 65.3%
TypeScript 30.4%
JavaScript 2.2%
HTML 0.8%
CSS 0.8%
Other 0.5%