joachim-danswer 463340b8a1
Reduce ranking scores for short chunks without actual information (#4098)
* remove title for slack

* initial working code

* simplification

* improvements

* name change to information_content_model

* avoid boost_score > 1.0

* nit

* EL comments and improvements

Improvements:
  - proper import of information content model from cache or HF
  - warm up for information content model

Other:
  - EL PR review comments

* nit

* requirements version update

* fixed docker file

* new home for model_server configs

* default off

* small updates

* YS comments - pt 1

* renaming to chunk_boost & chunk table def

* saving and deleting chunk stats in new table

* saving and updating chunk stats

* improved dict score update

* create columns for individual boost factors

* RK comments

* Update migration

* manual import reordering
2025-03-13 17:35:45 +00:00

35 lines
761 B
Plaintext

black==23.7.0
boto3-stubs[s3]==1.34.133
celery-types==0.19.0
cohere==5.6.1
lxml==5.3.0
lxml_html_clean==0.2.2
mypy-extensions==1.0.0
mypy==1.8.0
pandas-stubs==2.2.3.241009
pandas==2.2.3
posthog==3.7.4
pre-commit==3.2.2
pytest-asyncio==0.22.0
pytest==7.4.4
reorder-python-imports==3.9.0
ruff==0.0.286
sentence-transformers==3.4.1
trafilatura==1.12.2
types-beautifulsoup4==4.12.0.3
types-html5lib==1.1.11.13
types-oauthlib==3.2.0.9
types-passlib==1.7.7.20240106
types-Pillow==10.2.0.20240822
types-psutil==5.9.5.17
types-psycopg2==2.9.21.10
types-python-dateutil==2.8.19.13
types-pytz==2023.3.1.1
types-PyYAML==6.0.12.11
types-regex==2023.3.23.1
types-requests==2.28.11.17
types-retry==0.9.9.3
types-setuptools==68.0.0.3
types-urllib3==1.26.25.11
voyageai==0.2.3