Files
Daniel Hiltgen d6f7233a1c test: improve scheduler/concurrency stress tests (#11906)
* test: improve scheduler/concurrency stress tests

The scheduler test used to use approximate memory figures and would often
over or under shoot a systems capcity leading to flaky test results.
This should improve the reliability of this scenario by leveraging
ps output to determinie exactly how many models it takes to
trigger thrashing.

The concurrency test is also refined to target num_parallel + 1 and handle
timeouts better.

With these refinements, TestMultiModelConcurrency was redundant

* test: add parallel generate with history

TestGenerateWithHistory will help verify caching and context
are properly handled while making requests

* test: focus embed tests on embedding models

remove non-embedding models from the embedding tests
2025-08-15 14:37:54 -07:00
..