Detect CUDA OS Overhead

This adds logic to detect skew between the driver and
management library which can be attributed to OS overhead
and records that so we can adjust subsequent management
library free VRAM updates and avoid OOM scenarios.
This commit is contained in:
Daniel Hiltgen
2024-07-09 10:27:53 -07:00
parent 9544a57ee4
commit f6f759fc5f
2 changed files with 29 additions and 1 deletions

View File

@@ -52,7 +52,8 @@ type CPUInfo struct {
type CudaGPUInfo struct {
GpuInfo
index int //nolint:unused,nolintlint
OSOverhead uint64 // Memory overhead between the driver library and management library
index int //nolint:unused,nolintlint
}
type CudaGPUInfoList []CudaGPUInfo