19 Commits

Author SHA1 Message Date
bc9e0535c3 Refactor command building and argument handling 2025-09-25 22:05:46 +02:00
76ac93bedc Implement Docker command handling for Llama, MLX, and vLLM backends 2025-09-24 21:31:58 +02:00
313666ea17 Fix missing vllm proxy setup 2025-09-22 20:51:00 +02:00
4df02a6519 Initial vLLM backend support 2025-09-19 18:05:12 +02:00
468688cdbc Pass backend options to instances 2025-09-16 21:37:48 +02:00
d9542ba117 Refactor instance management to support backend types and options 2025-09-01 21:59:18 +02:00
905e685107 Add LRU eviction tests for instance management 2025-08-31 11:30:57 +02:00
c1fa0faf4b Add LastRequestTime method and LRU eviction logic for instance management 2025-08-30 23:59:37 +02:00
b41ebdc604 Set instance status to Failed when restart conditions are not met 2025-08-27 19:47:36 +02:00
1443746add Refactor instance status management: replace Running boolean with InstanceStatus enum and update related methods 2025-08-27 19:44:38 +02:00
615c2ac54e Add MaxRunningInstances to InstancesConfig and implement IsRunning method 2025-08-27 18:42:34 +02:00
1939b45312 Refactor WaitForHealthy method to use direct health check URL and simplify health check logic 2025-08-20 15:58:08 +02:00
287a5e0817 Implement WaitForHealthy method and enhance OpenAIProxy to support on-demand instance start 2025-08-20 14:19:12 +02:00
d70bb634cd Implement instance tests for timeout 2025-08-17 21:50:16 +02:00
c45fa13206 Initialize last request time on instance start and update timeout handling logic 2025-08-17 21:15:28 +02:00
c734bcae4a Move UpdateLastRequestTime method to timeout.go and add ShouldTimeout method for idle timeout handling 2025-08-17 20:37:20 +02:00
e4e7a82294 Implement last request time tracking for instance management 2025-08-17 19:44:57 +02:00
2abe9c282e Rename config and instance struct to avoid awkward naming 2025-08-04 19:30:50 +02:00
6a7a9a2d09 Split large package into subpackages 2025-08-04 19:23:56 +02:00