|
|
313666ea17
|
Fix missing vllm proxy setup
|
2025-09-22 20:51:00 +02:00 |
|
|
|
4df02a6519
|
Initial vLLM backend support
|
2025-09-19 18:05:12 +02:00 |
|
|
|
468688cdbc
|
Pass backend options to instances
|
2025-09-16 21:37:48 +02:00 |
|
|
|
d9542ba117
|
Refactor instance management to support backend types and options
|
2025-09-01 21:59:18 +02:00 |
|
|
|
905e685107
|
Add LRU eviction tests for instance management
|
2025-08-31 11:30:57 +02:00 |
|
|
|
c1fa0faf4b
|
Add LastRequestTime method and LRU eviction logic for instance management
|
2025-08-30 23:59:37 +02:00 |
|
|
|
b41ebdc604
|
Set instance status to Failed when restart conditions are not met
|
2025-08-27 19:47:36 +02:00 |
|
|
|
1443746add
|
Refactor instance status management: replace Running boolean with InstanceStatus enum and update related methods
|
2025-08-27 19:44:38 +02:00 |
|
|
|
615c2ac54e
|
Add MaxRunningInstances to InstancesConfig and implement IsRunning method
|
2025-08-27 18:42:34 +02:00 |
|
|
|
1939b45312
|
Refactor WaitForHealthy method to use direct health check URL and simplify health check logic
|
2025-08-20 15:58:08 +02:00 |
|
|
|
287a5e0817
|
Implement WaitForHealthy method and enhance OpenAIProxy to support on-demand instance start
|
2025-08-20 14:19:12 +02:00 |
|
|
|
d70bb634cd
|
Implement instance tests for timeout
|
2025-08-17 21:50:16 +02:00 |
|
|
|
c45fa13206
|
Initialize last request time on instance start and update timeout handling logic
|
2025-08-17 21:15:28 +02:00 |
|
|
|
c734bcae4a
|
Move UpdateLastRequestTime method to timeout.go and add ShouldTimeout method for idle timeout handling
|
2025-08-17 20:37:20 +02:00 |
|
|
|
e4e7a82294
|
Implement last request time tracking for instance management
|
2025-08-17 19:44:57 +02:00 |
|
|
|
2abe9c282e
|
Rename config and instance struct to avoid awkward naming
|
2025-08-04 19:30:50 +02:00 |
|
|
|
6a7a9a2d09
|
Split large package into subpackages
|
2025-08-04 19:23:56 +02:00 |
|