Commit Graph

201 Commits

Author SHA1 Message Date
bc9e0535c3 Refactor command building and argument handling 2025-09-25 22:05:46 +02:00
ba0f877185 Fix tests 2025-09-24 21:35:44 +02:00
840a7bc650 Add Docker command handling for backend options and refactor command building 2025-09-24 21:34:54 +02:00
76ac93bedc Implement Docker command handling for Llama, MLX, and vLLM backends 2025-09-24 21:31:58 +02:00
72d2a601c8 Update Docker args in LoadConfig and tests to include 'run --rm' prefix 2025-09-24 21:27:51 +02:00
9a56660f68 Refactor backend configuration to use structured settings and update environment variable handling 2025-09-24 20:31:20 +02:00
30e40ecd30 Refactor API endpoints to use /backends/llama-cpp path and update related documentation 2025-09-23 21:27:58 +02:00
46622d2107 Update documentation and add README synchronization 2025-09-22 22:37:53 +02:00
184d6df1bc Fix vllm command parsing 2025-09-22 21:25:50 +02:00
313666ea17 Fix missing vllm proxy setup 2025-09-22 20:51:00 +02:00
c3ca5b95f7 Update BuildCommandArgs to use positional argument for model and adjust tests accordingly 2025-09-22 20:32:03 +02:00
7eb59aa7e0 Remove unused JSON unmarshal test and clean up command argument checks 2025-09-19 20:46:25 +02:00
64842e74b0 Refactor command parsing and building 2025-09-19 20:23:25 +02:00
34a949d22e Refactor command argument building and parsing 2025-09-19 19:59:46 +02:00
ec5485bd0e Refactor command argument building across backends 2025-09-19 19:46:54 +02:00
9eecb37aec Refactor MLX and VLLM server options parsing and args building 2025-09-19 19:39:36 +02:00
c7136d5206 Refactor command parsing logic across backends to utilize a unified CommandParserConfig structure 2025-09-19 18:36:23 +02:00
4df02a6519 Initial vLLM backend support 2025-09-19 18:05:12 +02:00
6a580667ed Remove LlamaExecutable checks from default and file loading tests 2025-09-18 20:30:26 +02:00
2a20817078 Remove redundant LlamaExecutable field from instance configuration in tests 2025-09-18 20:29:04 +02:00
5121f0e302 Remove PythonPath references from MlxServerOptions and related configurations 2025-09-17 21:59:55 +02:00
cc5d8acd92 Refactor instance and manager tests to use BackendConfig for LlamaExecutable and MLXLMExecutable 2025-09-16 21:45:50 +02:00
154b754aff Add MLX command parsing and routing support 2025-09-16 21:39:08 +02:00
63fea02d66 Add MLX backend support in CreateInstanceOptions and validation 2025-09-16 21:38:33 +02:00
468688cdbc Pass backend options to instances 2025-09-16 21:37:48 +02:00
988c4aca40 Add MLX backend config options 2025-09-16 21:14:19 +02:00
1b5934303b Enhance command parsing in ParseLlamaCommand and improve error handling in ParseCommandRequest 2025-09-15 22:12:56 +02:00
e7b06341c3 Enhance command parsing in ParseLlamaCommand 2025-09-15 21:29:46 +02:00
323056096c Implement llama-server command parsing and add UI components for command input 2025-09-15 21:04:14 +02:00
d697f83b46 Update GetProxy method to use BackendTypeLlamaCpp constant for backend type 2025-09-02 21:56:38 +02:00
712d28ea42 Remove port marking logic from CreateInstance method 2025-09-02 21:56:25 +02:00
d9542ba117 Refactor instance management to support backend types and options 2025-09-01 21:59:18 +02:00
9579930a6a Simplify LRU eviction tests 2025-08-31 11:46:16 +02:00
447f441fd0 Move LRU eviction to timeout.go 2025-08-31 11:42:32 +02:00
27012b6de6 Split manager tests into multiple test files 2025-08-31 11:39:44 +02:00
905e685107 Add LRU eviction tests for instance management 2025-08-31 11:30:57 +02:00
d6d4792a0c Skip eviction for instances without a valid idle timeout 2025-08-31 00:59:26 +02:00
894f3c3213 Refactor StartInstance method to improve max running instances check 2025-08-31 00:14:29 +02:00
c1fa0faf4b Add LastRequestTime method and LRU eviction logic for instance management 2025-08-30 23:59:37 +02:00
4581d67165 Enhance instance management: improve on-demand start handling and add LRU eviction logic 2025-08-30 23:13:08 +02:00
58cb36bd18 Refactor instance management: replace CanStartInstance with IsMaxRunningInstancesReached method 2025-08-30 23:12:58 +02:00
68253be3e8 Add CanStartInstance method to check instance start conditions 2025-08-30 22:47:15 +02:00
a9f1c1a619 Add LRU eviction configuration for instances 2025-08-30 22:26:02 +02:00
74495f8163 Refactor Shutdown method to improve instance stopping logic and avoid deadlocks 2025-08-30 22:04:43 +02:00
9d548e6dda Remove wrong MaxRunningInstancesError type 2025-08-28 20:42:56 +02:00
41d8c41188 Introduce MaxRunningInstancesError type and handle it in StartInstance handler 2025-08-28 20:07:03 +02:00
e319731239 Remove unnecessary read locks from GetStatus and IsRunning methods 2025-08-28 19:19:28 +02:00
b698c1d0ea Remove locks from SetStatus 2025-08-28 19:08:20 +02:00
227ca7927a Refactor SetStatus method to capture onStatusChange callback reference before unlocking mutex 2025-08-28 18:59:26 +02:00
0b058237fe Enforce maximum running instances limit in StartInstance method 2025-08-27 21:18:38 +02:00