20 Commits

Author SHA1 Message Date
aae3f84d49 Implement caching for remote instance proxies and enhance proxy request handling 2025-10-07 18:44:23 +02:00
670f8ff81b Split up handlers 2025-10-02 23:11:20 +02:00
da56456504 Add node management endpoints to handle listing and retrieving node details 2025-10-02 22:51:41 +02:00
2ed67eb672 Add remote instance proxying functionality to handler 2025-10-01 22:17:19 +02:00
30e40ecd30 Refactor API endpoints to use /backends/llama-cpp path and update related documentation 2025-09-23 21:27:58 +02:00
46622d2107 Update documentation and add README synchronization 2025-09-22 22:37:53 +02:00
4df02a6519 Initial vLLM backend support 2025-09-19 18:05:12 +02:00
154b754aff Add MLX command parsing and routing support 2025-09-16 21:39:08 +02:00
1b5934303b Enhance command parsing in ParseLlamaCommand and improve error handling in ParseCommandRequest 2025-09-15 22:12:56 +02:00
323056096c Implement llama-server command parsing and add UI components for command input 2025-09-15 21:04:14 +02:00
4581d67165 Enhance instance management: improve on-demand start handling and add LRU eviction logic 2025-08-30 23:13:08 +02:00
41d8c41188 Introduce MaxRunningInstancesError type and handle it in StartInstance handler 2025-08-28 20:07:03 +02:00
1443746add Refactor instance status management: replace Running boolean with InstanceStatus enum and update related methods 2025-08-27 19:44:38 +02:00
ddb54763f6 Add OnDemandStartTimeout configuration and update OpenAIProxy to use it 2025-08-20 14:25:43 +02:00
287a5e0817 Implement WaitForHealthy method and enhance OpenAIProxy to support on-demand instance start 2025-08-20 14:19:12 +02:00
e4e7a82294 Implement last request time tracking for instance management 2025-08-17 19:44:57 +02:00
a87652937f Move swagger documentation to apidoc 2025-08-07 19:48:03 +02:00
e2b64620b5 Expose version endpoint 2025-08-07 19:10:06 +02:00
2abe9c282e Rename config and instance struct to avoid awkward naming 2025-08-04 19:30:50 +02:00
6a7a9a2d09 Split large package into subpackages 2025-08-04 19:23:56 +02:00