Commit Graph

55 Commits

Author SHA1 Message Date
e65f4f1641 Remove unsupported error wrapping from log.Printf 2025-10-27 18:01:58 +01:00
5ef0654cdd Use %w for error wrapping in log messages across multiple files 2025-10-27 17:54:39 +01:00
249ff2a7aa Capitalize godoc tags 2025-10-26 16:49:27 +01:00
59c954811d Update API routes in godoc 2025-10-26 16:35:42 +01:00
58c8899fd9 Update import path for API documentation 2025-10-26 14:08:48 +01:00
f98b09ea78 Move apidocs to docs folder 2025-10-26 14:04:53 +01:00
969fee837f Fix instance name retrieval 2025-10-26 11:34:45 +01:00
4e587953d8 Refactor llama server command handlers to use a common execution function 2025-10-26 11:00:10 +01:00
356c5be2c6 Improve comments 2025-10-26 10:34:36 +01:00
836e918fc5 Rename ProxyToInstance to InstanceProxy for clarity in routing 2025-10-26 10:22:37 +01:00
a7593e9a58 Split LlamaCppProxy handler 2025-10-26 10:21:40 +01:00
9259763054 Add getInstance method to handlers 2025-10-26 09:54:24 +01:00
94dce4c9bb Implement helper response handling functions 2025-10-26 00:12:33 +02:00
a3f9213f04 Implement ensureInstanceRunning helper 2025-10-25 23:44:21 +02:00
de5a38e7fd Refactor command parsing 2025-10-25 20:23:08 +02:00
c038aac91b Remove redundant UpdateLast RequestTime calls 2025-10-25 16:09:57 +02:00
7d9b983f93 Don't strip remote llama-cpp proxy prefix 2025-10-25 16:02:09 +02:00
ff719f3ef9 Remove remote instance proxy handling from handlers 2025-10-25 14:07:11 +02:00
a9fb0d613d Validate instance name in openai proxy 2025-10-22 18:55:57 +02:00
3b8bc658e3 Add name validation to backend handlers 2025-10-22 18:50:51 +02:00
c794e4f98b Move instance name validation to handlers 2025-10-22 18:40:39 +02:00
9da2433a7c Refactor instance and manager tests to use BackendOptions structure 2025-10-19 18:07:14 +02:00
2a7010d0e1 Flatten backends package structure 2025-10-19 15:50:42 +02:00
113b51eda2 Refactor instance node handling to use a map 2025-10-18 00:33:16 +02:00
4b30791be2 Refactor instance options structure and related code 2025-10-16 20:53:24 +02:00
80ca0cbd4f Rename Process to Instance 2025-10-16 19:38:44 +02:00
8a16a195de Fix getting remote instance logs 2025-10-09 20:22:32 +02:00
7f6725da96 Refactor NodeConfig handling to use a map 2025-10-08 19:24:24 +02:00
6298b03636 Refactor RemoteOpenAIProxy to use cached proxies and restore request body handling 2025-10-07 18:57:08 +02:00
aae3f84d49 Implement caching for remote instance proxies and enhance proxy request handling 2025-10-07 18:44:23 +02:00
16b28bac05 Merge branch 'main' into feat/multi-host 2025-10-07 18:04:24 +02:00
Anuruth Lertpiya
997bd1b063 Changed status code to StatusBadRequest (400) if requested invalid model name. 2025-10-05 14:53:20 +00:00
Anuruth Lertpiya
fa43f9e967 Added support for proxying llama.cpp native API endpoints via /llama-cpp/{name}/ 2025-10-05 14:28:33 +00:00
8ebdb1a183 Fix double read of json response when content-length header is missing 2025-10-04 22:16:28 +02:00
Anuruth Lertpiya
0e1bc8a352 Added support for configuring CORS headers 2025-10-04 09:13:40 +00:00
670f8ff81b Split up handlers 2025-10-02 23:11:20 +02:00
da56456504 Add node management endpoints to handle listing and retrieving node details 2025-10-02 22:51:41 +02:00
2ed67eb672 Add remote instance proxying functionality to handler 2025-10-01 22:17:19 +02:00
30e40ecd30 Refactor API endpoints to use /backends/llama-cpp path and update related documentation 2025-09-23 21:27:58 +02:00
46622d2107 Update documentation and add README synchronization 2025-09-22 22:37:53 +02:00
4df02a6519 Initial vLLM backend support 2025-09-19 18:05:12 +02:00
154b754aff Add MLX command parsing and routing support 2025-09-16 21:39:08 +02:00
1b5934303b Enhance command parsing in ParseLlamaCommand and improve error handling in ParseCommandRequest 2025-09-15 22:12:56 +02:00
323056096c Implement llama-server command parsing and add UI components for command input 2025-09-15 21:04:14 +02:00
4581d67165 Enhance instance management: improve on-demand start handling and add LRU eviction logic 2025-08-30 23:13:08 +02:00
41d8c41188 Introduce MaxRunningInstancesError type and handle it in StartInstance handler 2025-08-28 20:07:03 +02:00
1443746add Refactor instance status management: replace Running boolean with InstanceStatus enum and update related methods 2025-08-27 19:44:38 +02:00
ddb54763f6 Add OnDemandStartTimeout configuration and update OpenAIProxy to use it 2025-08-20 14:25:43 +02:00
287a5e0817 Implement WaitForHealthy method and enhance OpenAIProxy to support on-demand instance start 2025-08-20 14:19:12 +02:00
e4e7a82294 Implement last request time tracking for instance management 2025-08-17 19:44:57 +02:00