mirror of
https://github.com/lordmathis/llamactl.git
synced 2025-11-06 00:54:23 +00:00
Compare commits
40 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| a6e3cb4a9b | |||
| 9181c3d7bc | |||
| 1939b45312 | |||
| 8265a94bf7 | |||
| 4bc9362f7a | |||
| ddb54763f6 | |||
| 496ab3aa5d | |||
| 287a5e0817 | |||
| 7b4adfa0cd | |||
| 651c8b9b2c | |||
| 7194e1fdd1 | |||
| 492c3ff270 | |||
| 00a3cba717 | |||
| eb1d4ab55f | |||
| a9e3801eae | |||
| 1aaab96cec | |||
| 78eda77e44 | |||
| d70bb634cd | |||
| 41eaebc927 | |||
| c45fa13206 | |||
| 5e3a28398d | |||
| c734bcae4a | |||
| e4e7a82294 | |||
| ccffbca6b2 | |||
| 902be409d5 | |||
| eb9599f26a | |||
| ebf8dfdeab | |||
| f15c0840c4 | |||
| e702bcb694 | |||
| 4895fbff15 | |||
| 282fe67355 | |||
| 96a36e1119 | |||
| 759fc58326 | |||
| afef3d0180 | |||
| a87652937f | |||
| 7bde12db47 | |||
| e2b64620b5 | |||
| 3ba62af01a | |||
| 0150429e82 | |||
| 2ecf096024 |
2
.github/workflows/release.yaml
vendored
2
.github/workflows/release.yaml
vendored
@@ -29,6 +29,8 @@ jobs:
|
||||
npm ci
|
||||
|
||||
- name: Build Web UI
|
||||
env:
|
||||
VITE_APP_VERSION: ${{ github.ref_name }}
|
||||
run: |
|
||||
cd webui
|
||||
npm run build
|
||||
|
||||
138
CONTRIBUTING.md
Normal file
138
CONTRIBUTING.md
Normal file
@@ -0,0 +1,138 @@
|
||||
# Contributing to Llamactl
|
||||
|
||||
Thank you for considering contributing to Llamactl! This document outlines the development setup and contribution process.
|
||||
|
||||
## Development Setup
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Go 1.24 or later
|
||||
- Node.js 22 or later
|
||||
- `llama-server` executable (from [llama.cpp](https://github.com/ggml-org/llama.cpp))
|
||||
|
||||
### Getting Started
|
||||
|
||||
1. **Clone the repository**
|
||||
```bash
|
||||
git clone https://github.com/lordmathis/llamactl.git
|
||||
cd llamactl
|
||||
```
|
||||
|
||||
2. **Install dependencies**
|
||||
```bash
|
||||
# Go dependencies
|
||||
go mod download
|
||||
|
||||
# Frontend dependencies
|
||||
cd webui && npm ci && cd ..
|
||||
```
|
||||
|
||||
3. **Run for development**
|
||||
```bash
|
||||
# Start backend server
|
||||
go run ./cmd/server
|
||||
```
|
||||
Server will be available at `http://localhost:8080`
|
||||
|
||||
```bash
|
||||
# In a separate terminal, start frontend dev server
|
||||
cd webui && npm run dev
|
||||
```
|
||||
Development UI will be available at `http://localhost:5173`
|
||||
|
||||
4. **Common development commands**
|
||||
```bash
|
||||
# Backend
|
||||
go test ./... -v # Run tests
|
||||
go test -race ./... -v # Run with race detector
|
||||
go fmt ./... && go vet ./... # Format and vet code
|
||||
|
||||
# Frontend (run from webui/ directory)
|
||||
npm run test:run # Run tests once
|
||||
npm run test # Run tests in watch mode
|
||||
npm run type-check # TypeScript type checking
|
||||
npm run lint:fix # Lint and fix issues
|
||||
```
|
||||
|
||||
## Before Submitting a Pull Request
|
||||
|
||||
### Required Checks
|
||||
|
||||
All the following must pass:
|
||||
|
||||
1. **Backend**
|
||||
```bash
|
||||
go test ./... -v
|
||||
go test -race ./... -v
|
||||
go fmt ./... && go vet ./...
|
||||
go build -o llamactl ./cmd/server
|
||||
```
|
||||
|
||||
2. **Frontend**
|
||||
```bash
|
||||
cd webui
|
||||
npm run test:run
|
||||
npm run type-check
|
||||
npm run build
|
||||
```
|
||||
|
||||
### API Documentation
|
||||
|
||||
If changes affect API endpoints, update Swagger documentation:
|
||||
|
||||
```bash
|
||||
# Install swag if needed
|
||||
go install github.com/swaggo/swag/cmd/swag@latest
|
||||
|
||||
# Update Swagger comments in pkg/server/handlers.go
|
||||
# Then regenerate docs
|
||||
swag init -g cmd/server/main.go -o apidocs
|
||||
```
|
||||
|
||||
## Pull Request Guidelines
|
||||
|
||||
### Pull Request Titles
|
||||
Use this format for pull request titles:
|
||||
- `feat:` for new features
|
||||
- `fix:` for bug fixes
|
||||
- `docs:` for documentation changes
|
||||
- `test:` for test additions or modifications
|
||||
- `refactor:` for code refactoring
|
||||
|
||||
### Submission Process
|
||||
1. Create a feature branch from `main`
|
||||
2. Make changes following the coding standards
|
||||
3. Run all required checks listed above
|
||||
4. Update documentation if necessary
|
||||
5. Submit pull request with:
|
||||
- Clear description of changes
|
||||
- Reference to any related issues
|
||||
- Screenshots for UI changes
|
||||
|
||||
## Code Style and Testing
|
||||
|
||||
### Testing Strategy
|
||||
- Backend tests use Go's built-in testing framework
|
||||
- Frontend tests use Vitest and React Testing Library
|
||||
- Run tests frequently during development
|
||||
- Add tests for new features and bug fixes
|
||||
|
||||
### Go
|
||||
- Follow standard Go formatting (`go fmt`)
|
||||
- Use meaningful variable and function names
|
||||
- Add comments for exported functions and types
|
||||
- Handle errors appropriately
|
||||
|
||||
### TypeScript/React
|
||||
- Use TypeScript strictly (avoid `any` when possible)
|
||||
- Follow React hooks best practices
|
||||
- Use meaningful component and variable names
|
||||
- Prefer functional components over class components
|
||||
|
||||
## Getting Help
|
||||
|
||||
- Check existing [issues](https://github.com/lordmathis/llamactl/issues)
|
||||
- Review the [README.md](README.md) for usage documentation
|
||||
- Look at existing code for patterns and conventions
|
||||
|
||||
Thank you for contributing to Llamactl!
|
||||
39
README.md
39
README.md
@@ -2,7 +2,7 @@
|
||||
|
||||
  
|
||||
|
||||
**Management server for multiple llama.cpp instances with OpenAI-compatible API routing.**
|
||||
**Management server and proxy for multiple llama.cpp instances with OpenAI-compatible API routing.**
|
||||
|
||||
## Why llamactl?
|
||||
|
||||
@@ -11,7 +11,11 @@
|
||||
🌐 **Web Dashboard**: Modern React UI for visual management (unlike CLI-only tools)
|
||||
🔐 **API Key Authentication**: Separate keys for management vs inference access
|
||||
📊 **Instance Monitoring**: Health checks, auto-restart, log management
|
||||
⚡ **Persistent State**: Instances survive server restarts
|
||||
⏳ **Idle Timeout Management**: Automatically stop idle instances after a configurable period
|
||||
💡 **On-Demand Instance Start**: Automatically launch instances upon receiving OpenAI-compatible API requests
|
||||
💾 **State Persistence**: Ensure instances remain intact across server restarts
|
||||
|
||||

|
||||
|
||||
**Choose llamactl if**: You need authentication, health monitoring, auto-restart, and centralized management of multiple llama-server instances
|
||||
**Choose Ollama if**: You want the simplest setup with strong community ecosystem and third-party integrations
|
||||
@@ -113,6 +117,10 @@ instances:
|
||||
default_auto_restart: true # Auto-restart new instances by default
|
||||
default_max_restarts: 3 # Max restarts for new instances
|
||||
default_restart_delay: 5 # Restart delay (seconds) for new instances
|
||||
default_on_demand_start: true # Default on-demand start setting
|
||||
on_demand_start_timeout: 120 # Default on-demand start timeout in seconds
|
||||
timeout_check_interval: 5 # Idle instance timeout check in minutes
|
||||
|
||||
|
||||
auth:
|
||||
require_inference_auth: true # Require auth for inference endpoints
|
||||
@@ -170,16 +178,19 @@ server:
|
||||
|
||||
```yaml
|
||||
instances:
|
||||
port_range: [8000, 9000] # Port range for instances (default: [8000, 9000])
|
||||
data_dir: "~/.local/share/llamactl" # Directory for all llamactl data (default varies by OS)
|
||||
configs_dir: "~/.local/share/llamactl/instances" # Directory for instance configs (default: data_dir/instances)
|
||||
logs_dir: "~/.local/share/llamactl/logs" # Directory for instance logs (default: data_dir/logs)
|
||||
auto_create_dirs: true # Automatically create data/config/logs directories (default: true)
|
||||
max_instances: -1 # Maximum instances (-1 = unlimited)
|
||||
llama_executable: "llama-server" # Path to llama-server executable
|
||||
default_auto_restart: true # Default auto-restart setting
|
||||
default_max_restarts: 3 # Default maximum restart attempts
|
||||
default_restart_delay: 5 # Default restart delay in seconds
|
||||
port_range: [8000, 9000] # Port range for instances (default: [8000, 9000])
|
||||
data_dir: "~/.local/share/llamactl" # Directory for all llamactl data (default varies by OS)
|
||||
configs_dir: "~/.local/share/llamactl/instances" # Directory for instance configs (default: data_dir/instances)
|
||||
logs_dir: "~/.local/share/llamactl/logs" # Directory for instance logs (default: data_dir/logs)
|
||||
auto_create_dirs: true # Automatically create data/config/logs directories (default: true)
|
||||
max_instances: -1 # Maximum instances (-1 = unlimited)
|
||||
llama_executable: "llama-server" # Path to llama-server executable
|
||||
default_auto_restart: true # Default auto-restart setting
|
||||
default_max_restarts: 3 # Default maximum restart attempts
|
||||
default_restart_delay: 5 # Default restart delay in seconds
|
||||
default_on_demand_start: true # Default on-demand start setting
|
||||
on_demand_start_timeout: 120 # Default on-demand start timeout in seconds
|
||||
timeout_check_interval: 5 # Default instance timeout check interval in minutes
|
||||
```
|
||||
|
||||
**Environment Variables:**
|
||||
@@ -193,6 +204,10 @@ instances:
|
||||
- `LLAMACTL_DEFAULT_AUTO_RESTART` - Default auto-restart setting (true/false)
|
||||
- `LLAMACTL_DEFAULT_MAX_RESTARTS` - Default maximum restarts
|
||||
- `LLAMACTL_DEFAULT_RESTART_DELAY` - Default restart delay in seconds
|
||||
- `LLAMACTL_DEFAULT_ON_DEMAND_START` - Default on-demand start setting (true/false)
|
||||
- `LLAMACTL_ON_DEMAND_START_TIMEOUT` - Default on-demand start timeout in seconds
|
||||
- `LLAMACTL_TIMEOUT_CHECK_INTERVAL` - Default instance timeout check interval in minutes
|
||||
|
||||
|
||||
#### Authentication Configuration
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
// Package docs Code generated by swaggo/swag. DO NOT EDIT
|
||||
package docs
|
||||
// Package apidocs Code generated by swaggo/swag. DO NOT EDIT
|
||||
package apidocs
|
||||
|
||||
import "github.com/swaggo/swag"
|
||||
|
||||
@@ -37,7 +37,7 @@ const docTemplate = `{
|
||||
"schema": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"$ref": "#/definitions/llamactl.Instance"
|
||||
"$ref": "#/definitions/instance.Process"
|
||||
}
|
||||
}
|
||||
},
|
||||
@@ -75,7 +75,7 @@ const docTemplate = `{
|
||||
"200": {
|
||||
"description": "Instance details",
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.Instance"
|
||||
"$ref": "#/definitions/instance.Process"
|
||||
}
|
||||
},
|
||||
"400": {
|
||||
@@ -120,7 +120,7 @@ const docTemplate = `{
|
||||
"in": "body",
|
||||
"required": true,
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.CreateInstanceOptions"
|
||||
"$ref": "#/definitions/instance.CreateInstanceOptions"
|
||||
}
|
||||
}
|
||||
],
|
||||
@@ -128,7 +128,7 @@ const docTemplate = `{
|
||||
"200": {
|
||||
"description": "Updated instance details",
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.Instance"
|
||||
"$ref": "#/definitions/instance.Process"
|
||||
}
|
||||
},
|
||||
"400": {
|
||||
@@ -173,7 +173,7 @@ const docTemplate = `{
|
||||
"in": "body",
|
||||
"required": true,
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.CreateInstanceOptions"
|
||||
"$ref": "#/definitions/instance.CreateInstanceOptions"
|
||||
}
|
||||
}
|
||||
],
|
||||
@@ -181,7 +181,7 @@ const docTemplate = `{
|
||||
"201": {
|
||||
"description": "Created instance details",
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.Instance"
|
||||
"$ref": "#/definitions/instance.Process"
|
||||
}
|
||||
},
|
||||
"400": {
|
||||
@@ -401,7 +401,7 @@ const docTemplate = `{
|
||||
"200": {
|
||||
"description": "Restarted instance details",
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.Instance"
|
||||
"$ref": "#/definitions/instance.Process"
|
||||
}
|
||||
},
|
||||
"400": {
|
||||
@@ -444,7 +444,7 @@ const docTemplate = `{
|
||||
"200": {
|
||||
"description": "Started instance details",
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.Instance"
|
||||
"$ref": "#/definitions/instance.Process"
|
||||
}
|
||||
},
|
||||
"400": {
|
||||
@@ -487,7 +487,7 @@ const docTemplate = `{
|
||||
"200": {
|
||||
"description": "Stopped instance details",
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.Instance"
|
||||
"$ref": "#/definitions/instance.Process"
|
||||
}
|
||||
},
|
||||
"400": {
|
||||
@@ -639,7 +639,35 @@ const docTemplate = `{
|
||||
"200": {
|
||||
"description": "List of OpenAI-compatible instances",
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.OpenAIListInstancesResponse"
|
||||
"$ref": "#/definitions/server.OpenAIListInstancesResponse"
|
||||
}
|
||||
},
|
||||
"500": {
|
||||
"description": "Internal Server Error",
|
||||
"schema": {
|
||||
"type": "string"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"/version": {
|
||||
"get": {
|
||||
"security": [
|
||||
{
|
||||
"ApiKeyAuth": []
|
||||
}
|
||||
],
|
||||
"description": "Returns the version of the llamactl command",
|
||||
"tags": [
|
||||
"version"
|
||||
],
|
||||
"summary": "Get llamactl version",
|
||||
"responses": {
|
||||
"200": {
|
||||
"description": "Version information",
|
||||
"schema": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"500": {
|
||||
@@ -653,7 +681,7 @@ const docTemplate = `{
|
||||
}
|
||||
},
|
||||
"definitions": {
|
||||
"llamactl.CreateInstanceOptions": {
|
||||
"instance.CreateInstanceOptions": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"alias": {
|
||||
@@ -751,7 +779,6 @@ const docTemplate = `{
|
||||
"type": "string"
|
||||
},
|
||||
"draft_max": {
|
||||
"description": "Speculative decoding params",
|
||||
"type": "integer"
|
||||
},
|
||||
"draft_min": {
|
||||
@@ -955,7 +982,7 @@ const docTemplate = `{
|
||||
"type": "boolean"
|
||||
},
|
||||
"no_context_shift": {
|
||||
"description": "Server/Example-specific params",
|
||||
"description": "Example-specific params",
|
||||
"type": "boolean"
|
||||
},
|
||||
"no_escape": {
|
||||
@@ -1027,10 +1054,10 @@ const docTemplate = `{
|
||||
"presence_penalty": {
|
||||
"type": "number"
|
||||
},
|
||||
"priority": {
|
||||
"prio": {
|
||||
"type": "integer"
|
||||
},
|
||||
"priority_batch": {
|
||||
"prio_batch": {
|
||||
"type": "integer"
|
||||
},
|
||||
"props": {
|
||||
@@ -1101,7 +1128,7 @@ const docTemplate = `{
|
||||
"ssl_key_file": {
|
||||
"type": "string"
|
||||
},
|
||||
"temperature": {
|
||||
"temp": {
|
||||
"type": "number"
|
||||
},
|
||||
"tensor_split": {
|
||||
@@ -1167,7 +1194,7 @@ const docTemplate = `{
|
||||
}
|
||||
}
|
||||
},
|
||||
"llamactl.Instance": {
|
||||
"instance.Process": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"created": {
|
||||
@@ -1183,7 +1210,7 @@ const docTemplate = `{
|
||||
}
|
||||
}
|
||||
},
|
||||
"llamactl.OpenAIInstance": {
|
||||
"server.OpenAIInstance": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"created": {
|
||||
@@ -1200,13 +1227,13 @@ const docTemplate = `{
|
||||
}
|
||||
}
|
||||
},
|
||||
"llamactl.OpenAIListInstancesResponse": {
|
||||
"server.OpenAIListInstancesResponse": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"data": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"$ref": "#/definitions/llamactl.OpenAIInstance"
|
||||
"$ref": "#/definitions/server.OpenAIInstance"
|
||||
}
|
||||
},
|
||||
"object": {
|
||||
@@ -30,7 +30,7 @@
|
||||
"schema": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"$ref": "#/definitions/llamactl.Instance"
|
||||
"$ref": "#/definitions/instance.Process"
|
||||
}
|
||||
}
|
||||
},
|
||||
@@ -68,7 +68,7 @@
|
||||
"200": {
|
||||
"description": "Instance details",
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.Instance"
|
||||
"$ref": "#/definitions/instance.Process"
|
||||
}
|
||||
},
|
||||
"400": {
|
||||
@@ -113,7 +113,7 @@
|
||||
"in": "body",
|
||||
"required": true,
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.CreateInstanceOptions"
|
||||
"$ref": "#/definitions/instance.CreateInstanceOptions"
|
||||
}
|
||||
}
|
||||
],
|
||||
@@ -121,7 +121,7 @@
|
||||
"200": {
|
||||
"description": "Updated instance details",
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.Instance"
|
||||
"$ref": "#/definitions/instance.Process"
|
||||
}
|
||||
},
|
||||
"400": {
|
||||
@@ -166,7 +166,7 @@
|
||||
"in": "body",
|
||||
"required": true,
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.CreateInstanceOptions"
|
||||
"$ref": "#/definitions/instance.CreateInstanceOptions"
|
||||
}
|
||||
}
|
||||
],
|
||||
@@ -174,7 +174,7 @@
|
||||
"201": {
|
||||
"description": "Created instance details",
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.Instance"
|
||||
"$ref": "#/definitions/instance.Process"
|
||||
}
|
||||
},
|
||||
"400": {
|
||||
@@ -394,7 +394,7 @@
|
||||
"200": {
|
||||
"description": "Restarted instance details",
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.Instance"
|
||||
"$ref": "#/definitions/instance.Process"
|
||||
}
|
||||
},
|
||||
"400": {
|
||||
@@ -437,7 +437,7 @@
|
||||
"200": {
|
||||
"description": "Started instance details",
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.Instance"
|
||||
"$ref": "#/definitions/instance.Process"
|
||||
}
|
||||
},
|
||||
"400": {
|
||||
@@ -480,7 +480,7 @@
|
||||
"200": {
|
||||
"description": "Stopped instance details",
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.Instance"
|
||||
"$ref": "#/definitions/instance.Process"
|
||||
}
|
||||
},
|
||||
"400": {
|
||||
@@ -632,7 +632,35 @@
|
||||
"200": {
|
||||
"description": "List of OpenAI-compatible instances",
|
||||
"schema": {
|
||||
"$ref": "#/definitions/llamactl.OpenAIListInstancesResponse"
|
||||
"$ref": "#/definitions/server.OpenAIListInstancesResponse"
|
||||
}
|
||||
},
|
||||
"500": {
|
||||
"description": "Internal Server Error",
|
||||
"schema": {
|
||||
"type": "string"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"/version": {
|
||||
"get": {
|
||||
"security": [
|
||||
{
|
||||
"ApiKeyAuth": []
|
||||
}
|
||||
],
|
||||
"description": "Returns the version of the llamactl command",
|
||||
"tags": [
|
||||
"version"
|
||||
],
|
||||
"summary": "Get llamactl version",
|
||||
"responses": {
|
||||
"200": {
|
||||
"description": "Version information",
|
||||
"schema": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"500": {
|
||||
@@ -646,7 +674,7 @@
|
||||
}
|
||||
},
|
||||
"definitions": {
|
||||
"llamactl.CreateInstanceOptions": {
|
||||
"instance.CreateInstanceOptions": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"alias": {
|
||||
@@ -744,7 +772,6 @@
|
||||
"type": "string"
|
||||
},
|
||||
"draft_max": {
|
||||
"description": "Speculative decoding params",
|
||||
"type": "integer"
|
||||
},
|
||||
"draft_min": {
|
||||
@@ -948,7 +975,7 @@
|
||||
"type": "boolean"
|
||||
},
|
||||
"no_context_shift": {
|
||||
"description": "Server/Example-specific params",
|
||||
"description": "Example-specific params",
|
||||
"type": "boolean"
|
||||
},
|
||||
"no_escape": {
|
||||
@@ -1020,10 +1047,10 @@
|
||||
"presence_penalty": {
|
||||
"type": "number"
|
||||
},
|
||||
"priority": {
|
||||
"prio": {
|
||||
"type": "integer"
|
||||
},
|
||||
"priority_batch": {
|
||||
"prio_batch": {
|
||||
"type": "integer"
|
||||
},
|
||||
"props": {
|
||||
@@ -1094,7 +1121,7 @@
|
||||
"ssl_key_file": {
|
||||
"type": "string"
|
||||
},
|
||||
"temperature": {
|
||||
"temp": {
|
||||
"type": "number"
|
||||
},
|
||||
"tensor_split": {
|
||||
@@ -1160,7 +1187,7 @@
|
||||
}
|
||||
}
|
||||
},
|
||||
"llamactl.Instance": {
|
||||
"instance.Process": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"created": {
|
||||
@@ -1176,7 +1203,7 @@
|
||||
}
|
||||
}
|
||||
},
|
||||
"llamactl.OpenAIInstance": {
|
||||
"server.OpenAIInstance": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"created": {
|
||||
@@ -1193,13 +1220,13 @@
|
||||
}
|
||||
}
|
||||
},
|
||||
"llamactl.OpenAIListInstancesResponse": {
|
||||
"server.OpenAIListInstancesResponse": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"data": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"$ref": "#/definitions/llamactl.OpenAIInstance"
|
||||
"$ref": "#/definitions/server.OpenAIInstance"
|
||||
}
|
||||
},
|
||||
"object": {
|
||||
@@ -1,6 +1,6 @@
|
||||
basePath: /api/v1
|
||||
definitions:
|
||||
llamactl.CreateInstanceOptions:
|
||||
instance.CreateInstanceOptions:
|
||||
properties:
|
||||
alias:
|
||||
type: string
|
||||
@@ -66,7 +66,6 @@ definitions:
|
||||
device_draft:
|
||||
type: string
|
||||
draft_max:
|
||||
description: Speculative decoding params
|
||||
type: integer
|
||||
draft_min:
|
||||
type: integer
|
||||
@@ -203,7 +202,7 @@ definitions:
|
||||
no_cont_batching:
|
||||
type: boolean
|
||||
no_context_shift:
|
||||
description: Server/Example-specific params
|
||||
description: Example-specific params
|
||||
type: boolean
|
||||
no_escape:
|
||||
type: boolean
|
||||
@@ -251,9 +250,9 @@ definitions:
|
||||
type: integer
|
||||
presence_penalty:
|
||||
type: number
|
||||
priority:
|
||||
prio:
|
||||
type: integer
|
||||
priority_batch:
|
||||
prio_batch:
|
||||
type: integer
|
||||
props:
|
||||
type: boolean
|
||||
@@ -301,7 +300,7 @@ definitions:
|
||||
type: string
|
||||
ssl_key_file:
|
||||
type: string
|
||||
temperature:
|
||||
temp:
|
||||
type: number
|
||||
tensor_split:
|
||||
type: string
|
||||
@@ -345,7 +344,7 @@ definitions:
|
||||
yarn_orig_ctx:
|
||||
type: integer
|
||||
type: object
|
||||
llamactl.Instance:
|
||||
instance.Process:
|
||||
properties:
|
||||
created:
|
||||
description: Creation time
|
||||
@@ -356,7 +355,7 @@ definitions:
|
||||
description: Status
|
||||
type: boolean
|
||||
type: object
|
||||
llamactl.OpenAIInstance:
|
||||
server.OpenAIInstance:
|
||||
properties:
|
||||
created:
|
||||
type: integer
|
||||
@@ -367,11 +366,11 @@ definitions:
|
||||
owned_by:
|
||||
type: string
|
||||
type: object
|
||||
llamactl.OpenAIListInstancesResponse:
|
||||
server.OpenAIListInstancesResponse:
|
||||
properties:
|
||||
data:
|
||||
items:
|
||||
$ref: '#/definitions/llamactl.OpenAIInstance'
|
||||
$ref: '#/definitions/server.OpenAIInstance'
|
||||
type: array
|
||||
object:
|
||||
type: string
|
||||
@@ -393,7 +392,7 @@ paths:
|
||||
description: List of instances
|
||||
schema:
|
||||
items:
|
||||
$ref: '#/definitions/llamactl.Instance'
|
||||
$ref: '#/definitions/instance.Process'
|
||||
type: array
|
||||
"500":
|
||||
description: Internal Server Error
|
||||
@@ -441,7 +440,7 @@ paths:
|
||||
"200":
|
||||
description: Instance details
|
||||
schema:
|
||||
$ref: '#/definitions/llamactl.Instance'
|
||||
$ref: '#/definitions/instance.Process'
|
||||
"400":
|
||||
description: Invalid name format
|
||||
schema:
|
||||
@@ -470,12 +469,12 @@ paths:
|
||||
name: options
|
||||
required: true
|
||||
schema:
|
||||
$ref: '#/definitions/llamactl.CreateInstanceOptions'
|
||||
$ref: '#/definitions/instance.CreateInstanceOptions'
|
||||
responses:
|
||||
"201":
|
||||
description: Created instance details
|
||||
schema:
|
||||
$ref: '#/definitions/llamactl.Instance'
|
||||
$ref: '#/definitions/instance.Process'
|
||||
"400":
|
||||
description: Invalid request body
|
||||
schema:
|
||||
@@ -504,12 +503,12 @@ paths:
|
||||
name: options
|
||||
required: true
|
||||
schema:
|
||||
$ref: '#/definitions/llamactl.CreateInstanceOptions'
|
||||
$ref: '#/definitions/instance.CreateInstanceOptions'
|
||||
responses:
|
||||
"200":
|
||||
description: Updated instance details
|
||||
schema:
|
||||
$ref: '#/definitions/llamactl.Instance'
|
||||
$ref: '#/definitions/instance.Process'
|
||||
"400":
|
||||
description: Invalid name format
|
||||
schema:
|
||||
@@ -627,7 +626,7 @@ paths:
|
||||
"200":
|
||||
description: Restarted instance details
|
||||
schema:
|
||||
$ref: '#/definitions/llamactl.Instance'
|
||||
$ref: '#/definitions/instance.Process'
|
||||
"400":
|
||||
description: Invalid name format
|
||||
schema:
|
||||
@@ -654,7 +653,7 @@ paths:
|
||||
"200":
|
||||
description: Started instance details
|
||||
schema:
|
||||
$ref: '#/definitions/llamactl.Instance'
|
||||
$ref: '#/definitions/instance.Process'
|
||||
"400":
|
||||
description: Invalid name format
|
||||
schema:
|
||||
@@ -681,7 +680,7 @@ paths:
|
||||
"200":
|
||||
description: Stopped instance details
|
||||
schema:
|
||||
$ref: '#/definitions/llamactl.Instance'
|
||||
$ref: '#/definitions/instance.Process'
|
||||
"400":
|
||||
description: Invalid name format
|
||||
schema:
|
||||
@@ -777,7 +776,7 @@ paths:
|
||||
"200":
|
||||
description: List of OpenAI-compatible instances
|
||||
schema:
|
||||
$ref: '#/definitions/llamactl.OpenAIListInstancesResponse'
|
||||
$ref: '#/definitions/server.OpenAIListInstancesResponse'
|
||||
"500":
|
||||
description: Internal Server Error
|
||||
schema:
|
||||
@@ -787,4 +786,21 @@ paths:
|
||||
summary: List instances in OpenAI-compatible format
|
||||
tags:
|
||||
- openai
|
||||
/version:
|
||||
get:
|
||||
description: Returns the version of the llamactl command
|
||||
responses:
|
||||
"200":
|
||||
description: Version information
|
||||
schema:
|
||||
type: string
|
||||
"500":
|
||||
description: Internal Server Error
|
||||
schema:
|
||||
type: string
|
||||
security:
|
||||
- ApiKeyAuth: []
|
||||
summary: Get llamactl version
|
||||
tags:
|
||||
- version
|
||||
swagger: "2.0"
|
||||
@@ -11,6 +11,11 @@ import (
|
||||
"syscall"
|
||||
)
|
||||
|
||||
// version is set at build time using -ldflags "-X main.version=1.0.0"
|
||||
var version string = "unknown"
|
||||
var commitHash string = "unknown"
|
||||
var buildTime string = "unknown"
|
||||
|
||||
// @title llamactl API
|
||||
// @version 1.0
|
||||
// @description llamactl is a control server for managing Llama Server instances.
|
||||
@@ -19,6 +24,14 @@ import (
|
||||
// @basePath /api/v1
|
||||
func main() {
|
||||
|
||||
// --version flag to print the version
|
||||
if len(os.Args) > 1 && os.Args[1] == "--version" {
|
||||
fmt.Printf("llamactl version: %s\n", version)
|
||||
fmt.Printf("Commit hash: %s\n", commitHash)
|
||||
fmt.Printf("Build time: %s\n", buildTime)
|
||||
return
|
||||
}
|
||||
|
||||
configPath := os.Getenv("LLAMACTL_CONFIG_PATH")
|
||||
cfg, err := config.LoadConfig(configPath)
|
||||
if err != nil {
|
||||
@@ -26,6 +39,11 @@ func main() {
|
||||
fmt.Println("Using default configuration.")
|
||||
}
|
||||
|
||||
// Set version information
|
||||
cfg.Version = version
|
||||
cfg.CommitHash = commitHash
|
||||
cfg.BuildTime = buildTime
|
||||
|
||||
// Create the data directory if it doesn't exist
|
||||
if cfg.Instances.AutoCreateDirs {
|
||||
if err := os.MkdirAll(cfg.Instances.InstancesDir, 0755); err != nil {
|
||||
|
||||
BIN
docs/images/screenshot.png
Normal file
BIN
docs/images/screenshot.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 47 KiB |
@@ -12,9 +12,12 @@ import (
|
||||
|
||||
// AppConfig represents the configuration for llamactl
|
||||
type AppConfig struct {
|
||||
Server ServerConfig `yaml:"server"`
|
||||
Instances InstancesConfig `yaml:"instances"`
|
||||
Auth AuthConfig `yaml:"auth"`
|
||||
Server ServerConfig `yaml:"server"`
|
||||
Instances InstancesConfig `yaml:"instances"`
|
||||
Auth AuthConfig `yaml:"auth"`
|
||||
Version string `yaml:"-"`
|
||||
CommitHash string `yaml:"-"`
|
||||
BuildTime string `yaml:"-"`
|
||||
}
|
||||
|
||||
// ServerConfig contains HTTP server configuration
|
||||
@@ -63,6 +66,15 @@ type InstancesConfig struct {
|
||||
|
||||
// Default restart delay for new instances (in seconds)
|
||||
DefaultRestartDelay int `yaml:"default_restart_delay"`
|
||||
|
||||
// Default on-demand start setting for new instances
|
||||
DefaultOnDemandStart bool `yaml:"default_on_demand_start"`
|
||||
|
||||
// How long to wait for an instance to start on demand (in seconds)
|
||||
OnDemandStartTimeout int `yaml:"on_demand_start_timeout,omitempty"`
|
||||
|
||||
// Interval for checking instance timeouts (in minutes)
|
||||
TimeoutCheckInterval int `yaml:"timeout_check_interval"`
|
||||
}
|
||||
|
||||
// AuthConfig contains authentication settings
|
||||
@@ -95,16 +107,19 @@ func LoadConfig(configPath string) (AppConfig, error) {
|
||||
EnableSwagger: false,
|
||||
},
|
||||
Instances: InstancesConfig{
|
||||
PortRange: [2]int{8000, 9000},
|
||||
DataDir: getDefaultDataDirectory(),
|
||||
InstancesDir: filepath.Join(getDefaultDataDirectory(), "instances"),
|
||||
LogsDir: filepath.Join(getDefaultDataDirectory(), "logs"),
|
||||
AutoCreateDirs: true,
|
||||
MaxInstances: -1, // -1 means unlimited
|
||||
LlamaExecutable: "llama-server",
|
||||
DefaultAutoRestart: true,
|
||||
DefaultMaxRestarts: 3,
|
||||
DefaultRestartDelay: 5,
|
||||
PortRange: [2]int{8000, 9000},
|
||||
DataDir: getDefaultDataDirectory(),
|
||||
InstancesDir: filepath.Join(getDefaultDataDirectory(), "instances"),
|
||||
LogsDir: filepath.Join(getDefaultDataDirectory(), "logs"),
|
||||
AutoCreateDirs: true,
|
||||
MaxInstances: -1, // -1 means unlimited
|
||||
LlamaExecutable: "llama-server",
|
||||
DefaultAutoRestart: true,
|
||||
DefaultMaxRestarts: 3,
|
||||
DefaultRestartDelay: 5,
|
||||
DefaultOnDemandStart: true,
|
||||
OnDemandStartTimeout: 120, // 2 minutes
|
||||
TimeoutCheckInterval: 5, // Check timeouts every 5 minutes
|
||||
},
|
||||
Auth: AuthConfig{
|
||||
RequireInferenceAuth: true,
|
||||
@@ -214,6 +229,21 @@ func loadEnvVars(cfg *AppConfig) {
|
||||
cfg.Instances.DefaultRestartDelay = seconds
|
||||
}
|
||||
}
|
||||
if onDemandStart := os.Getenv("LLAMACTL_DEFAULT_ON_DEMAND_START"); onDemandStart != "" {
|
||||
if b, err := strconv.ParseBool(onDemandStart); err == nil {
|
||||
cfg.Instances.DefaultOnDemandStart = b
|
||||
}
|
||||
}
|
||||
if onDemandTimeout := os.Getenv("LLAMACTL_ON_DEMAND_START_TIMEOUT"); onDemandTimeout != "" {
|
||||
if seconds, err := strconv.Atoi(onDemandTimeout); err == nil {
|
||||
cfg.Instances.OnDemandStartTimeout = seconds
|
||||
}
|
||||
}
|
||||
if timeoutCheckInterval := os.Getenv("LLAMACTL_TIMEOUT_CHECK_INTERVAL"); timeoutCheckInterval != "" {
|
||||
if minutes, err := strconv.Atoi(timeoutCheckInterval); err == nil {
|
||||
cfg.Instances.TimeoutCheckInterval = minutes
|
||||
}
|
||||
}
|
||||
// Auth config
|
||||
if requireInferenceAuth := os.Getenv("LLAMACTL_REQUIRE_INFERENCE_AUTH"); requireInferenceAuth != "" {
|
||||
if b, err := strconv.ParseBool(requireInferenceAuth); err == nil {
|
||||
|
||||
@@ -13,16 +13,32 @@ import (
|
||||
"net/url"
|
||||
"os/exec"
|
||||
"sync"
|
||||
"sync/atomic"
|
||||
"time"
|
||||
)
|
||||
|
||||
// TimeProvider interface allows for testing with mock time
|
||||
type TimeProvider interface {
|
||||
Now() time.Time
|
||||
}
|
||||
|
||||
// realTimeProvider implements TimeProvider using the actual time
|
||||
type realTimeProvider struct{}
|
||||
|
||||
func (realTimeProvider) Now() time.Time {
|
||||
return time.Now()
|
||||
}
|
||||
|
||||
type CreateInstanceOptions struct {
|
||||
// Auto restart
|
||||
AutoRestart *bool `json:"auto_restart,omitempty"`
|
||||
MaxRestarts *int `json:"max_restarts,omitempty"`
|
||||
// RestartDelay duration in seconds
|
||||
RestartDelay *int `json:"restart_delay_seconds,omitempty"`
|
||||
|
||||
AutoRestart *bool `json:"auto_restart,omitempty"`
|
||||
MaxRestarts *int `json:"max_restarts,omitempty"`
|
||||
RestartDelay *int `json:"restart_delay,omitempty"`
|
||||
// On demand start
|
||||
OnDemandStart *bool `json:"on_demand_start,omitempty"`
|
||||
// Idle timeout
|
||||
IdleTimeout *int `json:"idle_timeout,omitempty"`
|
||||
// LlamaServerOptions contains the options for the llama server
|
||||
llamacpp.LlamaServerOptions `json:",inline"`
|
||||
}
|
||||
|
||||
@@ -32,9 +48,11 @@ type CreateInstanceOptions struct {
|
||||
func (c *CreateInstanceOptions) UnmarshalJSON(data []byte) error {
|
||||
// First, unmarshal into a temporary struct without the embedded type
|
||||
type tempCreateOptions struct {
|
||||
AutoRestart *bool `json:"auto_restart,omitempty"`
|
||||
MaxRestarts *int `json:"max_restarts,omitempty"`
|
||||
RestartDelay *int `json:"restart_delay_seconds,omitempty"`
|
||||
AutoRestart *bool `json:"auto_restart,omitempty"`
|
||||
MaxRestarts *int `json:"max_restarts,omitempty"`
|
||||
RestartDelay *int `json:"restart_delay,omitempty"`
|
||||
OnDemandStart *bool `json:"on_demand_start,omitempty"`
|
||||
IdleTimeout *int `json:"idle_timeout,omitempty"`
|
||||
}
|
||||
|
||||
var temp tempCreateOptions
|
||||
@@ -46,6 +64,8 @@ func (c *CreateInstanceOptions) UnmarshalJSON(data []byte) error {
|
||||
c.AutoRestart = temp.AutoRestart
|
||||
c.MaxRestarts = temp.MaxRestarts
|
||||
c.RestartDelay = temp.RestartDelay
|
||||
c.OnDemandStart = temp.OnDemandStart
|
||||
c.IdleTimeout = temp.IdleTimeout
|
||||
|
||||
// Now unmarshal the embedded LlamaServerOptions
|
||||
if err := json.Unmarshal(data, &c.LlamaServerOptions); err != nil {
|
||||
@@ -83,6 +103,10 @@ type Process struct {
|
||||
// Restart control
|
||||
restartCancel context.CancelFunc `json:"-"` // Cancel function for pending restarts
|
||||
monitorDone chan struct{} `json:"-"` // Channel to signal monitor goroutine completion
|
||||
|
||||
// Timeout management
|
||||
lastRequestTime atomic.Int64 // Unix timestamp of last request
|
||||
timeProvider TimeProvider `json:"-"` // Time provider for testing
|
||||
}
|
||||
|
||||
// validateAndCopyOptions validates and creates a deep copy of the provided options
|
||||
@@ -117,6 +141,20 @@ func validateAndCopyOptions(name string, options *CreateInstanceOptions) *Create
|
||||
}
|
||||
optionsCopy.RestartDelay = &restartDelay
|
||||
}
|
||||
|
||||
if options.OnDemandStart != nil {
|
||||
onDemandStart := *options.OnDemandStart
|
||||
optionsCopy.OnDemandStart = &onDemandStart
|
||||
}
|
||||
|
||||
if options.IdleTimeout != nil {
|
||||
idleTimeout := *options.IdleTimeout
|
||||
if idleTimeout < 0 {
|
||||
log.Printf("Instance %s IdleTimeout value (%d) cannot be negative, setting to 0 minutes", name, idleTimeout)
|
||||
idleTimeout = 0
|
||||
}
|
||||
optionsCopy.IdleTimeout = &idleTimeout
|
||||
}
|
||||
}
|
||||
|
||||
return optionsCopy
|
||||
@@ -142,6 +180,16 @@ func applyDefaultOptions(options *CreateInstanceOptions, globalSettings *config.
|
||||
defaultRestartDelay := globalSettings.DefaultRestartDelay
|
||||
options.RestartDelay = &defaultRestartDelay
|
||||
}
|
||||
|
||||
if options.OnDemandStart == nil {
|
||||
defaultOnDemandStart := globalSettings.DefaultOnDemandStart
|
||||
options.OnDemandStart = &defaultOnDemandStart
|
||||
}
|
||||
|
||||
if options.IdleTimeout == nil {
|
||||
defaultIdleTimeout := 0
|
||||
options.IdleTimeout = &defaultIdleTimeout
|
||||
}
|
||||
}
|
||||
|
||||
// NewInstance creates a new instance with the given name, log path, and options
|
||||
@@ -158,10 +206,8 @@ func NewInstance(name string, globalSettings *config.InstancesConfig, options *C
|
||||
options: optionsCopy,
|
||||
globalSettings: globalSettings,
|
||||
logger: logger,
|
||||
|
||||
Running: false,
|
||||
|
||||
Created: time.Now().Unix(),
|
||||
timeProvider: realTimeProvider{},
|
||||
Created: time.Now().Unix(),
|
||||
}
|
||||
}
|
||||
|
||||
@@ -189,6 +235,11 @@ func (i *Process) SetOptions(options *CreateInstanceOptions) {
|
||||
i.proxy = nil
|
||||
}
|
||||
|
||||
// SetTimeProvider sets a custom time provider for testing
|
||||
func (i *Process) SetTimeProvider(tp TimeProvider) {
|
||||
i.timeProvider = tp
|
||||
}
|
||||
|
||||
// GetProxy returns the reverse proxy for this instance, creating it if needed
|
||||
func (i *Process) GetProxy() (*httputil.ReverseProxy, error) {
|
||||
i.mu.Lock()
|
||||
|
||||
@@ -91,38 +91,6 @@ func TestNewInstance_WithRestartOptions(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewInstance_ValidationAndDefaults(t *testing.T) {
|
||||
globalSettings := &config.InstancesConfig{
|
||||
LogsDir: "/tmp/test",
|
||||
DefaultAutoRestart: true,
|
||||
DefaultMaxRestarts: 3,
|
||||
DefaultRestartDelay: 5,
|
||||
}
|
||||
|
||||
// Test with invalid negative values
|
||||
invalidMaxRestarts := -5
|
||||
invalidRestartDelay := -10
|
||||
|
||||
options := &instance.CreateInstanceOptions{
|
||||
MaxRestarts: &invalidMaxRestarts,
|
||||
RestartDelay: &invalidRestartDelay,
|
||||
LlamaServerOptions: llamacpp.LlamaServerOptions{
|
||||
Model: "/path/to/model.gguf",
|
||||
},
|
||||
}
|
||||
|
||||
instance := instance.NewInstance("test-instance", globalSettings, options)
|
||||
opts := instance.GetOptions()
|
||||
|
||||
// Check that negative values were corrected to 0
|
||||
if opts.MaxRestarts == nil || *opts.MaxRestarts != 0 {
|
||||
t.Errorf("Expected MaxRestarts to be corrected to 0, got %v", opts.MaxRestarts)
|
||||
}
|
||||
if opts.RestartDelay == nil || *opts.RestartDelay != 0 {
|
||||
t.Errorf("Expected RestartDelay to be corrected to 0, got %v", opts.RestartDelay)
|
||||
}
|
||||
}
|
||||
|
||||
func TestSetOptions(t *testing.T) {
|
||||
globalSettings := &config.InstancesConfig{
|
||||
LogsDir: "/tmp/test",
|
||||
@@ -164,33 +132,6 @@ func TestSetOptions(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestSetOptions_NilOptions(t *testing.T) {
|
||||
globalSettings := &config.InstancesConfig{
|
||||
LogsDir: "/tmp/test",
|
||||
DefaultAutoRestart: true,
|
||||
DefaultMaxRestarts: 3,
|
||||
DefaultRestartDelay: 5,
|
||||
}
|
||||
|
||||
options := &instance.CreateInstanceOptions{
|
||||
LlamaServerOptions: llamacpp.LlamaServerOptions{
|
||||
Model: "/path/to/model.gguf",
|
||||
},
|
||||
}
|
||||
|
||||
instance := instance.NewInstance("test-instance", globalSettings, options)
|
||||
originalOptions := instance.GetOptions()
|
||||
|
||||
// Try to set nil options
|
||||
instance.SetOptions(nil)
|
||||
|
||||
// Options should remain unchanged
|
||||
currentOptions := instance.GetOptions()
|
||||
if currentOptions.Model != originalOptions.Model {
|
||||
t.Error("Options should not change when setting nil options")
|
||||
}
|
||||
}
|
||||
|
||||
func TestGetProxy(t *testing.T) {
|
||||
globalSettings := &config.InstancesConfig{
|
||||
LogsDir: "/tmp/test",
|
||||
@@ -317,58 +258,6 @@ func TestUnmarshalJSON(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestUnmarshalJSON_PartialOptions(t *testing.T) {
|
||||
jsonData := `{
|
||||
"name": "test-instance",
|
||||
"running": false,
|
||||
"options": {
|
||||
"model": "/path/to/model.gguf"
|
||||
}
|
||||
}`
|
||||
|
||||
var inst instance.Process
|
||||
err := json.Unmarshal([]byte(jsonData), &inst)
|
||||
if err != nil {
|
||||
t.Fatalf("JSON unmarshal failed: %v", err)
|
||||
}
|
||||
|
||||
opts := inst.GetOptions()
|
||||
if opts.Model != "/path/to/model.gguf" {
|
||||
t.Errorf("Expected model '/path/to/model.gguf', got %q", opts.Model)
|
||||
}
|
||||
|
||||
// Note: Defaults are NOT applied during unmarshaling
|
||||
// They should only be applied by NewInstance or SetOptions
|
||||
if opts.AutoRestart != nil {
|
||||
t.Error("Expected AutoRestart to be nil (no defaults applied during unmarshal)")
|
||||
}
|
||||
}
|
||||
|
||||
func TestUnmarshalJSON_NoOptions(t *testing.T) {
|
||||
jsonData := `{
|
||||
"name": "test-instance",
|
||||
"running": false
|
||||
}`
|
||||
|
||||
var inst instance.Process
|
||||
err := json.Unmarshal([]byte(jsonData), &inst)
|
||||
if err != nil {
|
||||
t.Fatalf("JSON unmarshal failed: %v", err)
|
||||
}
|
||||
|
||||
if inst.Name != "test-instance" {
|
||||
t.Errorf("Expected name 'test-instance', got %q", inst.Name)
|
||||
}
|
||||
if inst.Running {
|
||||
t.Error("Expected running to be false")
|
||||
}
|
||||
|
||||
opts := inst.GetOptions()
|
||||
if opts != nil {
|
||||
t.Error("Expected options to be nil when not provided in JSON")
|
||||
}
|
||||
}
|
||||
|
||||
func TestCreateInstanceOptionsValidation(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
@@ -377,13 +266,6 @@ func TestCreateInstanceOptionsValidation(t *testing.T) {
|
||||
expectedMax int
|
||||
expectedDelay int
|
||||
}{
|
||||
{
|
||||
name: "nil values",
|
||||
maxRestarts: nil,
|
||||
restartDelay: nil,
|
||||
expectedMax: 0, // Should remain nil, but we can't easily test nil in this structure
|
||||
expectedDelay: 0,
|
||||
},
|
||||
{
|
||||
name: "valid positive values",
|
||||
maxRestarts: testutil.IntPtr(10),
|
||||
@@ -424,20 +306,16 @@ func TestCreateInstanceOptionsValidation(t *testing.T) {
|
||||
instance := instance.NewInstance("test", globalSettings, options)
|
||||
opts := instance.GetOptions()
|
||||
|
||||
if tt.maxRestarts != nil {
|
||||
if opts.MaxRestarts == nil {
|
||||
t.Error("Expected MaxRestarts to be set")
|
||||
} else if *opts.MaxRestarts != tt.expectedMax {
|
||||
t.Errorf("Expected MaxRestarts %d, got %d", tt.expectedMax, *opts.MaxRestarts)
|
||||
}
|
||||
if opts.MaxRestarts == nil {
|
||||
t.Error("Expected MaxRestarts to be set")
|
||||
} else if *opts.MaxRestarts != tt.expectedMax {
|
||||
t.Errorf("Expected MaxRestarts %d, got %d", tt.expectedMax, *opts.MaxRestarts)
|
||||
}
|
||||
|
||||
if tt.restartDelay != nil {
|
||||
if opts.RestartDelay == nil {
|
||||
t.Error("Expected RestartDelay to be set")
|
||||
} else if *opts.RestartDelay != tt.expectedDelay {
|
||||
t.Errorf("Expected RestartDelay %d, got %d", tt.expectedDelay, *opts.RestartDelay)
|
||||
}
|
||||
if opts.RestartDelay == nil {
|
||||
t.Error("Expected RestartDelay to be set")
|
||||
} else if *opts.RestartDelay != tt.expectedDelay {
|
||||
t.Errorf("Expected RestartDelay %d, got %d", tt.expectedDelay, *opts.RestartDelay)
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
@@ -4,6 +4,7 @@ import (
|
||||
"context"
|
||||
"fmt"
|
||||
"log"
|
||||
"net/http"
|
||||
"os/exec"
|
||||
"runtime"
|
||||
"syscall"
|
||||
@@ -30,6 +31,9 @@ func (i *Process) Start() error {
|
||||
i.restarts = 0
|
||||
}
|
||||
|
||||
// Initialize last request time to current time when starting
|
||||
i.lastRequestTime.Store(i.timeProvider.Now().Unix())
|
||||
|
||||
// Create log files
|
||||
if err := i.logger.Create(); err != nil {
|
||||
return fmt.Errorf("failed to create log files: %w", err)
|
||||
@@ -140,6 +144,74 @@ func (i *Process) Stop() error {
|
||||
return nil
|
||||
}
|
||||
|
||||
func (i *Process) WaitForHealthy(timeout int) error {
|
||||
if !i.Running {
|
||||
return fmt.Errorf("instance %s is not running", i.Name)
|
||||
}
|
||||
|
||||
if timeout <= 0 {
|
||||
timeout = 30 // Default to 30 seconds if no timeout is specified
|
||||
}
|
||||
|
||||
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(timeout)*time.Second)
|
||||
defer cancel()
|
||||
|
||||
// Get instance options to build the health check URL
|
||||
opts := i.GetOptions()
|
||||
if opts == nil {
|
||||
return fmt.Errorf("instance %s has no options set", i.Name)
|
||||
}
|
||||
|
||||
// Build the health check URL directly
|
||||
host := opts.Host
|
||||
if host == "" {
|
||||
host = "localhost"
|
||||
}
|
||||
healthURL := fmt.Sprintf("http://%s:%d/health", host, opts.Port)
|
||||
|
||||
// Create a dedicated HTTP client for health checks
|
||||
client := &http.Client{
|
||||
Timeout: 5 * time.Second, // 5 second timeout per request
|
||||
}
|
||||
|
||||
// Helper function to check health directly
|
||||
checkHealth := func() bool {
|
||||
req, err := http.NewRequestWithContext(ctx, "GET", healthURL, nil)
|
||||
if err != nil {
|
||||
return false
|
||||
}
|
||||
|
||||
resp, err := client.Do(req)
|
||||
if err != nil {
|
||||
return false
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
|
||||
return resp.StatusCode == http.StatusOK
|
||||
}
|
||||
|
||||
// Try immediate check first
|
||||
if checkHealth() {
|
||||
return nil // Instance is healthy
|
||||
}
|
||||
|
||||
// If immediate check failed, start polling
|
||||
ticker := time.NewTicker(1 * time.Second)
|
||||
defer ticker.Stop()
|
||||
|
||||
for {
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
return fmt.Errorf("timeout waiting for instance %s to become healthy after %d seconds", i.Name, timeout)
|
||||
case <-ticker.C:
|
||||
if checkHealth() {
|
||||
return nil // Instance is healthy
|
||||
}
|
||||
// Continue polling
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (i *Process) monitorProcess() {
|
||||
defer func() {
|
||||
i.mu.Lock()
|
||||
|
||||
28
pkg/instance/timeout.go
Normal file
28
pkg/instance/timeout.go
Normal file
@@ -0,0 +1,28 @@
|
||||
package instance
|
||||
|
||||
// UpdateLastRequestTime updates the last request access time for the instance via proxy
|
||||
func (i *Process) UpdateLastRequestTime() {
|
||||
i.mu.Lock()
|
||||
defer i.mu.Unlock()
|
||||
|
||||
lastRequestTime := i.timeProvider.Now().Unix()
|
||||
i.lastRequestTime.Store(lastRequestTime)
|
||||
}
|
||||
|
||||
func (i *Process) ShouldTimeout() bool {
|
||||
i.mu.RLock()
|
||||
defer i.mu.RUnlock()
|
||||
|
||||
if !i.Running || i.options.IdleTimeout == nil || *i.options.IdleTimeout <= 0 {
|
||||
return false
|
||||
}
|
||||
|
||||
// Check if the last request time exceeds the idle timeout
|
||||
lastRequest := i.lastRequestTime.Load()
|
||||
idleTimeoutMinutes := *i.options.IdleTimeout
|
||||
|
||||
// Convert timeout from minutes to seconds for comparison
|
||||
idleTimeoutSeconds := int64(idleTimeoutMinutes * 60)
|
||||
|
||||
return (i.timeProvider.Now().Unix() - lastRequest) > idleTimeoutSeconds
|
||||
}
|
||||
195
pkg/instance/timeout_test.go
Normal file
195
pkg/instance/timeout_test.go
Normal file
@@ -0,0 +1,195 @@
|
||||
package instance_test
|
||||
|
||||
import (
|
||||
"llamactl/pkg/backends/llamacpp"
|
||||
"llamactl/pkg/config"
|
||||
"llamactl/pkg/instance"
|
||||
"llamactl/pkg/testutil"
|
||||
"sync/atomic"
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
// MockTimeProvider implements TimeProvider for testing
|
||||
type MockTimeProvider struct {
|
||||
currentTime atomic.Int64 // Unix timestamp
|
||||
}
|
||||
|
||||
func NewMockTimeProvider(t time.Time) *MockTimeProvider {
|
||||
m := &MockTimeProvider{}
|
||||
m.currentTime.Store(t.Unix())
|
||||
return m
|
||||
}
|
||||
|
||||
func (m *MockTimeProvider) Now() time.Time {
|
||||
return time.Unix(m.currentTime.Load(), 0)
|
||||
}
|
||||
|
||||
func (m *MockTimeProvider) SetTime(t time.Time) {
|
||||
m.currentTime.Store(t.Unix())
|
||||
}
|
||||
|
||||
// Timeout-related tests
|
||||
|
||||
func TestUpdateLastRequestTime(t *testing.T) {
|
||||
globalSettings := &config.InstancesConfig{
|
||||
LogsDir: "/tmp/test",
|
||||
}
|
||||
|
||||
options := &instance.CreateInstanceOptions{
|
||||
LlamaServerOptions: llamacpp.LlamaServerOptions{
|
||||
Model: "/path/to/model.gguf",
|
||||
},
|
||||
}
|
||||
|
||||
inst := instance.NewInstance("test-instance", globalSettings, options)
|
||||
|
||||
// Test that UpdateLastRequestTime doesn't panic
|
||||
inst.UpdateLastRequestTime()
|
||||
}
|
||||
|
||||
func TestShouldTimeout_NotRunning(t *testing.T) {
|
||||
globalSettings := &config.InstancesConfig{
|
||||
LogsDir: "/tmp/test",
|
||||
}
|
||||
|
||||
idleTimeout := 1 // 1 minute
|
||||
options := &instance.CreateInstanceOptions{
|
||||
IdleTimeout: &idleTimeout,
|
||||
LlamaServerOptions: llamacpp.LlamaServerOptions{
|
||||
Model: "/path/to/model.gguf",
|
||||
},
|
||||
}
|
||||
|
||||
inst := instance.NewInstance("test-instance", globalSettings, options)
|
||||
|
||||
// Instance is not running, should not timeout regardless of configuration
|
||||
if inst.ShouldTimeout() {
|
||||
t.Error("Non-running instance should never timeout")
|
||||
}
|
||||
}
|
||||
|
||||
func TestShouldTimeout_NoTimeoutConfigured(t *testing.T) {
|
||||
globalSettings := &config.InstancesConfig{
|
||||
LogsDir: "/tmp/test",
|
||||
}
|
||||
|
||||
tests := []struct {
|
||||
name string
|
||||
idleTimeout *int
|
||||
}{
|
||||
{"nil timeout", nil},
|
||||
{"zero timeout", testutil.IntPtr(0)},
|
||||
{"negative timeout", testutil.IntPtr(-5)},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
options := &instance.CreateInstanceOptions{
|
||||
IdleTimeout: tt.idleTimeout,
|
||||
LlamaServerOptions: llamacpp.LlamaServerOptions{
|
||||
Model: "/path/to/model.gguf",
|
||||
},
|
||||
}
|
||||
|
||||
inst := instance.NewInstance("test-instance", globalSettings, options)
|
||||
// Simulate running state
|
||||
inst.Running = true
|
||||
|
||||
if inst.ShouldTimeout() {
|
||||
t.Errorf("Instance with %s should not timeout", tt.name)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestShouldTimeout_WithinTimeLimit(t *testing.T) {
|
||||
globalSettings := &config.InstancesConfig{
|
||||
LogsDir: "/tmp/test",
|
||||
}
|
||||
|
||||
idleTimeout := 5 // 5 minutes
|
||||
options := &instance.CreateInstanceOptions{
|
||||
IdleTimeout: &idleTimeout,
|
||||
LlamaServerOptions: llamacpp.LlamaServerOptions{
|
||||
Model: "/path/to/model.gguf",
|
||||
},
|
||||
}
|
||||
|
||||
inst := instance.NewInstance("test-instance", globalSettings, options)
|
||||
inst.Running = true
|
||||
|
||||
// Update last request time to now
|
||||
inst.UpdateLastRequestTime()
|
||||
|
||||
// Should not timeout immediately
|
||||
if inst.ShouldTimeout() {
|
||||
t.Error("Instance should not timeout when last request was recent")
|
||||
}
|
||||
}
|
||||
|
||||
func TestShouldTimeout_ExceedsTimeLimit(t *testing.T) {
|
||||
globalSettings := &config.InstancesConfig{
|
||||
LogsDir: "/tmp/test",
|
||||
}
|
||||
|
||||
idleTimeout := 1 // 1 minute
|
||||
options := &instance.CreateInstanceOptions{
|
||||
IdleTimeout: &idleTimeout,
|
||||
LlamaServerOptions: llamacpp.LlamaServerOptions{
|
||||
Model: "/path/to/model.gguf",
|
||||
},
|
||||
}
|
||||
|
||||
inst := instance.NewInstance("test-instance", globalSettings, options)
|
||||
inst.Running = true
|
||||
|
||||
// Use MockTimeProvider to simulate old last request time
|
||||
mockTime := NewMockTimeProvider(time.Now())
|
||||
inst.SetTimeProvider(mockTime)
|
||||
|
||||
// Set last request time to now
|
||||
inst.UpdateLastRequestTime()
|
||||
|
||||
// Advance time by 2 minutes (exceeds 1 minute timeout)
|
||||
mockTime.SetTime(time.Now().Add(2 * time.Minute))
|
||||
|
||||
if !inst.ShouldTimeout() {
|
||||
t.Error("Instance should timeout when last request exceeds idle timeout")
|
||||
}
|
||||
}
|
||||
|
||||
func TestTimeoutConfiguration_Validation(t *testing.T) {
|
||||
globalSettings := &config.InstancesConfig{
|
||||
LogsDir: "/tmp/test",
|
||||
}
|
||||
|
||||
tests := []struct {
|
||||
name string
|
||||
inputTimeout *int
|
||||
expectedTimeout int
|
||||
}{
|
||||
{"default value when nil", nil, 0},
|
||||
{"positive value", testutil.IntPtr(10), 10},
|
||||
{"zero value", testutil.IntPtr(0), 0},
|
||||
{"negative value gets corrected", testutil.IntPtr(-5), 0},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
options := &instance.CreateInstanceOptions{
|
||||
IdleTimeout: tt.inputTimeout,
|
||||
LlamaServerOptions: llamacpp.LlamaServerOptions{
|
||||
Model: "/path/to/model.gguf",
|
||||
},
|
||||
}
|
||||
|
||||
inst := instance.NewInstance("test-instance", globalSettings, options)
|
||||
opts := inst.GetOptions()
|
||||
|
||||
if opts.IdleTimeout == nil || *opts.IdleTimeout != tt.expectedTimeout {
|
||||
t.Errorf("Expected IdleTimeout %d, got %v", tt.expectedTimeout, opts.IdleTimeout)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
@@ -10,6 +10,7 @@ import (
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
)
|
||||
|
||||
// InstanceManager defines the interface for managing instances of the llama server.
|
||||
@@ -31,20 +32,48 @@ type instanceManager struct {
|
||||
instances map[string]*instance.Process
|
||||
ports map[int]bool
|
||||
instancesConfig config.InstancesConfig
|
||||
|
||||
// Timeout checker
|
||||
timeoutChecker *time.Ticker
|
||||
shutdownChan chan struct{}
|
||||
shutdownDone chan struct{}
|
||||
isShutdown bool
|
||||
}
|
||||
|
||||
// NewInstanceManager creates a new instance of InstanceManager.
|
||||
func NewInstanceManager(instancesConfig config.InstancesConfig) InstanceManager {
|
||||
if instancesConfig.TimeoutCheckInterval <= 0 {
|
||||
instancesConfig.TimeoutCheckInterval = 5 // Default to 5 minutes if not set
|
||||
}
|
||||
im := &instanceManager{
|
||||
instances: make(map[string]*instance.Process),
|
||||
ports: make(map[int]bool),
|
||||
instancesConfig: instancesConfig,
|
||||
|
||||
timeoutChecker: time.NewTicker(time.Duration(instancesConfig.TimeoutCheckInterval) * time.Minute),
|
||||
shutdownChan: make(chan struct{}),
|
||||
shutdownDone: make(chan struct{}),
|
||||
}
|
||||
|
||||
// Load existing instances from disk
|
||||
if err := im.loadInstances(); err != nil {
|
||||
log.Printf("Error loading instances: %v", err)
|
||||
}
|
||||
|
||||
// Start the timeout checker goroutine after initialization is complete
|
||||
go func() {
|
||||
defer close(im.shutdownDone)
|
||||
|
||||
for {
|
||||
select {
|
||||
case <-im.timeoutChecker.C:
|
||||
im.checkAllTimeouts()
|
||||
case <-im.shutdownChan:
|
||||
return // Exit goroutine on shutdown
|
||||
}
|
||||
}
|
||||
}()
|
||||
|
||||
return im
|
||||
}
|
||||
|
||||
@@ -94,6 +123,27 @@ func (im *instanceManager) Shutdown() {
|
||||
im.mu.Lock()
|
||||
defer im.mu.Unlock()
|
||||
|
||||
// Check if already shutdown
|
||||
if im.isShutdown {
|
||||
return
|
||||
}
|
||||
im.isShutdown = true
|
||||
|
||||
// Signal the timeout checker to stop
|
||||
close(im.shutdownChan)
|
||||
|
||||
// Release lock temporarily to wait for goroutine
|
||||
im.mu.Unlock()
|
||||
// Wait for the timeout checker goroutine to actually stop
|
||||
<-im.shutdownDone
|
||||
// Reacquire lock
|
||||
im.mu.Lock()
|
||||
|
||||
// Now stop the ticker
|
||||
if im.timeoutChecker != nil {
|
||||
im.timeoutChecker.Stop()
|
||||
}
|
||||
|
||||
var wg sync.WaitGroup
|
||||
wg.Add(len(im.instances))
|
||||
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -27,10 +27,6 @@ func (im *instanceManager) CreateInstance(name string, options *instance.CreateI
|
||||
return nil, fmt.Errorf("instance options cannot be nil")
|
||||
}
|
||||
|
||||
if len(im.instances) >= im.instancesConfig.MaxInstances && im.instancesConfig.MaxInstances != -1 {
|
||||
return nil, fmt.Errorf("maximum number of instances (%d) reached", im.instancesConfig.MaxInstances)
|
||||
}
|
||||
|
||||
name, err := validation.ValidateInstanceName(name)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
@@ -44,6 +40,11 @@ func (im *instanceManager) CreateInstance(name string, options *instance.CreateI
|
||||
im.mu.Lock()
|
||||
defer im.mu.Unlock()
|
||||
|
||||
// Check max instances limit after acquiring the lock
|
||||
if len(im.instances) >= im.instancesConfig.MaxInstances && im.instancesConfig.MaxInstances != -1 {
|
||||
return nil, fmt.Errorf("maximum number of instances (%d) reached", im.instancesConfig.MaxInstances)
|
||||
}
|
||||
|
||||
// Check if instance with this name already exists
|
||||
if im.instances[name] != nil {
|
||||
return nil, fmt.Errorf("instance with name %s already exists", name)
|
||||
|
||||
26
pkg/manager/timeout.go
Normal file
26
pkg/manager/timeout.go
Normal file
@@ -0,0 +1,26 @@
|
||||
package manager
|
||||
|
||||
import "log"
|
||||
|
||||
func (im *instanceManager) checkAllTimeouts() {
|
||||
im.mu.RLock()
|
||||
var timeoutInstances []string
|
||||
|
||||
// Identify instances that should timeout
|
||||
for _, inst := range im.instances {
|
||||
if inst.ShouldTimeout() {
|
||||
timeoutInstances = append(timeoutInstances, inst.Name)
|
||||
}
|
||||
}
|
||||
im.mu.RUnlock() // Release read lock before calling StopInstance
|
||||
|
||||
// Stop the timed-out instances
|
||||
for _, name := range timeoutInstances {
|
||||
log.Printf("Instance %s has timed out, stopping it", name)
|
||||
if _, err := im.StopInstance(name); err != nil {
|
||||
log.Printf("Error stopping instance %s: %v", name, err)
|
||||
} else {
|
||||
log.Printf("Instance %s stopped successfully", name)
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -28,7 +28,23 @@ func NewHandler(im manager.InstanceManager, cfg config.AppConfig) *Handler {
|
||||
}
|
||||
}
|
||||
|
||||
// HelpHandler godoc
|
||||
// VersionHandler godoc
|
||||
// @Summary Get llamactl version
|
||||
// @Description Returns the version of the llamactl command
|
||||
// @Tags version
|
||||
// @Security ApiKeyAuth
|
||||
// @Produces text/plain
|
||||
// @Success 200 {string} string "Version information"
|
||||
// @Failure 500 {string} string "Internal Server Error"
|
||||
// @Router /version [get]
|
||||
func (h *Handler) VersionHandler() http.HandlerFunc {
|
||||
return func(w http.ResponseWriter, r *http.Request) {
|
||||
w.Header().Set("Content-Type", "text/plain")
|
||||
fmt.Fprintf(w, "Version: %s\nCommit: %s\nBuild Time: %s\n", h.cfg.Version, h.cfg.CommitHash, h.cfg.BuildTime)
|
||||
}
|
||||
}
|
||||
|
||||
// LlamaServerHelpHandler godoc
|
||||
// @Summary Get help for llama server
|
||||
// @Description Returns the help text for the llama server command
|
||||
// @Tags server
|
||||
@@ -37,7 +53,7 @@ func NewHandler(im manager.InstanceManager, cfg config.AppConfig) *Handler {
|
||||
// @Success 200 {string} string "Help text"
|
||||
// @Failure 500 {string} string "Internal Server Error"
|
||||
// @Router /server/help [get]
|
||||
func (h *Handler) HelpHandler() http.HandlerFunc {
|
||||
func (h *Handler) LlamaServerHelpHandler() http.HandlerFunc {
|
||||
return func(w http.ResponseWriter, r *http.Request) {
|
||||
helpCmd := exec.Command("llama-server", "--help")
|
||||
output, err := helpCmd.CombinedOutput()
|
||||
@@ -50,7 +66,7 @@ func (h *Handler) HelpHandler() http.HandlerFunc {
|
||||
}
|
||||
}
|
||||
|
||||
// VersionHandler godoc
|
||||
// LlamaServerVersionHandler godoc
|
||||
// @Summary Get version of llama server
|
||||
// @Description Returns the version of the llama server command
|
||||
// @Tags server
|
||||
@@ -59,7 +75,7 @@ func (h *Handler) HelpHandler() http.HandlerFunc {
|
||||
// @Success 200 {string} string "Version information"
|
||||
// @Failure 500 {string} string "Internal Server Error"
|
||||
// @Router /server/version [get]
|
||||
func (h *Handler) VersionHandler() http.HandlerFunc {
|
||||
func (h *Handler) LlamaServerVersionHandler() http.HandlerFunc {
|
||||
return func(w http.ResponseWriter, r *http.Request) {
|
||||
versionCmd := exec.Command("llama-server", "--version")
|
||||
output, err := versionCmd.CombinedOutput()
|
||||
@@ -72,7 +88,7 @@ func (h *Handler) VersionHandler() http.HandlerFunc {
|
||||
}
|
||||
}
|
||||
|
||||
// ListDevicesHandler godoc
|
||||
// LlamaServerListDevicesHandler godoc
|
||||
// @Summary List available devices for llama server
|
||||
// @Description Returns a list of available devices for the llama server
|
||||
// @Tags server
|
||||
@@ -81,7 +97,7 @@ func (h *Handler) VersionHandler() http.HandlerFunc {
|
||||
// @Success 200 {string} string "List of devices"
|
||||
// @Failure 500 {string} string "Internal Server Error"
|
||||
// @Router /server/devices [get]
|
||||
func (h *Handler) ListDevicesHandler() http.HandlerFunc {
|
||||
func (h *Handler) LlamaServerListDevicesHandler() http.HandlerFunc {
|
||||
return func(w http.ResponseWriter, r *http.Request) {
|
||||
listCmd := exec.Command("llama-server", "--list-devices")
|
||||
output, err := listCmd.CombinedOutput()
|
||||
@@ -100,7 +116,7 @@ func (h *Handler) ListDevicesHandler() http.HandlerFunc {
|
||||
// @Tags instances
|
||||
// @Security ApiKeyAuth
|
||||
// @Produces json
|
||||
// @Success 200 {array} Instance "List of instances"
|
||||
// @Success 200 {array} instance.Process "List of instances"
|
||||
// @Failure 500 {string} string "Internal Server Error"
|
||||
// @Router /instances [get]
|
||||
func (h *Handler) ListInstances() http.HandlerFunc {
|
||||
@@ -127,8 +143,8 @@ func (h *Handler) ListInstances() http.HandlerFunc {
|
||||
// @Accept json
|
||||
// @Produces json
|
||||
// @Param name path string true "Instance Name"
|
||||
// @Param options body CreateInstanceOptions true "Instance configuration options"
|
||||
// @Success 201 {object} Instance "Created instance details"
|
||||
// @Param options body instance.CreateInstanceOptions true "Instance configuration options"
|
||||
// @Success 201 {object} instance.Process "Created instance details"
|
||||
// @Failure 400 {string} string "Invalid request body"
|
||||
// @Failure 500 {string} string "Internal Server Error"
|
||||
// @Router /instances/{name} [post]
|
||||
@@ -168,7 +184,7 @@ func (h *Handler) CreateInstance() http.HandlerFunc {
|
||||
// @Security ApiKeyAuth
|
||||
// @Produces json
|
||||
// @Param name path string true "Instance Name"
|
||||
// @Success 200 {object} Instance "Instance details"
|
||||
// @Success 200 {object} instance.Process "Instance details"
|
||||
// @Failure 400 {string} string "Invalid name format"
|
||||
// @Failure 500 {string} string "Internal Server Error"
|
||||
// @Router /instances/{name} [get]
|
||||
@@ -202,8 +218,8 @@ func (h *Handler) GetInstance() http.HandlerFunc {
|
||||
// @Accept json
|
||||
// @Produces json
|
||||
// @Param name path string true "Instance Name"
|
||||
// @Param options body CreateInstanceOptions true "Instance configuration options"
|
||||
// @Success 200 {object} Instance "Updated instance details"
|
||||
// @Param options body instance.CreateInstanceOptions true "Instance configuration options"
|
||||
// @Success 200 {object} instance.Process "Updated instance details"
|
||||
// @Failure 400 {string} string "Invalid name format"
|
||||
// @Failure 500 {string} string "Internal Server Error"
|
||||
// @Router /instances/{name} [put]
|
||||
@@ -242,7 +258,7 @@ func (h *Handler) UpdateInstance() http.HandlerFunc {
|
||||
// @Security ApiKeyAuth
|
||||
// @Produces json
|
||||
// @Param name path string true "Instance Name"
|
||||
// @Success 200 {object} Instance "Started instance details"
|
||||
// @Success 200 {object} instance.Process "Started instance details"
|
||||
// @Failure 400 {string} string "Invalid name format"
|
||||
// @Failure 500 {string} string "Internal Server Error"
|
||||
// @Router /instances/{name}/start [post]
|
||||
@@ -275,7 +291,7 @@ func (h *Handler) StartInstance() http.HandlerFunc {
|
||||
// @Security ApiKeyAuth
|
||||
// @Produces json
|
||||
// @Param name path string true "Instance Name"
|
||||
// @Success 200 {object} Instance "Stopped instance details"
|
||||
// @Success 200 {object} instance.Process "Stopped instance details"
|
||||
// @Failure 400 {string} string "Invalid name format"
|
||||
// @Failure 500 {string} string "Internal Server Error"
|
||||
// @Router /instances/{name}/stop [post]
|
||||
@@ -308,7 +324,7 @@ func (h *Handler) StopInstance() http.HandlerFunc {
|
||||
// @Security ApiKeyAuth
|
||||
// @Produces json
|
||||
// @Param name path string true "Instance Name"
|
||||
// @Success 200 {object} Instance "Restarted instance details"
|
||||
// @Success 200 {object} instance.Process "Restarted instance details"
|
||||
// @Failure 400 {string} string "Invalid name format"
|
||||
// @Failure 500 {string} string "Internal Server Error"
|
||||
// @Router /instances/{name}/restart [post]
|
||||
@@ -456,6 +472,9 @@ func (h *Handler) ProxyToInstance() http.HandlerFunc {
|
||||
proxyPath = "/" + proxyPath
|
||||
}
|
||||
|
||||
// Update the last request time for the instance
|
||||
inst.UpdateLastRequestTime()
|
||||
|
||||
// Modify the request to remove the proxy prefix
|
||||
originalPath := r.URL.Path
|
||||
r.URL.Path = proxyPath
|
||||
@@ -556,8 +575,23 @@ func (h *Handler) OpenAIProxy() http.HandlerFunc {
|
||||
}
|
||||
|
||||
if !inst.Running {
|
||||
http.Error(w, "Instance is not running", http.StatusServiceUnavailable)
|
||||
return
|
||||
if inst.GetOptions().OnDemandStart != nil && *inst.GetOptions().OnDemandStart {
|
||||
// If on-demand start is enabled, start the instance
|
||||
if _, err := h.InstanceManager.StartInstance(modelName); err != nil {
|
||||
http.Error(w, "Failed to start instance: "+err.Error(), http.StatusInternalServerError)
|
||||
return
|
||||
}
|
||||
|
||||
// Wait for the instance to become healthy before proceeding
|
||||
if err := inst.WaitForHealthy(h.cfg.Instances.OnDemandStartTimeout); err != nil { // 2 minutes timeout
|
||||
http.Error(w, "Instance failed to become healthy: "+err.Error(), http.StatusServiceUnavailable)
|
||||
return
|
||||
}
|
||||
|
||||
} else {
|
||||
http.Error(w, "Instance is not running", http.StatusServiceUnavailable)
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
proxy, err := inst.GetProxy()
|
||||
@@ -566,6 +600,9 @@ func (h *Handler) OpenAIProxy() http.HandlerFunc {
|
||||
return
|
||||
}
|
||||
|
||||
// Update last request time for the instance
|
||||
inst.UpdateLastRequestTime()
|
||||
|
||||
// Recreate the request body from the bytes we read
|
||||
r.Body = io.NopCloser(bytes.NewReader(bodyBytes))
|
||||
r.ContentLength = int64(len(bodyBytes))
|
||||
|
||||
@@ -8,7 +8,7 @@ import (
|
||||
"github.com/go-chi/cors"
|
||||
httpSwagger "github.com/swaggo/http-swagger"
|
||||
|
||||
_ "llamactl/docs"
|
||||
_ "llamactl/apidocs"
|
||||
"llamactl/webui"
|
||||
)
|
||||
|
||||
@@ -42,10 +42,12 @@ func SetupRouter(handler *Handler) *chi.Mux {
|
||||
r.Use(authMiddleware.AuthMiddleware(KeyTypeManagement))
|
||||
}
|
||||
|
||||
r.Get("/version", handler.VersionHandler()) // Get server version
|
||||
|
||||
r.Route("/server", func(r chi.Router) {
|
||||
r.Get("/help", handler.HelpHandler())
|
||||
r.Get("/version", handler.VersionHandler())
|
||||
r.Get("/devices", handler.ListDevicesHandler())
|
||||
r.Get("/help", handler.LlamaServerHelpHandler())
|
||||
r.Get("/version", handler.LlamaServerVersionHandler())
|
||||
r.Get("/devices", handler.LlamaServerListDevicesHandler())
|
||||
})
|
||||
|
||||
// Instance management endpoints
|
||||
|
||||
@@ -7,6 +7,7 @@ import SystemInfoDialog from "./components/SystemInfoDialog";
|
||||
import { type CreateInstanceOptions, type Instance } from "@/types/instance";
|
||||
import { useInstances } from "@/contexts/InstancesContext";
|
||||
import { useAuth } from "@/contexts/AuthContext";
|
||||
import { ThemeProvider } from "@/contexts/ThemeContext";
|
||||
|
||||
function App() {
|
||||
const { isAuthenticated, isLoading: authLoading } = useAuth();
|
||||
@@ -42,44 +43,50 @@ function App() {
|
||||
// Show loading spinner while checking auth
|
||||
if (authLoading) {
|
||||
return (
|
||||
<div className="min-h-screen bg-gray-50 flex items-center justify-center">
|
||||
<div className="text-center">
|
||||
<div className="animate-spin rounded-full h-8 w-8 border-b-2 border-blue-600 mx-auto mb-4"></div>
|
||||
<p className="text-gray-600">Loading...</p>
|
||||
<ThemeProvider>
|
||||
<div className="min-h-screen bg-background flex items-center justify-center">
|
||||
<div className="text-center">
|
||||
<div className="animate-spin rounded-full h-8 w-8 border-b-2 border-primary mx-auto mb-4"></div>
|
||||
<p className="text-muted-foreground">Loading...</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</ThemeProvider>
|
||||
);
|
||||
}
|
||||
|
||||
// Show login dialog if not authenticated
|
||||
if (!isAuthenticated) {
|
||||
return (
|
||||
<div className="min-h-screen bg-gray-50">
|
||||
<LoginDialog open={true} />
|
||||
</div>
|
||||
<ThemeProvider>
|
||||
<div className="min-h-screen bg-background">
|
||||
<LoginDialog open={true} />
|
||||
</div>
|
||||
</ThemeProvider>
|
||||
);
|
||||
}
|
||||
|
||||
// Show main app if authenticated
|
||||
return (
|
||||
<div className="min-h-screen bg-gray-50">
|
||||
<Header onCreateInstance={handleCreateInstance} onShowSystemInfo={handleShowSystemInfo} />
|
||||
<main className="container mx-auto max-w-4xl px-4 py-8">
|
||||
<InstanceList editInstance={handleEditInstance} />
|
||||
</main>
|
||||
<ThemeProvider>
|
||||
<div className="min-h-screen bg-background">
|
||||
<Header onCreateInstance={handleCreateInstance} onShowSystemInfo={handleShowSystemInfo} />
|
||||
<main className="container mx-auto max-w-4xl px-4 py-8">
|
||||
<InstanceList editInstance={handleEditInstance} />
|
||||
</main>
|
||||
|
||||
<InstanceDialog
|
||||
open={isInstanceModalOpen}
|
||||
onOpenChange={setIsInstanceModalOpen}
|
||||
onSave={handleSaveInstance}
|
||||
instance={editingInstance}
|
||||
/>
|
||||
<InstanceDialog
|
||||
open={isInstanceModalOpen}
|
||||
onOpenChange={setIsInstanceModalOpen}
|
||||
onSave={handleSaveInstance}
|
||||
instance={editingInstance}
|
||||
/>
|
||||
|
||||
<SystemInfoDialog
|
||||
open={isSystemInfoModalOpen}
|
||||
onOpenChange={setIsSystemInfoModalOpen}
|
||||
/>
|
||||
</div>
|
||||
<SystemInfoDialog
|
||||
open={isSystemInfoModalOpen}
|
||||
onOpenChange={setIsSystemInfoModalOpen}
|
||||
/>
|
||||
</div>
|
||||
</ThemeProvider>
|
||||
);
|
||||
}
|
||||
|
||||
|
||||
@@ -55,6 +55,21 @@ describe('App Component - Critical Business Logic Only', () => {
|
||||
vi.mocked(instancesApi.list).mockResolvedValue(mockInstances)
|
||||
window.sessionStorage.setItem('llamactl_management_key', 'test-api-key-123')
|
||||
global.fetch = vi.fn(() => Promise.resolve(new Response(null, { status: 200 })))
|
||||
|
||||
// Mock window.matchMedia for dark mode functionality
|
||||
Object.defineProperty(window, 'matchMedia', {
|
||||
writable: true,
|
||||
value: vi.fn().mockImplementation((query: string) => ({
|
||||
matches: false,
|
||||
media: query,
|
||||
onchange: null,
|
||||
addListener: vi.fn(),
|
||||
removeListener: vi.fn(),
|
||||
addEventListener: vi.fn(),
|
||||
removeEventListener: vi.fn(),
|
||||
dispatchEvent: vi.fn(),
|
||||
})),
|
||||
})
|
||||
})
|
||||
|
||||
afterEach(() => {
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
import { Button } from "@/components/ui/button";
|
||||
import { HelpCircle, LogOut } from "lucide-react";
|
||||
import { HelpCircle, LogOut, Moon, Sun } from "lucide-react";
|
||||
import { useAuth } from "@/contexts/AuthContext";
|
||||
import { useTheme } from "@/contexts/ThemeContext";
|
||||
|
||||
interface HeaderProps {
|
||||
onCreateInstance: () => void;
|
||||
@@ -9,6 +10,7 @@ interface HeaderProps {
|
||||
|
||||
function Header({ onCreateInstance, onShowSystemInfo }: HeaderProps) {
|
||||
const { logout } = useAuth();
|
||||
const { theme, toggleTheme } = useTheme();
|
||||
|
||||
const handleLogout = () => {
|
||||
if (confirm("Are you sure you want to logout?")) {
|
||||
@@ -17,10 +19,10 @@ function Header({ onCreateInstance, onShowSystemInfo }: HeaderProps) {
|
||||
};
|
||||
|
||||
return (
|
||||
<header className="bg-white border-b border-gray-200">
|
||||
<header className="bg-card border-b border-border">
|
||||
<div className="container mx-auto max-w-4xl px-4 py-4">
|
||||
<div className="flex items-center justify-between">
|
||||
<h1 className="text-2xl font-bold text-gray-900">
|
||||
<h1 className="text-2xl font-bold text-foreground">
|
||||
Llamactl Dashboard
|
||||
</h1>
|
||||
|
||||
@@ -29,6 +31,16 @@ function Header({ onCreateInstance, onShowSystemInfo }: HeaderProps) {
|
||||
Create Instance
|
||||
</Button>
|
||||
|
||||
<Button
|
||||
variant="outline"
|
||||
size="icon"
|
||||
onClick={toggleTheme}
|
||||
data-testid="theme-toggle-button"
|
||||
title={`Switch to ${theme === 'light' ? 'dark' : 'light'} mode`}
|
||||
>
|
||||
{theme === 'light' ? <Moon className="h-4 w-4" /> : <Sun className="h-4 w-4" />}
|
||||
</Button>
|
||||
|
||||
<Button
|
||||
variant="outline"
|
||||
size="icon"
|
||||
|
||||
@@ -18,8 +18,8 @@ function InstanceList({ editInstance }: InstanceListProps) {
|
||||
return (
|
||||
<div className="flex items-center justify-center py-12" aria-label="Loading">
|
||||
<div className="text-center">
|
||||
<div className="animate-spin rounded-full h-8 w-8 border-b-2 border-blue-600 mx-auto mb-4"></div>
|
||||
<p className="text-gray-600">Loading instances...</p>
|
||||
<div className="animate-spin rounded-full h-8 w-8 border-b-2 border-primary mx-auto mb-4"></div>
|
||||
<p className="text-muted-foreground">Loading instances...</p>
|
||||
</div>
|
||||
</div>
|
||||
)
|
||||
@@ -28,7 +28,7 @@ function InstanceList({ editInstance }: InstanceListProps) {
|
||||
if (error) {
|
||||
return (
|
||||
<div className="text-center py-12">
|
||||
<div className="text-red-600 mb-4">
|
||||
<div className="text-destructive mb-4">
|
||||
<p className="text-lg font-semibold">Error loading instances</p>
|
||||
<p className="text-sm">{error}</p>
|
||||
</div>
|
||||
@@ -39,15 +39,15 @@ function InstanceList({ editInstance }: InstanceListProps) {
|
||||
if (instances.length === 0) {
|
||||
return (
|
||||
<div className="text-center py-12">
|
||||
<p className="text-gray-600 text-lg mb-2">No instances found</p>
|
||||
<p className="text-gray-500 text-sm">Create your first instance to get started</p>
|
||||
<p className="text-foreground text-lg mb-2">No instances found</p>
|
||||
<p className="text-muted-foreground text-sm">Create your first instance to get started</p>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
return (
|
||||
<div className="space-y-4">
|
||||
<h2 className="text-xl font-semibold text-gray-900 mb-6">
|
||||
<h2 className="text-xl font-semibold text-foreground mb-6">
|
||||
Instances ({instances.length})
|
||||
</h2>
|
||||
|
||||
|
||||
@@ -19,6 +19,15 @@ import {
|
||||
} from 'lucide-react'
|
||||
import { serverApi } from '@/lib/api'
|
||||
|
||||
// Helper to get version from environment
|
||||
const getAppVersion = (): string => {
|
||||
try {
|
||||
return (import.meta.env as Record<string, string>).VITE_APP_VERSION || 'unknown'
|
||||
} catch {
|
||||
return 'unknown'
|
||||
}
|
||||
}
|
||||
|
||||
interface SystemInfoModalProps {
|
||||
open: boolean
|
||||
onOpenChange: (open: boolean) => void
|
||||
@@ -109,9 +118,20 @@ const SystemInfoDialog: React.FC<SystemInfoModalProps> = ({
|
||||
</div>
|
||||
) : systemInfo ? (
|
||||
<div className="space-y-6">
|
||||
{/* Version Section */}
|
||||
{/* Llamactl Version Section */}
|
||||
<div className="space-y-3">
|
||||
<h3 className="font-semibold">Version</h3>
|
||||
<h3 className="font-semibold">Llamactl Version</h3>
|
||||
|
||||
<div className="bg-gray-900 rounded-lg p-4">
|
||||
<pre className="text-sm text-gray-300 whitespace-pre-wrap font-mono">
|
||||
{getAppVersion()}
|
||||
</pre>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* Llama Server Version Section */}
|
||||
<div className="space-y-3">
|
||||
<h3 className="font-semibold">Llama Server Version</h3>
|
||||
|
||||
<div className="bg-gray-900 rounded-lg p-4">
|
||||
<div className="mb-2">
|
||||
|
||||
54
webui/src/contexts/ThemeContext.tsx
Normal file
54
webui/src/contexts/ThemeContext.tsx
Normal file
@@ -0,0 +1,54 @@
|
||||
import { createContext, useContext, useEffect, useState, type ReactNode } from "react";
|
||||
|
||||
type Theme = "light" | "dark";
|
||||
|
||||
interface ThemeContextType {
|
||||
theme: Theme;
|
||||
toggleTheme: () => void;
|
||||
}
|
||||
|
||||
const ThemeContext = createContext<ThemeContextType | undefined>(undefined);
|
||||
|
||||
interface ThemeProviderProps {
|
||||
children: ReactNode;
|
||||
}
|
||||
|
||||
export function ThemeProvider({ children }: ThemeProviderProps) {
|
||||
const [theme, setTheme] = useState<Theme>(() => {
|
||||
const stored = localStorage.getItem("theme");
|
||||
if (stored === "light" || stored === "dark") {
|
||||
return stored;
|
||||
}
|
||||
return window.matchMedia("(prefers-color-scheme: dark)").matches ? "dark" : "light";
|
||||
});
|
||||
|
||||
useEffect(() => {
|
||||
const root = document.documentElement;
|
||||
|
||||
if (theme === "dark") {
|
||||
root.classList.add("dark");
|
||||
} else {
|
||||
root.classList.remove("dark");
|
||||
}
|
||||
|
||||
localStorage.setItem("theme", theme);
|
||||
}, [theme]);
|
||||
|
||||
const toggleTheme = () => {
|
||||
setTheme(prevTheme => prevTheme === "light" ? "dark" : "light");
|
||||
};
|
||||
|
||||
return (
|
||||
<ThemeContext.Provider value={{ theme, toggleTheme }}>
|
||||
{children}
|
||||
</ThemeContext.Provider>
|
||||
);
|
||||
}
|
||||
|
||||
export function useTheme() {
|
||||
const context = useContext(ThemeContext);
|
||||
if (context === undefined) {
|
||||
throw new Error("useTheme must be used within a ThemeProvider");
|
||||
}
|
||||
return context;
|
||||
}
|
||||
@@ -21,6 +21,15 @@ export const basicFieldsConfig: Record<string, {
|
||||
placeholder: '5',
|
||||
description: 'Delay in seconds before attempting restart'
|
||||
},
|
||||
idle_timeout: {
|
||||
label: 'Idle Timeout (minutes)',
|
||||
placeholder: '60',
|
||||
description: 'Time in minutes before instance is considered idle and stopped'
|
||||
},
|
||||
on_demand_start: {
|
||||
label: 'On-Demand Start',
|
||||
description: 'Start instance upon receiving OpenAI-compatible API request'
|
||||
},
|
||||
model: {
|
||||
label: 'Model Path',
|
||||
placeholder: '/path/to/model.gguf',
|
||||
|
||||
@@ -6,6 +6,8 @@ export const CreateInstanceOptionsSchema = z.object({
|
||||
auto_restart: z.boolean().optional(),
|
||||
max_restarts: z.number().optional(),
|
||||
restart_delay: z.number().optional(),
|
||||
idle_timeout: z.number().optional(),
|
||||
on_demand_start: z.boolean().optional(),
|
||||
|
||||
// Common params
|
||||
verbose_prompt: z.boolean().optional(),
|
||||
|
||||
13
webui/src/vite-env.d.ts
vendored
Normal file
13
webui/src/vite-env.d.ts
vendored
Normal file
@@ -0,0 +1,13 @@
|
||||
/// <reference types="vite/client" />
|
||||
|
||||
declare global {
|
||||
interface ImportMetaEnv {
|
||||
readonly VITE_APP_VERSION?: string
|
||||
}
|
||||
|
||||
interface ImportMeta {
|
||||
readonly env: ImportMetaEnv
|
||||
}
|
||||
}
|
||||
|
||||
export {}
|
||||
@@ -18,8 +18,9 @@
|
||||
"baseUrl": ".",
|
||||
"paths": {
|
||||
"@/*": ["./src/*"]
|
||||
}
|
||||
},
|
||||
"types": ["vite/client"]
|
||||
},
|
||||
"include": ["src"],
|
||||
"include": ["src", "src/vite-env.d.ts"],
|
||||
"references": [{ "path": "./tsconfig.node.json" }]
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user