mirror of https://github.com/lordmathis/llamactl.git synced 2025-11-05 16:44:22 +00:00

Go to file

Matúš Námešný d6e84f0527 Merge pull request #13 from lordmathis/fix/decimal-input

fix: Allow decimal input for numeric fields in instance configuration

2025-08-05 20:03:31 +02:00

.github/workflows

Remove changelog generation step from release workflow

2025-07-28 22:32:44 +02:00

cmd/server

Split large package into subpackages

2025-08-04 19:23:56 +02:00

docs

Update swagger docs

2025-07-30 21:34:46 +02:00

pkg

Update ValidateInstanceName to return the validated name and modify tests accordingly

2025-08-04 20:46:15 +02:00

webui

Fix eslint issues in ZodFormField

2025-08-05 19:21:09 +02:00

.gitignore

Embed webui

2025-07-26 12:25:51 +02:00

go.mod

Add cors middleware

2025-07-27 19:05:15 +02:00

go.sum

Add cors middleware

2025-07-27 19:05:15 +02:00

LICENSE

Initial commit

2025-07-16 20:02:28 +02:00

README.md

Update README.md to enhance API Key authentication section and provide usage examples

2025-08-02 23:15:25 +02:00

README.md

llamactl

A control server for managing multiple Llama Server instances with a web-based dashboard.

Features

Multi-instance Management: Create, start, stop, restart, and delete multiple llama-server instances
Web Dashboard: Modern React-based UI for managing instances
Auto-restart: Configurable automatic restart on instance failure
Instance Monitoring: Real-time health checks and status monitoring
Log Management: View, search, and download instance logs
Data Persistence: Persistent storage of instance state.
REST API: Full API for programmatic control
OpenAI Compatible: Route requests to instances by instance name
Configuration Management: Comprehensive llama-server parameter support
System Information: View llama-server version, devices, and help
API Key Authentication: Secure access with separate management and inference keys

Prerequisites

This project requires llama-server from llama.cpp to be installed and available in your PATH.

Install llama.cpp: Follow the installation instructions at https://github.com/ggml-org/llama.cpp

Installation

Download Prebuilt Binaries

The easiest way to install llamactl is to download a prebuilt binary from the releases page.

Linux/macOS:

# Download the latest release for your platform
curl -L https://github.com/lordmathis/llamactl/releases/latest/download/llamactl-$(curl -s https://api.github.com/repos/lordmathis/llamactl/releases/latest | grep tag_name | cut -d '"' -f 4)-linux-amd64.tar.gz | tar -xz

# Move to PATH
sudo mv llamactl /usr/local/bin/

# Run the server
llamactl

Manual Download:

Go to the releases page
Download the appropriate archive for your platform
Extract the archive and move the binary to a directory in your PATH

Build from Source

If you prefer to build from source or need the latest development version:

Build Requirements

Go 1.24 or later
Node.js 22 or later (for building the web UI)

Building with Web UI

# Clone the repository
git clone https://github.com/lordmathis/llamactl.git
cd llamactl

# Install Node.js dependencies
cd webui
npm ci

# Build the web UI
npm run build

# Return to project root and build
cd ..
go build -o llamactl ./cmd/server

# Run the server
./llamactl

Configuration

llamactl can be configured via configuration files or environment variables. Configuration is loaded in the following order of precedence:

Hardcoded defaults
Configuration file
Environment variables

Configuration Files

Configuration File Locations

Configuration files are searched in the following locations (in order of precedence):

Linux/macOS:

./llamactl.yaml or ./config.yaml (current directory)
$HOME/.config/llamactl/config.yaml
/etc/llamactl/config.yaml

Windows:

./llamactl.yaml or ./config.yaml (current directory)
%APPDATA%\llamactl\config.yaml
%USERPROFILE%\llamactl\config.yaml
%PROGRAMDATA%\llamactl\config.yaml

You can specify the path to config file with LLAMACTL_CONFIG_PATH environment variable.

Configuration Options

Server Configuration

server:
  host: "0.0.0.0"         # Server host to bind to (default: "0.0.0.0")
  port: 8080              # Server port to bind to (default: 8080)
  allowed_origins: ["*"]  # CORS allowed origins (default: ["*"])
  enable_swagger: false   # Enable Swagger UI (default: false)

Environment Variables:

LLAMACTL_HOST - Server host
LLAMACTL_PORT - Server port
LLAMACTL_ALLOWED_ORIGINS - Comma-separated CORS origins
LLAMACTL_ENABLE_SWAGGER - Enable Swagger UI (true/false)

Instance Configuration

instances:
  port_range: [8000, 9000]           # Port range for instances (default: [8000, 9000])
  data_dir: "~/.local/share/llamactl" # Directory for all llamactl data (default varies by OS)
  configs_dir: "~/.local/share/llamactl/instances" # Directory for instance configs (default: data_dir/instances)
  logs_dir: "~/.local/share/llamactl/logs" # Directory for instance logs (default: data_dir/logs)
  auto_create_dirs: true             # Automatically create data/config/logs directories (default: true)
  max_instances: -1                  # Maximum instances (-1 = unlimited)
  llama_executable: "llama-server"   # Path to llama-server executable
  default_auto_restart: true         # Default auto-restart setting
  default_max_restarts: 3            # Default maximum restart attempts
  default_restart_delay: 5           # Default restart delay in seconds

Environment Variables:

LLAMACTL_INSTANCE_PORT_RANGE - Port range (format: "8000-9000" or "8000,9000")
LLAMACTL_DATA_DIRECTORY - Data directory path
LLAMACTL_INSTANCES_DIR - Instance configs directory path
LLAMACTL_LOGS_DIR - Log directory path
LLAMACTL_AUTO_CREATE_DATA_DIR - Auto-create data/config/logs directories (true/false)
LLAMACTL_MAX_INSTANCES - Maximum number of instances
LLAMACTL_LLAMA_EXECUTABLE - Path to llama-server executable
LLAMACTL_DEFAULT_AUTO_RESTART - Default auto-restart setting (true/false)
LLAMACTL_DEFAULT_MAX_RESTARTS - Default maximum restarts
LLAMACTL_DEFAULT_RESTART_DELAY - Default restart delay in seconds

Authentication Configuration

auth:
  require_inference_auth: true           # Require API key for OpenAI endpoints (default: true)
  inference_keys: []                     # List of valid inference API keys
  require_management_auth: true          # Require API key for management endpoints (default: true)
  management_keys: []                    # List of valid management API keys

Environment Variables:

LLAMACTL_REQUIRE_INFERENCE_AUTH - Require auth for OpenAI endpoints (true/false)
LLAMACTL_INFERENCE_KEYS - Comma-separated inference API keys
LLAMACTL_REQUIRE_MANAGEMENT_AUTH - Require auth for management endpoints (true/false)
LLAMACTL_MANAGEMENT_KEYS - Comma-separated management API keys

Example Configuration

server:
  host: "0.0.0.0"
  port: 8080

instances:
  port_range: [8001, 8100]
  data_dir: "/var/lib/llamactl"
  configs_dir: "/var/lib/llamactl/instances"
  logs_dir: "/var/log/llamactl"
  auto_create_dirs: true
  max_instances: 10
  llama_executable: "/usr/local/bin/llama-server"
  default_auto_restart: true
  default_max_restarts: 5
  default_restart_delay: 10

auth:
  require_inference_auth: true
  inference_keys: ["sk-inference-abc123"]
  require_management_auth: true
  management_keys: ["sk-management-xyz456"]

Usage

Starting the Server

# Start with default configuration
./llamactl

# Start with custom config file
LLAMACTL_CONFIG_PATH=/path/to/config.yaml ./llamactl

# Start with environment variables
LLAMACTL_PORT=9090 LLAMACTL_LOG_DIR=/custom/logs ./llamactl

Authentication

llamactl supports API Key authentication for both management and inference (OpenAI-compatible) endpoints. There are separate keys for management and inference APIs:

Management keys grant full access to instance management
Inference keys grant access to OpenAI-compatible endpoints
Management keys also work for inference endpoints (higher privilege)

How to Use: Pass your API key in requests using one of:

Authorization: Bearer <key> header
X-API-Key: <key> header
api_key=<key> query parameter

Auto-generated keys: If no keys are set and authentication is required, a key will be generated and printed to the terminal at startup. For production, set your own keys in config or environment variables.

Web Dashboard

Open your browser and navigate to http://localhost:8080 to access the web dashboard.

API Usage

The REST API is available at http://localhost:8080/api/v1. See the Swagger documentation at http://localhost:8080/swagger/ for complete API reference.

Create an Instance

curl -X POST http://localhost:8080/api/v1/instances/my-instance \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-management-your-key" \
  -d '{
    "model": "/path/to/model.gguf",
    "gpu_layers": 32,
    "auto_restart": true
  }'

List Instances

curl -H "Authorization: Bearer sk-management-your-key" \
  http://localhost:8080/api/v1/instances

Start/Stop Instance

# Start
curl -X POST \
  -H "Authorization: Bearer sk-management-your-key" \
  http://localhost:8080/api/v1/instances/my-instance/start

# Stop
curl -X POST \
  -H "Authorization: Bearer sk-management-your-key" \
  http://localhost:8080/api/v1/instances/my-instance/stop

OpenAI Compatible Endpoints

Route requests to instances by including the instance name as the model parameter:

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-inference-your-key" \
  -d '{
    "model": "my-instance",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Development

Running Tests

# Go tests
go test ./...

# Web UI tests
cd webui
npm test

Development Server

# Start Go server in development mode
go run ./cmd/server

# Start web UI development server (in another terminal)
cd webui
npm run dev

API Documentation

Interactive API documentation is available at http://localhost:8080/swagger/ when the server is running.

License

This project is licensed under the MIT License. See the LICENSE file for details.