Mathis/llamactl

Fork 0

mirror of https://github.com/lordmathis/llamactl.git synced 2025-11-05 16:44:22 +00:00

Files

LordMathis d94c922314 Update README.md for data persistence features

2025-08-02 23:02:30 +02:00

9.2 KiB

Raw Blame History

llamactl

A control server for managing multiple Llama Server instances with a web-based dashboard.

Features

Multi-instance Management: Create, start, stop, restart, and delete multiple llama-server instances
Web Dashboard: Modern React-based UI for managing instances
Auto-restart: Configurable automatic restart on instance failure
Instance Monitoring: Real-time health checks and status monitoring
Log Management: View, search, and download instance logs
Data Persistence: Persistent storage of instance state.
REST API: Full API for programmatic control
OpenAI Compatible: Route requests to instances by instance name
Configuration Management: Comprehensive llama-server parameter support
System Information: View llama-server version, devices, and help

Prerequisites

This project requires llama-server from llama.cpp to be installed and available in your PATH.

Install llama.cpp: Follow the installation instructions at https://github.com/ggml-org/llama.cpp

Installation

Download Prebuilt Binaries

The easiest way to install llamactl is to download a prebuilt binary from the releases page.

Linux/macOS:

# Download the latest release for your platform
curl -L https://github.com/lordmathis/llamactl/releases/latest/download/llamactl-$(curl -s https://api.github.com/repos/lordmathis/llamactl/releases/latest | grep tag_name | cut -d '"' -f 4)-linux-amd64.tar.gz | tar -xz

# Move to PATH
sudo mv llamactl /usr/local/bin/

# Run the server
llamactl

Manual Download:

Go to the releases page
Download the appropriate archive for your platform
Extract the archive and move the binary to a directory in your PATH

Build from Source

If you prefer to build from source or need the latest development version:

Build Requirements

Go 1.24 or later
Node.js 22 or later (for building the web UI)

Building with Web UI

# Clone the repository
git clone https://github.com/lordmathis/llamactl.git
cd llamactl

# Install Node.js dependencies
cd webui
npm ci

# Build the web UI
npm run build

# Return to project root and build
cd ..
go build -o llamactl ./cmd/server

# Run the server
./llamactl

Configuration

llamactl can be configured via configuration files or environment variables. Configuration is loaded in the following order of precedence:

Hardcoded defaults
Configuration file
Environment variables

Configuration Files

Configuration File Locations

Configuration files are searched in the following locations (in order of precedence):

Linux/macOS:

./llamactl.yaml or ./config.yaml (current directory)
$HOME/.config/llamactl/config.yaml
/etc/llamactl/config.yaml

Windows:

./llamactl.yaml or ./config.yaml (current directory)
%APPDATA%\llamactl\config.yaml
%USERPROFILE%\llamactl\config.yaml
%PROGRAMDATA%\llamactl\config.yaml

You can specify the path to config file with LLAMACTL_CONFIG_PATH environment variable.

API Key Authentication

llamactl now supports API Key authentication for both management and inference (OpenAI-compatible) endpoints. The are separate keys for management and inference APIs. Management keys grant full access; inference keys grant access to OpenAI-compatible endpoints

How to Use:

Pass your API key in requests using one of:
- Authorization: Bearer <key> header
- X-API-Key: <key> header
- api_key=<key> query parameter

Auto-generated keys: If no keys are set and authentication is required, a key will be generated and printed to the terminal at startup. For production, set your own keys in config or environment variables.

Configuration Options

Server Configuration

server:
  host: "0.0.0.0"         # Server host to bind to (default: "0.0.0.0")
  port: 8080              # Server port to bind to (default: 8080)
  allowed_origins: ["*"]  # CORS allowed origins (default: ["*"])
  enable_swagger: false   # Enable Swagger UI (default: false)

Environment Variables:

LLAMACTL_HOST - Server host
LLAMACTL_PORT - Server port
LLAMACTL_ALLOWED_ORIGINS - Comma-separated CORS origins
LLAMACTL_ENABLE_SWAGGER - Enable Swagger UI (true/false)

Instance Configuration

instances:
  port_range: [8000, 9000]           # Port range for instances (default: [8000, 9000])
  data_dir: "~/.local/share/llamactl" # Directory for all llamactl data (default varies by OS)
  configs_dir: "~/.local/share/llamactl/instances" # Directory for instance configs (default: data_dir/instances)
  logs_dir: "~/.local/share/llamactl/logs" # Directory for instance logs (default: data_dir/logs)
  auto_create_dirs: true             # Automatically create data/config/logs directories (default: true)
  max_instances: -1                  # Maximum instances (-1 = unlimited)
  llama_executable: "llama-server"   # Path to llama-server executable
  default_auto_restart: true         # Default auto-restart setting
  default_max_restarts: 3            # Default maximum restart attempts
  default_restart_delay: 5           # Default restart delay in seconds

Environment Variables:

LLAMACTL_INSTANCE_PORT_RANGE - Port range (format: "8000-9000" or "8000,9000")
LLAMACTL_DATA_DIRECTORY - Data directory path
LLAMACTL_INSTANCES_DIR - Instance configs directory path
LLAMACTL_LOGS_DIR - Log directory path
LLAMACTL_AUTO_CREATE_DATA_DIR - Auto-create data/config/logs directories (true/false)
LLAMACTL_MAX_INSTANCES - Maximum number of instances
LLAMACTL_LLAMA_EXECUTABLE - Path to llama-server executable
LLAMACTL_DEFAULT_AUTO_RESTART - Default auto-restart setting (true/false)
LLAMACTL_DEFAULT_MAX_RESTARTS - Default maximum restarts
LLAMACTL_DEFAULT_RESTART_DELAY - Default restart delay in seconds

Auth Configuration

auth:
  require_inference_auth: true           # Require API key for OpenAI endpoints (default: true)
  inference_keys: []                     # List of valid inference API keys
  require_management_auth: true          # Require API key for management endpoints (default: true)
  management_keys: []                    # List of valid management API keys

Environment Variables:

LLAMACTL_REQUIRE_INFERENCE_AUTH - Require auth for OpenAI endpoints (true/false)
LLAMACTL_INFERENCE_KEYS - Comma-separated inference API keys
LLAMACTL_REQUIRE_MANAGEMENT_AUTH - Require auth for management endpoints (true/false)
LLAMACTL_MANAGEMENT_KEYS - Comma-separated management API keys

Example Configuration

server:
  host: "0.0.0.0"
  port: 8080

instances:
  port_range: [8001, 8100]
  data_dir: "/var/lib/llamactl"
  configs_dir: "/var/lib/llamactl/instances"
  logs_dir: "/var/log/llamactl"
  auto_create_dirs: true
  max_instances: 10
  llama_executable: "/usr/local/bin/llama-server"
  default_auto_restart: true
  default_max_restarts: 5
  default_restart_delay: 10

auth:
  require_inference_auth: true
  inference_keys: ["sk-inference-abc123"]
  require_management_auth: true
  management_keys: ["sk-management-xyz456"]

Usage

Starting the Server

# Start with default configuration
./llamactl

# Start with custom config file
LLAMACTL_CONFIG_PATH=/path/to/config.yaml ./llamactl

# Start with environment variables
LLAMACTL_PORT=9090 LLAMACTL_LOG_DIR=/custom/logs ./llamactl

Web Dashboard

Open your browser and navigate to http://localhost:8080 to access the web dashboard.

API Usage

The REST API is available at http://localhost:8080/api/v1. See the Swagger documentation at http://localhost:8080/swagger/ for complete API reference.

Create an Instance

curl -X POST http://localhost:8080/api/v1/instances/my-instance \
  -H "Content-Type: application/json" \
  -d '{
    "model": "/path/to/model.gguf",
    "gpu_layers": 32,
    "auto_restart": true
  }'

List Instances

curl http://localhost:8080/api/v1/instances

Start/Stop Instance

# Start
curl -X POST http://localhost:8080/api/v1/instances/my-instance/start

# Stop
curl -X POST http://localhost:8080/api/v1/instances/my-instance/stop

OpenAI Compatible Endpoints

Route requests to instances by including the instance name as the model parameter:

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my-instance",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Development

Running Tests

# Go tests
go test ./...

# Web UI tests
cd webui
npm test

Development Server

# Start Go server in development mode
go run ./cmd/server

# Start web UI development server (in another terminal)
cd webui
npm run dev

API Documentation

Interactive API documentation is available at http://localhost:8080/swagger/ when the server is running.

License

This project is licensed under the MIT License. See the LICENSE file for details.

9.2 KiB Raw Blame History

llamactl

Features

Prerequisites

Installation

Download Prebuilt Binaries

Build from Source

Build Requirements

Building with Web UI

Configuration

Configuration Files

Configuration File Locations

API Key Authentication

Configuration Options

Server Configuration

Instance Configuration

Auth Configuration

Example Configuration

Usage

Starting the Server

Web Dashboard

API Usage

Create an Instance

List Instances

Start/Stop Instance

OpenAI Compatible Endpoints

Development

Running Tests

Development Server

API Documentation

License

9.2 KiB

Raw Blame History