Files
llamactl/docs/getting-started/installation.md
2025-09-28 22:17:38 +02:00

4.4 KiB

Installation

This guide will walk you through installing Llamactl on your system.

Prerequisites

Backend Dependencies

llamactl supports multiple backends. Install at least one:

For llama.cpp backend (all platforms):

You need llama-server from llama.cpp installed:

# Homebrew (macOS/Linux)
brew install llama.cpp
# Winget (Windows)
winget install llama.cpp

Or build from source - see llama.cpp docs

For MLX backend (macOS only):

MLX provides optimized inference on Apple Silicon. Install MLX-LM:

# Install via pip (requires Python 3.8+)
pip install mlx-lm

# Or in a virtual environment (recommended)
python -m venv mlx-env
source mlx-env/bin/activate
pip install mlx-lm

Note: MLX backend is only available on macOS with Apple Silicon (M1, M2, M3, etc.)

For vLLM backend:

vLLM provides high-throughput distributed serving for LLMs. Install vLLM:

# Install via pip (requires Python 3.8+, GPU required)
pip install vllm

# Or in a virtual environment (recommended)
python -m venv vllm-env
source vllm-env/bin/activate
pip install vllm

# For production deployments, consider container-based installation

Installation Methods

Download the latest release from the GitHub releases page:

# Linux/macOS - Get latest version and download
LATEST_VERSION=$(curl -s https://api.github.com/repos/lordmathis/llamactl/releases/latest | grep '"tag_name":' | sed -E 's/.*"([^"]+)".*/\1/')
curl -L https://github.com/lordmathis/llamactl/releases/download/${LATEST_VERSION}/llamactl-${LATEST_VERSION}-$(uname -s | tr '[:upper:]' '[:lower:]')-$(uname -m).tar.gz | tar -xz
sudo mv llamactl /usr/local/bin/

# Or download manually from:
# https://github.com/lordmathis/llamactl/releases/latest

# Windows - Download from releases page

Option 2: Docker

llamactl provides Dockerfiles for creating Docker images with CUDA support for llama.cpp and vLLM backends. The resulting images include the latest llamactl release with the respective backend pre-installed.

Available Dockerfiles:

  • llamactl with llama.cpp CUDA: Dockerfile.llamacpp (based on ghcr.io/ggml-org/llama.cpp:server)
  • llamactl with vLLM CUDA: Dockerfile.vllm (based on vllm/vllm-openai:latest)

Using Docker Compose

# Clone the repository
git clone https://github.com/lordmathis/llamactl.git
cd llamactl

# Create directories for data and models
mkdir -p data/llamacpp data/vllm models

# Start llamactl with llama.cpp backend
docker-compose up llamactl-llamacpp -d

# Or start llamactl with vLLM backend
docker-compose up llamactl-vllm -d

Access the dashboard at:

Using Docker Build and Run

llamactl with llama.cpp CUDA:

docker build -f Dockerfile.llamacpp -t llamactl:llamacpp-cuda .
docker run -d \
  --name llamactl-llamacpp \
  --gpus all \
  -p 8080:8080 \
  -v $(pwd)/data/llamacpp:/data \
  -v $(pwd)/models:/models \
  -e LLAMACTL_LLAMACPP_COMMAND=llama-server \
  llamactl:llamacpp-cuda

llamactl with vLLM CUDA:

docker build -f Dockerfile.vllm -t llamactl:vllm-cuda .
docker run -d \
  --name llamactl-vllm \
  --gpus all \
  -p 8080:8080 \
  -v $(pwd)/data/vllm:/data \
  -v $(pwd)/models:/models \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  -e LLAMACTL_VLLM_COMMAND=vllm \
  -e LLAMACTL_VLLM_ARGS=serve \
  llamactl:vllm-cuda

Docker-Specific Configuration:

  • Set LLAMACTL_LLAMACPP_COMMAND=llama-server to use the pre-installed llama-server
  • Set LLAMACTL_VLLM_COMMAND=vllm to use the pre-installed vLLM
  • Volume mount /data for llamactl data and /models for your model files
  • Use --gpus all for GPU access

Option 3: Build from Source

Requirements:

  • Go 1.24 or later
  • Node.js 22 or later
  • Git

If you prefer to build from source:

# Clone the repository
git clone https://github.com/lordmathis/llamactl.git
cd llamactl

# Build the web UI
cd webui && npm ci && npm run build && cd ..

# Build the application
go build -o llamactl ./cmd/server

Verification

Verify your installation by checking the version:

llamactl --version

Next Steps

Now that Llamactl is installed, continue to the Quick Start guide to get your first instance running!