diff --git a/README.md b/README.md
index d9edfd5..3eed452 100644
--- a/README.md
+++ b/README.md
@@ -123,7 +123,6 @@ instances:
on_demand_start_timeout: 120 # Default on-demand start timeout in seconds
timeout_check_interval: 5 # Idle instance timeout check in minutes
-
auth:
require_inference_auth: true # Require auth for inference endpoints
inference_keys: [] # Keys for inference endpoints
@@ -131,107 +130,7 @@ auth:
management_keys: [] # Keys for management endpoints
```
-Full Configuration Guide
-
-llamactl can be configured via configuration files or environment variables. Configuration is loaded in the following order of precedence:
-
-```
-Defaults < Configuration file < Environment variables
-```
-
-### Configuration Files
-
-#### Configuration File Locations
-
-Configuration files are searched in the following locations (in order of precedence):
-
-**Linux/macOS:**
-- `./llamactl.yaml` or `./config.yaml` (current directory)
-- `$HOME/.config/llamactl/config.yaml`
-- `/etc/llamactl/config.yaml`
-
-**Windows:**
-- `./llamactl.yaml` or `./config.yaml` (current directory)
-- `%APPDATA%\llamactl\config.yaml`
-- `%USERPROFILE%\llamactl\config.yaml`
-- `%PROGRAMDATA%\llamactl\config.yaml`
-
-You can specify the path to config file with `LLAMACTL_CONFIG_PATH` environment variable.
-
-### Configuration Options
-
-#### Server Configuration
-
-```yaml
-server:
- host: "0.0.0.0" # Server host to bind to (default: "0.0.0.0")
- port: 8080 # Server port to bind to (default: 8080)
- allowed_origins: ["*"] # CORS allowed origins (default: ["*"])
- enable_swagger: false # Enable Swagger UI (default: false)
-```
-
-**Environment Variables:**
-- `LLAMACTL_HOST` - Server host
-- `LLAMACTL_PORT` - Server port
-- `LLAMACTL_ALLOWED_ORIGINS` - Comma-separated CORS origins
-- `LLAMACTL_ENABLE_SWAGGER` - Enable Swagger UI (true/false)
-
-#### Instance Configuration
-
-```yaml
-instances:
- port_range: [8000, 9000] # Port range for instances (default: [8000, 9000])
- data_dir: "~/.local/share/llamactl" # Directory for all llamactl data (default varies by OS)
- configs_dir: "~/.local/share/llamactl/instances" # Directory for instance configs (default: data_dir/instances)
- logs_dir: "~/.local/share/llamactl/logs" # Directory for instance logs (default: data_dir/logs)
- auto_create_dirs: true # Automatically create data/config/logs directories (default: true)
- max_instances: -1 # Maximum instances (-1 = unlimited)
- max_running_instances: -1 # Maximum running instances (-1 = unlimited)
- enable_lru_eviction: true # Enable LRU eviction for idle instances
- llama_executable: "llama-server" # Path to llama-server executable
- default_auto_restart: true # Default auto-restart setting
- default_max_restarts: 3 # Default maximum restart attempts
- default_restart_delay: 5 # Default restart delay in seconds
- default_on_demand_start: true # Default on-demand start setting
- on_demand_start_timeout: 120 # Default on-demand start timeout in seconds
- timeout_check_interval: 5 # Default instance timeout check interval in minutes
-```
-
-**Environment Variables:**
-- `LLAMACTL_INSTANCE_PORT_RANGE` - Port range (format: "8000-9000" or "8000,9000")
-- `LLAMACTL_DATA_DIRECTORY` - Data directory path
-- `LLAMACTL_INSTANCES_DIR` - Instance configs directory path
-- `LLAMACTL_LOGS_DIR` - Log directory path
-- `LLAMACTL_AUTO_CREATE_DATA_DIR` - Auto-create data/config/logs directories (true/false)
-- `LLAMACTL_MAX_INSTANCES` - Maximum number of instances
-- `LLAMACTL_MAX_RUNNING_INSTANCES` - Maximum number of running instances
-- `LLAMACTL_ENABLE_LRU_EVICTION` - Enable LRU eviction for idle instances
-- `LLAMACTL_LLAMA_EXECUTABLE` - Path to llama-server executable
-- `LLAMACTL_DEFAULT_AUTO_RESTART` - Default auto-restart setting (true/false)
-- `LLAMACTL_DEFAULT_MAX_RESTARTS` - Default maximum restarts
-- `LLAMACTL_DEFAULT_RESTART_DELAY` - Default restart delay in seconds
-- `LLAMACTL_DEFAULT_ON_DEMAND_START` - Default on-demand start setting (true/false)
-- `LLAMACTL_ON_DEMAND_START_TIMEOUT` - Default on-demand start timeout in seconds
-- `LLAMACTL_TIMEOUT_CHECK_INTERVAL` - Default instance timeout check interval in minutes
-
-
-#### Authentication Configuration
-
-```yaml
-auth:
- require_inference_auth: true # Require API key for OpenAI endpoints (default: true)
- inference_keys: [] # List of valid inference API keys
- require_management_auth: true # Require API key for management endpoints (default: true)
- management_keys: [] # List of valid management API keys
-```
-
-**Environment Variables:**
-- `LLAMACTL_REQUIRE_INFERENCE_AUTH` - Require auth for OpenAI endpoints (true/false)
-- `LLAMACTL_INFERENCE_KEYS` - Comma-separated inference API keys
-- `LLAMACTL_REQUIRE_MANAGEMENT_AUTH` - Require auth for management endpoints (true/false)
-- `LLAMACTL_MANAGEMENT_KEYS` - Comma-separated management API keys
-
-
+For detailed configuration options including environment variables, file locations, and advanced settings, see the [Configuration Guide](docs/getting-started/configuration.md).
## License
diff --git a/docs/development/building.md b/docs/development/building.md
deleted file mode 100644
index a102915..0000000
--- a/docs/development/building.md
+++ /dev/null
@@ -1,464 +0,0 @@
-# Building from Source
-
-This guide covers building Llamactl from source code for development and production deployment.
-
-## Prerequisites
-
-### Required Tools
-
-- **Go 1.24+**: Download from [golang.org](https://golang.org/dl/)
-- **Node.js 22+**: Download from [nodejs.org](https://nodejs.org/)
-- **Git**: For cloning the repository
-- **Make**: For build automation (optional)
-
-### System Requirements
-
-- **Memory**: 4GB+ RAM for building
-- **Disk**: 2GB+ free space
-- **OS**: Linux, macOS, or Windows
-
-## Quick Build
-
-### Clone and Build
-
-```bash
-# Clone the repository
-git clone https://github.com/lordmathis/llamactl.git
-cd llamactl
-
-# Build the application
-go build -o llamactl cmd/server/main.go
-```
-
-### Run
-
-```bash
-./llamactl
-```
-
-## Development Build
-
-### Setup Development Environment
-
-```bash
-# Clone repository
-git clone https://github.com/lordmathis/llamactl.git
-cd llamactl
-
-# Install Go dependencies
-go mod download
-
-# Install frontend dependencies
-cd webui
-npm ci
-cd ..
-```
-
-### Build Components
-
-```bash
-# Build backend only
-go build -o llamactl cmd/server/main.go
-
-# Build frontend only
-cd webui
-npm run build
-cd ..
-
-# Build everything
-make build
-```
-
-### Development Server
-
-```bash
-# Run backend in development mode
-go run cmd/server/main.go --dev
-
-# Run frontend dev server (separate terminal)
-cd webui
-npm run dev
-```
-
-## Production Build
-
-### Optimized Build
-
-```bash
-# Build with optimizations
-go build -ldflags="-s -w" -o llamactl cmd/server/main.go
-
-# Or use the Makefile
-make build-prod
-```
-
-### Build Flags
-
-Common build flags for production:
-
-```bash
-go build \
- -ldflags="-s -w -X main.version=1.0.0 -X main.buildTime=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
- -trimpath \
- -o llamactl \
- cmd/server/main.go
-```
-
-**Flag explanations:**
-- `-s`: Strip symbol table
-- `-w`: Strip debug information
-- `-X`: Set variable values at build time
-- `-trimpath`: Remove absolute paths from binary
-
-## Cross-Platform Building
-
-### Build for Multiple Platforms
-
-```bash
-# Linux AMD64
-GOOS=linux GOARCH=amd64 go build -o llamactl-linux-amd64 cmd/server/main.go
-
-# Linux ARM64
-GOOS=linux GOARCH=arm64 go build -o llamactl-linux-arm64 cmd/server/main.go
-
-# macOS AMD64
-GOOS=darwin GOARCH=amd64 go build -o llamactl-darwin-amd64 cmd/server/main.go
-
-# macOS ARM64 (Apple Silicon)
-GOOS=darwin GOARCH=arm64 go build -o llamactl-darwin-arm64 cmd/server/main.go
-
-# Windows AMD64
-GOOS=windows GOARCH=amd64 go build -o llamactl-windows-amd64.exe cmd/server/main.go
-```
-
-### Automated Cross-Building
-
-Use the provided Makefile:
-
-```bash
-# Build all platforms
-make build-all
-
-# Build specific platform
-make build-linux
-make build-darwin
-make build-windows
-```
-
-## Build with Docker
-
-### Development Container
-
-```dockerfile
-# Dockerfile.dev
-FROM golang:1.24-alpine AS builder
-
-WORKDIR /app
-COPY go.mod go.sum ./
-RUN go mod download
-
-COPY . .
-RUN go build -o llamactl cmd/server/main.go
-
-FROM alpine:latest
-RUN apk --no-cache add ca-certificates
-WORKDIR /root/
-COPY --from=builder /app/llamactl .
-
-EXPOSE 8080
-CMD ["./llamactl"]
-```
-
-```bash
-# Build development image
-docker build -f Dockerfile.dev -t llamactl:dev .
-
-# Run container
-docker run -p 8080:8080 llamactl:dev
-```
-
-### Production Container
-
-```dockerfile
-# Dockerfile
-FROM node:22-alpine AS frontend-builder
-
-WORKDIR /app/webui
-COPY webui/package*.json ./
-RUN npm ci
-
-COPY webui/ ./
-RUN npm run build
-
-FROM golang:1.24-alpine AS backend-builder
-
-WORKDIR /app
-COPY go.mod go.sum ./
-RUN go mod download
-
-COPY . .
-COPY --from=frontend-builder /app/webui/dist ./webui/dist
-
-RUN CGO_ENABLED=0 GOOS=linux go build \
- -ldflags="-s -w" \
- -o llamactl \
- cmd/server/main.go
-
-FROM alpine:latest
-
-RUN apk --no-cache add ca-certificates tzdata
-RUN adduser -D -s /bin/sh llamactl
-
-WORKDIR /home/llamactl
-COPY --from=backend-builder /app/llamactl .
-RUN chown llamactl:llamactl llamactl
-
-USER llamactl
-EXPOSE 8080
-
-CMD ["./llamactl"]
-```
-
-## Advanced Build Options
-
-### Static Linking
-
-For deployments without external dependencies:
-
-```bash
-CGO_ENABLED=0 go build \
- -ldflags="-s -w -extldflags '-static'" \
- -o llamactl-static \
- cmd/server/main.go
-```
-
-### Debug Build
-
-Build with debug information:
-
-```bash
-go build -gcflags="all=-N -l" -o llamactl-debug cmd/server/main.go
-```
-
-### Race Detection Build
-
-Build with race detection (development only):
-
-```bash
-go build -race -o llamactl-race cmd/server/main.go
-```
-
-## Build Automation
-
-### Makefile
-
-```makefile
-# Makefile
-VERSION := $(shell git describe --tags --always --dirty)
-BUILD_TIME := $(shell date -u +%Y-%m-%dT%H:%M:%SZ)
-LDFLAGS := -s -w -X main.version=$(VERSION) -X main.buildTime=$(BUILD_TIME)
-
-.PHONY: build clean test install
-
-build:
- @echo "Building Llamactl..."
- @cd webui && npm run build
- @go build -ldflags="$(LDFLAGS)" -o llamactl cmd/server/main.go
-
-build-prod:
- @echo "Building production binary..."
- @cd webui && npm run build
- @CGO_ENABLED=0 go build -ldflags="$(LDFLAGS)" -trimpath -o llamactl cmd/server/main.go
-
-build-all: build-linux build-darwin build-windows
-
-build-linux:
- @GOOS=linux GOARCH=amd64 go build -ldflags="$(LDFLAGS)" -o dist/llamactl-linux-amd64 cmd/server/main.go
- @GOOS=linux GOARCH=arm64 go build -ldflags="$(LDFLAGS)" -o dist/llamactl-linux-arm64 cmd/server/main.go
-
-build-darwin:
- @GOOS=darwin GOARCH=amd64 go build -ldflags="$(LDFLAGS)" -o dist/llamactl-darwin-amd64 cmd/server/main.go
- @GOOS=darwin GOARCH=arm64 go build -ldflags="$(LDFLAGS)" -o dist/llamactl-darwin-arm64 cmd/server/main.go
-
-build-windows:
- @GOOS=windows GOARCH=amd64 go build -ldflags="$(LDFLAGS)" -o dist/llamactl-windows-amd64.exe cmd/server/main.go
-
-test:
- @go test ./...
-
-clean:
- @rm -f llamactl llamactl-*
- @rm -rf dist/
-
-install: build
- @cp llamactl $(GOPATH)/bin/llamactl
-```
-
-### GitHub Actions
-
-```yaml
-# .github/workflows/build.yml
-name: Build
-
-on:
- push:
- branches: [ main ]
- pull_request:
- branches: [ main ]
-
-jobs:
- test:
- runs-on: ubuntu-latest
- steps:
- - uses: actions/checkout@v4
-
- - name: Set up Go
- uses: actions/setup-go@v4
- with:
- go-version: '1.24'
-
- - name: Set up Node.js
- uses: actions/setup-node@v4
- with:
- node-version: '22'
-
- - name: Install dependencies
- run: |
- go mod download
- cd webui && npm ci
-
- - name: Run tests
- run: |
- go test ./...
- cd webui && npm test
-
- - name: Build
- run: make build
-
- build:
- needs: test
- runs-on: ubuntu-latest
- if: github.ref == 'refs/heads/main'
-
- steps:
- - uses: actions/checkout@v4
-
- - name: Set up Go
- uses: actions/setup-go@v4
- with:
- go-version: '1.24'
-
- - name: Set up Node.js
- uses: actions/setup-node@v4
- with:
- node-version: '22'
-
- - name: Build all platforms
- run: make build-all
-
- - name: Upload artifacts
- uses: actions/upload-artifact@v4
- with:
- name: binaries
- path: dist/
-```
-
-## Build Troubleshooting
-
-### Common Issues
-
-**Go version mismatch:**
-```bash
-# Check Go version
-go version
-
-# Update Go
-# Download from https://golang.org/dl/
-```
-
-**Node.js issues:**
-```bash
-# Clear npm cache
-npm cache clean --force
-
-# Remove node_modules and reinstall
-rm -rf webui/node_modules
-cd webui && npm ci
-```
-
-**Build failures:**
-```bash
-# Clean and rebuild
-make clean
-go mod tidy
-make build
-```
-
-### Performance Issues
-
-**Slow builds:**
-```bash
-# Use build cache
-export GOCACHE=$(go env GOCACHE)
-
-# Parallel builds
-export GOMAXPROCS=$(nproc)
-```
-
-**Large binary size:**
-```bash
-# Use UPX compression
-upx --best llamactl
-
-# Analyze binary size
-go tool nm -size llamactl | head -20
-```
-
-## Deployment
-
-### System Service
-
-Create a systemd service:
-
-```ini
-# /etc/systemd/system/llamactl.service
-[Unit]
-Description=Llamactl Server
-After=network.target
-
-[Service]
-Type=simple
-User=llamactl
-Group=llamactl
-ExecStart=/usr/local/bin/llamactl
-Restart=always
-RestartSec=5
-
-[Install]
-WantedBy=multi-user.target
-```
-
-```bash
-# Enable and start service
-sudo systemctl enable llamactl
-sudo systemctl start llamactl
-```
-
-### Configuration
-
-```bash
-# Create configuration directory
-sudo mkdir -p /etc/llamactl
-
-# Copy configuration
-sudo cp config.yaml /etc/llamactl/
-
-# Set permissions
-sudo chown -R llamactl:llamactl /etc/llamactl
-```
-
-## Next Steps
-
-- Configure [Installation](../getting-started/installation.md)
-- Set up [Configuration](../getting-started/configuration.md)
-- Learn about [Contributing](contributing.md)
diff --git a/docs/development/contributing.md b/docs/development/contributing.md
deleted file mode 100644
index 3b27d90..0000000
--- a/docs/development/contributing.md
+++ /dev/null
@@ -1,373 +0,0 @@
-# Contributing
-
-Thank you for your interest in contributing to Llamactl! This guide will help you get started with development and contribution.
-
-## Development Setup
-
-### Prerequisites
-
-- Go 1.24 or later
-- Node.js 22 or later
-- `llama-server` executable (from [llama.cpp](https://github.com/ggml-org/llama.cpp))
-- Git
-
-### Getting Started
-
-1. **Fork and Clone**
- ```bash
- # Fork the repository on GitHub, then clone your fork
- git clone https://github.com/yourusername/llamactl.git
- cd llamactl
-
- # Add upstream remote
- git remote add upstream https://github.com/lordmathis/llamactl.git
- ```
-
-2. **Install Dependencies**
- ```bash
- # Go dependencies
- go mod download
-
- # Frontend dependencies
- cd webui && npm ci && cd ..
- ```
-
-3. **Run Development Environment**
- ```bash
- # Start backend server
- go run ./cmd/server
- ```
-
- In a separate terminal:
- ```bash
- # Start frontend dev server
- cd webui && npm run dev
- ```
-
-## Development Workflow
-
-### Setting Up Your Environment
-
-1. **Configuration**
- Create a development configuration file:
- ```yaml
- # dev-config.yaml
- server:
- host: "localhost"
- port: 8080
- logging:
- level: "debug"
- ```
-
-2. **Test Data**
- Set up test models and instances for development.
-
-### Making Changes
-
-1. **Create a Branch**
- ```bash
- git checkout -b feature/your-feature-name
- ```
-
-2. **Development Commands**
- ```bash
- # Backend
- go test ./... -v # Run tests
- go test -race ./... -v # Run with race detector
- go fmt ./... && go vet ./... # Format and vet code
- go build ./cmd/server # Build binary
-
- # Frontend (from webui/ directory)
- npm run test # Run tests
- npm run lint # Lint code
- npm run type-check # TypeScript check
- npm run build # Build for production
- ```
-
-3. **Code Quality**
- ```bash
- # Run all checks before committing
- make lint
- make test
- make build
- ```
-
-## Project Structure
-
-### Backend (Go)
-
-```
-cmd/
-├── server/ # Main application entry point
-pkg/
-├── backends/ # Model backend implementations
-├── config/ # Configuration management
-├── instance/ # Instance lifecycle management
-├── manager/ # Instance manager
-├── server/ # HTTP server and routes
-├── testutil/ # Test utilities
-└── validation/ # Input validation
-```
-
-### Frontend (React/TypeScript)
-
-```
-webui/src/
-├── components/ # React components
-├── contexts/ # React contexts
-├── hooks/ # Custom hooks
-├── lib/ # Utility libraries
-├── schemas/ # Zod schemas
-└── types/ # TypeScript types
-```
-
-## Coding Standards
-
-### Go Code
-
-- Follow standard Go formatting (`gofmt`)
-- Use `go vet` and address all warnings
-- Write comprehensive tests for new functionality
-- Include documentation comments for exported functions
-- Use meaningful variable and function names
-
-Example:
-```go
-// CreateInstance creates a new model instance with the given configuration.
-// It validates the configuration and ensures the instance name is unique.
-func (m *Manager) CreateInstance(ctx context.Context, config InstanceConfig) (*Instance, error) {
- if err := config.Validate(); err != nil {
- return nil, fmt.Errorf("invalid configuration: %w", err)
- }
-
- // Implementation...
-}
-```
-
-### TypeScript/React Code
-
-- Use TypeScript strict mode
-- Follow React best practices
-- Use functional components with hooks
-- Implement proper error boundaries
-- Write unit tests for components
-
-Example:
-```typescript
-interface InstanceCardProps {
- instance: Instance;
- onStart: (name: string) => Promise;
- onStop: (name: string) => Promise;
-}
-
-export const InstanceCard: React.FC = ({
- instance,
- onStart,
- onStop,
-}) => {
- // Implementation...
-};
-```
-
-## Testing
-
-### Backend Tests
-
-```bash
-# Run all tests
-go test ./...
-
-# Run tests with coverage
-go test ./... -coverprofile=coverage.out
-go tool cover -html=coverage.out
-
-# Run specific package tests
-go test ./pkg/manager -v
-
-# Run with race detection
-go test -race ./...
-```
-
-### Frontend Tests
-
-```bash
-cd webui
-
-# Run unit tests
-npm run test
-
-# Run tests with coverage
-npm run test:coverage
-
-# Run E2E tests
-npm run test:e2e
-```
-
-### Integration Tests
-
-```bash
-# Run integration tests (requires llama-server)
-go test ./... -tags=integration
-```
-
-## Pull Request Process
-
-### Before Submitting
-
-1. **Update your branch**
- ```bash
- git fetch upstream
- git rebase upstream/main
- ```
-
-2. **Run all tests**
- ```bash
- make test-all
- ```
-
-3. **Update documentation** if needed
-
-4. **Write clear commit messages**
- ```
- feat: add instance health monitoring
-
- - Implement health check endpoint
- - Add periodic health monitoring
- - Update API documentation
-
- Fixes #123
- ```
-
-### Submitting a PR
-
-1. **Push your branch**
- ```bash
- git push origin feature/your-feature-name
- ```
-
-2. **Create Pull Request**
- - Use the PR template
- - Provide clear description
- - Link related issues
- - Add screenshots for UI changes
-
-3. **PR Review Process**
- - Automated checks must pass
- - Code review by maintainers
- - Address feedback promptly
- - Keep PR scope focused
-
-## Issue Guidelines
-
-### Reporting Bugs
-
-Use the bug report template and include:
-
-- Steps to reproduce
-- Expected vs actual behavior
-- Environment details (OS, Go version, etc.)
-- Relevant logs or error messages
-- Minimal reproduction case
-
-### Feature Requests
-
-Use the feature request template and include:
-
-- Clear description of the problem
-- Proposed solution
-- Alternative solutions considered
-- Implementation complexity estimate
-
-### Security Issues
-
-For security vulnerabilities:
-- Do NOT create public issues
-- Email security@llamactl.dev
-- Provide detailed description
-- Allow time for fix before disclosure
-
-## Development Best Practices
-
-### API Design
-
-- Follow REST principles
-- Use consistent naming conventions
-- Provide comprehensive error messages
-- Include proper HTTP status codes
-- Document all endpoints
-
-### Error Handling
-
-```go
-// Wrap errors with context
-if err := instance.Start(); err != nil {
- return fmt.Errorf("failed to start instance %s: %w", instance.Name, err)
-}
-
-// Use structured logging
-log.WithFields(log.Fields{
- "instance": instance.Name,
- "error": err,
-}).Error("Failed to start instance")
-```
-
-### Configuration
-
-- Use environment variables for deployment
-- Provide sensible defaults
-- Validate configuration on startup
-- Support configuration file reloading
-
-### Performance
-
-- Profile code for bottlenecks
-- Use efficient data structures
-- Implement proper caching
-- Monitor resource usage
-
-## Release Process
-
-### Version Management
-
-- Use semantic versioning (SemVer)
-- Tag releases properly
-- Maintain CHANGELOG.md
-- Create release notes
-
-### Building Releases
-
-```bash
-# Build all platforms
-make build-all
-
-# Create release package
-make package
-```
-
-## Getting Help
-
-### Communication Channels
-
-- **GitHub Issues**: Bug reports and feature requests
-- **GitHub Discussions**: General questions and ideas
-- **Code Review**: PR comments and feedback
-
-### Development Questions
-
-When asking for help:
-
-1. Check existing documentation
-2. Search previous issues
-3. Provide minimal reproduction case
-4. Include relevant environment details
-
-## Recognition
-
-Contributors are recognized in:
-
-- CONTRIBUTORS.md file
-- Release notes
-- Documentation credits
-- Annual contributor highlights
-
-Thank you for contributing to Llamactl!
diff --git a/docs/getting-started/configuration.md b/docs/getting-started/configuration.md
index e9ba2d3..3a859ee 100644
--- a/docs/getting-started/configuration.md
+++ b/docs/getting-started/configuration.md
@@ -1,59 +1,144 @@
# Configuration
-Llamactl can be configured through various methods to suit your needs.
+llamactl can be configured via configuration files or environment variables. Configuration is loaded in the following order of precedence:
-## Configuration File
+```
+Defaults < Configuration file < Environment variables
+```
-Create a configuration file at `~/.llamactl/config.yaml`:
+llamactl works out of the box with sensible defaults, but you can customize the behavior to suit your needs.
+
+## Default Configuration
+
+Here's the default configuration with all available options:
```yaml
-# Server configuration
server:
- host: "0.0.0.0"
- port: 8080
- cors_enabled: true
+ host: "0.0.0.0" # Server host to bind to
+ port: 8080 # Server port to bind to
+ allowed_origins: ["*"] # Allowed CORS origins (default: all)
+ enable_swagger: false # Enable Swagger UI for API docs
+
+instances:
+ port_range: [8000, 9000] # Port range for instances
+ data_dir: ~/.local/share/llamactl # Data directory (platform-specific, see below)
+ configs_dir: ~/.local/share/llamactl/instances # Instance configs directory
+ logs_dir: ~/.local/share/llamactl/logs # Logs directory
+ auto_create_dirs: true # Auto-create data/config/logs dirs if missing
+ max_instances: -1 # Max instances (-1 = unlimited)
+ max_running_instances: -1 # Max running instances (-1 = unlimited)
+ enable_lru_eviction: true # Enable LRU eviction for idle instances
+ llama_executable: llama-server # Path to llama-server executable
+ default_auto_restart: true # Auto-restart new instances by default
+ default_max_restarts: 3 # Max restarts for new instances
+ default_restart_delay: 5 # Restart delay (seconds) for new instances
+ default_on_demand_start: true # Default on-demand start setting
+ on_demand_start_timeout: 120 # Default on-demand start timeout in seconds
+ timeout_check_interval: 5 # Idle instance timeout check in minutes
-# Authentication (optional)
auth:
- enabled: false
- # When enabled, configure your authentication method
- # jwt_secret: "your-secret-key"
-
-# Default instance settings
-defaults:
- backend: "llamacpp"
- timeout: 300
- log_level: "info"
-
-# Paths
-paths:
- models_dir: "/path/to/your/models"
- logs_dir: "/var/log/llamactl"
- data_dir: "/var/lib/llamactl"
-
-# Instance limits
-limits:
- max_instances: 10
- max_memory_per_instance: "8GB"
+ require_inference_auth: true # Require auth for inference endpoints
+ inference_keys: [] # Keys for inference endpoints
+ require_management_auth: true # Require auth for management endpoints
+ management_keys: [] # Keys for management endpoints
```
-## Environment Variables
+## Configuration Files
-You can also configure Llamactl using environment variables:
+### Configuration File Locations
-```bash
-# Server settings
-export LLAMACTL_HOST=0.0.0.0
-export LLAMACTL_PORT=8080
+Configuration files are searched in the following locations (in order of precedence):
-# Paths
-export LLAMACTL_MODELS_DIR=/path/to/models
-export LLAMACTL_LOGS_DIR=/var/log/llamactl
+**Linux:**
+- `./llamactl.yaml` or `./config.yaml` (current directory)
+- `$HOME/.config/llamactl/config.yaml`
+- `/etc/llamactl/config.yaml`
-# Limits
-export LLAMACTL_MAX_INSTANCES=5
+**macOS:**
+- `./llamactl.yaml` or `./config.yaml` (current directory)
+- `$HOME/Library/Application Support/llamactl/config.yaml`
+- `/Library/Application Support/llamactl/config.yaml`
+
+**Windows:**
+- `./llamactl.yaml` or `./config.yaml` (current directory)
+- `%APPDATA%\llamactl\config.yaml`
+- `%USERPROFILE%\llamactl\config.yaml`
+- `%PROGRAMDATA%\llamactl\config.yaml`
+
+You can specify the path to config file with `LLAMACTL_CONFIG_PATH` environment variable.
+
+## Configuration Options
+
+### Server Configuration
+
+```yaml
+server:
+ host: "0.0.0.0" # Server host to bind to (default: "0.0.0.0")
+ port: 8080 # Server port to bind to (default: 8080)
+ allowed_origins: ["*"] # CORS allowed origins (default: ["*"])
+ enable_swagger: false # Enable Swagger UI (default: false)
```
+**Environment Variables:**
+- `LLAMACTL_HOST` - Server host
+- `LLAMACTL_PORT` - Server port
+- `LLAMACTL_ALLOWED_ORIGINS` - Comma-separated CORS origins
+- `LLAMACTL_ENABLE_SWAGGER` - Enable Swagger UI (true/false)
+
+### Instance Configuration
+
+```yaml
+instances:
+ port_range: [8000, 9000] # Port range for instances (default: [8000, 9000])
+ data_dir: "~/.local/share/llamactl" # Directory for all llamactl data (default varies by OS)
+ configs_dir: "~/.local/share/llamactl/instances" # Directory for instance configs (default: data_dir/instances)
+ logs_dir: "~/.local/share/llamactl/logs" # Directory for instance logs (default: data_dir/logs)
+ auto_create_dirs: true # Automatically create data/config/logs directories (default: true)
+ max_instances: -1 # Maximum instances (-1 = unlimited)
+ max_running_instances: -1 # Maximum running instances (-1 = unlimited)
+ enable_lru_eviction: true # Enable LRU eviction for idle instances
+ llama_executable: "llama-server" # Path to llama-server executable
+ default_auto_restart: true # Default auto-restart setting
+ default_max_restarts: 3 # Default maximum restart attempts
+ default_restart_delay: 5 # Default restart delay in seconds
+ default_on_demand_start: true # Default on-demand start setting
+ on_demand_start_timeout: 120 # Default on-demand start timeout in seconds
+ timeout_check_interval: 5 # Default instance timeout check interval in minutes
+```
+
+**Environment Variables:**
+- `LLAMACTL_INSTANCE_PORT_RANGE` - Port range (format: "8000-9000" or "8000,9000")
+- `LLAMACTL_DATA_DIRECTORY` - Data directory path
+- `LLAMACTL_INSTANCES_DIR` - Instance configs directory path
+- `LLAMACTL_LOGS_DIR` - Log directory path
+- `LLAMACTL_AUTO_CREATE_DATA_DIR` - Auto-create data/config/logs directories (true/false)
+- `LLAMACTL_MAX_INSTANCES` - Maximum number of instances
+- `LLAMACTL_MAX_RUNNING_INSTANCES` - Maximum number of running instances
+- `LLAMACTL_ENABLE_LRU_EVICTION` - Enable LRU eviction for idle instances
+- `LLAMACTL_LLAMA_EXECUTABLE` - Path to llama-server executable
+- `LLAMACTL_DEFAULT_AUTO_RESTART` - Default auto-restart setting (true/false)
+- `LLAMACTL_DEFAULT_MAX_RESTARTS` - Default maximum restarts
+- `LLAMACTL_DEFAULT_RESTART_DELAY` - Default restart delay in seconds
+- `LLAMACTL_DEFAULT_ON_DEMAND_START` - Default on-demand start setting (true/false)
+- `LLAMACTL_ON_DEMAND_START_TIMEOUT` - Default on-demand start timeout in seconds
+- `LLAMACTL_TIMEOUT_CHECK_INTERVAL` - Default instance timeout check interval in minutes
+
+### Authentication Configuration
+
+```yaml
+auth:
+ require_inference_auth: true # Require API key for OpenAI endpoints (default: true)
+ inference_keys: [] # List of valid inference API keys
+ require_management_auth: true # Require API key for management endpoints (default: true)
+ management_keys: [] # List of valid management API keys
+```
+
+**Environment Variables:**
+- `LLAMACTL_REQUIRE_INFERENCE_AUTH` - Require auth for OpenAI endpoints (true/false)
+- `LLAMACTL_INFERENCE_KEYS` - Comma-separated inference API keys
+- `LLAMACTL_REQUIRE_MANAGEMENT_AUTH` - Require auth for management endpoints (true/false)
+- `LLAMACTL_MANAGEMENT_KEYS` - Comma-separated management API keys
+
## Command Line Options
View all available command line options:
@@ -62,90 +147,13 @@ View all available command line options:
llamactl --help
```
-Common options:
-
-```bash
-# Specify config file
-llamactl --config /path/to/config.yaml
-
-# Set log level
-llamactl --log-level debug
-
-# Run on different port
-llamactl --port 9090
-```
-
-## Instance Configuration
-
-When creating instances, you can specify various options:
-
-### Basic Options
-
-- `name`: Unique identifier for the instance
-- `model_path`: Path to the GGUF model file
-- `port`: Port for the instance to listen on
-
-### Advanced Options
-
-- `threads`: Number of CPU threads to use
-- `context_size`: Context window size
-- `batch_size`: Batch size for processing
-- `gpu_layers`: Number of layers to offload to GPU
-- `memory_lock`: Lock model in memory
-- `no_mmap`: Disable memory mapping
-
-### Example Instance Configuration
-
-```json
-{
- "name": "production-model",
- "model_path": "/models/llama-2-13b-chat.gguf",
- "port": 8081,
- "options": {
- "threads": 8,
- "context_size": 4096,
- "batch_size": 512,
- "gpu_layers": 35,
- "memory_lock": true
- }
-}
-```
-
-## Security Configuration
-
-### Enable Authentication
-
-To enable authentication, update your config file:
-
-```yaml
-auth:
- enabled: true
- jwt_secret: "your-very-secure-secret-key"
- token_expiry: "24h"
-```
-
-### HTTPS Configuration
-
-For production deployments, configure HTTPS:
-
-```yaml
-server:
- tls:
- enabled: true
- cert_file: "/path/to/cert.pem"
- key_file: "/path/to/key.pem"
-```
-
-## Logging Configuration
-
-Configure logging levels and outputs:
-
-```yaml
-logging:
- level: "info" # debug, info, warn, error
- format: "json" # json or text
- output: "/var/log/llamactl/app.log"
-```
+You can also override configuration using command line flags when starting llamactl.
+
+## Next Steps
+
+- Learn about [Managing Instances](../user-guide/managing-instances.md)
+- Explore [Advanced Configuration](../advanced/monitoring.md)
+- Set up [Monitoring](../advanced/monitoring.md)
## Next Steps
diff --git a/docs/getting-started/installation.md b/docs/getting-started/installation.md
index 9be575e..9ae35ed 100644
--- a/docs/getting-started/installation.md
+++ b/docs/getting-started/installation.md
@@ -4,9 +4,19 @@ This guide will walk you through installing Llamactl on your system.
## Prerequisites
-Before installing Llamactl, ensure you have:
+You need `llama-server` from [llama.cpp](https://github.com/ggml-org/llama.cpp) installed:
-- Go 1.19 or later
+```bash
+# Quick install methods:
+# Homebrew (macOS)
+brew install llama.cpp
+
+# Or build from source - see llama.cpp docs
+```
+
+Additional requirements for building from source:
+- Go 1.24 or later
+- Node.js 22 or later
- Git
- Sufficient disk space for your models
@@ -14,17 +24,18 @@ Before installing Llamactl, ensure you have:
### Option 1: Download Binary (Recommended)
-Download the latest release from our [GitHub releases page](https://github.com/lordmathis/llamactl/releases):
+Download the latest release from the [GitHub releases page](https://github.com/lordmathis/llamactl/releases):
```bash
-# Download for Linux
-curl -L https://github.com/lordmathis/llamactl/releases/latest/download/llamactl-linux-amd64 -o llamactl
-
-# Make executable
-chmod +x llamactl
-
-# Move to PATH (optional)
+# Linux/macOS - Get latest version and download
+LATEST_VERSION=$(curl -s https://api.github.com/repos/lordmathis/llamactl/releases/latest | grep '"tag_name":' | sed -E 's/.*"([^"]+)".*/\1/')
+curl -L https://github.com/lordmathis/llamactl/releases/download/${LATEST_VERSION}/llamactl-${LATEST_VERSION}-$(uname -s | tr '[:upper:]' '[:lower:]')-$(uname -m).tar.gz | tar -xz
sudo mv llamactl /usr/local/bin/
+
+# Or download manually from:
+# https://github.com/lordmathis/llamactl/releases/latest
+
+# Windows - Download from releases page
```
### Option 2: Build from Source
@@ -36,11 +47,12 @@ If you prefer to build from source:
git clone https://github.com/lordmathis/llamactl.git
cd llamactl
-# Build the application
-go build -o llamactl cmd/server/main.go
-```
+# Build the web UI
+cd webui && npm ci && npm run build && cd ..
-For detailed build instructions, see the [Building from Source](../development/building.md) guide.
+# Build the application
+go build -o llamactl ./cmd/server
+```
## Verification
diff --git a/docs/getting-started/quick-start.md b/docs/getting-started/quick-start.md
index a882b10..11751c0 100644
--- a/docs/getting-started/quick-start.md
+++ b/docs/getting-started/quick-start.md
@@ -28,7 +28,6 @@ You should see the Llamactl web interface.
2. Fill in the instance configuration:
- **Name**: Give your instance a descriptive name
- **Model Path**: Path to your Llama.cpp model file
- - **Port**: Port for the instance to run on
- **Additional Options**: Any extra Llama.cpp parameters
3. Click "Create Instance"
@@ -50,7 +49,6 @@ Here's a basic example configuration for a Llama 2 model:
{
"name": "llama2-7b",
"model_path": "/path/to/llama-2-7b-chat.gguf",
- "port": 8081,
"options": {
"threads": 4,
"context_size": 2048
@@ -72,13 +70,70 @@ curl -X POST http://localhost:8080/api/instances \
-d '{
"name": "my-model",
"model_path": "/path/to/model.gguf",
- "port": 8081
}'
# Start an instance
curl -X POST http://localhost:8080/api/instances/my-model/start
```
+## OpenAI Compatible API
+
+Llamactl provides OpenAI-compatible endpoints, making it easy to integrate with existing OpenAI client libraries and tools.
+
+### Chat Completions
+
+Once you have an instance running, you can use it with the OpenAI-compatible chat completions endpoint:
+
+```bash
+curl -X POST http://localhost:8080/v1/chat/completions \
+ -H "Content-Type: application/json" \
+ -d '{
+ "model": "my-model",
+ "messages": [
+ {
+ "role": "user",
+ "content": "Hello! Can you help me write a Python function?"
+ }
+ ],
+ "max_tokens": 150,
+ "temperature": 0.7
+ }'
+```
+
+### Using with Python OpenAI Client
+
+You can also use the official OpenAI Python client:
+
+```python
+from openai import OpenAI
+
+# Point the client to your Llamactl server
+client = OpenAI(
+ base_url="http://localhost:8080/v1",
+ api_key="not-needed" # Llamactl doesn't require API keys by default
+)
+
+# Create a chat completion
+response = client.chat.completions.create(
+ model="my-model", # Use the name of your instance
+ messages=[
+ {"role": "user", "content": "Explain quantum computing in simple terms"}
+ ],
+ max_tokens=200,
+ temperature=0.7
+)
+
+print(response.choices[0].message.content)
+```
+
+### List Available Models
+
+Get a list of running instances (models) in OpenAI-compatible format:
+
+```bash
+curl http://localhost:8080/v1/models
+```
+
## Next Steps
- Learn more about the [Web UI](../user-guide/web-ui.md)
diff --git a/docs/index.md b/docs/index.md
index b45cae2..19f7508 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -1,12 +1,12 @@
# Llamactl Documentation
-Welcome to the Llamactl documentation! Llamactl is a powerful management tool for Llama.cpp instances that provides both a web interface and REST API for managing large language models.
+Welcome to the Llamactl documentation! Llamactl is a powerful management tool for llama-server instances that provides both a web interface and REST API for managing large language models.
## What is Llamactl?
-Llamactl is designed to simplify the deployment and management of Llama.cpp instances. It provides:
+Llamactl is designed to simplify the deployment and management of llama-server instances. It provides:
-- **Instance Management**: Start, stop, and monitor multiple Llama.cpp instances
+- **Instance Management**: Start, stop, and monitor multiple llama-server instances
- **Web UI**: User-friendly interface for managing your models
- **REST API**: Programmatic access to all functionality
- **Health Monitoring**: Real-time status and health checks
@@ -33,8 +33,7 @@ Llamactl is designed to simplify the deployment and management of Llama.cpp inst
If you need help or have questions:
- Check the [Troubleshooting](advanced/troubleshooting.md) guide
-- Visit our [GitHub repository](https://github.com/lordmathis/llamactl)
-- Read the [Contributing guide](development/contributing.md) to help improve Llamactl
+- Visit the [GitHub repository](https://github.com/lordmathis/llamactl)
---
diff --git a/mkdocs.yml b/mkdocs.yml
index f23c70e..4e7e107 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -1,6 +1,6 @@
-site_name: LlamaCtl Documentation
-site_description: User documentation for LlamaCtl - A management tool for Llama.cpp instances
-site_author: LlamaCtl Team
+site_name: Llamatl Documentation
+site_description: User documentation for Llamatl - A management tool for Llama.cpp instances
+site_author: Llamatl Team
site_url: https://llamactl.org
repo_name: lordmathis/llamactl
@@ -61,9 +61,6 @@ nav:
- Backends: advanced/backends.md
- Monitoring: advanced/monitoring.md
- Troubleshooting: advanced/troubleshooting.md
- - Development:
- - Contributing: development/contributing.md
- - Building from Source: development/building.md
plugins:
- search