diff --git a/README.md b/README.md index d9edfd5..3eed452 100644 --- a/README.md +++ b/README.md @@ -123,7 +123,6 @@ instances: on_demand_start_timeout: 120 # Default on-demand start timeout in seconds timeout_check_interval: 5 # Idle instance timeout check in minutes - auth: require_inference_auth: true # Require auth for inference endpoints inference_keys: [] # Keys for inference endpoints @@ -131,107 +130,7 @@ auth: management_keys: [] # Keys for management endpoints ``` -
Full Configuration Guide - -llamactl can be configured via configuration files or environment variables. Configuration is loaded in the following order of precedence: - -``` -Defaults < Configuration file < Environment variables -``` - -### Configuration Files - -#### Configuration File Locations - -Configuration files are searched in the following locations (in order of precedence): - -**Linux/macOS:** -- `./llamactl.yaml` or `./config.yaml` (current directory) -- `$HOME/.config/llamactl/config.yaml` -- `/etc/llamactl/config.yaml` - -**Windows:** -- `./llamactl.yaml` or `./config.yaml` (current directory) -- `%APPDATA%\llamactl\config.yaml` -- `%USERPROFILE%\llamactl\config.yaml` -- `%PROGRAMDATA%\llamactl\config.yaml` - -You can specify the path to config file with `LLAMACTL_CONFIG_PATH` environment variable. - -### Configuration Options - -#### Server Configuration - -```yaml -server: - host: "0.0.0.0" # Server host to bind to (default: "0.0.0.0") - port: 8080 # Server port to bind to (default: 8080) - allowed_origins: ["*"] # CORS allowed origins (default: ["*"]) - enable_swagger: false # Enable Swagger UI (default: false) -``` - -**Environment Variables:** -- `LLAMACTL_HOST` - Server host -- `LLAMACTL_PORT` - Server port -- `LLAMACTL_ALLOWED_ORIGINS` - Comma-separated CORS origins -- `LLAMACTL_ENABLE_SWAGGER` - Enable Swagger UI (true/false) - -#### Instance Configuration - -```yaml -instances: - port_range: [8000, 9000] # Port range for instances (default: [8000, 9000]) - data_dir: "~/.local/share/llamactl" # Directory for all llamactl data (default varies by OS) - configs_dir: "~/.local/share/llamactl/instances" # Directory for instance configs (default: data_dir/instances) - logs_dir: "~/.local/share/llamactl/logs" # Directory for instance logs (default: data_dir/logs) - auto_create_dirs: true # Automatically create data/config/logs directories (default: true) - max_instances: -1 # Maximum instances (-1 = unlimited) - max_running_instances: -1 # Maximum running instances (-1 = unlimited) - enable_lru_eviction: true # Enable LRU eviction for idle instances - llama_executable: "llama-server" # Path to llama-server executable - default_auto_restart: true # Default auto-restart setting - default_max_restarts: 3 # Default maximum restart attempts - default_restart_delay: 5 # Default restart delay in seconds - default_on_demand_start: true # Default on-demand start setting - on_demand_start_timeout: 120 # Default on-demand start timeout in seconds - timeout_check_interval: 5 # Default instance timeout check interval in minutes -``` - -**Environment Variables:** -- `LLAMACTL_INSTANCE_PORT_RANGE` - Port range (format: "8000-9000" or "8000,9000") -- `LLAMACTL_DATA_DIRECTORY` - Data directory path -- `LLAMACTL_INSTANCES_DIR` - Instance configs directory path -- `LLAMACTL_LOGS_DIR` - Log directory path -- `LLAMACTL_AUTO_CREATE_DATA_DIR` - Auto-create data/config/logs directories (true/false) -- `LLAMACTL_MAX_INSTANCES` - Maximum number of instances -- `LLAMACTL_MAX_RUNNING_INSTANCES` - Maximum number of running instances -- `LLAMACTL_ENABLE_LRU_EVICTION` - Enable LRU eviction for idle instances -- `LLAMACTL_LLAMA_EXECUTABLE` - Path to llama-server executable -- `LLAMACTL_DEFAULT_AUTO_RESTART` - Default auto-restart setting (true/false) -- `LLAMACTL_DEFAULT_MAX_RESTARTS` - Default maximum restarts -- `LLAMACTL_DEFAULT_RESTART_DELAY` - Default restart delay in seconds -- `LLAMACTL_DEFAULT_ON_DEMAND_START` - Default on-demand start setting (true/false) -- `LLAMACTL_ON_DEMAND_START_TIMEOUT` - Default on-demand start timeout in seconds -- `LLAMACTL_TIMEOUT_CHECK_INTERVAL` - Default instance timeout check interval in minutes - - -#### Authentication Configuration - -```yaml -auth: - require_inference_auth: true # Require API key for OpenAI endpoints (default: true) - inference_keys: [] # List of valid inference API keys - require_management_auth: true # Require API key for management endpoints (default: true) - management_keys: [] # List of valid management API keys -``` - -**Environment Variables:** -- `LLAMACTL_REQUIRE_INFERENCE_AUTH` - Require auth for OpenAI endpoints (true/false) -- `LLAMACTL_INFERENCE_KEYS` - Comma-separated inference API keys -- `LLAMACTL_REQUIRE_MANAGEMENT_AUTH` - Require auth for management endpoints (true/false) -- `LLAMACTL_MANAGEMENT_KEYS` - Comma-separated management API keys - -
+For detailed configuration options including environment variables, file locations, and advanced settings, see the [Configuration Guide](docs/getting-started/configuration.md). ## License diff --git a/docs/development/building.md b/docs/development/building.md deleted file mode 100644 index a102915..0000000 --- a/docs/development/building.md +++ /dev/null @@ -1,464 +0,0 @@ -# Building from Source - -This guide covers building Llamactl from source code for development and production deployment. - -## Prerequisites - -### Required Tools - -- **Go 1.24+**: Download from [golang.org](https://golang.org/dl/) -- **Node.js 22+**: Download from [nodejs.org](https://nodejs.org/) -- **Git**: For cloning the repository -- **Make**: For build automation (optional) - -### System Requirements - -- **Memory**: 4GB+ RAM for building -- **Disk**: 2GB+ free space -- **OS**: Linux, macOS, or Windows - -## Quick Build - -### Clone and Build - -```bash -# Clone the repository -git clone https://github.com/lordmathis/llamactl.git -cd llamactl - -# Build the application -go build -o llamactl cmd/server/main.go -``` - -### Run - -```bash -./llamactl -``` - -## Development Build - -### Setup Development Environment - -```bash -# Clone repository -git clone https://github.com/lordmathis/llamactl.git -cd llamactl - -# Install Go dependencies -go mod download - -# Install frontend dependencies -cd webui -npm ci -cd .. -``` - -### Build Components - -```bash -# Build backend only -go build -o llamactl cmd/server/main.go - -# Build frontend only -cd webui -npm run build -cd .. - -# Build everything -make build -``` - -### Development Server - -```bash -# Run backend in development mode -go run cmd/server/main.go --dev - -# Run frontend dev server (separate terminal) -cd webui -npm run dev -``` - -## Production Build - -### Optimized Build - -```bash -# Build with optimizations -go build -ldflags="-s -w" -o llamactl cmd/server/main.go - -# Or use the Makefile -make build-prod -``` - -### Build Flags - -Common build flags for production: - -```bash -go build \ - -ldflags="-s -w -X main.version=1.0.0 -X main.buildTime=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \ - -trimpath \ - -o llamactl \ - cmd/server/main.go -``` - -**Flag explanations:** -- `-s`: Strip symbol table -- `-w`: Strip debug information -- `-X`: Set variable values at build time -- `-trimpath`: Remove absolute paths from binary - -## Cross-Platform Building - -### Build for Multiple Platforms - -```bash -# Linux AMD64 -GOOS=linux GOARCH=amd64 go build -o llamactl-linux-amd64 cmd/server/main.go - -# Linux ARM64 -GOOS=linux GOARCH=arm64 go build -o llamactl-linux-arm64 cmd/server/main.go - -# macOS AMD64 -GOOS=darwin GOARCH=amd64 go build -o llamactl-darwin-amd64 cmd/server/main.go - -# macOS ARM64 (Apple Silicon) -GOOS=darwin GOARCH=arm64 go build -o llamactl-darwin-arm64 cmd/server/main.go - -# Windows AMD64 -GOOS=windows GOARCH=amd64 go build -o llamactl-windows-amd64.exe cmd/server/main.go -``` - -### Automated Cross-Building - -Use the provided Makefile: - -```bash -# Build all platforms -make build-all - -# Build specific platform -make build-linux -make build-darwin -make build-windows -``` - -## Build with Docker - -### Development Container - -```dockerfile -# Dockerfile.dev -FROM golang:1.24-alpine AS builder - -WORKDIR /app -COPY go.mod go.sum ./ -RUN go mod download - -COPY . . -RUN go build -o llamactl cmd/server/main.go - -FROM alpine:latest -RUN apk --no-cache add ca-certificates -WORKDIR /root/ -COPY --from=builder /app/llamactl . - -EXPOSE 8080 -CMD ["./llamactl"] -``` - -```bash -# Build development image -docker build -f Dockerfile.dev -t llamactl:dev . - -# Run container -docker run -p 8080:8080 llamactl:dev -``` - -### Production Container - -```dockerfile -# Dockerfile -FROM node:22-alpine AS frontend-builder - -WORKDIR /app/webui -COPY webui/package*.json ./ -RUN npm ci - -COPY webui/ ./ -RUN npm run build - -FROM golang:1.24-alpine AS backend-builder - -WORKDIR /app -COPY go.mod go.sum ./ -RUN go mod download - -COPY . . -COPY --from=frontend-builder /app/webui/dist ./webui/dist - -RUN CGO_ENABLED=0 GOOS=linux go build \ - -ldflags="-s -w" \ - -o llamactl \ - cmd/server/main.go - -FROM alpine:latest - -RUN apk --no-cache add ca-certificates tzdata -RUN adduser -D -s /bin/sh llamactl - -WORKDIR /home/llamactl -COPY --from=backend-builder /app/llamactl . -RUN chown llamactl:llamactl llamactl - -USER llamactl -EXPOSE 8080 - -CMD ["./llamactl"] -``` - -## Advanced Build Options - -### Static Linking - -For deployments without external dependencies: - -```bash -CGO_ENABLED=0 go build \ - -ldflags="-s -w -extldflags '-static'" \ - -o llamactl-static \ - cmd/server/main.go -``` - -### Debug Build - -Build with debug information: - -```bash -go build -gcflags="all=-N -l" -o llamactl-debug cmd/server/main.go -``` - -### Race Detection Build - -Build with race detection (development only): - -```bash -go build -race -o llamactl-race cmd/server/main.go -``` - -## Build Automation - -### Makefile - -```makefile -# Makefile -VERSION := $(shell git describe --tags --always --dirty) -BUILD_TIME := $(shell date -u +%Y-%m-%dT%H:%M:%SZ) -LDFLAGS := -s -w -X main.version=$(VERSION) -X main.buildTime=$(BUILD_TIME) - -.PHONY: build clean test install - -build: - @echo "Building Llamactl..." - @cd webui && npm run build - @go build -ldflags="$(LDFLAGS)" -o llamactl cmd/server/main.go - -build-prod: - @echo "Building production binary..." - @cd webui && npm run build - @CGO_ENABLED=0 go build -ldflags="$(LDFLAGS)" -trimpath -o llamactl cmd/server/main.go - -build-all: build-linux build-darwin build-windows - -build-linux: - @GOOS=linux GOARCH=amd64 go build -ldflags="$(LDFLAGS)" -o dist/llamactl-linux-amd64 cmd/server/main.go - @GOOS=linux GOARCH=arm64 go build -ldflags="$(LDFLAGS)" -o dist/llamactl-linux-arm64 cmd/server/main.go - -build-darwin: - @GOOS=darwin GOARCH=amd64 go build -ldflags="$(LDFLAGS)" -o dist/llamactl-darwin-amd64 cmd/server/main.go - @GOOS=darwin GOARCH=arm64 go build -ldflags="$(LDFLAGS)" -o dist/llamactl-darwin-arm64 cmd/server/main.go - -build-windows: - @GOOS=windows GOARCH=amd64 go build -ldflags="$(LDFLAGS)" -o dist/llamactl-windows-amd64.exe cmd/server/main.go - -test: - @go test ./... - -clean: - @rm -f llamactl llamactl-* - @rm -rf dist/ - -install: build - @cp llamactl $(GOPATH)/bin/llamactl -``` - -### GitHub Actions - -```yaml -# .github/workflows/build.yml -name: Build - -on: - push: - branches: [ main ] - pull_request: - branches: [ main ] - -jobs: - test: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - - - name: Set up Go - uses: actions/setup-go@v4 - with: - go-version: '1.24' - - - name: Set up Node.js - uses: actions/setup-node@v4 - with: - node-version: '22' - - - name: Install dependencies - run: | - go mod download - cd webui && npm ci - - - name: Run tests - run: | - go test ./... - cd webui && npm test - - - name: Build - run: make build - - build: - needs: test - runs-on: ubuntu-latest - if: github.ref == 'refs/heads/main' - - steps: - - uses: actions/checkout@v4 - - - name: Set up Go - uses: actions/setup-go@v4 - with: - go-version: '1.24' - - - name: Set up Node.js - uses: actions/setup-node@v4 - with: - node-version: '22' - - - name: Build all platforms - run: make build-all - - - name: Upload artifacts - uses: actions/upload-artifact@v4 - with: - name: binaries - path: dist/ -``` - -## Build Troubleshooting - -### Common Issues - -**Go version mismatch:** -```bash -# Check Go version -go version - -# Update Go -# Download from https://golang.org/dl/ -``` - -**Node.js issues:** -```bash -# Clear npm cache -npm cache clean --force - -# Remove node_modules and reinstall -rm -rf webui/node_modules -cd webui && npm ci -``` - -**Build failures:** -```bash -# Clean and rebuild -make clean -go mod tidy -make build -``` - -### Performance Issues - -**Slow builds:** -```bash -# Use build cache -export GOCACHE=$(go env GOCACHE) - -# Parallel builds -export GOMAXPROCS=$(nproc) -``` - -**Large binary size:** -```bash -# Use UPX compression -upx --best llamactl - -# Analyze binary size -go tool nm -size llamactl | head -20 -``` - -## Deployment - -### System Service - -Create a systemd service: - -```ini -# /etc/systemd/system/llamactl.service -[Unit] -Description=Llamactl Server -After=network.target - -[Service] -Type=simple -User=llamactl -Group=llamactl -ExecStart=/usr/local/bin/llamactl -Restart=always -RestartSec=5 - -[Install] -WantedBy=multi-user.target -``` - -```bash -# Enable and start service -sudo systemctl enable llamactl -sudo systemctl start llamactl -``` - -### Configuration - -```bash -# Create configuration directory -sudo mkdir -p /etc/llamactl - -# Copy configuration -sudo cp config.yaml /etc/llamactl/ - -# Set permissions -sudo chown -R llamactl:llamactl /etc/llamactl -``` - -## Next Steps - -- Configure [Installation](../getting-started/installation.md) -- Set up [Configuration](../getting-started/configuration.md) -- Learn about [Contributing](contributing.md) diff --git a/docs/development/contributing.md b/docs/development/contributing.md deleted file mode 100644 index 3b27d90..0000000 --- a/docs/development/contributing.md +++ /dev/null @@ -1,373 +0,0 @@ -# Contributing - -Thank you for your interest in contributing to Llamactl! This guide will help you get started with development and contribution. - -## Development Setup - -### Prerequisites - -- Go 1.24 or later -- Node.js 22 or later -- `llama-server` executable (from [llama.cpp](https://github.com/ggml-org/llama.cpp)) -- Git - -### Getting Started - -1. **Fork and Clone** - ```bash - # Fork the repository on GitHub, then clone your fork - git clone https://github.com/yourusername/llamactl.git - cd llamactl - - # Add upstream remote - git remote add upstream https://github.com/lordmathis/llamactl.git - ``` - -2. **Install Dependencies** - ```bash - # Go dependencies - go mod download - - # Frontend dependencies - cd webui && npm ci && cd .. - ``` - -3. **Run Development Environment** - ```bash - # Start backend server - go run ./cmd/server - ``` - - In a separate terminal: - ```bash - # Start frontend dev server - cd webui && npm run dev - ``` - -## Development Workflow - -### Setting Up Your Environment - -1. **Configuration** - Create a development configuration file: - ```yaml - # dev-config.yaml - server: - host: "localhost" - port: 8080 - logging: - level: "debug" - ``` - -2. **Test Data** - Set up test models and instances for development. - -### Making Changes - -1. **Create a Branch** - ```bash - git checkout -b feature/your-feature-name - ``` - -2. **Development Commands** - ```bash - # Backend - go test ./... -v # Run tests - go test -race ./... -v # Run with race detector - go fmt ./... && go vet ./... # Format and vet code - go build ./cmd/server # Build binary - - # Frontend (from webui/ directory) - npm run test # Run tests - npm run lint # Lint code - npm run type-check # TypeScript check - npm run build # Build for production - ``` - -3. **Code Quality** - ```bash - # Run all checks before committing - make lint - make test - make build - ``` - -## Project Structure - -### Backend (Go) - -``` -cmd/ -├── server/ # Main application entry point -pkg/ -├── backends/ # Model backend implementations -├── config/ # Configuration management -├── instance/ # Instance lifecycle management -├── manager/ # Instance manager -├── server/ # HTTP server and routes -├── testutil/ # Test utilities -└── validation/ # Input validation -``` - -### Frontend (React/TypeScript) - -``` -webui/src/ -├── components/ # React components -├── contexts/ # React contexts -├── hooks/ # Custom hooks -├── lib/ # Utility libraries -├── schemas/ # Zod schemas -└── types/ # TypeScript types -``` - -## Coding Standards - -### Go Code - -- Follow standard Go formatting (`gofmt`) -- Use `go vet` and address all warnings -- Write comprehensive tests for new functionality -- Include documentation comments for exported functions -- Use meaningful variable and function names - -Example: -```go -// CreateInstance creates a new model instance with the given configuration. -// It validates the configuration and ensures the instance name is unique. -func (m *Manager) CreateInstance(ctx context.Context, config InstanceConfig) (*Instance, error) { - if err := config.Validate(); err != nil { - return nil, fmt.Errorf("invalid configuration: %w", err) - } - - // Implementation... -} -``` - -### TypeScript/React Code - -- Use TypeScript strict mode -- Follow React best practices -- Use functional components with hooks -- Implement proper error boundaries -- Write unit tests for components - -Example: -```typescript -interface InstanceCardProps { - instance: Instance; - onStart: (name: string) => Promise; - onStop: (name: string) => Promise; -} - -export const InstanceCard: React.FC = ({ - instance, - onStart, - onStop, -}) => { - // Implementation... -}; -``` - -## Testing - -### Backend Tests - -```bash -# Run all tests -go test ./... - -# Run tests with coverage -go test ./... -coverprofile=coverage.out -go tool cover -html=coverage.out - -# Run specific package tests -go test ./pkg/manager -v - -# Run with race detection -go test -race ./... -``` - -### Frontend Tests - -```bash -cd webui - -# Run unit tests -npm run test - -# Run tests with coverage -npm run test:coverage - -# Run E2E tests -npm run test:e2e -``` - -### Integration Tests - -```bash -# Run integration tests (requires llama-server) -go test ./... -tags=integration -``` - -## Pull Request Process - -### Before Submitting - -1. **Update your branch** - ```bash - git fetch upstream - git rebase upstream/main - ``` - -2. **Run all tests** - ```bash - make test-all - ``` - -3. **Update documentation** if needed - -4. **Write clear commit messages** - ``` - feat: add instance health monitoring - - - Implement health check endpoint - - Add periodic health monitoring - - Update API documentation - - Fixes #123 - ``` - -### Submitting a PR - -1. **Push your branch** - ```bash - git push origin feature/your-feature-name - ``` - -2. **Create Pull Request** - - Use the PR template - - Provide clear description - - Link related issues - - Add screenshots for UI changes - -3. **PR Review Process** - - Automated checks must pass - - Code review by maintainers - - Address feedback promptly - - Keep PR scope focused - -## Issue Guidelines - -### Reporting Bugs - -Use the bug report template and include: - -- Steps to reproduce -- Expected vs actual behavior -- Environment details (OS, Go version, etc.) -- Relevant logs or error messages -- Minimal reproduction case - -### Feature Requests - -Use the feature request template and include: - -- Clear description of the problem -- Proposed solution -- Alternative solutions considered -- Implementation complexity estimate - -### Security Issues - -For security vulnerabilities: -- Do NOT create public issues -- Email security@llamactl.dev -- Provide detailed description -- Allow time for fix before disclosure - -## Development Best Practices - -### API Design - -- Follow REST principles -- Use consistent naming conventions -- Provide comprehensive error messages -- Include proper HTTP status codes -- Document all endpoints - -### Error Handling - -```go -// Wrap errors with context -if err := instance.Start(); err != nil { - return fmt.Errorf("failed to start instance %s: %w", instance.Name, err) -} - -// Use structured logging -log.WithFields(log.Fields{ - "instance": instance.Name, - "error": err, -}).Error("Failed to start instance") -``` - -### Configuration - -- Use environment variables for deployment -- Provide sensible defaults -- Validate configuration on startup -- Support configuration file reloading - -### Performance - -- Profile code for bottlenecks -- Use efficient data structures -- Implement proper caching -- Monitor resource usage - -## Release Process - -### Version Management - -- Use semantic versioning (SemVer) -- Tag releases properly -- Maintain CHANGELOG.md -- Create release notes - -### Building Releases - -```bash -# Build all platforms -make build-all - -# Create release package -make package -``` - -## Getting Help - -### Communication Channels - -- **GitHub Issues**: Bug reports and feature requests -- **GitHub Discussions**: General questions and ideas -- **Code Review**: PR comments and feedback - -### Development Questions - -When asking for help: - -1. Check existing documentation -2. Search previous issues -3. Provide minimal reproduction case -4. Include relevant environment details - -## Recognition - -Contributors are recognized in: - -- CONTRIBUTORS.md file -- Release notes -- Documentation credits -- Annual contributor highlights - -Thank you for contributing to Llamactl! diff --git a/docs/getting-started/configuration.md b/docs/getting-started/configuration.md index e9ba2d3..3a859ee 100644 --- a/docs/getting-started/configuration.md +++ b/docs/getting-started/configuration.md @@ -1,59 +1,144 @@ # Configuration -Llamactl can be configured through various methods to suit your needs. +llamactl can be configured via configuration files or environment variables. Configuration is loaded in the following order of precedence: -## Configuration File +``` +Defaults < Configuration file < Environment variables +``` -Create a configuration file at `~/.llamactl/config.yaml`: +llamactl works out of the box with sensible defaults, but you can customize the behavior to suit your needs. + +## Default Configuration + +Here's the default configuration with all available options: ```yaml -# Server configuration server: - host: "0.0.0.0" - port: 8080 - cors_enabled: true + host: "0.0.0.0" # Server host to bind to + port: 8080 # Server port to bind to + allowed_origins: ["*"] # Allowed CORS origins (default: all) + enable_swagger: false # Enable Swagger UI for API docs + +instances: + port_range: [8000, 9000] # Port range for instances + data_dir: ~/.local/share/llamactl # Data directory (platform-specific, see below) + configs_dir: ~/.local/share/llamactl/instances # Instance configs directory + logs_dir: ~/.local/share/llamactl/logs # Logs directory + auto_create_dirs: true # Auto-create data/config/logs dirs if missing + max_instances: -1 # Max instances (-1 = unlimited) + max_running_instances: -1 # Max running instances (-1 = unlimited) + enable_lru_eviction: true # Enable LRU eviction for idle instances + llama_executable: llama-server # Path to llama-server executable + default_auto_restart: true # Auto-restart new instances by default + default_max_restarts: 3 # Max restarts for new instances + default_restart_delay: 5 # Restart delay (seconds) for new instances + default_on_demand_start: true # Default on-demand start setting + on_demand_start_timeout: 120 # Default on-demand start timeout in seconds + timeout_check_interval: 5 # Idle instance timeout check in minutes -# Authentication (optional) auth: - enabled: false - # When enabled, configure your authentication method - # jwt_secret: "your-secret-key" - -# Default instance settings -defaults: - backend: "llamacpp" - timeout: 300 - log_level: "info" - -# Paths -paths: - models_dir: "/path/to/your/models" - logs_dir: "/var/log/llamactl" - data_dir: "/var/lib/llamactl" - -# Instance limits -limits: - max_instances: 10 - max_memory_per_instance: "8GB" + require_inference_auth: true # Require auth for inference endpoints + inference_keys: [] # Keys for inference endpoints + require_management_auth: true # Require auth for management endpoints + management_keys: [] # Keys for management endpoints ``` -## Environment Variables +## Configuration Files -You can also configure Llamactl using environment variables: +### Configuration File Locations -```bash -# Server settings -export LLAMACTL_HOST=0.0.0.0 -export LLAMACTL_PORT=8080 +Configuration files are searched in the following locations (in order of precedence): -# Paths -export LLAMACTL_MODELS_DIR=/path/to/models -export LLAMACTL_LOGS_DIR=/var/log/llamactl +**Linux:** +- `./llamactl.yaml` or `./config.yaml` (current directory) +- `$HOME/.config/llamactl/config.yaml` +- `/etc/llamactl/config.yaml` -# Limits -export LLAMACTL_MAX_INSTANCES=5 +**macOS:** +- `./llamactl.yaml` or `./config.yaml` (current directory) +- `$HOME/Library/Application Support/llamactl/config.yaml` +- `/Library/Application Support/llamactl/config.yaml` + +**Windows:** +- `./llamactl.yaml` or `./config.yaml` (current directory) +- `%APPDATA%\llamactl\config.yaml` +- `%USERPROFILE%\llamactl\config.yaml` +- `%PROGRAMDATA%\llamactl\config.yaml` + +You can specify the path to config file with `LLAMACTL_CONFIG_PATH` environment variable. + +## Configuration Options + +### Server Configuration + +```yaml +server: + host: "0.0.0.0" # Server host to bind to (default: "0.0.0.0") + port: 8080 # Server port to bind to (default: 8080) + allowed_origins: ["*"] # CORS allowed origins (default: ["*"]) + enable_swagger: false # Enable Swagger UI (default: false) ``` +**Environment Variables:** +- `LLAMACTL_HOST` - Server host +- `LLAMACTL_PORT` - Server port +- `LLAMACTL_ALLOWED_ORIGINS` - Comma-separated CORS origins +- `LLAMACTL_ENABLE_SWAGGER` - Enable Swagger UI (true/false) + +### Instance Configuration + +```yaml +instances: + port_range: [8000, 9000] # Port range for instances (default: [8000, 9000]) + data_dir: "~/.local/share/llamactl" # Directory for all llamactl data (default varies by OS) + configs_dir: "~/.local/share/llamactl/instances" # Directory for instance configs (default: data_dir/instances) + logs_dir: "~/.local/share/llamactl/logs" # Directory for instance logs (default: data_dir/logs) + auto_create_dirs: true # Automatically create data/config/logs directories (default: true) + max_instances: -1 # Maximum instances (-1 = unlimited) + max_running_instances: -1 # Maximum running instances (-1 = unlimited) + enable_lru_eviction: true # Enable LRU eviction for idle instances + llama_executable: "llama-server" # Path to llama-server executable + default_auto_restart: true # Default auto-restart setting + default_max_restarts: 3 # Default maximum restart attempts + default_restart_delay: 5 # Default restart delay in seconds + default_on_demand_start: true # Default on-demand start setting + on_demand_start_timeout: 120 # Default on-demand start timeout in seconds + timeout_check_interval: 5 # Default instance timeout check interval in minutes +``` + +**Environment Variables:** +- `LLAMACTL_INSTANCE_PORT_RANGE` - Port range (format: "8000-9000" or "8000,9000") +- `LLAMACTL_DATA_DIRECTORY` - Data directory path +- `LLAMACTL_INSTANCES_DIR` - Instance configs directory path +- `LLAMACTL_LOGS_DIR` - Log directory path +- `LLAMACTL_AUTO_CREATE_DATA_DIR` - Auto-create data/config/logs directories (true/false) +- `LLAMACTL_MAX_INSTANCES` - Maximum number of instances +- `LLAMACTL_MAX_RUNNING_INSTANCES` - Maximum number of running instances +- `LLAMACTL_ENABLE_LRU_EVICTION` - Enable LRU eviction for idle instances +- `LLAMACTL_LLAMA_EXECUTABLE` - Path to llama-server executable +- `LLAMACTL_DEFAULT_AUTO_RESTART` - Default auto-restart setting (true/false) +- `LLAMACTL_DEFAULT_MAX_RESTARTS` - Default maximum restarts +- `LLAMACTL_DEFAULT_RESTART_DELAY` - Default restart delay in seconds +- `LLAMACTL_DEFAULT_ON_DEMAND_START` - Default on-demand start setting (true/false) +- `LLAMACTL_ON_DEMAND_START_TIMEOUT` - Default on-demand start timeout in seconds +- `LLAMACTL_TIMEOUT_CHECK_INTERVAL` - Default instance timeout check interval in minutes + +### Authentication Configuration + +```yaml +auth: + require_inference_auth: true # Require API key for OpenAI endpoints (default: true) + inference_keys: [] # List of valid inference API keys + require_management_auth: true # Require API key for management endpoints (default: true) + management_keys: [] # List of valid management API keys +``` + +**Environment Variables:** +- `LLAMACTL_REQUIRE_INFERENCE_AUTH` - Require auth for OpenAI endpoints (true/false) +- `LLAMACTL_INFERENCE_KEYS` - Comma-separated inference API keys +- `LLAMACTL_REQUIRE_MANAGEMENT_AUTH` - Require auth for management endpoints (true/false) +- `LLAMACTL_MANAGEMENT_KEYS` - Comma-separated management API keys + ## Command Line Options View all available command line options: @@ -62,90 +147,13 @@ View all available command line options: llamactl --help ``` -Common options: - -```bash -# Specify config file -llamactl --config /path/to/config.yaml - -# Set log level -llamactl --log-level debug - -# Run on different port -llamactl --port 9090 -``` - -## Instance Configuration - -When creating instances, you can specify various options: - -### Basic Options - -- `name`: Unique identifier for the instance -- `model_path`: Path to the GGUF model file -- `port`: Port for the instance to listen on - -### Advanced Options - -- `threads`: Number of CPU threads to use -- `context_size`: Context window size -- `batch_size`: Batch size for processing -- `gpu_layers`: Number of layers to offload to GPU -- `memory_lock`: Lock model in memory -- `no_mmap`: Disable memory mapping - -### Example Instance Configuration - -```json -{ - "name": "production-model", - "model_path": "/models/llama-2-13b-chat.gguf", - "port": 8081, - "options": { - "threads": 8, - "context_size": 4096, - "batch_size": 512, - "gpu_layers": 35, - "memory_lock": true - } -} -``` - -## Security Configuration - -### Enable Authentication - -To enable authentication, update your config file: - -```yaml -auth: - enabled: true - jwt_secret: "your-very-secure-secret-key" - token_expiry: "24h" -``` - -### HTTPS Configuration - -For production deployments, configure HTTPS: - -```yaml -server: - tls: - enabled: true - cert_file: "/path/to/cert.pem" - key_file: "/path/to/key.pem" -``` - -## Logging Configuration - -Configure logging levels and outputs: - -```yaml -logging: - level: "info" # debug, info, warn, error - format: "json" # json or text - output: "/var/log/llamactl/app.log" -``` +You can also override configuration using command line flags when starting llamactl. + +## Next Steps + +- Learn about [Managing Instances](../user-guide/managing-instances.md) +- Explore [Advanced Configuration](../advanced/monitoring.md) +- Set up [Monitoring](../advanced/monitoring.md) ## Next Steps diff --git a/docs/getting-started/installation.md b/docs/getting-started/installation.md index 9be575e..9ae35ed 100644 --- a/docs/getting-started/installation.md +++ b/docs/getting-started/installation.md @@ -4,9 +4,19 @@ This guide will walk you through installing Llamactl on your system. ## Prerequisites -Before installing Llamactl, ensure you have: +You need `llama-server` from [llama.cpp](https://github.com/ggml-org/llama.cpp) installed: -- Go 1.19 or later +```bash +# Quick install methods: +# Homebrew (macOS) +brew install llama.cpp + +# Or build from source - see llama.cpp docs +``` + +Additional requirements for building from source: +- Go 1.24 or later +- Node.js 22 or later - Git - Sufficient disk space for your models @@ -14,17 +24,18 @@ Before installing Llamactl, ensure you have: ### Option 1: Download Binary (Recommended) -Download the latest release from our [GitHub releases page](https://github.com/lordmathis/llamactl/releases): +Download the latest release from the [GitHub releases page](https://github.com/lordmathis/llamactl/releases): ```bash -# Download for Linux -curl -L https://github.com/lordmathis/llamactl/releases/latest/download/llamactl-linux-amd64 -o llamactl - -# Make executable -chmod +x llamactl - -# Move to PATH (optional) +# Linux/macOS - Get latest version and download +LATEST_VERSION=$(curl -s https://api.github.com/repos/lordmathis/llamactl/releases/latest | grep '"tag_name":' | sed -E 's/.*"([^"]+)".*/\1/') +curl -L https://github.com/lordmathis/llamactl/releases/download/${LATEST_VERSION}/llamactl-${LATEST_VERSION}-$(uname -s | tr '[:upper:]' '[:lower:]')-$(uname -m).tar.gz | tar -xz sudo mv llamactl /usr/local/bin/ + +# Or download manually from: +# https://github.com/lordmathis/llamactl/releases/latest + +# Windows - Download from releases page ``` ### Option 2: Build from Source @@ -36,11 +47,12 @@ If you prefer to build from source: git clone https://github.com/lordmathis/llamactl.git cd llamactl -# Build the application -go build -o llamactl cmd/server/main.go -``` +# Build the web UI +cd webui && npm ci && npm run build && cd .. -For detailed build instructions, see the [Building from Source](../development/building.md) guide. +# Build the application +go build -o llamactl ./cmd/server +``` ## Verification diff --git a/docs/getting-started/quick-start.md b/docs/getting-started/quick-start.md index a882b10..11751c0 100644 --- a/docs/getting-started/quick-start.md +++ b/docs/getting-started/quick-start.md @@ -28,7 +28,6 @@ You should see the Llamactl web interface. 2. Fill in the instance configuration: - **Name**: Give your instance a descriptive name - **Model Path**: Path to your Llama.cpp model file - - **Port**: Port for the instance to run on - **Additional Options**: Any extra Llama.cpp parameters 3. Click "Create Instance" @@ -50,7 +49,6 @@ Here's a basic example configuration for a Llama 2 model: { "name": "llama2-7b", "model_path": "/path/to/llama-2-7b-chat.gguf", - "port": 8081, "options": { "threads": 4, "context_size": 2048 @@ -72,13 +70,70 @@ curl -X POST http://localhost:8080/api/instances \ -d '{ "name": "my-model", "model_path": "/path/to/model.gguf", - "port": 8081 }' # Start an instance curl -X POST http://localhost:8080/api/instances/my-model/start ``` +## OpenAI Compatible API + +Llamactl provides OpenAI-compatible endpoints, making it easy to integrate with existing OpenAI client libraries and tools. + +### Chat Completions + +Once you have an instance running, you can use it with the OpenAI-compatible chat completions endpoint: + +```bash +curl -X POST http://localhost:8080/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{ + "model": "my-model", + "messages": [ + { + "role": "user", + "content": "Hello! Can you help me write a Python function?" + } + ], + "max_tokens": 150, + "temperature": 0.7 + }' +``` + +### Using with Python OpenAI Client + +You can also use the official OpenAI Python client: + +```python +from openai import OpenAI + +# Point the client to your Llamactl server +client = OpenAI( + base_url="http://localhost:8080/v1", + api_key="not-needed" # Llamactl doesn't require API keys by default +) + +# Create a chat completion +response = client.chat.completions.create( + model="my-model", # Use the name of your instance + messages=[ + {"role": "user", "content": "Explain quantum computing in simple terms"} + ], + max_tokens=200, + temperature=0.7 +) + +print(response.choices[0].message.content) +``` + +### List Available Models + +Get a list of running instances (models) in OpenAI-compatible format: + +```bash +curl http://localhost:8080/v1/models +``` + ## Next Steps - Learn more about the [Web UI](../user-guide/web-ui.md) diff --git a/docs/index.md b/docs/index.md index b45cae2..19f7508 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,12 +1,12 @@ # Llamactl Documentation -Welcome to the Llamactl documentation! Llamactl is a powerful management tool for Llama.cpp instances that provides both a web interface and REST API for managing large language models. +Welcome to the Llamactl documentation! Llamactl is a powerful management tool for llama-server instances that provides both a web interface and REST API for managing large language models. ## What is Llamactl? -Llamactl is designed to simplify the deployment and management of Llama.cpp instances. It provides: +Llamactl is designed to simplify the deployment and management of llama-server instances. It provides: -- **Instance Management**: Start, stop, and monitor multiple Llama.cpp instances +- **Instance Management**: Start, stop, and monitor multiple llama-server instances - **Web UI**: User-friendly interface for managing your models - **REST API**: Programmatic access to all functionality - **Health Monitoring**: Real-time status and health checks @@ -33,8 +33,7 @@ Llamactl is designed to simplify the deployment and management of Llama.cpp inst If you need help or have questions: - Check the [Troubleshooting](advanced/troubleshooting.md) guide -- Visit our [GitHub repository](https://github.com/lordmathis/llamactl) -- Read the [Contributing guide](development/contributing.md) to help improve Llamactl +- Visit the [GitHub repository](https://github.com/lordmathis/llamactl) --- diff --git a/mkdocs.yml b/mkdocs.yml index f23c70e..4e7e107 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -1,6 +1,6 @@ -site_name: LlamaCtl Documentation -site_description: User documentation for LlamaCtl - A management tool for Llama.cpp instances -site_author: LlamaCtl Team +site_name: Llamatl Documentation +site_description: User documentation for Llamatl - A management tool for Llama.cpp instances +site_author: Llamatl Team site_url: https://llamactl.org repo_name: lordmathis/llamactl @@ -61,9 +61,6 @@ nav: - Backends: advanced/backends.md - Monitoring: advanced/monitoring.md - Troubleshooting: advanced/troubleshooting.md - - Development: - - Contributing: development/contributing.md - - Building from Source: development/building.md plugins: - search