# Configuration

llamactl can be configured via configuration files or environment variables. Configuration is loaded in the following order of precedence:

```
Defaults < Configuration file < Environment variables
```

llamactl works out of the box with sensible defaults, but you can customize the behavior to suit your needs.

## Default Configuration

Here's the default configuration with all available options:

```yaml
server:
  host: "0.0.0.0"                # Server host to bind to
  port: 8080                     # Server port to bind to
  allowed_origins: ["*"]         # Allowed CORS origins (default: all)
  allowed_headers: ["*"]         # Allowed CORS headers (default: all)
  enable_swagger: false          # Enable Swagger UI for API docs

backends:
  llama-cpp:
    command: "llama-server"
    args: []
    environment: {}              # Environment variables for the backend process
    docker:
      enabled: false
      image: "ghcr.io/ggml-org/llama.cpp:server"
      args: ["run", "--rm", "--network", "host", "--gpus", "all"]
      environment: {}
    response_headers: {}         # Additional response headers to send with responses

  vllm:
    command: "vllm"
    args: ["serve"]
    environment: {}              # Environment variables for the backend process
    docker:
      enabled: false
      image: "vllm/vllm-openai:latest"
      args: ["run", "--rm", "--network", "host", "--gpus", "all", "--shm-size", "1g"]
      environment: {}
    response_headers: {}         # Additional response headers to send with responses

  mlx:
    command: "mlx_lm.server"
    args: []
    environment: {}              # Environment variables for the backend process
    response_headers: {}         # Additional response headers to send with responses

data_dir: ~/.local/share/llamactl  # Main data directory (database, instances, logs), default varies by OS

instances:
  port_range: [8000, 9000]         # Port range for instances
  configs_dir: data_dir/instances  # Instance configs directory
  logs_dir: data_dir/logs          # Logs directory
  auto_create_dirs: true           # Auto-create data/config/logs dirs if missing
  max_instances: -1                # Max instances (-1 = unlimited)
  max_running_instances: -1        # Max running instances (-1 = unlimited)
  enable_lru_eviction: true        # Enable LRU eviction for idle instances
  default_auto_restart: true       # Auto-restart new instances by default
  default_max_restarts: 3          # Max restarts for new instances
  default_restart_delay: 5         # Restart delay (seconds) for new instances
  default_on_demand_start: true    # Default on-demand start setting
  on_demand_start_timeout: 120     # Default on-demand start timeout in seconds
  timeout_check_interval: 5        # Idle instance timeout check in minutes

database:
  path: data_dir/llamactl.db              # Database file path
  max_open_connections: 25       # Maximum open database connections
  max_idle_connections: 5        # Maximum idle database connections
  connection_max_lifetime: 5m    # Connection max lifetime

auth:
  require_inference_auth: true   # Require auth for inference endpoints
  require_management_auth: true  # Require auth for management endpoints
  management_keys: []            # Keys for management endpoints

local_node: "main"               # Name of the local node (default: "main")
nodes:                           # Node configuration for multi-node deployment
  main:                          # Default local node (empty config)
```

## Configuration Files

### Configuration File Locations

Configuration files are searched in the following locations (in order of precedence, first found is used):

**Linux:**  
- `./llamactl.yaml` or `./config.yaml` (current directory)  
- `$HOME/.config/llamactl/config.yaml`  
- `/etc/llamactl/config.yaml`  

**macOS:**  
- `./llamactl.yaml` or `./config.yaml` (current directory)  
- `$HOME/Library/Application Support/llamactl/config.yaml`  
- `/Library/Application Support/llamactl/config.yaml`  

**Windows:**  
- `./llamactl.yaml` or `./config.yaml` (current directory)  
- `%APPDATA%\llamactl\config.yaml`  
- `%USERPROFILE%\llamactl\config.yaml`  
- `%PROGRAMDATA%\llamactl\config.yaml`  

You can specify the path to config file with `LLAMACTL_CONFIG_PATH` environment variable.

## Configuration Options

### Server Configuration

```yaml
server:
  host: "0.0.0.0"         # Server host to bind to (default: "0.0.0.0")
  port: 8080              # Server port to bind to (default: 8080)
  allowed_origins: ["*"]  # CORS allowed origins (default: ["*"])
  allowed_headers: ["*"]  # CORS allowed headers (default: ["*"])
  enable_swagger: false   # Enable Swagger UI (default: false)
```

**Environment Variables:**
- `LLAMACTL_HOST` - Server host
- `LLAMACTL_PORT` - Server port
- `LLAMACTL_ALLOWED_ORIGINS` - Comma-separated CORS origins
- `LLAMACTL_ENABLE_SWAGGER` - Enable Swagger UI (true/false)

### Backend Configuration
```yaml
backends:
  llama-cpp:
    command: "llama-server"
    args: []
    environment: {}              # Environment variables for the backend process
    docker:
      enabled: false             # Enable Docker runtime (default: false)
      image: "ghcr.io/ggml-org/llama.cpp:server"
      args: ["run", "--rm", "--network", "host", "--gpus", "all"]
      environment: {}
    response_headers: {}         # Additional response headers to send with responses

  vllm:
    command: "vllm"
    args: ["serve"]
    environment: {}              # Environment variables for the backend process
    docker:
      enabled: false             # Enable Docker runtime (default: false)
      image: "vllm/vllm-openai:latest"
      args: ["run", "--rm", "--network", "host", "--gpus", "all", "--shm-size", "1g"]
      environment: {}
    response_headers: {}         # Additional response headers to send with responses

  mlx:
    command: "mlx_lm.server"
    args: []
    environment: {}              # Environment variables for the backend process
    # MLX does not support Docker
    response_headers: {}         # Additional response headers to send with responses
```

**Backend Configuration Fields:**
- `command`: Executable name/path for the backend
- `args`: Default arguments prepended to all instances
- `environment`: Environment variables for the backend process (optional)
- `response_headers`: Additional response headers to send with responses (optional)
- `docker`: Docker-specific configuration (optional)
  - `enabled`: Boolean flag to enable Docker runtime
  - `image`: Docker image to use
  - `args`: Additional arguments passed to `docker run`
  - `environment`: Environment variables for the container (optional)

> If llamactl is behind an NGINX proxy, `X-Accel-Buffering: no` response header may be required for NGINX to properly stream the responses without buffering.

**Environment Variables:**

**LlamaCpp Backend:**
- `LLAMACTL_LLAMACPP_COMMAND` - LlamaCpp executable command
- `LLAMACTL_LLAMACPP_ARGS` - Space-separated default arguments
- `LLAMACTL_LLAMACPP_ENV` - Environment variables in format "KEY1=value1,KEY2=value2"
- `LLAMACTL_LLAMACPP_DOCKER_ENABLED` - Enable Docker runtime (true/false)
- `LLAMACTL_LLAMACPP_DOCKER_IMAGE` - Docker image to use
- `LLAMACTL_LLAMACPP_DOCKER_ARGS` - Space-separated Docker arguments
- `LLAMACTL_LLAMACPP_DOCKER_ENV` - Docker environment variables in format "KEY1=value1,KEY2=value2"
- `LLAMACTL_LLAMACPP_RESPONSE_HEADERS` - Response headers in format "KEY1=value1;KEY2=value2"

**VLLM Backend:**
- `LLAMACTL_VLLM_COMMAND` - VLLM executable command
- `LLAMACTL_VLLM_ARGS` - Space-separated default arguments
- `LLAMACTL_VLLM_ENV` - Environment variables in format "KEY1=value1,KEY2=value2"
- `LLAMACTL_VLLM_DOCKER_ENABLED` - Enable Docker runtime (true/false)
- `LLAMACTL_VLLM_DOCKER_IMAGE` - Docker image to use
- `LLAMACTL_VLLM_DOCKER_ARGS` - Space-separated Docker arguments
- `LLAMACTL_VLLM_DOCKER_ENV` - Docker environment variables in format "KEY1=value1,KEY2=value2"
- `LLAMACTL_VLLM_RESPONSE_HEADERS` - Response headers in format "KEY1=value1;KEY2=value2"

**MLX Backend:**
- `LLAMACTL_MLX_COMMAND` - MLX executable command
- `LLAMACTL_MLX_ARGS` - Space-separated default arguments
- `LLAMACTL_MLX_ENV` - Environment variables in format "KEY1=value1,KEY2=value2"
- `LLAMACTL_MLX_RESPONSE_HEADERS` - Response headers in format "KEY1=value1;KEY2=value2"

### Data Directory Configuration

```yaml
data_dir: "~/.local/share/llamactl"  # Main data directory for database, instances, and logs (default varies by OS)
```

**Environment Variables:**
- `LLAMACTL_DATA_DIRECTORY` - Main data directory path

**Default Data Directory by Platform:**
- **Linux**: `~/.local/share/llamactl`
- **macOS**: `~/Library/Application Support/llamactl`
- **Windows**: `%LOCALAPPDATA%\llamactl` or `%PROGRAMDATA%\llamactl`

### Instance Configuration

```yaml
instances:
  port_range: [8000, 9000]      # Port range for instances (default: [8000, 9000])
  configs_dir: "instances"      # Directory for instance configs, default: data_dir/instances
  logs_dir: "logs"              # Directory for instance logs, default: data_dir/logs
  auto_create_dirs: true        # Automatically create data/config/logs directories (default: true)
  max_instances: -1             # Maximum instances (-1 = unlimited)
  max_running_instances: -1     # Maximum running instances (-1 = unlimited)
  enable_lru_eviction: true     # Enable LRU eviction for idle instances
  default_auto_restart: true    # Default auto-restart setting
  default_max_restarts: 3       # Default maximum restart attempts
  default_restart_delay: 5      # Default restart delay in seconds
  default_on_demand_start: true # Default on-demand start setting
  on_demand_start_timeout: 120  # Default on-demand start timeout in seconds
  timeout_check_interval: 5     # Default instance timeout check interval in minutes
```

**Environment Variables:**
- `LLAMACTL_INSTANCE_PORT_RANGE` - Port range (format: "8000-9000" or "8000,9000")
- `LLAMACTL_INSTANCES_DIR` - Instance configs directory path
- `LLAMACTL_LOGS_DIR` - Log directory path
- `LLAMACTL_AUTO_CREATE_DATA_DIR` - Auto-create data/config/logs directories (true/false)
- `LLAMACTL_MAX_INSTANCES` - Maximum number of instances  
- `LLAMACTL_MAX_RUNNING_INSTANCES` - Maximum number of running instances
- `LLAMACTL_ENABLE_LRU_EVICTION` - Enable LRU eviction for idle instances
- `LLAMACTL_DEFAULT_AUTO_RESTART` - Default auto-restart setting (true/false)  
- `LLAMACTL_DEFAULT_MAX_RESTARTS` - Default maximum restarts  
- `LLAMACTL_DEFAULT_RESTART_DELAY` - Default restart delay in seconds  
- `LLAMACTL_DEFAULT_ON_DEMAND_START` - Default on-demand start setting (true/false)  
- `LLAMACTL_ON_DEMAND_START_TIMEOUT` - Default on-demand start timeout in seconds
- `LLAMACTL_TIMEOUT_CHECK_INTERVAL` - Default instance timeout check interval in minutes

### Database Configuration

```yaml
database:
  path: "llamactl.db"              # Database file path, default: data_dir/llamactl.db
  max_open_connections: 25         # Maximum open database connections (default: 25)
  max_idle_connections: 5          # Maximum idle database connections (default: 5)
  connection_max_lifetime: 5m      # Connection max lifetime (default: 5m)
```

**Environment Variables:**
- `LLAMACTL_DATABASE_PATH` - Database file path (relative to data_dir or absolute)
- `LLAMACTL_DATABASE_MAX_OPEN_CONNECTIONS` - Maximum open database connections
- `LLAMACTL_DATABASE_MAX_IDLE_CONNECTIONS` - Maximum idle database connections
- `LLAMACTL_DATABASE_CONN_MAX_LIFETIME` - Connection max lifetime (e.g., "5m", "1h")

### Authentication Configuration

llamactl supports two types of authentication:

- **Management API Keys**: For accessing the web UI and management API (creating/managing instances). These can be configured in the config file or via environment variables.
- **Inference API Keys**: For accessing the OpenAI-compatible inference endpoints. These are managed via the web UI (Settings → API Keys) and stored in the database.

```yaml
auth:
  require_inference_auth: true           # Require API key for OpenAI endpoints (default: true)
  require_management_auth: true          # Require API key for management endpoints (default: true)
  management_keys: []                    # List of valid management API keys
```

**Managing Inference API Keys:**

Inference API keys are managed through the web UI or management API and stored in the database. To create and manage inference keys:

1. Open the web UI and log in with a management API key
2. Navigate to **Settings → API Keys**
3. Click **Create API Key**
4. Configure the key:
   - **Name**: A descriptive name for the key
   - **Expiration**: Optional expiration date
   - **Permissions**: Grant access to all instances or specific instances only
5. Copy the generated key - it won't be shown again

**Environment Variables:**
- `LLAMACTL_REQUIRE_INFERENCE_AUTH` - Require auth for OpenAI endpoints (true/false)
- `LLAMACTL_REQUIRE_MANAGEMENT_AUTH` - Require auth for management endpoints (true/false)
- `LLAMACTL_MANAGEMENT_KEYS` - Comma-separated management API keys

### Remote Node Configuration

llamactl supports remote node deployments. Configure remote nodes to deploy instances on remote hosts and manage them centrally.

```yaml
local_node: "main"               # Name of the local node (default: "main")
nodes:                           # Node configuration map
  main:                          # Local node (empty address means local)
    address: ""                  # Not used for local node
    api_key: ""                  # Not used for local node
  worker1:                       # Remote worker node
    address: "http://192.168.1.10:8080"
    api_key: "worker1-api-key"   # Management API key for authentication
```

**Node Configuration Fields:**
- `local_node`: Specifies which node in the `nodes` map represents the local node. Must match exactly what other nodes call this node.
- `nodes`: Map of node configurations
  - `address`: HTTP/HTTPS URL of the remote node (empty for local node)
  - `api_key`: Management API key for authenticating with the remote node

**Environment Variables:**
- `LLAMACTL_LOCAL_NODE` - Name of the local node