diff --git a/dev/__pycache__/readme_sync.cpython-311.pyc b/dev/__pycache__/readme_sync.cpython-311.pyc index 657627f..11bd050 100644 Binary files a/dev/__pycache__/readme_sync.cpython-311.pyc and b/dev/__pycache__/readme_sync.cpython-311.pyc differ diff --git a/dev/getting-started/configuration/index.html b/dev/getting-started/configuration/index.html index 8885de0..94adbb7 100644 --- a/dev/getting-started/configuration/index.html +++ b/dev/getting-started/configuration/index.html @@ -558,18 +558,18 @@ - - - - - -
  • - +
  • + - Command Line Options + Remote Node Configuration +
  • + + + + @@ -805,18 +805,18 @@ - - - - - -
  • - +
  • + - Command Line Options + Remote Node Configuration +
  • + + + + @@ -901,6 +901,10 @@ inference_keys: [] # Keys for inference endpoints require_management_auth: true # Require auth for management endpoints management_keys: [] # Keys for management endpoints + +local_node: "main" # Name of the local node (default: "main") +nodes: # Node configuration for multi-node deployment + main: # Default local node (empty config)

    Configuration Files

    Configuration File Locations

    @@ -1040,16 +1044,29 @@ require_management_auth: true # Require API key for management endpoints (default: true) management_keys: [] # List of valid management API keys -

    Environment Variables:
    -- LLAMACTL_REQUIRE_INFERENCE_AUTH - Require auth for OpenAI endpoints (true/false)
    -- LLAMACTL_INFERENCE_KEYS - Comma-separated inference API keys
    -- LLAMACTL_REQUIRE_MANAGEMENT_AUTH - Require auth for management endpoints (true/false)
    -- LLAMACTL_MANAGEMENT_KEYS - Comma-separated management API keys

    -

    Command Line Options

    -

    View all available command line options:

    -
    llamactl --help
    +

    Environment Variables: +- LLAMACTL_REQUIRE_INFERENCE_AUTH - Require auth for OpenAI endpoints (true/false) +- LLAMACTL_INFERENCE_KEYS - Comma-separated inference API keys +- LLAMACTL_REQUIRE_MANAGEMENT_AUTH - Require auth for management endpoints (true/false) +- LLAMACTL_MANAGEMENT_KEYS - Comma-separated management API keys

    +

    Remote Node Configuration

    +

    llamactl supports remote node deployments. Configure remote nodes to deploy instances on remote hosts and manage them centrally.

    +
    local_node: "main"               # Name of the local node (default: "main")
    +nodes:                           # Node configuration map
    +  main:                          # Local node (empty address means local)
    +    address: ""                  # Not used for local node
    +    api_key: ""                  # Not used for local node
    +  worker1:                       # Remote worker node
    +    address: "http://192.168.1.10:8080"
    +    api_key: "worker1-api-key"   # Management API key for authentication
     
    -

    You can also override configuration using command line flags when starting llamactl.

    +

    Node Configuration Fields: +- local_node: Specifies which node in the nodes map represents the local node +- nodes: Map of node configurations + - address: HTTP/HTTPS URL of the remote node (empty for local node) + - api_key: Management API key for authenticating with the remote node

    +

    Environment Variables: +- LLAMACTL_LOCAL_NODE - Name of the local node

    @@ -1070,7 +1087,7 @@ - October 4, 2025 + October 9, 2025 diff --git a/dev/getting-started/installation/index.html b/dev/getting-started/installation/index.html index af94cbb..4c0c80e 100644 --- a/dev/getting-started/installation/index.html +++ b/dev/getting-started/installation/index.html @@ -525,6 +525,15 @@ + + +
  • + + + Remote Node Installation + + +
  • @@ -829,6 +838,15 @@ +
  • + +
  • + + + Remote Node Installation + + +
  • @@ -982,12 +1000,17 @@ # Build the application go build -o llamactl ./cmd/server
  • +

    Remote Node Installation

    +

    For deployments with remote nodes: +- Install llamactl on each node using any of the methods above +- Configure API keys for authentication between nodes

    Verification

    Verify your installation by checking the version:

    llamactl --version
     

    Next Steps

    Now that Llamactl is installed, continue to the Quick Start guide to get your first instance running!

    +

    For remote node deployments, see the Configuration Guide for node setup instructions.

    @@ -1008,7 +1031,7 @@ - September 29, 2025 + October 9, 2025 diff --git a/dev/index.html b/dev/index.html index f6153da..6c1c1a6 100644 --- a/dev/index.html +++ b/dev/index.html @@ -426,6 +426,15 @@ + + +
  • + + + 🔗 Remote Instance Deployment + + +
  • @@ -771,6 +780,15 @@ + + +
  • + + + 🔗 Remote Instance Deployment + + +
  • @@ -852,6 +870,12 @@
  • Smart Resource Management: Idle timeout, LRU eviction, and configurable instance limits
  • Environment Variables: Set custom environment variables per instance for advanced configuration
  • +

    🔗 Remote Instance Deployment

    +

    Dashboard Screenshot

    See Managing Instances for complete configuration options.

    Response: @@ -1655,74 +1674,103 @@ curl -X DELETE -H "Authorization: Bearer your-api-key" \ http://localhost:8080/api/v1/instances/my-model -

    Using the Proxy Endpoint

    -

    You can also directly proxy requests to the llama-server instance:

    -
    # Direct proxy to instance (bypasses OpenAI compatibility layer)
    -curl -X POST http://localhost:8080/api/v1/instances/my-model/proxy/completion \
    +

    Remote Node Instance Example

    +
    # Create instance on specific remote node
    +curl -X POST http://localhost:8080/api/v1/instances/remote-model \
       -H "Content-Type: application/json" \
       -H "Authorization: Bearer your-api-key" \
       -d '{
    -    "prompt": "Hello, world!",
    -    "n_predict": 50
    -  }'
    +    "backend_type": "llama_cpp",
    +    "backend_options": {
    +      "model": "/models/llama-2-7b.gguf",
    +      "gpu_layers": 32
    +    },
    +    "nodes": ["worker1"]
    +  }'
    +
    +# Check status of remote instance
    +curl -H "Authorization: Bearer your-api-key" \
    +  http://localhost:8080/api/v1/instances/remote-model
    +
    +# Use remote instance with OpenAI-compatible API
    +curl -X POST http://localhost:8080/v1/chat/completions \
    +  -H "Content-Type: application/json" \
    +  -H "Authorization: Bearer your-inference-api-key" \
    +  -d '{
    +    "model": "remote-model",
    +    "messages": [
    +      {"role": "user", "content": "Hello from remote node!"}
    +    ]
    +  }'
    +
    +

    Using the Proxy Endpoint

    +

    You can also directly proxy requests to the llama-server instance:

    +
    # Direct proxy to instance (bypasses OpenAI compatibility layer)
    +curl -X POST http://localhost:8080/api/v1/instances/my-model/proxy/completion \
    +  -H "Content-Type: application/json" \
    +  -H "Authorization: Bearer your-api-key" \
    +  -d '{
    +    "prompt": "Hello, world!",
    +    "n_predict": 50
    +  }'
     

    Backend-Specific Endpoints

    Parse Commands

    Llamactl provides endpoints to parse command strings from different backends into instance configuration options.

    Parse Llama.cpp Command

    Parse a llama-server command string into instance options.

    -
    POST /api/v1/backends/llama-cpp/parse-command
    +
    POST /api/v1/backends/llama-cpp/parse-command
     

    Request Body: -

    {
    -  "command": "llama-server -m /path/to/model.gguf -c 2048 --port 8080"
    -}
    +
    {
    +  "command": "llama-server -m /path/to/model.gguf -c 2048 --port 8080"
    +}
     

    Response: -

    {
    -  "backend_type": "llama_cpp",
    -  "llama_server_options": {
    -    "model": "/path/to/model.gguf",
    -    "ctx_size": 2048,
    -    "port": 8080
    -  }
    -}
    +
    {
    +  "backend_type": "llama_cpp",
    +  "llama_server_options": {
    +    "model": "/path/to/model.gguf",
    +    "ctx_size": 2048,
    +    "port": 8080
    +  }
    +}
     

    Parse MLX-LM Command

    Parse an MLX-LM server command string into instance options.

    -
    POST /api/v1/backends/mlx/parse-command
    +
    POST /api/v1/backends/mlx/parse-command
     

    Request Body: -

    {
    -  "command": "mlx_lm.server --model /path/to/model --port 8080"
    -}
    +
    {
    +  "command": "mlx_lm.server --model /path/to/model --port 8080"
    +}
     

    Response: -

    {
    -  "backend_type": "mlx_lm",
    -  "mlx_server_options": {
    -    "model": "/path/to/model",
    -    "port": 8080
    -  }
    -}
    +
    {
    +  "backend_type": "mlx_lm",
    +  "mlx_server_options": {
    +    "model": "/path/to/model",
    +    "port": 8080
    +  }
    +}
     

    Parse vLLM Command

    Parse a vLLM serve command string into instance options.

    -
    POST /api/v1/backends/vllm/parse-command
    +
    POST /api/v1/backends/vllm/parse-command
     

    Request Body: -

    {
    -  "command": "vllm serve /path/to/model --port 8080"
    -}
    +
    {
    +  "command": "vllm serve /path/to/model --port 8080"
    +}
     

    Response: -

    {
    -  "backend_type": "vllm",
    -  "vllm_server_options": {
    -    "model": "/path/to/model",
    -    "port": 8080
    -  }
    -}
    +
    {
    +  "backend_type": "vllm",
    +  "vllm_server_options": {
    +    "model": "/path/to/model",
    +    "port": 8080
    +  }
    +}
     

    Error Responses for Parse Commands: - 400 Bad Request: Invalid request body, empty command, or parse error @@ -1735,7 +1783,7 @@

    Swagger Documentation

    If swagger documentation is enabled in the server configuration, you can access the interactive API documentation at:

    -
    http://localhost:8080/swagger/
    +
    http://localhost:8080/swagger/
     

    This provides a complete interactive interface for testing all API endpoints.

    @@ -1758,7 +1806,7 @@ - September 28, 2025 + October 9, 2025 diff --git a/dev/user-guide/managing-instances/index.html b/dev/user-guide/managing-instances/index.html index 8a3edc2..ca1999b 100644 --- a/dev/user-guide/managing-instances/index.html +++ b/dev/user-guide/managing-instances/index.html @@ -1259,6 +1259,7 @@
    1. Click the "Create Instance" button on the dashboard
    2. Enter a unique Name for your instance (only required field)
    3. +
    4. Select Target Node: Choose which node to deploy the instance to from the dropdown
    5. Choose Backend Type:
      • llama.cpp: For GGUF models using llama-server
      • MLX: For MLX-optimized models (macOS only)
      • @@ -1347,6 +1348,18 @@ "gpu_layers": 32 } }' + +# Create instance on specific remote node +curl -X POST http://localhost:8080/api/instances/remote-llama \ + -H "Content-Type: application/json" \ + -d '{ + "backend_type": "llama_cpp", + "backend_options": { + "model": "/models/llama-7b.gguf", + "gpu_layers": 32 + }, + "nodes": ["worker1"] + }'

    Start Instance

    Via Web UI

    @@ -1450,7 +1463,7 @@ - September 28, 2025 + October 9, 2025 diff --git a/dev/user-guide/troubleshooting/index.html b/dev/user-guide/troubleshooting/index.html index 1708031..0de9c30 100644 --- a/dev/user-guide/troubleshooting/index.html +++ b/dev/user-guide/troubleshooting/index.html @@ -695,6 +695,30 @@ + + +
  • + + + Remote Node Issues + + + + +
  • @@ -887,6 +911,30 @@ +
  • + +
  • + + + Remote Node Issues + + + + +
  • @@ -1044,24 +1092,43 @@
  • +

    Remote Node Issues

    +

    Node Configuration

    +

    Problem: Remote instances not appearing or cannot be managed

    +

    Solutions: +1. Verify node configuration: +

    local_node: "main"  # Must match a key in nodes map
    +nodes:
    +  main:
    +    address: ""     # Empty for local node
    +  worker1:
    +    address: "http://worker1.internal:8080"
    +    api_key: "secure-key"  # Must match worker1's management key
    +

    +
      +
    1. Test remote node connectivity: +
      curl -H "Authorization: Bearer remote-node-key" \
      +  http://remote-node:8080/api/v1/instances
      +
    2. +

    Debugging and Logs

    Viewing Instance Logs

    -
    # Get instance logs via API
    -curl http://localhost:8080/api/v1/instances/{name}/logs
    -
    -# Or check log files directly
    -tail -f ~/.local/share/llamactl/logs/{instance-name}.log
    +
    # Get instance logs via API
    +curl http://localhost:8080/api/v1/instances/{name}/logs
    +
    +# Or check log files directly
    +tail -f ~/.local/share/llamactl/logs/{instance-name}.log
     

    Enable Debug Logging

    -
    export LLAMACTL_LOG_LEVEL=debug
    -llamactl
    +
    export LLAMACTL_LOG_LEVEL=debug
    +llamactl
     

    Getting Help

    When reporting issues, include:

    1. System information: -

      llamactl --version
      +   
      llamactl --version
       

    2. @@ -1094,7 +1161,7 @@ - September 3, 2025 + October 9, 2025