Minor docs improvements

2025-11-06 00:54:23 +00:00 · 2025-10-26 16:10:37 +01:00
parent c0cd03c75d
commit dd40b153d8
3 changed files with 50 additions and 19 deletions
--- a/docs/quick-start.md
+++ b/docs/quick-start.md
@@ -28,13 +28,17 @@ You should see the Llamactl web interface.

 1. Click the "Add Instance" button
 2. Fill in the instance configuration:
-   - **Name**: Give your instance a descriptive name
-   - **Backend Type**: Choose from llama.cpp, MLX, or vLLM
-   - **Model**: Model path or huggingface repo
-   - **Additional Options**: Backend-specific parameters
+     - **Name**: Give your instance a descriptive name
+     - **Node**: Select which node to deploy the instance to (defaults to "main" for single-node setups)
+     - **Backend Type**: Choose from llama.cpp, MLX, or vLLM
+     - **Model**: Model path or huggingface repo
+     - **Additional Options**: Backend-specific parameters

-!!! tip "Auto-Assignment"
-    Llamactl automatically assigns ports from the configured port range (default: 8000-9000) and generates API keys if authentication is enabled. You typically don't need to manually specify these values.
+    !!! tip "Auto-Assignment"
+        Llamactl automatically assigns ports from the configured port range (default: 8000-9000) and generates API keys if authentication is enabled. You typically don't need to manually specify these values.
+
+    !!! note "Remote Node Deployment"
+        If you have configured remote nodes in your configuration file, you can select which node to deploy the instance to. This allows you to distribute instances across multiple machines. See the [Configuration](configuration.md#remote-node-configuration) guide for details on setting up remote nodes.

 3. Click "Create Instance"

@@ -61,7 +65,8 @@ Here are basic example configurations for each backend:
    "threads": 4,
    "ctx_size": 2048,
    "gpu_layers": 32
-  }
+  },
+  "nodes": ["main"]
 }
 ```

@@ -74,7 +79,8 @@ Here are basic example configurations for each backend:
    "model": "mlx-community/Mistral-7B-Instruct-v0.3-4bit",
    "temp": 0.7,
    "max_tokens": 2048
-  }
+  },
+  "nodes": ["main"]
 }
 ```

@@ -87,7 +93,21 @@ Here are basic example configurations for each backend:
    "model": "microsoft/DialoGPT-medium",
    "tensor_parallel_size": 2,
    "gpu_memory_utilization": 0.9
-  }
+  },
+  "nodes": ["main"]
+}
+```
+
+**Multi-node deployment example:**
+```json
+{
+  "name": "distributed-model",
+  "backend_type": "llama_cpp",
+  "backend_options": {
+    "model": "/path/to/model.gguf",
+    "gpu_layers": 32
+  },
+  "nodes": ["worker1", "worker2"]
 }
 ```