diff --git a/README.md b/README.md index 60aafa9..9f2039b 100644 --- a/README.md +++ b/README.md @@ -138,7 +138,7 @@ go build -o llamactl ./cmd/server 1. Open http://localhost:8080 2. Click "Create Instance" 3. Choose backend type (llama.cpp, MLX, or vLLM) -4. Configure your model and options +4. Configure your model and options (ports and API keys are auto-assigned) 5. Start the instance and use it with any OpenAI-compatible client ## Configuration diff --git a/docs/managing-instances.md b/docs/managing-instances.md index 9277c6d..c298b15 100644 --- a/docs/managing-instances.md +++ b/docs/managing-instances.md @@ -59,6 +59,10 @@ Each instance is displayed as a card showing: - **llama.cpp**: Threads, context size, GPU layers, port, etc. - **MLX**: Temperature, top-p, adapter path, Python environment, etc. - **vLLM**: Tensor parallel size, GPU memory utilization, quantization, etc. + +!!! tip "Auto-Assignment" + Llamactl automatically assigns ports from the configured port range (default: 8000-9000) and generates API keys if authentication is enabled. You typically don't need to manually specify these values. + 8. Click **"Create"** to save the instance **Via API** diff --git a/docs/quick-start.md b/docs/quick-start.md index 7a6dedd..3fc562e 100644 --- a/docs/quick-start.md +++ b/docs/quick-start.md @@ -33,6 +33,9 @@ You should see the Llamactl web interface. - **Model**: Model path or huggingface repo - **Additional Options**: Backend-specific parameters +!!! tip "Auto-Assignment" + Llamactl automatically assigns ports from the configured port range (default: 8000-9000) and generates API keys if authentication is enabled. You typically don't need to manually specify these values. + 3. Click "Create Instance" ## Start Your Instance