mirror of
https://github.com/lordmathis/llamactl.git
synced 2025-11-05 16:44:22 +00:00
Minor docs improvements
This commit is contained in:
@@ -80,7 +80,7 @@ nodes: # Node configuration for multi-node deployment
|
|||||||
|
|
||||||
### Configuration File Locations
|
### Configuration File Locations
|
||||||
|
|
||||||
Configuration files are searched in the following locations (in order of precedence):
|
Configuration files are searched in the following locations (in order of precedence, first found is used):
|
||||||
|
|
||||||
**Linux:**
|
**Linux:**
|
||||||
- `./llamactl.yaml` or `./config.yaml` (current directory)
|
- `./llamactl.yaml` or `./config.yaml` (current directory)
|
||||||
|
|||||||
@@ -2,6 +2,8 @@
|
|||||||
|
|
||||||
This guide will help you get Llamactl up and running in just a few minutes.
|
This guide will help you get Llamactl up and running in just a few minutes.
|
||||||
|
|
||||||
|
**Before you begin:** Ensure you have at least one backend installed (llama.cpp, MLX, or vLLM). See the [Installation Guide](installation.md#prerequisites) for backend setup.
|
||||||
|
|
||||||
## Core Concepts
|
## Core Concepts
|
||||||
|
|
||||||
Before you start, let's clarify a few key terms:
|
Before you start, let's clarify a few key terms:
|
||||||
@@ -53,7 +55,7 @@ llamactl
|
|||||||
Llamactl server listening on 0.0.0.0:8080
|
Llamactl server listening on 0.0.0.0:8080
|
||||||
```
|
```
|
||||||
|
|
||||||
Copy the **Management API Key** from the terminal - you'll need it to access the web UI.
|
Copy the **Management** and **Inference** API Keys from the terminal - you'll need them to access the web UI and make inference requests.
|
||||||
|
|
||||||
By default, Llamactl will start on `http://localhost:8080`.
|
By default, Llamactl will start on `http://localhost:8080`.
|
||||||
|
|
||||||
@@ -143,7 +145,7 @@ Here are basic example configurations for each backend:
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
**Multi-node deployment example:**
|
**Remote node deployment example:**
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"name": "distributed-model",
|
"name": "distributed-model",
|
||||||
@@ -152,7 +154,7 @@ Here are basic example configurations for each backend:
|
|||||||
"model": "/path/to/model.gguf",
|
"model": "/path/to/model.gguf",
|
||||||
"gpu_layers": 32
|
"gpu_layers": 32
|
||||||
},
|
},
|
||||||
"nodes": ["worker1", "worker2"]
|
"nodes": ["worker1"]
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -85,7 +85,7 @@ llama-server --model /path/to/model.gguf --port 8081
|
|||||||
mlx_lm.server --model mlx-community/Mistral-7B-Instruct-v0.3-4bit --port 8081
|
mlx_lm.server --model mlx-community/Mistral-7B-Instruct-v0.3-4bit --port 8081
|
||||||
|
|
||||||
# vLLM
|
# vLLM
|
||||||
python -m vllm.entrypoints.openai.api_server --model microsoft/DialoGPT-medium --port 8081
|
vllm serve microsoft/DialoGPT-medium --port 8081
|
||||||
```
|
```
|
||||||
|
|
||||||
## API and Network Issues
|
## API and Network Issues
|
||||||
|
|||||||
Reference in New Issue
Block a user