Refactor documentation headings

This commit is contained in:
2025-10-26 14:50:42 +01:00
parent 85e21596d9
commit 781921fc5a
4 changed files with 27 additions and 27 deletions

View File

@@ -10,17 +10,17 @@
## Features ## Features
### 🚀 Easy Model Management **🚀 Easy Model Management**
- **Multiple Models Simultaneously**: Run different models at the same time (7B for speed, 70B for quality) - **Multiple Models Simultaneously**: Run different models at the same time (7B for speed, 70B for quality)
- **Smart Resource Management**: Automatic idle timeout, LRU eviction, and configurable instance limits - **Smart Resource Management**: Automatic idle timeout, LRU eviction, and configurable instance limits
- **Web Dashboard**: Modern React UI for managing instances, monitoring health, and viewing logs - **Web Dashboard**: Modern React UI for managing instances, monitoring health, and viewing logs
### 🔗 Flexible Integration **🔗 Flexible Integration**
- **OpenAI API Compatible**: Drop-in replacement - route requests to different models by instance name - **OpenAI API Compatible**: Drop-in replacement - route requests to different models by instance name
- **Multi-Backend Support**: Native support for llama.cpp, MLX (Apple Silicon optimized), and vLLM - **Multi-Backend Support**: Native support for llama.cpp, MLX (Apple Silicon optimized), and vLLM
- **Docker Ready**: Run backends in containers with full GPU support - **Docker Ready**: Run backends in containers with full GPU support
### 🌐 Distributed Deployment **🌐 Distributed Deployment**
- **Remote Instances**: Deploy instances on remote hosts - **Remote Instances**: Deploy instances on remote hosts
- **Central Management**: Manage everything from a single dashboard with automatic routing - **Central Management**: Manage everything from a single dashboard with automatic routing

View File

@@ -82,7 +82,7 @@ llamactl provides Dockerfiles for creating Docker images with backends pre-insta
**Note:** These Dockerfiles are configured for CUDA. For other platforms (CPU, ROCm, Vulkan, etc.), adapt the base image. For llama.cpp, see available tags at [llama.cpp Docker docs](https://github.com/ggml-org/llama.cpp/blob/master/docs/docker.md). For vLLM, check [vLLM docs](https://docs.vllm.ai/en/v0.6.5/serving/deploying_with_docker.html). **Note:** These Dockerfiles are configured for CUDA. For other platforms (CPU, ROCm, Vulkan, etc.), adapt the base image. For llama.cpp, see available tags at [llama.cpp Docker docs](https://github.com/ggml-org/llama.cpp/blob/master/docs/docker.md). For vLLM, check [vLLM docs](https://docs.vllm.ai/en/v0.6.5/serving/deploying_with_docker.html).
#### Using Docker Compose **Using Docker Compose**
```bash ```bash
# Clone the repository # Clone the repository
@@ -103,9 +103,9 @@ Access the dashboard at:
- llamactl with llama.cpp: http://localhost:8080 - llamactl with llama.cpp: http://localhost:8080
- llamactl with vLLM: http://localhost:8081 - llamactl with vLLM: http://localhost:8081
#### Using Docker Build and Run **Using Docker Build and Run**
**llamactl with llama.cpp CUDA:** 1. llamactl with llama.cpp CUDA:
```bash ```bash
docker build -f docker/Dockerfile.llamacpp -t llamactl:llamacpp-cuda . docker build -f docker/Dockerfile.llamacpp -t llamactl:llamacpp-cuda .
docker run -d \ docker run -d \
@@ -116,7 +116,7 @@ docker run -d \
llamactl:llamacpp-cuda llamactl:llamacpp-cuda
``` ```
**llamactl with vLLM CUDA:** 2. llamactl with vLLM CUDA:
```bash ```bash
docker build -f docker/Dockerfile.vllm -t llamactl:vllm-cuda . docker build -f docker/Dockerfile.vllm -t llamactl:vllm-cuda .
docker run -d \ docker run -d \
@@ -127,7 +127,7 @@ docker run -d \
llamactl:vllm-cuda llamactl:vllm-cuda
``` ```
**llamactl built from source:** 3. llamactl built from source:
```bash ```bash
docker build -f docker/Dockerfile.source -t llamactl:source . docker build -f docker/Dockerfile.source -t llamactl:source .
docker run -d \ docker run -d \

View File

@@ -33,7 +33,7 @@ Each instance is displayed as a card showing:
## Create Instance ## Create Instance
### Via Web UI **Via Web UI**
![Create Instance Screenshot](images/create_instance.png) ![Create Instance Screenshot](images/create_instance.png)
@@ -61,7 +61,7 @@ Each instance is displayed as a card showing:
- **vLLM**: Tensor parallel size, GPU memory utilization, quantization, etc. - **vLLM**: Tensor parallel size, GPU memory utilization, quantization, etc.
8. Click **"Create"** to save the instance 8. Click **"Create"** to save the instance
### Via API **Via API**
```bash ```bash
# Create llama.cpp instance with local model file # Create llama.cpp instance with local model file
@@ -138,37 +138,37 @@ curl -X POST http://localhost:8080/api/instances/remote-llama \
## Start Instance ## Start Instance
### Via Web UI **Via Web UI**
1. Click the **"Start"** button on an instance card 1. Click the **"Start"** button on an instance card
2. Watch the status change to "Unknown" 2. Watch the status change to "Unknown"
3. Monitor progress in the logs 3. Monitor progress in the logs
4. Instance status changes to "Ready" when ready 4. Instance status changes to "Ready" when ready
### Via API **Via API**
```bash ```bash
curl -X POST http://localhost:8080/api/instances/{name}/start curl -X POST http://localhost:8080/api/instances/{name}/start
``` ```
## Stop Instance ## Stop Instance
### Via Web UI **Via Web UI**
1. Click the **"Stop"** button on an instance card 1. Click the **"Stop"** button on an instance card
2. Instance gracefully shuts down 2. Instance gracefully shuts down
### Via API **Via API**
```bash ```bash
curl -X POST http://localhost:8080/api/instances/{name}/stop curl -X POST http://localhost:8080/api/instances/{name}/stop
``` ```
## Edit Instance ## Edit Instance
### Via Web UI **Via Web UI**
1. Click the **"Edit"** button on an instance card 1. Click the **"Edit"** button on an instance card
2. Modify settings in the configuration dialog 2. Modify settings in the configuration dialog
3. Changes require instance restart to take effect 3. Changes require instance restart to take effect
4. Click **"Update & Restart"** to apply changes 4. Click **"Update & Restart"** to apply changes
### Via API **Via API**
Modify instance settings: Modify instance settings:
```bash ```bash
@@ -188,12 +188,12 @@ curl -X PUT http://localhost:8080/api/instances/{name} \
## View Logs ## View Logs
### Via Web UI **Via Web UI**
1. Click the **"Logs"** button on any instance card 1. Click the **"Logs"** button on any instance card
2. Real-time log viewer opens 2. Real-time log viewer opens
### Via API **Via API**
Check instance status in real-time: Check instance status in real-time:
```bash ```bash
@@ -203,12 +203,12 @@ curl http://localhost:8080/api/instances/{name}/logs
## Delete Instance ## Delete Instance
### Via Web UI **Via Web UI**
1. Click the **"Delete"** button on an instance card 1. Click the **"Delete"** button on an instance card
2. Only stopped instances can be deleted 2. Only stopped instances can be deleted
3. Confirm deletion in the dialog 3. Confirm deletion in the dialog
### Via API **Via API**
```bash ```bash
curl -X DELETE http://localhost:8080/api/instances/{name} curl -X DELETE http://localhost:8080/api/instances/{name}
``` ```
@@ -229,11 +229,11 @@ All backends provide OpenAI-compatible endpoints. Check the respective documenta
### Instance Health ### Instance Health
#### Via Web UI **Via Web UI**
1. The health status badge is displayed on each instance card 1. The health status badge is displayed on each instance card
#### Via API **Via API**
Check the health status of your instances: Check the health status of your instances:

View File

@@ -2,7 +2,7 @@
This guide will help you get Llamactl up and running in just a few minutes. This guide will help you get Llamactl up and running in just a few minutes.
## Step 1: Start Llamactl ## Start Llamactl
Start the Llamactl server: Start the Llamactl server:
@@ -12,7 +12,7 @@ llamactl
By default, Llamactl will start on `http://localhost:8080`. By default, Llamactl will start on `http://localhost:8080`.
## Step 2: Access the Web UI ## Access the Web UI
Open your web browser and navigate to: Open your web browser and navigate to:
@@ -24,18 +24,18 @@ Login with the management API key. By default it is generated during server star
You should see the Llamactl web interface. You should see the Llamactl web interface.
## Step 3: Create Your First Instance ## Create Your First Instance
1. Click the "Add Instance" button 1. Click the "Add Instance" button
2. Fill in the instance configuration: 2. Fill in the instance configuration:
- **Name**: Give your instance a descriptive name - **Name**: Give your instance a descriptive name
- **Backend Type**: Choose from llama.cpp, MLX, or vLLM - **Backend Type**: Choose from llama.cpp, MLX, or vLLM
- **Model**: Model path or identifier for your chosen backend - **Model**: Model path or huggingface repo
- **Additional Options**: Backend-specific parameters - **Additional Options**: Backend-specific parameters
3. Click "Create Instance" 3. Click "Create Instance"
## Step 4: Start Your Instance ## Start Your Instance
Once created, you can: Once created, you can: