Refactor documentation headings

2025-11-05 16:44:22 +00:00 · 2025-10-26 14:50:42 +01:00
parent 85e21596d9
commit 781921fc5a
4 changed files with 27 additions and 27 deletions
--- a/README.md
+++ b/README.md
@@ -10,17 +10,17 @@

 ## Features

-### 🚀 Easy Model Management
+**🚀 Easy Model Management**
 - **Multiple Models Simultaneously**: Run different models at the same time (7B for speed, 70B for quality)
 - **Smart Resource Management**: Automatic idle timeout, LRU eviction, and configurable instance limits
 - **Web Dashboard**: Modern React UI for managing instances, monitoring health, and viewing logs

-### 🔗 Flexible Integration
+**🔗 Flexible Integration**
 - **OpenAI API Compatible**: Drop-in replacement - route requests to different models by instance name
 - **Multi-Backend Support**: Native support for llama.cpp, MLX (Apple Silicon optimized), and vLLM
 - **Docker Ready**: Run backends in containers with full GPU support

-### 🌐 Distributed Deployment
+**🌐 Distributed Deployment**
 - **Remote Instances**: Deploy instances on remote hosts
 - **Central Management**: Manage everything from a single dashboard with automatic routing  

--- a/docs/installation.md
+++ b/docs/installation.md
@@ -82,7 +82,7 @@ llamactl provides Dockerfiles for creating Docker images with backends pre-insta

 **Note:** These Dockerfiles are configured for CUDA. For other platforms (CPU, ROCm, Vulkan, etc.), adapt the base image. For llama.cpp, see available tags at [llama.cpp Docker docs](https://github.com/ggml-org/llama.cpp/blob/master/docs/docker.md). For vLLM, check [vLLM docs](https://docs.vllm.ai/en/v0.6.5/serving/deploying_with_docker.html).

-#### Using Docker Compose
+**Using Docker Compose**

 ```bash
 # Clone the repository
@@ -103,9 +103,9 @@ Access the dashboard at:
 - llamactl with llama.cpp: http://localhost:8080
 - llamactl with vLLM: http://localhost:8081

-#### Using Docker Build and Run
+**Using Docker Build and Run**

-**llamactl with llama.cpp CUDA:**
+1. llamactl with llama.cpp CUDA:
 ```bash
 docker build -f docker/Dockerfile.llamacpp -t llamactl:llamacpp-cuda .
 docker run -d \
@@ -116,7 +116,7 @@ docker run -d \
  llamactl:llamacpp-cuda
 ```

-**llamactl with vLLM CUDA:**
+2. llamactl with vLLM CUDA:
 ```bash
 docker build -f docker/Dockerfile.vllm -t llamactl:vllm-cuda .
 docker run -d \
@@ -127,7 +127,7 @@ docker run -d \
  llamactl:vllm-cuda
 ```

-**llamactl built from source:**
+3. llamactl built from source:
 ```bash
 docker build -f docker/Dockerfile.source -t llamactl:source .
 docker run -d \
--- a/docs/managing-instances.md
+++ b/docs/managing-instances.md
@@ -33,7 +33,7 @@ Each instance is displayed as a card showing:

 ## Create Instance

-### Via Web UI
+**Via Web UI**

 ![Create Instance Screenshot](images/create_instance.png)

@@ -61,7 +61,7 @@ Each instance is displayed as a card showing:
    - **vLLM**: Tensor parallel size, GPU memory utilization, quantization, etc.
 8. Click **"Create"** to save the instance  

-### Via API
+**Via API**

 ```bash
 # Create llama.cpp instance with local model file
@@ -138,37 +138,37 @@ curl -X POST http://localhost:8080/api/instances/remote-llama \

 ## Start Instance

-### Via Web UI
+**Via Web UI**
 1. Click the **"Start"** button on an instance card
 2. Watch the status change to "Unknown"
 3. Monitor progress in the logs
 4. Instance status changes to "Ready" when ready

-### Via API
+**Via API**
 ```bash
 curl -X POST http://localhost:8080/api/instances/{name}/start
 ```

 ## Stop Instance

-### Via Web UI
+**Via Web UI**
 1. Click the **"Stop"** button on an instance card
 2. Instance gracefully shuts down

-### Via API
+**Via API**
 ```bash
 curl -X POST http://localhost:8080/api/instances/{name}/stop
 ```

 ## Edit Instance

-### Via Web UI
+**Via Web UI**
 1. Click the **"Edit"** button on an instance card
 2. Modify settings in the configuration dialog
 3. Changes require instance restart to take effect
 4. Click **"Update & Restart"** to apply changes

-### Via API
+**Via API**
 Modify instance settings:

 ```bash
@@ -188,12 +188,12 @@ curl -X PUT http://localhost:8080/api/instances/{name} \

 ## View Logs

-### Via Web UI
+**Via Web UI**

 1. Click the **"Logs"** button on any instance card
 2. Real-time log viewer opens

-### Via API
+**Via API**
 Check instance status in real-time:

 ```bash
@@ -203,12 +203,12 @@ curl http://localhost:8080/api/instances/{name}/logs

 ## Delete Instance

-### Via Web UI
+**Via Web UI**
 1. Click the **"Delete"** button on an instance card
 2. Only stopped instances can be deleted
 3. Confirm deletion in the dialog

-### Via API
+**Via API**
 ```bash
 curl -X DELETE http://localhost:8080/api/instances/{name}
 ```
@@ -229,11 +229,11 @@ All backends provide OpenAI-compatible endpoints. Check the respective documenta

 ### Instance Health

-#### Via Web UI
+**Via Web UI**

 1. The health status badge is displayed on each instance card

-#### Via API
+**Via API**

 Check the health status of your instances:

--- a/docs/quick-start.md
+++ b/docs/quick-start.md
@@ -2,7 +2,7 @@

 This guide will help you get Llamactl up and running in just a few minutes.

-## Step 1: Start Llamactl
+## Start Llamactl

 Start the Llamactl server:

@@ -12,7 +12,7 @@ llamactl

 By default, Llamactl will start on `http://localhost:8080`.

-## Step 2: Access the Web UI
+## Access the Web UI

 Open your web browser and navigate to:

@@ -24,18 +24,18 @@ Login with the management API key. By default it is generated during server star

 You should see the Llamactl web interface.

-## Step 3: Create Your First Instance
+## Create Your First Instance

 1. Click the "Add Instance" button
 2. Fill in the instance configuration:
   - **Name**: Give your instance a descriptive name
   - **Backend Type**: Choose from llama.cpp, MLX, or vLLM
-   - **Model**: Model path or identifier for your chosen backend
+   - **Model**: Model path or huggingface repo
   - **Additional Options**: Backend-specific parameters

 3. Click "Create Instance"

-## Step 4: Start Your Instance
+## Start Your Instance

 Once created, you can: