Refactor documentation headings

2025-11-05 16:44:22 +00:00 · 2025-10-26 14:50:42 +01:00
parent 85e21596d9
commit 781921fc5a
4 changed files with 27 additions and 27 deletions
--- a/README.md
+++ b/README.md
@@ -10,17 +10,17 @@
 ## Features
-### 🚀 Easy Model Management
+**🚀 Easy Model Management**
 - **Multiple Models Simultaneously**: Run different models at the same time (7B for speed, 70B for quality)
 - **Smart Resource Management**: Automatic idle timeout, LRU eviction, and configurable instance limits
 - **Web Dashboard**: Modern React UI for managing instances, monitoring health, and viewing logs
-### 🔗 Flexible Integration
+**🔗 Flexible Integration**
 - **OpenAI API Compatible**: Drop-in replacement - route requests to different models by instance name
 - **Multi-Backend Support**: Native support for llama.cpp, MLX (Apple Silicon optimized), and vLLM
 - **Docker Ready**: Run backends in containers with full GPU support
-### 🌐 Distributed Deployment
+**🌐 Distributed Deployment**
 - **Remote Instances**: Deploy instances on remote hosts
 - **Central Management**: Manage everything from a single dashboard with automatic routing  
--- a/docs/installation.md
+++ b/docs/installation.md
@@ -82,7 +82,7 @@ llamactl provides Dockerfiles for creating Docker images with backends pre-insta
 **Note:** These Dockerfiles are configured for CUDA. For other platforms (CPU, ROCm, Vulkan, etc.), adapt the base image. For llama.cpp, see available tags at [llama.cpp Docker docs](https://github.com/ggml-org/llama.cpp/blob/master/docs/docker.md). For vLLM, check [vLLM docs](https://docs.vllm.ai/en/v0.6.5/serving/deploying_with_docker.html).
-#### Using Docker Compose
+**Using Docker Compose**
 ```bash
 # Clone the repository
@@ -103,9 +103,9 @@ Access the dashboard at:
 - llamactl with llama.cpp: http://localhost:8080
 - llamactl with vLLM: http://localhost:8081
-#### Using Docker Build and Run
+**Using Docker Build and Run**
-**llamactl with llama.cpp CUDA:**
+1. llamactl with llama.cpp CUDA:
 ```bash
 docker build -f docker/Dockerfile.llamacpp -t llamactl:llamacpp-cuda .
 docker run -d \
@@ -116,7 +116,7 @@ docker run -d \
  llamactl:llamacpp-cuda
 ```
-**llamactl with vLLM CUDA:**
+2. llamactl with vLLM CUDA:
 ```bash
 docker build -f docker/Dockerfile.vllm -t llamactl:vllm-cuda .
 docker run -d \
@@ -127,7 +127,7 @@ docker run -d \
  llamactl:vllm-cuda
 ```
-**llamactl built from source:**
+3. llamactl built from source:
 ```bash
 docker build -f docker/Dockerfile.source -t llamactl:source .
 docker run -d \
--- a/docs/managing-instances.md
+++ b/docs/managing-instances.md
@@ -33,7 +33,7 @@ Each instance is displayed as a card showing:
 ## Create Instance
-### Via Web UI
+**Via Web UI**
 ![Create Instance Screenshot](images/create_instance.png)
@@ -61,7 +61,7 @@ Each instance is displayed as a card showing:
    - **vLLM**: Tensor parallel size, GPU memory utilization, quantization, etc.
 8. Click **"Create"** to save the instance  
-### Via API
+**Via API**
 ```bash
 # Create llama.cpp instance with local model file
@@ -138,37 +138,37 @@ curl -X POST http://localhost:8080/api/instances/remote-llama \
 ## Start Instance
-### Via Web UI
+**Via Web UI**
 1. Click the **"Start"** button on an instance card
 2. Watch the status change to "Unknown"
 3. Monitor progress in the logs
 4. Instance status changes to "Ready" when ready
-### Via API
+**Via API**
 ```bash
 curl -X POST http://localhost:8080/api/instances/{name}/start
 ```
 ## Stop Instance
-### Via Web UI
+**Via Web UI**
 1. Click the **"Stop"** button on an instance card
 2. Instance gracefully shuts down
-### Via API
+**Via API**
 ```bash
 curl -X POST http://localhost:8080/api/instances/{name}/stop
 ```
 ## Edit Instance
-### Via Web UI
+**Via Web UI**
 1. Click the **"Edit"** button on an instance card
 2. Modify settings in the configuration dialog
 3. Changes require instance restart to take effect
 4. Click **"Update & Restart"** to apply changes
-### Via API
+**Via API**
 Modify instance settings:
 ```bash
@@ -188,12 +188,12 @@ curl -X PUT http://localhost:8080/api/instances/{name} \
 ## View Logs
-### Via Web UI
+**Via Web UI**
 1. Click the **"Logs"** button on any instance card
 2. Real-time log viewer opens
-### Via API
+**Via API**
 Check instance status in real-time:
 ```bash
@@ -203,12 +203,12 @@ curl http://localhost:8080/api/instances/{name}/logs
 ## Delete Instance
-### Via Web UI
+**Via Web UI**
 1. Click the **"Delete"** button on an instance card
 2. Only stopped instances can be deleted
 3. Confirm deletion in the dialog
-### Via API
+**Via API**
 ```bash
 curl -X DELETE http://localhost:8080/api/instances/{name}
 ```
@@ -229,11 +229,11 @@ All backends provide OpenAI-compatible endpoints. Check the respective documenta
 ### Instance Health
-#### Via Web UI
+**Via Web UI**
 1. The health status badge is displayed on each instance card
-#### Via API
+**Via API**
 Check the health status of your instances:
--- a/docs/quick-start.md
+++ b/docs/quick-start.md
@@ -2,7 +2,7 @@
 This guide will help you get Llamactl up and running in just a few minutes.
-## Step 1: Start Llamactl
+## Start Llamactl
 Start the Llamactl server:
@@ -12,7 +12,7 @@ llamactl
 By default, Llamactl will start on `http://localhost:8080`.
-## Step 2: Access the Web UI
+## Access the Web UI
 Open your web browser and navigate to:
@@ -24,18 +24,18 @@ Login with the management API key. By default it is generated during server star
 You should see the Llamactl web interface.
-## Step 3: Create Your First Instance
+## Create Your First Instance
 1. Click the "Add Instance" button
 2. Fill in the instance configuration:
   - **Name**: Give your instance a descriptive name
   - **Backend Type**: Choose from llama.cpp, MLX, or vLLM
-   - **Model**: Model path or identifier for your chosen backend
+   - **Model**: Model path or huggingface repo
   - **Additional Options**: Backend-specific parameters
 3. Click "Create Instance"
-## Step 4: Start Your Instance
+## Start Your Instance
 Once created, you can: