Update documentation: remove Web UI guide and adjust navigation links

2025-11-06 00:54:23 +00:00 · 2025-09-03 22:47:15 +02:00
parent 969b4b14e1
commit 3013a343f1
5 changed files with 129 additions and 320 deletions
--- a/docs/getting-started/quick-start.md
+++ b/docs/getting-started/quick-start.md
@@ -138,6 +138,6 @@ curl http://localhost:8080/v1/models
 ## Next Steps
- Learn more about the [Web UI](../user-guide/web-ui.md)
+- Manage instances [Managing Instances](../user-guide/managing-instances.md)
 - Explore the [API Reference](../user-guide/api-reference.md)
 - Configure advanced settings in the [Configuration](configuration.md) guide
--- a/docs/index.md
+++ b/docs/index.md
@@ -37,7 +37,6 @@ Llamactl is designed to simplify the deployment and management of llama-server i
 - [Installation Guide](getting-started/installation.md) - Get Llamactl up and running
 - [Configuration Guide](getting-started/configuration.md) - Detailed configuration options
 - [Quick Start](getting-started/quick-start.md) - Your first steps with Llamactl
 - [Web UI Guide](user-guide/web-ui.md) - Learn to use the web interface
 - [Managing Instances](user-guide/managing-instances.md) - Instance lifecycle management
 - [API Reference](user-guide/api-reference.md) - Complete API documentation
--- a/docs/user-guide/managing-instances.md
+++ b/docs/user-guide/managing-instances.md
@@ -1,73 +1,121 @@
 # Managing Instances
-Learn how to effectively manage your Llama.cpp instances with Llamactl.
+Learn how to effectively manage your Llama.cpp instances with Llamactl through both the Web UI and API.
-## Instance Lifecycle
+## Overview
-### Creating Instances
+Llamactl provides two ways to manage instances:
-Instances can be created through the Web UI or API:
+- **Web UI**: Accessible at `http://localhost:8080` with an intuitive dashboard
 - **REST API**: Programmatic access for automation and integration
-#### Via Web UI
+### Authentication
-1. Click "Add Instance" button
+
-2. Fill in the configuration form
+If authentication is enabled:
-3. Click "Create"
+1. Navigate to the web UI
 2. Enter your credentials
 3. Bearer token is stored for the session
 ### Theme Support
 - Switch between light and dark themes
 - Setting is remembered across sessions
 ## Instance Cards
 Each instance is displayed as a card showing:
 - **Instance name**
 - **Health status badge** (unknown, ready, error, failed)
 - **Action buttons** (start, stop, edit, logs, delete)
 ## Create Instance
 ### Via Web UI
 1. Click the **"Add Instance"** button on the dashboard
 2. Enter a unique **Name** for your instance (only required field)
 3. Configure model source (choose one):
    - **Model Path**: Full path to your downloaded GGUF model file
    - **HuggingFace Repo**: Repository name (e.g., `microsoft/Phi-3-mini-4k-instruct-gguf`)
    - **HuggingFace File**: Specific file within the repo (optional, uses default if not specified)
 4. Configure optional instance management settings:
    - **Auto Restart**: Automatically restart instance on failure
    - **Max Restarts**: Maximum number of restart attempts
    - **Restart Delay**: Delay in seconds between restart attempts
    - **On Demand Start**: Start instance when receiving a request to the OpenAI compatible endpoint
    - **Idle Timeout**: Minutes before stopping idle instance (set to 0 to disable)
 5. Configure optional llama-server backend options:
    - **Threads**: Number of CPU threads to use
    - **Context Size**: Context window size (ctx_size)
    - **GPU Layers**: Number of layers to offload to GPU
    - **Port**: Network port (auto-assigned by llamactl if not specified)
    - **Additional Parameters**: Any other llama-server command line options (see [llama-server documentation](https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md))
 6. Click **"Create"** to save the instance  
 ### Via API
 #### Via API
 ```bash
-curl -X POST http://localhost:8080/api/instances \
+# Create instance with local model file
 curl -X POST http://localhost:8080/api/instances/my-instance \
  -H "Content-Type: application/json" \
  -d '{
-    "name": "my-instance",
+    "backend_type": "llama_cpp",
-    "model_path": "/path/to/model.gguf",
+    "backend_options": {
-    "port": 8081
+      "model": "/path/to/model.gguf",
      "threads": 8,
      "ctx_size": 4096
    }
  }'
 # Create instance with HuggingFace model
 curl -X POST http://localhost:8080/api/instances/phi3-mini \
  -H "Content-Type: application/json" \
  -d '{
    "backend_type": "llama_cpp",
    "backend_options": {
      "hf_repo": "microsoft/Phi-3-mini-4k-instruct-gguf",
      "hf_file": "Phi-3-mini-4k-instruct-q4.gguf",
      "gpu_layers": 32
    },
    "auto_restart": true,
    "max_restarts": 3
  }'
 ```
-### Starting and Stopping
+## Start Instance
-#### Start an Instance
+### Via Web UI
 1. Click the **"Start"** button on an instance card
 2. Watch the status change to "Unknown"
 3. Monitor progress in the logs
 4. Instance status changes to "Ready" when ready
 ### Via API
 ```bash
 # Via API
 curl -X POST http://localhost:8080/api/instances/{name}/start
 # The instance will begin loading the model
 ```
-#### Stop an Instance
+## Stop Instance
 ### Via Web UI
 1. Click the **"Stop"** button on an instance card
 2. Instance gracefully shuts down
 ### Via API
 ```bash
 # Via API
 curl -X POST http://localhost:8080/api/instances/{name}/stop
 # Graceful shutdown with configurable timeout
 ```
-### Monitoring Status
+## Edit Instance
-Check instance status in real-time:
+### Via Web UI
-
+1. Click the **"Edit"** button on an instance card
-```bash
+2. Modify settings in the configuration dialog
-# Get instance details
+3. Changes require instance restart to take effect
-curl http://localhost:8080/api/instances/{name}
+4. Click **"Update & Restart"** to apply changes
 # Get health status
 curl http://localhost:8080/api/instances/{name}/health
 ```
 ## Instance States
 Instances can be in one of several states:
 - **Stopped**: Instance is not running
 - **Starting**: Instance is initializing and loading the model
 - **Running**: Instance is active and ready to serve requests
 - **Stopping**: Instance is shutting down gracefully
 - **Error**: Instance encountered an error
 ## Configuration Management
 ### Updating Instance Configuration
 ### Via API
 Modify instance settings:
 ```bash
@@ -84,82 +132,55 @@ curl -X PUT http://localhost:8080/api/instances/{name} \
 !!! note
    Configuration changes require restarting the instance to take effect.
-### Viewing Configuration
+
 ## View Logs
 ### Via Web UI
 1. Click the **"Logs"** button on any instance card
 2. Real-time log viewer opens
 ### Via API
 Check instance status in real-time:
 ```bash
-# Get current configuration
+# Get instance details
-curl http://localhost:8080/api/instances/{name}/config
+curl http://localhost:8080/api/instances/{name}/logs
 ```
-## Resource Management
+## Delete Instance
-### Memory Usage
+### Via Web UI
 1. Click the **"Delete"** button on an instance card
 2. Only stopped instances can be deleted
 3. Confirm deletion in the dialog
-Monitor memory consumption:
+### Via API
 ```bash
 curl -X DELETE http://localhost:8080/api/instances/{name}
 ```
 ## Instance Proxy
 Llamactl proxies all requests to the underlying llama-server instances.
 ```bash
-# Get resource usage
+# Get instance details
-curl http://localhost:8080/api/instances/{name}/stats
+curl http://localhost:8080/api/instances/{name}/proxy/
 ```
-### CPU and GPU Usage
+Check llama-server [docs](https://github.com/ggml-org/llama.cpp/blob/master/tools/server/README.md) for more information.
-Track performance metrics:
+### Instance Health
- CPU thread utilization
+#### Via Web UI
 - GPU memory usage (if applicable)
 - Request processing times
-## Troubleshooting Common Issues
+1. The health status badge is displayed on each instance card
-### Instance Won't Start
+#### Via API
-1. **Check model path**: Ensure the model file exists and is readable
+Check the health status of your instances:
 2. **Port conflicts**: Verify the port isn't already in use
 3. **Resource limits**: Check available memory and CPU
 4. **Permissions**: Ensure proper file system permissions
 ### Performance Issues
 1. **Adjust thread count**: Match to your CPU cores
 2. **Optimize context size**: Balance memory usage and capability
 3. **GPU offloading**: Use `gpu_layers` for GPU acceleration
 4. **Batch size tuning**: Optimize for your workload
 ### Memory Problems
 1. **Reduce context size**: Lower memory requirements
 2. **Disable memory mapping**: Use `no_mmap` option
 3. **Enable memory locking**: Use `memory_lock` for performance
 4. **Monitor system resources**: Check available RAM
 ## Best Practices
 ### Production Deployments
 1. **Resource allocation**: Plan memory and CPU requirements
 2. **Health monitoring**: Set up regular health checks
 3. **Graceful shutdowns**: Use proper stop procedures
 4. **Backup configurations**: Save instance configurations
 5. **Log management**: Configure appropriate logging levels
 ### Development Environments
 1. **Resource sharing**: Use smaller models for development
 2. **Quick iterations**: Optimize for fast startup times
 3. **Debug logging**: Enable detailed logging for troubleshooting
 ## Batch Operations
 ### Managing Multiple Instances
 ```bash
-# Start all instances
+curl http://localhost:8080/api/instances/{name}/proxy/health
 curl -X POST http://localhost:8080/api/instances/start-all
 # Stop all instances
 curl -X POST http://localhost:8080/api/instances/stop-all
 # Get status of all instances
 curl http://localhost:8080/api/instances
 ```
--- a/docs/user-guide/web-ui.md
+++ b/docs/user-guide/web-ui.md
@@ -1,210 +0,0 @@
 # Web UI Guide
 The Llamactl Web UI provides an intuitive interface for managing your Llama.cpp instances.
 ## Overview
 The web interface is accessible at `http://localhost:8080` (or your configured host/port) and provides:
 - Instance management dashboard
 - Real-time status monitoring
 - Configuration management
 - Log viewing
 - System information
 ## Dashboard
 ### Instance Cards
 Each instance is displayed as a card showing:
 - **Instance name** and status indicator
 - **Model information** (name, size)
 - **Current state** (stopped, starting, running, error)
 - **Resource usage** (memory, CPU)
 - **Action buttons** (start, stop, configure, logs)
 ### Status Indicators
 - 🟢 **Green**: Instance is running and healthy
 - 🟡 **Yellow**: Instance is starting or stopping
 - 🔴 **Red**: Instance has encountered an error
 - ⚪ **Gray**: Instance is stopped
 ## Creating Instances
 ### Add Instance Dialog
 1. Click the **"Add Instance"** button
 2. Fill in the required fields:
   - **Name**: Unique identifier for your instance
   - **Model Path**: Full path to your GGUF model file
   - **Port**: Port number for the instance
 3. Configure optional settings:
   - **Threads**: Number of CPU threads
   - **Context Size**: Context window size
   - **GPU Layers**: Layers to offload to GPU
   - **Additional Options**: Advanced Llama.cpp parameters
 4. Click **"Create"** to save the instance
 ### Model Path Helper
 Use the file browser to select model files:
 - Navigate to your models directory
 - Select the `.gguf` file
 - Path is automatically filled in the form
 ## Managing Instances
 ### Starting Instances
 1. Click the **"Start"** button on an instance card
 2. Watch the status change to "Starting"
 3. Monitor progress in the logs
 4. Instance becomes "Running" when ready
 ### Stopping Instances
 1. Click the **"Stop"** button
 2. Instance gracefully shuts down
 3. Status changes to "Stopped"
 ### Viewing Logs
 1. Click the **"Logs"** button on any instance
 2. Real-time log viewer opens
 3. Filter by log level (Debug, Info, Warning, Error)
 4. Search through log entries
 5. Download logs for offline analysis
 ## Configuration Management
 ### Editing Instance Settings
 1. Click the **"Configure"** button
 2. Modify settings in the configuration dialog
 3. Changes require instance restart to take effect
 4. Click **"Save"** to apply changes
 ### Advanced Options
 Access advanced Llama.cpp options:
 ```yaml
 # Example advanced configuration
 options:
  rope_freq_base: 10000
  rope_freq_scale: 1.0
  yarn_ext_factor: -1.0
  yarn_attn_factor: 1.0
  yarn_beta_fast: 32.0
  yarn_beta_slow: 1.0
 ```
 ## System Information
 ### Health Dashboard
 Monitor overall system health:
 - **System Resources**: CPU, memory, disk usage
 - **Instance Summary**: Running/stopped instance counts
 - **Performance Metrics**: Request rates, response times
 ### Resource Usage
 Track resource consumption:
 - Per-instance memory usage
 - CPU utilization
 - GPU memory (if applicable)
 - Network I/O
 ## User Interface Features
 ### Theme Support
 Switch between light and dark themes:
 1. Click the theme toggle button
 2. Setting is remembered across sessions
 ### Responsive Design
 The UI adapts to different screen sizes:
 - **Desktop**: Full-featured dashboard
 - **Tablet**: Condensed layout
 - **Mobile**: Stack-based navigation
 ### Keyboard Shortcuts
 - `Ctrl+N`: Create new instance
 - `Ctrl+R`: Refresh dashboard
 - `Ctrl+L`: Open logs for selected instance
 - `Esc`: Close dialogs
 ## Authentication
 ### Login
 If authentication is enabled:
 1. Navigate to the web UI
 2. Enter your credentials
 3. JWT token is stored for the session
 4. Automatic logout on token expiry
 ### Session Management
 - Sessions persist across browser restarts
 - Logout clears authentication tokens
 - Configurable session timeout
 ## Troubleshooting
 ### Common UI Issues
 **Page won't load:**
 - Check if Llamactl server is running
 - Verify the correct URL and port
 - Check browser console for errors
 **Instance won't start from UI:**
 - Verify model path is correct
 - Check for port conflicts
 - Review instance logs for errors
 **Real-time updates not working:**
 - Check WebSocket connection
 - Verify firewall settings
 - Try refreshing the page
 ### Browser Compatibility
 Supported browsers:
 - Chrome/Chromium 90+
 - Firefox 88+
 - Safari 14+
 - Edge 90+
 ## Mobile Access
 ### Responsive Features
 On mobile devices:
 - Touch-friendly interface
 - Swipe gestures for navigation
 - Optimized button sizes
 - Condensed information display
 ### Limitations
 Some features may be limited on mobile:
 - Log viewing (use horizontal scrolling)
 - Complex configuration forms
 - File browser functionality
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -55,7 +55,6 @@ nav:
    - Configuration: getting-started/configuration.md
  - User Guide:
    - Managing Instances: user-guide/managing-instances.md
    - Web UI: user-guide/web-ui.md
    - API Reference: user-guide/api-reference.md
    - Troubleshooting: user-guide/troubleshooting.md