diff --git a/docs/getting-started/quick-start.md b/docs/getting-started/quick-start.md
index 6ea5720..4de1065 100644
--- a/docs/getting-started/quick-start.md
+++ b/docs/getting-started/quick-start.md
@@ -138,6 +138,6 @@ curl http://localhost:8080/v1/models
 
 ## Next Steps
 
-- Learn more about the [Web UI](../user-guide/web-ui.md)
+- Manage instances [Managing Instances](../user-guide/managing-instances.md)
 - Explore the [API Reference](../user-guide/api-reference.md)
 - Configure advanced settings in the [Configuration](configuration.md) guide
diff --git a/docs/index.md b/docs/index.md
index 0637fdc..8dc6b1c 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -37,7 +37,6 @@ Llamactl is designed to simplify the deployment and management of llama-server i
 - [Installation Guide](getting-started/installation.md) - Get Llamactl up and running
 - [Configuration Guide](getting-started/configuration.md) - Detailed configuration options
 - [Quick Start](getting-started/quick-start.md) - Your first steps with Llamactl
-- [Web UI Guide](user-guide/web-ui.md) - Learn to use the web interface
 - [Managing Instances](user-guide/managing-instances.md) - Instance lifecycle management
 - [API Reference](user-guide/api-reference.md) - Complete API documentation
 
diff --git a/docs/user-guide/managing-instances.md b/docs/user-guide/managing-instances.md
index 14bbd71..9d9e4dc 100644
--- a/docs/user-guide/managing-instances.md
+++ b/docs/user-guide/managing-instances.md
@@ -1,73 +1,121 @@
 # Managing Instances
 
-Learn how to effectively manage your Llama.cpp instances with Llamactl.
+Learn how to effectively manage your Llama.cpp instances with Llamactl through both the Web UI and API.
 
-## Instance Lifecycle
+## Overview
 
-### Creating Instances
+Llamactl provides two ways to manage instances:
 
-Instances can be created through the Web UI or API:
+- **Web UI**: Accessible at `http://localhost:8080` with an intuitive dashboard
+- **REST API**: Programmatic access for automation and integration
 
-#### Via Web UI
-1. Click "Add Instance" button
-2. Fill in the configuration form
-3. Click "Create"
+### Authentication
+
+If authentication is enabled:
+1. Navigate to the web UI
+2. Enter your credentials
+3. Bearer token is stored for the session
+
+### Theme Support
+
+- Switch between light and dark themes
+- Setting is remembered across sessions
+
+## Instance Cards
+
+Each instance is displayed as a card showing:
+
+- **Instance name**
+- **Health status badge** (unknown, ready, error, failed)
+- **Action buttons** (start, stop, edit, logs, delete)
+
+## Create Instance
+
+### Via Web UI
+
+1. Click the **"Add Instance"** button on the dashboard
+2. Enter a unique **Name** for your instance (only required field)
+3. Configure model source (choose one):
+    - **Model Path**: Full path to your downloaded GGUF model file
+    - **HuggingFace Repo**: Repository name (e.g., `microsoft/Phi-3-mini-4k-instruct-gguf`)
+    - **HuggingFace File**: Specific file within the repo (optional, uses default if not specified)
+4. Configure optional instance management settings:
+    - **Auto Restart**: Automatically restart instance on failure
+    - **Max Restarts**: Maximum number of restart attempts
+    - **Restart Delay**: Delay in seconds between restart attempts
+    - **On Demand Start**: Start instance when receiving a request to the OpenAI compatible endpoint
+    - **Idle Timeout**: Minutes before stopping idle instance (set to 0 to disable)
+5. Configure optional llama-server backend options:
+    - **Threads**: Number of CPU threads to use
+    - **Context Size**: Context window size (ctx_size)
+    - **GPU Layers**: Number of layers to offload to GPU
+    - **Port**: Network port (auto-assigned by llamactl if not specified)
+    - **Additional Parameters**: Any other llama-server command line options (see [llama-server documentation](https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md))
+6. Click **"Create"** to save the instance  
+
+### Via API
 
-#### Via API
 ```bash
-curl -X POST http://localhost:8080/api/instances \
+# Create instance with local model file
+curl -X POST http://localhost:8080/api/instances/my-instance \
   -H "Content-Type: application/json" \
   -d '{
-    "name": "my-instance",
-    "model_path": "/path/to/model.gguf",
-    "port": 8081
+    "backend_type": "llama_cpp",
+    "backend_options": {
+      "model": "/path/to/model.gguf",
+      "threads": 8,
+      "ctx_size": 4096
+    }
+  }'
+
+# Create instance with HuggingFace model
+curl -X POST http://localhost:8080/api/instances/phi3-mini \
+  -H "Content-Type: application/json" \
+  -d '{
+    "backend_type": "llama_cpp",
+    "backend_options": {
+      "hf_repo": "microsoft/Phi-3-mini-4k-instruct-gguf",
+      "hf_file": "Phi-3-mini-4k-instruct-q4.gguf",
+      "gpu_layers": 32
+    },
+    "auto_restart": true,
+    "max_restarts": 3
   }'
 ```
 
-### Starting and Stopping
+## Start Instance
 
-#### Start an Instance
+### Via Web UI
+1. Click the **"Start"** button on an instance card
+2. Watch the status change to "Unknown"
+3. Monitor progress in the logs
+4. Instance status changes to "Ready" when ready
+
+### Via API
 ```bash
-# Via API
 curl -X POST http://localhost:8080/api/instances/{name}/start
-
-# The instance will begin loading the model
 ```
 
-#### Stop an Instance
+## Stop Instance
+
+### Via Web UI
+1. Click the **"Stop"** button on an instance card
+2. Instance gracefully shuts down
+
+### Via API
 ```bash
-# Via API
 curl -X POST http://localhost:8080/api/instances/{name}/stop
-
-# Graceful shutdown with configurable timeout
 ```
 
-### Monitoring Status
+## Edit Instance
 
-Check instance status in real-time:
-
-```bash
-# Get instance details
-curl http://localhost:8080/api/instances/{name}
-
-# Get health status
-curl http://localhost:8080/api/instances/{name}/health
-```
-
-## Instance States
-
-Instances can be in one of several states:
-
-- **Stopped**: Instance is not running
-- **Starting**: Instance is initializing and loading the model
-- **Running**: Instance is active and ready to serve requests
-- **Stopping**: Instance is shutting down gracefully
-- **Error**: Instance encountered an error
-
-## Configuration Management
-
-### Updating Instance Configuration
+### Via Web UI
+1. Click the **"Edit"** button on an instance card
+2. Modify settings in the configuration dialog
+3. Changes require instance restart to take effect
+4. Click **"Update & Restart"** to apply changes
 
+### Via API
 Modify instance settings:
 
 ```bash
@@ -84,82 +132,55 @@ curl -X PUT http://localhost:8080/api/instances/{name} \
 !!! note
     Configuration changes require restarting the instance to take effect.
 
-### Viewing Configuration
+
+## View Logs
+
+### Via Web UI
+
+1. Click the **"Logs"** button on any instance card
+2. Real-time log viewer opens
+
+### Via API
+Check instance status in real-time:
 
 ```bash
-# Get current configuration
-curl http://localhost:8080/api/instances/{name}/config
+# Get instance details
+curl http://localhost:8080/api/instances/{name}/logs
 ```
 
-## Resource Management
+## Delete Instance
 
-### Memory Usage
+### Via Web UI
+1. Click the **"Delete"** button on an instance card
+2. Only stopped instances can be deleted
+3. Confirm deletion in the dialog
 
-Monitor memory consumption:
+### Via API
+```bash
+curl -X DELETE http://localhost:8080/api/instances/{name}
+```
+
+## Instance Proxy
+
+Llamactl proxies all requests to the underlying llama-server instances.
 
 ```bash
-# Get resource usage
-curl http://localhost:8080/api/instances/{name}/stats
+# Get instance details
+curl http://localhost:8080/api/instances/{name}/proxy/
 ```
 
-### CPU and GPU Usage
+Check llama-server [docs](https://github.com/ggml-org/llama.cpp/blob/master/tools/server/README.md) for more information.
 
-Track performance metrics:
+### Instance Health
 
-- CPU thread utilization
-- GPU memory usage (if applicable)
-- Request processing times
+#### Via Web UI
 
-## Troubleshooting Common Issues
+1. The health status badge is displayed on each instance card
 
-### Instance Won't Start
+#### Via API
 
-1. **Check model path**: Ensure the model file exists and is readable
-2. **Port conflicts**: Verify the port isn't already in use
-3. **Resource limits**: Check available memory and CPU
-4. **Permissions**: Ensure proper file system permissions
-
-### Performance Issues
-
-1. **Adjust thread count**: Match to your CPU cores
-2. **Optimize context size**: Balance memory usage and capability
-3. **GPU offloading**: Use `gpu_layers` for GPU acceleration
-4. **Batch size tuning**: Optimize for your workload
-
-### Memory Problems
-
-1. **Reduce context size**: Lower memory requirements
-2. **Disable memory mapping**: Use `no_mmap` option
-3. **Enable memory locking**: Use `memory_lock` for performance
-4. **Monitor system resources**: Check available RAM
-
-## Best Practices
-
-### Production Deployments
-
-1. **Resource allocation**: Plan memory and CPU requirements
-2. **Health monitoring**: Set up regular health checks
-3. **Graceful shutdowns**: Use proper stop procedures
-4. **Backup configurations**: Save instance configurations
-5. **Log management**: Configure appropriate logging levels
-
-### Development Environments
-
-1. **Resource sharing**: Use smaller models for development
-2. **Quick iterations**: Optimize for fast startup times
-3. **Debug logging**: Enable detailed logging for troubleshooting
-
-## Batch Operations
-
-### Managing Multiple Instances
+Check the health status of your instances:
 
 ```bash
-# Start all instances
-curl -X POST http://localhost:8080/api/instances/start-all
-
-# Stop all instances
-curl -X POST http://localhost:8080/api/instances/stop-all
-
-# Get status of all instances
-curl http://localhost:8080/api/instances
+curl http://localhost:8080/api/instances/{name}/proxy/health
 ```
diff --git a/docs/user-guide/web-ui.md b/docs/user-guide/web-ui.md
deleted file mode 100644
index 6a3c4c1..0000000
--- a/docs/user-guide/web-ui.md
+++ /dev/null
@@ -1,210 +0,0 @@
-# Web UI Guide
-
-The Llamactl Web UI provides an intuitive interface for managing your Llama.cpp instances.
-
-## Overview
-
-The web interface is accessible at `http://localhost:8080` (or your configured host/port) and provides:
-
-- Instance management dashboard
-- Real-time status monitoring
-- Configuration management
-- Log viewing
-- System information
-
-## Dashboard
-
-### Instance Cards
-
-Each instance is displayed as a card showing:
-
-- **Instance name** and status indicator
-- **Model information** (name, size)
-- **Current state** (stopped, starting, running, error)
-- **Resource usage** (memory, CPU)
-- **Action buttons** (start, stop, configure, logs)
-
-### Status Indicators
-
-- 🟢 **Green**: Instance is running and healthy
-- 🟡 **Yellow**: Instance is starting or stopping
-- 🔴 **Red**: Instance has encountered an error
-- ⚪ **Gray**: Instance is stopped
-
-## Creating Instances
-
-### Add Instance Dialog
-
-1. Click the **"Add Instance"** button
-2. Fill in the required fields:
-   - **Name**: Unique identifier for your instance
-   - **Model Path**: Full path to your GGUF model file
-   - **Port**: Port number for the instance
-
-3. Configure optional settings:
-   - **Threads**: Number of CPU threads
-   - **Context Size**: Context window size
-   - **GPU Layers**: Layers to offload to GPU
-   - **Additional Options**: Advanced Llama.cpp parameters
-
-4. Click **"Create"** to save the instance
-
-### Model Path Helper
-
-Use the file browser to select model files:
-
-- Navigate to your models directory
-- Select the `.gguf` file
-- Path is automatically filled in the form
-
-## Managing Instances
-
-### Starting Instances
-
-1. Click the **"Start"** button on an instance card
-2. Watch the status change to "Starting"
-3. Monitor progress in the logs
-4. Instance becomes "Running" when ready
-
-### Stopping Instances
-
-1. Click the **"Stop"** button
-2. Instance gracefully shuts down
-3. Status changes to "Stopped"
-
-### Viewing Logs
-
-1. Click the **"Logs"** button on any instance
-2. Real-time log viewer opens
-3. Filter by log level (Debug, Info, Warning, Error)
-4. Search through log entries
-5. Download logs for offline analysis
-
-## Configuration Management
-
-### Editing Instance Settings
-
-1. Click the **"Configure"** button
-2. Modify settings in the configuration dialog
-3. Changes require instance restart to take effect
-4. Click **"Save"** to apply changes
-
-### Advanced Options
-
-Access advanced Llama.cpp options:
-
-```yaml
-# Example advanced configuration
-options:
-  rope_freq_base: 10000
-  rope_freq_scale: 1.0
-  yarn_ext_factor: -1.0
-  yarn_attn_factor: 1.0
-  yarn_beta_fast: 32.0
-  yarn_beta_slow: 1.0
-```
-
-## System Information
-
-### Health Dashboard
-
-Monitor overall system health:
-
-- **System Resources**: CPU, memory, disk usage
-- **Instance Summary**: Running/stopped instance counts
-- **Performance Metrics**: Request rates, response times
-
-### Resource Usage
-
-Track resource consumption:
-
-- Per-instance memory usage
-- CPU utilization
-- GPU memory (if applicable)
-- Network I/O
-
-## User Interface Features
-
-### Theme Support
-
-Switch between light and dark themes:
-
-1. Click the theme toggle button
-2. Setting is remembered across sessions
-
-### Responsive Design
-
-The UI adapts to different screen sizes:
-
-- **Desktop**: Full-featured dashboard
-- **Tablet**: Condensed layout
-- **Mobile**: Stack-based navigation
-
-### Keyboard Shortcuts
-
-- `Ctrl+N`: Create new instance
-- `Ctrl+R`: Refresh dashboard
-- `Ctrl+L`: Open logs for selected instance
-- `Esc`: Close dialogs
-
-## Authentication
-
-### Login
-
-If authentication is enabled:
-
-1. Navigate to the web UI
-2. Enter your credentials
-3. JWT token is stored for the session
-4. Automatic logout on token expiry
-
-### Session Management
-
-- Sessions persist across browser restarts
-- Logout clears authentication tokens
-- Configurable session timeout
-
-## Troubleshooting
-
-### Common UI Issues
-
-**Page won't load:**
-- Check if Llamactl server is running
-- Verify the correct URL and port
-- Check browser console for errors
-
-**Instance won't start from UI:**
-- Verify model path is correct
-- Check for port conflicts
-- Review instance logs for errors
-
-**Real-time updates not working:**
-- Check WebSocket connection
-- Verify firewall settings
-- Try refreshing the page
-
-### Browser Compatibility
-
-Supported browsers:
-- Chrome/Chromium 90+
-- Firefox 88+
-- Safari 14+
-- Edge 90+
-
-## Mobile Access
-
-### Responsive Features
-
-On mobile devices:
-
-- Touch-friendly interface
-- Swipe gestures for navigation
-- Optimized button sizes
-- Condensed information display
-
-### Limitations
-
-Some features may be limited on mobile:
-- Log viewing (use horizontal scrolling)
-- Complex configuration forms
-- File browser functionality
diff --git a/mkdocs.yml b/mkdocs.yml
index f9fbe3d..ed4be3a 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -55,7 +55,6 @@ nav:
     - Configuration: getting-started/configuration.md
   - User Guide:
     - Managing Instances: user-guide/managing-instances.md
-    - Web UI: user-guide/web-ui.md
     - API Reference: user-guide/api-reference.md
     - Troubleshooting: user-guide/troubleshooting.md