Update documentation: remove Web UI guide and adjust navigation links

This commit is contained in:
2025-09-03 22:47:15 +02:00
parent 969b4b14e1
commit 3013a343f1
5 changed files with 129 additions and 320 deletions

View File

@@ -138,6 +138,6 @@ curl http://localhost:8080/v1/models
## Next Steps ## Next Steps
- Learn more about the [Web UI](../user-guide/web-ui.md) - Manage instances [Managing Instances](../user-guide/managing-instances.md)
- Explore the [API Reference](../user-guide/api-reference.md) - Explore the [API Reference](../user-guide/api-reference.md)
- Configure advanced settings in the [Configuration](configuration.md) guide - Configure advanced settings in the [Configuration](configuration.md) guide

View File

@@ -37,7 +37,6 @@ Llamactl is designed to simplify the deployment and management of llama-server i
- [Installation Guide](getting-started/installation.md) - Get Llamactl up and running - [Installation Guide](getting-started/installation.md) - Get Llamactl up and running
- [Configuration Guide](getting-started/configuration.md) - Detailed configuration options - [Configuration Guide](getting-started/configuration.md) - Detailed configuration options
- [Quick Start](getting-started/quick-start.md) - Your first steps with Llamactl - [Quick Start](getting-started/quick-start.md) - Your first steps with Llamactl
- [Web UI Guide](user-guide/web-ui.md) - Learn to use the web interface
- [Managing Instances](user-guide/managing-instances.md) - Instance lifecycle management - [Managing Instances](user-guide/managing-instances.md) - Instance lifecycle management
- [API Reference](user-guide/api-reference.md) - Complete API documentation - [API Reference](user-guide/api-reference.md) - Complete API documentation

View File

@@ -1,73 +1,121 @@
# Managing Instances # Managing Instances
Learn how to effectively manage your Llama.cpp instances with Llamactl. Learn how to effectively manage your Llama.cpp instances with Llamactl through both the Web UI and API.
## Instance Lifecycle ## Overview
### Creating Instances Llamactl provides two ways to manage instances:
Instances can be created through the Web UI or API: - **Web UI**: Accessible at `http://localhost:8080` with an intuitive dashboard
- **REST API**: Programmatic access for automation and integration
#### Via Web UI ### Authentication
1. Click "Add Instance" button
2. Fill in the configuration form If authentication is enabled:
3. Click "Create" 1. Navigate to the web UI
2. Enter your credentials
3. Bearer token is stored for the session
### Theme Support
- Switch between light and dark themes
- Setting is remembered across sessions
## Instance Cards
Each instance is displayed as a card showing:
- **Instance name**
- **Health status badge** (unknown, ready, error, failed)
- **Action buttons** (start, stop, edit, logs, delete)
## Create Instance
### Via Web UI
1. Click the **"Add Instance"** button on the dashboard
2. Enter a unique **Name** for your instance (only required field)
3. Configure model source (choose one):
- **Model Path**: Full path to your downloaded GGUF model file
- **HuggingFace Repo**: Repository name (e.g., `microsoft/Phi-3-mini-4k-instruct-gguf`)
- **HuggingFace File**: Specific file within the repo (optional, uses default if not specified)
4. Configure optional instance management settings:
- **Auto Restart**: Automatically restart instance on failure
- **Max Restarts**: Maximum number of restart attempts
- **Restart Delay**: Delay in seconds between restart attempts
- **On Demand Start**: Start instance when receiving a request to the OpenAI compatible endpoint
- **Idle Timeout**: Minutes before stopping idle instance (set to 0 to disable)
5. Configure optional llama-server backend options:
- **Threads**: Number of CPU threads to use
- **Context Size**: Context window size (ctx_size)
- **GPU Layers**: Number of layers to offload to GPU
- **Port**: Network port (auto-assigned by llamactl if not specified)
- **Additional Parameters**: Any other llama-server command line options (see [llama-server documentation](https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md))
6. Click **"Create"** to save the instance
### Via API
#### Via API
```bash ```bash
curl -X POST http://localhost:8080/api/instances \ # Create instance with local model file
curl -X POST http://localhost:8080/api/instances/my-instance \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-d '{ -d '{
"name": "my-instance", "backend_type": "llama_cpp",
"model_path": "/path/to/model.gguf", "backend_options": {
"port": 8081 "model": "/path/to/model.gguf",
"threads": 8,
"ctx_size": 4096
}
}'
# Create instance with HuggingFace model
curl -X POST http://localhost:8080/api/instances/phi3-mini \
-H "Content-Type: application/json" \
-d '{
"backend_type": "llama_cpp",
"backend_options": {
"hf_repo": "microsoft/Phi-3-mini-4k-instruct-gguf",
"hf_file": "Phi-3-mini-4k-instruct-q4.gguf",
"gpu_layers": 32
},
"auto_restart": true,
"max_restarts": 3
}' }'
``` ```
### Starting and Stopping ## Start Instance
#### Start an Instance ### Via Web UI
1. Click the **"Start"** button on an instance card
2. Watch the status change to "Unknown"
3. Monitor progress in the logs
4. Instance status changes to "Ready" when ready
### Via API
```bash ```bash
# Via API
curl -X POST http://localhost:8080/api/instances/{name}/start curl -X POST http://localhost:8080/api/instances/{name}/start
# The instance will begin loading the model
``` ```
#### Stop an Instance ## Stop Instance
### Via Web UI
1. Click the **"Stop"** button on an instance card
2. Instance gracefully shuts down
### Via API
```bash ```bash
# Via API
curl -X POST http://localhost:8080/api/instances/{name}/stop curl -X POST http://localhost:8080/api/instances/{name}/stop
# Graceful shutdown with configurable timeout
``` ```
### Monitoring Status ## Edit Instance
Check instance status in real-time: ### Via Web UI
1. Click the **"Edit"** button on an instance card
```bash 2. Modify settings in the configuration dialog
# Get instance details 3. Changes require instance restart to take effect
curl http://localhost:8080/api/instances/{name} 4. Click **"Update & Restart"** to apply changes
# Get health status
curl http://localhost:8080/api/instances/{name}/health
```
## Instance States
Instances can be in one of several states:
- **Stopped**: Instance is not running
- **Starting**: Instance is initializing and loading the model
- **Running**: Instance is active and ready to serve requests
- **Stopping**: Instance is shutting down gracefully
- **Error**: Instance encountered an error
## Configuration Management
### Updating Instance Configuration
### Via API
Modify instance settings: Modify instance settings:
```bash ```bash
@@ -84,82 +132,55 @@ curl -X PUT http://localhost:8080/api/instances/{name} \
!!! note !!! note
Configuration changes require restarting the instance to take effect. Configuration changes require restarting the instance to take effect.
### Viewing Configuration
## View Logs
### Via Web UI
1. Click the **"Logs"** button on any instance card
2. Real-time log viewer opens
### Via API
Check instance status in real-time:
```bash ```bash
# Get current configuration # Get instance details
curl http://localhost:8080/api/instances/{name}/config curl http://localhost:8080/api/instances/{name}/logs
``` ```
## Resource Management ## Delete Instance
### Memory Usage ### Via Web UI
1. Click the **"Delete"** button on an instance card
2. Only stopped instances can be deleted
3. Confirm deletion in the dialog
Monitor memory consumption: ### Via API
```bash
curl -X DELETE http://localhost:8080/api/instances/{name}
```
## Instance Proxy
Llamactl proxies all requests to the underlying llama-server instances.
```bash ```bash
# Get resource usage # Get instance details
curl http://localhost:8080/api/instances/{name}/stats curl http://localhost:8080/api/instances/{name}/proxy/
``` ```
### CPU and GPU Usage Check llama-server [docs](https://github.com/ggml-org/llama.cpp/blob/master/tools/server/README.md) for more information.
Track performance metrics: ### Instance Health
- CPU thread utilization #### Via Web UI
- GPU memory usage (if applicable)
- Request processing times
## Troubleshooting Common Issues 1. The health status badge is displayed on each instance card
### Instance Won't Start #### Via API
1. **Check model path**: Ensure the model file exists and is readable Check the health status of your instances:
2. **Port conflicts**: Verify the port isn't already in use
3. **Resource limits**: Check available memory and CPU
4. **Permissions**: Ensure proper file system permissions
### Performance Issues
1. **Adjust thread count**: Match to your CPU cores
2. **Optimize context size**: Balance memory usage and capability
3. **GPU offloading**: Use `gpu_layers` for GPU acceleration
4. **Batch size tuning**: Optimize for your workload
### Memory Problems
1. **Reduce context size**: Lower memory requirements
2. **Disable memory mapping**: Use `no_mmap` option
3. **Enable memory locking**: Use `memory_lock` for performance
4. **Monitor system resources**: Check available RAM
## Best Practices
### Production Deployments
1. **Resource allocation**: Plan memory and CPU requirements
2. **Health monitoring**: Set up regular health checks
3. **Graceful shutdowns**: Use proper stop procedures
4. **Backup configurations**: Save instance configurations
5. **Log management**: Configure appropriate logging levels
### Development Environments
1. **Resource sharing**: Use smaller models for development
2. **Quick iterations**: Optimize for fast startup times
3. **Debug logging**: Enable detailed logging for troubleshooting
## Batch Operations
### Managing Multiple Instances
```bash ```bash
# Start all instances curl http://localhost:8080/api/instances/{name}/proxy/health
curl -X POST http://localhost:8080/api/instances/start-all
# Stop all instances
curl -X POST http://localhost:8080/api/instances/stop-all
# Get status of all instances
curl http://localhost:8080/api/instances
``` ```

View File

@@ -1,210 +0,0 @@
# Web UI Guide
The Llamactl Web UI provides an intuitive interface for managing your Llama.cpp instances.
## Overview
The web interface is accessible at `http://localhost:8080` (or your configured host/port) and provides:
- Instance management dashboard
- Real-time status monitoring
- Configuration management
- Log viewing
- System information
## Dashboard
### Instance Cards
Each instance is displayed as a card showing:
- **Instance name** and status indicator
- **Model information** (name, size)
- **Current state** (stopped, starting, running, error)
- **Resource usage** (memory, CPU)
- **Action buttons** (start, stop, configure, logs)
### Status Indicators
- 🟢 **Green**: Instance is running and healthy
- 🟡 **Yellow**: Instance is starting or stopping
- 🔴 **Red**: Instance has encountered an error
-**Gray**: Instance is stopped
## Creating Instances
### Add Instance Dialog
1. Click the **"Add Instance"** button
2. Fill in the required fields:
- **Name**: Unique identifier for your instance
- **Model Path**: Full path to your GGUF model file
- **Port**: Port number for the instance
3. Configure optional settings:
- **Threads**: Number of CPU threads
- **Context Size**: Context window size
- **GPU Layers**: Layers to offload to GPU
- **Additional Options**: Advanced Llama.cpp parameters
4. Click **"Create"** to save the instance
### Model Path Helper
Use the file browser to select model files:
- Navigate to your models directory
- Select the `.gguf` file
- Path is automatically filled in the form
## Managing Instances
### Starting Instances
1. Click the **"Start"** button on an instance card
2. Watch the status change to "Starting"
3. Monitor progress in the logs
4. Instance becomes "Running" when ready
### Stopping Instances
1. Click the **"Stop"** button
2. Instance gracefully shuts down
3. Status changes to "Stopped"
### Viewing Logs
1. Click the **"Logs"** button on any instance
2. Real-time log viewer opens
3. Filter by log level (Debug, Info, Warning, Error)
4. Search through log entries
5. Download logs for offline analysis
## Configuration Management
### Editing Instance Settings
1. Click the **"Configure"** button
2. Modify settings in the configuration dialog
3. Changes require instance restart to take effect
4. Click **"Save"** to apply changes
### Advanced Options
Access advanced Llama.cpp options:
```yaml
# Example advanced configuration
options:
rope_freq_base: 10000
rope_freq_scale: 1.0
yarn_ext_factor: -1.0
yarn_attn_factor: 1.0
yarn_beta_fast: 32.0
yarn_beta_slow: 1.0
```
## System Information
### Health Dashboard
Monitor overall system health:
- **System Resources**: CPU, memory, disk usage
- **Instance Summary**: Running/stopped instance counts
- **Performance Metrics**: Request rates, response times
### Resource Usage
Track resource consumption:
- Per-instance memory usage
- CPU utilization
- GPU memory (if applicable)
- Network I/O
## User Interface Features
### Theme Support
Switch between light and dark themes:
1. Click the theme toggle button
2. Setting is remembered across sessions
### Responsive Design
The UI adapts to different screen sizes:
- **Desktop**: Full-featured dashboard
- **Tablet**: Condensed layout
- **Mobile**: Stack-based navigation
### Keyboard Shortcuts
- `Ctrl+N`: Create new instance
- `Ctrl+R`: Refresh dashboard
- `Ctrl+L`: Open logs for selected instance
- `Esc`: Close dialogs
## Authentication
### Login
If authentication is enabled:
1. Navigate to the web UI
2. Enter your credentials
3. JWT token is stored for the session
4. Automatic logout on token expiry
### Session Management
- Sessions persist across browser restarts
- Logout clears authentication tokens
- Configurable session timeout
## Troubleshooting
### Common UI Issues
**Page won't load:**
- Check if Llamactl server is running
- Verify the correct URL and port
- Check browser console for errors
**Instance won't start from UI:**
- Verify model path is correct
- Check for port conflicts
- Review instance logs for errors
**Real-time updates not working:**
- Check WebSocket connection
- Verify firewall settings
- Try refreshing the page
### Browser Compatibility
Supported browsers:
- Chrome/Chromium 90+
- Firefox 88+
- Safari 14+
- Edge 90+
## Mobile Access
### Responsive Features
On mobile devices:
- Touch-friendly interface
- Swipe gestures for navigation
- Optimized button sizes
- Condensed information display
### Limitations
Some features may be limited on mobile:
- Log viewing (use horizontal scrolling)
- Complex configuration forms
- File browser functionality

View File

@@ -55,7 +55,6 @@ nav:
- Configuration: getting-started/configuration.md - Configuration: getting-started/configuration.md
- User Guide: - User Guide:
- Managing Instances: user-guide/managing-instances.md - Managing Instances: user-guide/managing-instances.md
- Web UI: user-guide/web-ui.md
- API Reference: user-guide/api-reference.md - API Reference: user-guide/api-reference.md
- Troubleshooting: user-guide/troubleshooting.md - Troubleshooting: user-guide/troubleshooting.md