Create initial documentation structure

2025-11-06 00:54:23 +00:00 · 2025-08-31 14:27:00 +02:00
parent 7675271370
commit bd31c03f4a
16 changed files with 3514 additions and 0 deletions
--- a/docs/user-guide/api-reference.md
+++ b/docs/user-guide/api-reference.md
@@ -0,0 +1,470 @@
+# API Reference
+
+Complete reference for the LlamaCtl REST API.
+
+## Base URL
+
+All API endpoints are relative to the base URL:
+
+```
+http://localhost:8080/api
+```
+
+## Authentication
+
+If authentication is enabled, include the JWT token in the Authorization header:
+
+```bash
+curl -H "Authorization: Bearer <your-jwt-token>" \
+  http://localhost:8080/api/instances
+```
+
+## Instances
+
+### List All Instances
+
+Get a list of all instances.
+
+```http
+GET /api/instances
+```
+
+**Response:**
+```json
+{
+  "instances": [
+    {
+      "name": "llama2-7b",
+      "status": "running",
+      "model_path": "/models/llama-2-7b.gguf",
+      "port": 8081,
+      "created_at": "2024-01-15T10:30:00Z",
+      "updated_at": "2024-01-15T12:45:00Z"
+    }
+  ]
+}
+```
+
+### Get Instance Details
+
+Get detailed information about a specific instance.
+
+```http
+GET /api/instances/{name}
+```
+
+**Response:**
+```json
+{
+  "name": "llama2-7b",
+  "status": "running",
+  "model_path": "/models/llama-2-7b.gguf",
+  "port": 8081,
+  "pid": 12345,
+  "options": {
+    "threads": 4,
+    "context_size": 2048,
+    "gpu_layers": 0
+  },
+  "stats": {
+    "memory_usage": 4294967296,
+    "cpu_usage": 25.5,
+    "uptime": 3600
+  },
+  "created_at": "2024-01-15T10:30:00Z",
+  "updated_at": "2024-01-15T12:45:00Z"
+}
+```
+
+### Create Instance
+
+Create a new instance.
+
+```http
+POST /api/instances
+```
+
+**Request Body:**
+```json
+{
+  "name": "my-instance",
+  "model_path": "/path/to/model.gguf",
+  "port": 8081,
+  "options": {
+    "threads": 4,
+    "context_size": 2048,
+    "gpu_layers": 0
+  }
+}
+```
+
+**Response:**
+```json
+{
+  "message": "Instance created successfully",
+  "instance": {
+    "name": "my-instance",
+    "status": "stopped",
+    "model_path": "/path/to/model.gguf",
+    "port": 8081,
+    "created_at": "2024-01-15T14:30:00Z"
+  }
+}
+```
+
+### Update Instance
+
+Update an existing instance configuration.
+
+```http
+PUT /api/instances/{name}
+```
+
+**Request Body:**
+```json
+{
+  "options": {
+    "threads": 8,
+    "context_size": 4096
+  }
+}
+```
+
+### Delete Instance
+
+Delete an instance (must be stopped first).
+
+```http
+DELETE /api/instances/{name}
+```
+
+**Response:**
+```json
+{
+  "message": "Instance deleted successfully"
+}
+```
+
+## Instance Operations
+
+### Start Instance
+
+Start a stopped instance.
+
+```http
+POST /api/instances/{name}/start
+```
+
+**Response:**
+```json
+{
+  "message": "Instance start initiated",
+  "status": "starting"
+}
+```
+
+### Stop Instance
+
+Stop a running instance.
+
+```http
+POST /api/instances/{name}/stop
+```
+
+**Request Body (Optional):**
+```json
+{
+  "force": false,
+  "timeout": 30
+}
+```
+
+**Response:**
+```json
+{
+  "message": "Instance stop initiated",
+  "status": "stopping"
+}
+```
+
+### Restart Instance
+
+Restart an instance (stop then start).
+
+```http
+POST /api/instances/{name}/restart
+```
+
+### Get Instance Health
+
+Check instance health status.
+
+```http
+GET /api/instances/{name}/health
+```
+
+**Response:**
+```json
+{
+  "status": "healthy",
+  "checks": {
+    "process": "running",
+    "port": "open",
+    "response": "ok"
+  },
+  "last_check": "2024-01-15T14:30:00Z"
+}
+```
+
+### Get Instance Logs
+
+Retrieve instance logs.
+
+```http
+GET /api/instances/{name}/logs
+```
+
+**Query Parameters:**
+- `lines`: Number of lines to return (default: 100)
+- `follow`: Stream logs (boolean)
+- `level`: Filter by log level (debug, info, warn, error)
+
+**Response:**
+```json
+{
+  "logs": [
+    {
+      "timestamp": "2024-01-15T14:30:00Z",
+      "level": "info",
+      "message": "Model loaded successfully"
+    }
+  ]
+}
+```
+
+## Batch Operations
+
+### Start All Instances
+
+Start all stopped instances.
+
+```http
+POST /api/instances/start-all
+```
+
+### Stop All Instances
+
+Stop all running instances.
+
+```http
+POST /api/instances/stop-all
+```
+
+## System Information
+
+### Get System Status
+
+Get overall system status and metrics.
+
+```http
+GET /api/system/status
+```
+
+**Response:**
+```json
+{
+  "version": "1.0.0",
+  "uptime": 86400,
+  "instances": {
+    "total": 5,
+    "running": 3,
+    "stopped": 2
+  },
+  "resources": {
+    "cpu_usage": 45.2,
+    "memory_usage": 8589934592,
+    "memory_total": 17179869184,
+    "disk_usage": 75.5
+  }
+}
+```
+
+### Get System Information
+
+Get detailed system information.
+
+```http
+GET /api/system/info
+```
+
+**Response:**
+```json
+{
+  "hostname": "server-01",
+  "os": "linux",
+  "arch": "amd64",
+  "cpu_count": 8,
+  "memory_total": 17179869184,
+  "version": "1.0.0",
+  "build_time": "2024-01-15T10:00:00Z"
+}
+```
+
+## Configuration
+
+### Get Configuration
+
+Get current LlamaCtl configuration.
+
+```http
+GET /api/config
+```
+
+### Update Configuration
+
+Update LlamaCtl configuration (requires restart).
+
+```http
+PUT /api/config
+```
+
+## Authentication
+
+### Login
+
+Authenticate and receive a JWT token.
+
+```http
+POST /api/auth/login
+```
+
+**Request Body:**
+```json
+{
+  "username": "admin",
+  "password": "password"
+}
+```
+
+**Response:**
+```json
+{
+  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
+  "expires_at": "2024-01-16T14:30:00Z"
+}
+```
+
+### Refresh Token
+
+Refresh an existing JWT token.
+
+```http
+POST /api/auth/refresh
+```
+
+## Error Responses
+
+All endpoints may return error responses in the following format:
+
+```json
+{
+  "error": "Error message",
+  "code": "ERROR_CODE",
+  "details": "Additional error details"
+}
+```
+
+### Common HTTP Status Codes
+
+- `200`: Success
+- `201`: Created
+- `400`: Bad Request
+- `401`: Unauthorized
+- `403`: Forbidden
+- `404`: Not Found
+- `409`: Conflict (e.g., instance already exists)
+- `500`: Internal Server Error
+
+## WebSocket API
+
+### Real-time Updates
+
+Connect to WebSocket for real-time updates:
+
+```javascript
+const ws = new WebSocket('ws://localhost:8080/api/ws');
+
+ws.onmessage = function(event) {
+  const data = JSON.parse(event.data);
+  console.log('Update:', data);
+};
+```
+
+**Message Types:**
+- `instance_status_changed`: Instance status updates
+- `instance_stats_updated`: Resource usage updates
+- `system_alert`: System-level alerts
+
+## Rate Limiting
+
+API requests are rate limited to:
+- **100 requests per minute** for regular endpoints
+- **10 requests per minute** for resource-intensive operations
+
+Rate limit headers are included in responses:
+- `X-RateLimit-Limit`: Request limit
+- `X-RateLimit-Remaining`: Remaining requests
+- `X-RateLimit-Reset`: Reset time (Unix timestamp)
+
+## SDKs and Libraries
+
+### Go Client
+
+```go
+import "github.com/lordmathis/llamactl-go-client"
+
+client := llamactl.NewClient("http://localhost:8080")
+instances, err := client.ListInstances()
+```
+
+### Python Client
+
+```python
+from llamactl import Client
+
+client = Client("http://localhost:8080")
+instances = client.list_instances()
+```
+
+## Examples
+
+### Complete Instance Lifecycle
+
+```bash
+# Create instance
+curl -X POST http://localhost:8080/api/instances \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "example",
+    "model_path": "/models/example.gguf",
+    "port": 8081
+  }'
+
+# Start instance
+curl -X POST http://localhost:8080/api/instances/example/start
+
+# Check status
+curl http://localhost:8080/api/instances/example
+
+# Stop instance
+curl -X POST http://localhost:8080/api/instances/example/stop
+
+# Delete instance
+curl -X DELETE http://localhost:8080/api/instances/example
+```
+
+## Next Steps
+
+- Learn about [Managing Instances](managing-instances.md) in detail
+- Explore [Advanced Configuration](../advanced/backends.md)
+- Set up [Monitoring](../advanced/monitoring.md) for production use
--- a/docs/user-guide/managing-instances.md
+++ b/docs/user-guide/managing-instances.md
@@ -0,0 +1,171 @@
+# Managing Instances
+
+Learn how to effectively manage your Llama.cpp instances with LlamaCtl.
+
+## Instance Lifecycle
+
+### Creating Instances
+
+Instances can be created through the Web UI or API:
+
+#### Via Web UI
+1. Click "Add Instance" button
+2. Fill in the configuration form
+3. Click "Create"
+
+#### Via API
+```bash
+curl -X POST http://localhost:8080/api/instances \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "my-instance",
+    "model_path": "/path/to/model.gguf",
+    "port": 8081
+  }'
+```
+
+### Starting and Stopping
+
+#### Start an Instance
+```bash
+# Via API
+curl -X POST http://localhost:8080/api/instances/{name}/start
+
+# The instance will begin loading the model
+```
+
+#### Stop an Instance
+```bash
+# Via API
+curl -X POST http://localhost:8080/api/instances/{name}/stop
+
+# Graceful shutdown with configurable timeout
+```
+
+### Monitoring Status
+
+Check instance status in real-time:
+
+```bash
+# Get instance details
+curl http://localhost:8080/api/instances/{name}
+
+# Get health status
+curl http://localhost:8080/api/instances/{name}/health
+```
+
+## Instance States
+
+Instances can be in one of several states:
+
+- **Stopped**: Instance is not running
+- **Starting**: Instance is initializing and loading the model
+- **Running**: Instance is active and ready to serve requests
+- **Stopping**: Instance is shutting down gracefully
+- **Error**: Instance encountered an error
+
+## Configuration Management
+
+### Updating Instance Configuration
+
+Modify instance settings:
+
+```bash
+curl -X PUT http://localhost:8080/api/instances/{name} \
+  -H "Content-Type: application/json" \
+  -d '{
+    "options": {
+      "threads": 8,
+      "context_size": 4096
+    }
+  }'
+```
+
+!!! note
+    Configuration changes require restarting the instance to take effect.
+
+### Viewing Configuration
+
+```bash
+# Get current configuration
+curl http://localhost:8080/api/instances/{name}/config
+```
+
+## Resource Management
+
+### Memory Usage
+
+Monitor memory consumption:
+
+```bash
+# Get resource usage
+curl http://localhost:8080/api/instances/{name}/stats
+```
+
+### CPU and GPU Usage
+
+Track performance metrics:
+
+- CPU thread utilization
+- GPU memory usage (if applicable)
+- Request processing times
+
+## Troubleshooting Common Issues
+
+### Instance Won't Start
+
+1. **Check model path**: Ensure the model file exists and is readable
+2. **Port conflicts**: Verify the port isn't already in use
+3. **Resource limits**: Check available memory and CPU
+4. **Permissions**: Ensure proper file system permissions
+
+### Performance Issues
+
+1. **Adjust thread count**: Match to your CPU cores
+2. **Optimize context size**: Balance memory usage and capability
+3. **GPU offloading**: Use `gpu_layers` for GPU acceleration
+4. **Batch size tuning**: Optimize for your workload
+
+### Memory Problems
+
+1. **Reduce context size**: Lower memory requirements
+2. **Disable memory mapping**: Use `no_mmap` option
+3. **Enable memory locking**: Use `memory_lock` for performance
+4. **Monitor system resources**: Check available RAM
+
+## Best Practices
+
+### Production Deployments
+
+1. **Resource allocation**: Plan memory and CPU requirements
+2. **Health monitoring**: Set up regular health checks
+3. **Graceful shutdowns**: Use proper stop procedures
+4. **Backup configurations**: Save instance configurations
+5. **Log management**: Configure appropriate logging levels
+
+### Development Environments
+
+1. **Resource sharing**: Use smaller models for development
+2. **Quick iterations**: Optimize for fast startup times
+3. **Debug logging**: Enable detailed logging for troubleshooting
+
+## Batch Operations
+
+### Managing Multiple Instances
+
+```bash
+# Start all instances
+curl -X POST http://localhost:8080/api/instances/start-all
+
+# Stop all instances
+curl -X POST http://localhost:8080/api/instances/stop-all
+
+# Get status of all instances
+curl http://localhost:8080/api/instances
+```
+
+## Next Steps
+
+- Learn about the [Web UI](web-ui.md) interface
+- Explore the complete [API Reference](api-reference.md)
+- Set up [Monitoring](../advanced/monitoring.md) for production use
--- a/docs/user-guide/web-ui.md
+++ b/docs/user-guide/web-ui.md
@@ -0,0 +1,216 @@
+# Web UI Guide
+
+The LlamaCtl Web UI provides an intuitive interface for managing your Llama.cpp instances.
+
+## Overview
+
+The web interface is accessible at `http://localhost:8080` (or your configured host/port) and provides:
+
+- Instance management dashboard
+- Real-time status monitoring
+- Configuration management
+- Log viewing
+- System information
+
+## Dashboard
+
+### Instance Cards
+
+Each instance is displayed as a card showing:
+
+- **Instance name** and status indicator
+- **Model information** (name, size)
+- **Current state** (stopped, starting, running, error)
+- **Resource usage** (memory, CPU)
+- **Action buttons** (start, stop, configure, logs)
+
+### Status Indicators
+
+- 🟢 **Green**: Instance is running and healthy
+- 🟡 **Yellow**: Instance is starting or stopping
+- 🔴 **Red**: Instance has encountered an error
+- ⚪ **Gray**: Instance is stopped
+
+## Creating Instances
+
+### Add Instance Dialog
+
+1. Click the **"Add Instance"** button
+2. Fill in the required fields:
+   - **Name**: Unique identifier for your instance
+   - **Model Path**: Full path to your GGUF model file
+   - **Port**: Port number for the instance
+
+3. Configure optional settings:
+   - **Threads**: Number of CPU threads
+   - **Context Size**: Context window size
+   - **GPU Layers**: Layers to offload to GPU
+   - **Additional Options**: Advanced Llama.cpp parameters
+
+4. Click **"Create"** to save the instance
+
+### Model Path Helper
+
+Use the file browser to select model files:
+
+- Navigate to your models directory
+- Select the `.gguf` file
+- Path is automatically filled in the form
+
+## Managing Instances
+
+### Starting Instances
+
+1. Click the **"Start"** button on an instance card
+2. Watch the status change to "Starting"
+3. Monitor progress in the logs
+4. Instance becomes "Running" when ready
+
+### Stopping Instances
+
+1. Click the **"Stop"** button
+2. Instance gracefully shuts down
+3. Status changes to "Stopped"
+
+### Viewing Logs
+
+1. Click the **"Logs"** button on any instance
+2. Real-time log viewer opens
+3. Filter by log level (Debug, Info, Warning, Error)
+4. Search through log entries
+5. Download logs for offline analysis
+
+## Configuration Management
+
+### Editing Instance Settings
+
+1. Click the **"Configure"** button
+2. Modify settings in the configuration dialog
+3. Changes require instance restart to take effect
+4. Click **"Save"** to apply changes
+
+### Advanced Options
+
+Access advanced Llama.cpp options:
+
+```yaml
+# Example advanced configuration
+options:
+  rope_freq_base: 10000
+  rope_freq_scale: 1.0
+  yarn_ext_factor: -1.0
+  yarn_attn_factor: 1.0
+  yarn_beta_fast: 32.0
+  yarn_beta_slow: 1.0
+```
+
+## System Information
+
+### Health Dashboard
+
+Monitor overall system health:
+
+- **System Resources**: CPU, memory, disk usage
+- **Instance Summary**: Running/stopped instance counts
+- **Performance Metrics**: Request rates, response times
+
+### Resource Usage
+
+Track resource consumption:
+
+- Per-instance memory usage
+- CPU utilization
+- GPU memory (if applicable)
+- Network I/O
+
+## User Interface Features
+
+### Theme Support
+
+Switch between light and dark themes:
+
+1. Click the theme toggle button
+2. Setting is remembered across sessions
+
+### Responsive Design
+
+The UI adapts to different screen sizes:
+
+- **Desktop**: Full-featured dashboard
+- **Tablet**: Condensed layout
+- **Mobile**: Stack-based navigation
+
+### Keyboard Shortcuts
+
+- `Ctrl+N`: Create new instance
+- `Ctrl+R`: Refresh dashboard
+- `Ctrl+L`: Open logs for selected instance
+- `Esc`: Close dialogs
+
+## Authentication
+
+### Login
+
+If authentication is enabled:
+
+1. Navigate to the web UI
+2. Enter your credentials
+3. JWT token is stored for the session
+4. Automatic logout on token expiry
+
+### Session Management
+
+- Sessions persist across browser restarts
+- Logout clears authentication tokens
+- Configurable session timeout
+
+## Troubleshooting
+
+### Common UI Issues
+
+**Page won't load:**
+- Check if LlamaCtl server is running
+- Verify the correct URL and port
+- Check browser console for errors
+
+**Instance won't start from UI:**
+- Verify model path is correct
+- Check for port conflicts
+- Review instance logs for errors
+
+**Real-time updates not working:**
+- Check WebSocket connection
+- Verify firewall settings
+- Try refreshing the page
+
+### Browser Compatibility
+
+Supported browsers:
+- Chrome/Chromium 90+
+- Firefox 88+
+- Safari 14+
+- Edge 90+
+
+## Mobile Access
+
+### Responsive Features
+
+On mobile devices:
+
+- Touch-friendly interface
+- Swipe gestures for navigation
+- Optimized button sizes
+- Condensed information display
+
+### Limitations
+
+Some features may be limited on mobile:
+- Log viewing (use horizontal scrolling)
+- Complex configuration forms
+- File browser functionality
+
+## Next Steps
+
+- Learn about [API Reference](api-reference.md) for programmatic access
+- Set up [Monitoring](../advanced/monitoring.md) for production use
+- Explore [Advanced Configuration](../advanced/backends.md) options