Refactor installation and troubleshooting documentation for clarity and completeness

2025-11-05 16:44:22 +00:00 · 2025-09-03 21:11:26 +02:00
parent 56756192e3
commit 969b4b14e1
4 changed files with 99 additions and 488 deletions
--- a/docs/getting-started/installation.md
+++ b/docs/getting-started/installation.md
@@ -6,19 +6,17 @@ This guide will walk you through installing Llamactl on your system.
 You need `llama-server` from [llama.cpp](https://github.com/ggml-org/llama.cpp) installed:
 ```bash
 # Quick install methods:
 # Homebrew (macOS)
 brew install llama.cpp
-# Or build from source - see llama.cpp docs
+**Quick install methods:**
 ```bash
 # Homebrew (macOS/Linux)
 brew install llama.cpp
 # Winget (Windows)
 winget install llama.cpp
 ```
-Additional requirements for building from source:
+Or build from source - see llama.cpp docs
 - Go 1.24 or later
 - Node.js 22 or later
 - Git
 - Sufficient disk space for your models
 ## Installation Methods
@@ -40,6 +38,11 @@ sudo mv llamactl /usr/local/bin/
 ### Option 2: Build from Source
 Requirements:
 - Go 1.24 or later
 - Node.js 22 or later
 - Git
 If you prefer to build from source:
 ```bash
--- a/docs/getting-started/quick-start.md
+++ b/docs/getting-started/quick-start.md
@@ -20,6 +20,8 @@ Open your web browser and navigate to:
 http://localhost:8080
 ```
 Login with the management API key. By default it is generated during server startup. Copy it from the terminal output.
 You should see the Llamactl web interface.
 ## Step 3: Create Your First Instance
--- a/docs/user-guide/api-reference.md
+++ b/docs/user-guide/api-reference.md
@@ -316,10 +316,10 @@ The server routes requests to the appropriate instance based on the `model` fiel
 ## Instance Status Values
-Instances can have the following status values:
+Instances can have the following status values:  
- `stopped`: Instance is not running
+- `stopped`: Instance is not running  
- `running`: Instance is running and ready to accept requests
+- `running`: Instance is running and ready to accept requests  
- `failed`: Instance failed to start or crashed
+- `failed`: Instance failed to start or crashed  
 ## Error Responses
--- a/docs/user-guide/troubleshooting.md
+++ b/docs/user-guide/troubleshooting.md
@@ -1,103 +1,27 @@
 # Troubleshooting
-Common issues and solutions for Llamactl deployment and operation.
+Issues specific to Llamactl deployment and operation.
-## Installation Issues
+## Configuration Issues
-### Binary Not Found
+### Invalid Configuration
 **Problem:** `llamactl: command not found`
 **Solutions:**
 1. Verify the binary is in your PATH:
   ```bash
   echo $PATH
   which llamactl
   ```
 2. Add to PATH or use full path:
   ```bash
   export PATH=$PATH:/path/to/llamactl
   # or
   /full/path/to/llamactl
   ```
 3. Check binary permissions:
   ```bash
   chmod +x llamactl
   ```
 ### Permission Denied
 **Problem:** Permission errors when starting Llamactl
 **Solutions:**
 1. Check file permissions:
   ```bash
   ls -la llamactl
   chmod +x llamactl
   ```
 2. Verify directory permissions:
   ```bash
   # Check models directory
   ls -la /path/to/models/
   # Check logs directory
   sudo mkdir -p /var/log/llamactl
   sudo chown $USER:$USER /var/log/llamactl
   ```
 3. Run with appropriate user:
   ```bash
   # Don't run as root unless necessary
   sudo -u llamactl ./llamactl
   ```
 ## Startup Issues
 ### Port Already in Use
 **Problem:** `bind: address already in use`
 **Solutions:**
 1. Find process using the port:
   ```bash
   sudo netstat -tulpn | grep :8080
   # or
   sudo lsof -i :8080
   ```
 2. Kill the conflicting process:
   ```bash
   sudo kill -9 <PID>
   ```
 3. Use a different port:
   ```bash
   llamactl --port 8081
   ```
 ### Configuration Errors
 **Problem:** Invalid configuration preventing startup
 **Solutions:**
-1. Validate configuration file:
+1. Use minimal configuration:
   ```bash
   llamactl --config /path/to/config.yaml --validate
   ```
 2. Check YAML syntax:
   ```bash
   yamllint config.yaml
   ```
 3. Use minimal configuration:
   ```yaml
   server:
-     host: "localhost"
+     host: "0.0.0.0"
     port: 8080
   instances:
     port_range: [8000, 9000]
   ```
 2. Check data directory permissions:
   ```bash
   # Ensure data directory is writable (default: ~/.local/share/llamactl)
   mkdir -p ~/.local/share/llamactl/{instances,logs}
   ```
 ## Instance Management Issues
@@ -106,449 +30,131 @@ Common issues and solutions for Llamactl deployment and operation.
 **Problem:** Instance fails to start with model loading errors
-**Diagnostic Steps:**
+**Common Solutions:**  
-1. Check model file exists:
+- **llama-server not found:** Ensure `llama-server` binary is in PATH  
-   ```bash
+- **Wrong model format:** Ensure model is in GGUF format  
-   ls -la /path/to/model.gguf
+- **Insufficient memory:** Use smaller model or reduce context size  
-   file /path/to/model.gguf
+- **Path issues:** Use absolute paths to model files  
   ```
 2. Verify model format:
   ```bash
   # Check if it's a valid GGUF file
   hexdump -C /path/to/model.gguf | head -5
   ```
 3. Test with llama.cpp directly:
   ```bash
   llama-server --model /path/to/model.gguf --port 8081
   ```
 **Common Solutions:**
 - **Corrupted model:** Re-download the model file
 - **Wrong format:** Ensure model is in GGUF format
 - **Insufficient memory:** Reduce context size or use smaller model
 - **Path issues:** Use absolute paths, check file permissions
 ### Memory Issues
 **Problem:** Out of memory errors or system becomes unresponsive
 **Diagnostic Steps:**
 1. Check system memory:
   ```bash
   free -h
   cat /proc/meminfo
   ```
 2. Monitor memory usage:
   ```bash
   top -p $(pgrep llamactl)
   ```
 3. Check instance memory requirements:
   ```bash
   curl http://localhost:8080/api/instances/{name}/stats
   ```
 **Solutions:**
 1. **Reduce context size:**
   ```json
   {
-     "options": {
+     "n_ctx": 1024
       "context_size": 1024
     }
   }
   ```
-2. **Enable memory mapping:**
+2. **Use quantized models:**  
-   ```json
+   - Try Q4_K_M instead of higher precision models  
-   {
+   - Use smaller model variants (7B instead of 13B)  
     "options": {
       "no_mmap": false
     }
   }
   ```
-3. **Use quantized models:**
+### GPU Configuration
   - Try Q4_K_M instead of higher precision models
   - Use smaller model variants (7B instead of 13B)
-### GPU Issues
+**Problem:** GPU not being used effectively
 **Problem:** GPU not detected or not being used
 **Diagnostic Steps:**
 1. Check GPU availability:
   ```bash
   nvidia-smi
   ```
 2. Verify CUDA installation:
   ```bash
   nvcc --version
   ```
 3. Check llama.cpp GPU support:
   ```bash
   llama-server --help | grep -i gpu
   ```
 **Solutions:**
-1. **Install CUDA drivers:**
+1. **Configure GPU layers:**
   ```bash
   sudo apt update
   sudo apt install nvidia-driver-470 nvidia-cuda-toolkit
   ```
 2. **Rebuild llama.cpp with GPU support:**
   ```bash
   cmake -DLLAMA_CUBLAS=ON ..
   make
   ```
 3. **Configure GPU layers:**
   ```json
   {
-     "options": {
+     "n_gpu_layers": 35
       "gpu_layers": 35
     }
   }
   ```
-## Performance Issues
+### Advanced Instance Issues
-### Slow Response Times
+**Problem:** Complex model loading, performance, or compatibility issues
-**Problem:** API responses are slow or timeouts occur
+Since llamactl uses `llama-server` under the hood, many instance-related issues are actually llama.cpp issues. For advanced troubleshooting:
-**Diagnostic Steps:**
+**Resources:**  
-1. Check API response times:
+- **llama.cpp Documentation:** [https://github.com/ggml/llama.cpp](https://github.com/ggml/llama.cpp)  
-   ```bash
+- **llama.cpp Issues:** [https://github.com/ggml/llama.cpp/issues](https://github.com/ggml/llama.cpp/issues)  
-   time curl http://localhost:8080/api/instances
+- **llama.cpp Discussions:** [https://github.com/ggml/llama.cpp/discussions](https://github.com/ggml/llama.cpp/discussions)  
   ```
-2. Monitor system resources:
+**Testing directly with llama-server:**  
-   ```bash
+```bash
-   htop
+# Test your model and parameters directly with llama-server
-   iotop
+llama-server --model /path/to/model.gguf --port 8081 --n-gpu-layers 35
-   ```
+```
-3. Check instance logs:
+This helps determine if the issue is with llamactl or with the underlying llama.cpp/llama-server.
   ```bash
   curl http://localhost:8080/api/instances/{name}/logs
   ```
-**Solutions:**
+## API and Network Issues
 1. **Optimize thread count:**
   ```json
   {
     "options": {
       "threads": 6
     }
   }
   ```
 2. **Adjust batch size:**
   ```json
   {
     "options": {
       "batch_size": 512
     }
   }
   ```
 3. **Enable GPU acceleration:**
   ```json
   {
     "options": {
       "gpu_layers": 35
     }
   }
   ```
 ### High CPU Usage
 **Problem:** Llamactl consuming excessive CPU
 **Diagnostic Steps:**
 1. Identify CPU-intensive processes:
   ```bash
   top -p $(pgrep -f llamactl)
   ```
 2. Check thread allocation:
   ```bash
   curl http://localhost:8080/api/instances/{name}/config
   ```
 **Solutions:**
 1. **Reduce thread count:**
   ```json
   {
     "options": {
       "threads": 4
     }
   }
   ```
 2. **Limit concurrent instances:**
   ```yaml
   limits:
     max_instances: 3
   ```
 ## Network Issues
 ### Connection Refused
 **Problem:** Cannot connect to Llamactl web interface
 **Diagnostic Steps:**
 1. Check if service is running:
   ```bash
   ps aux | grep llamactl
   ```
 2. Verify port binding:
   ```bash
   netstat -tulpn | grep :8080
   ```
 3. Test local connectivity:
   ```bash
   curl http://localhost:8080/api/health
   ```
 **Solutions:**
 1. **Check firewall settings:**
   ```bash
   sudo ufw status
   sudo ufw allow 8080
   ```
 2. **Bind to correct interface:**
   ```yaml
   server:
     host: "0.0.0.0"  # Instead of "localhost"
     port: 8080
   ```
 ### CORS Errors
 **Problem:** Web UI shows CORS errors in browser console
 **Solutions:**
-1. **Enable CORS in configuration:**
+1. **Configure allowed origins:**
   ```yaml
   server:
-     cors_enabled: true
+     allowed_origins:
     cors_origins:
       - "http://localhost:3000"
       - "https://yourdomain.com"
   ```
-2. **Use reverse proxy:**
+## Authentication Issues
   ```nginx
   server {
       listen 80;
       location / {
           proxy_pass http://localhost:8080;
           proxy_set_header Host $host;
           proxy_set_header X-Real-IP $remote_addr;
       }
   }
   ```
-## Database Issues
+**Problem:** API requests failing with authentication errors
 ### Startup Database Errors
 **Problem:** Database connection failures on startup
 **Diagnostic Steps:**
 1. Check database service:
   ```bash
   systemctl status postgresql
   # or
   systemctl status mysql
   ```
 2. Test database connectivity:
   ```bash
   psql -h localhost -U llamactl -d llamactl
   ```
 **Solutions:**
-1. **Start database service:**
+1. **Disable authentication temporarily:**
   ```bash
   sudo systemctl start postgresql
   sudo systemctl enable postgresql
   ```
 2. **Create database and user:**
   ```sql
   CREATE DATABASE llamactl;
   CREATE USER llamactl WITH PASSWORD 'password';
   GRANT ALL PRIVILEGES ON DATABASE llamactl TO llamactl;
   ```
 ## Web UI Issues
 ### Blank Page or Loading Issues
 **Problem:** Web UI doesn't load or shows blank page
 **Diagnostic Steps:**
 1. Check browser console for errors (F12)
 2. Verify API connectivity:
   ```bash
   curl http://localhost:8080/api/system/status
   ```
 3. Check static file serving:
   ```bash
   curl http://localhost:8080/
   ```
 **Solutions:**
 1. **Clear browser cache**
 2. **Try different browser**
 3. **Check for JavaScript errors in console**
 4. **Verify API endpoint accessibility**
 ### Authentication Issues
 **Problem:** Unable to login or authentication failures
 **Diagnostic Steps:**
 1. Check authentication configuration:
   ```bash
   curl http://localhost:8080/api/config | jq .auth
   ```
 2. Verify user credentials:
   ```bash
   # Test login endpoint
   curl -X POST http://localhost:8080/api/auth/login \
     -H "Content-Type: application/json" \
     -d '{"username":"admin","password":"password"}'
   ```
 **Solutions:**
 1. **Reset admin password:**
   ```bash
   llamactl --reset-admin-password
   ```
 2. **Disable authentication temporarily:**
   ```yaml
   auth:
-     enabled: false
+     require_management_auth: false
     require_inference_auth: false
   ```
-## Log Analysis
+2. **Configure API keys:**
   ```yaml
   auth:
     management_keys:
       - "your-management-key"
     inference_keys:
       - "your-inference-key"
   ```
 3. **Use correct Authorization header:**
   ```bash
   curl -H "Authorization: Bearer your-api-key" \
     http://localhost:8080/api/v1/instances
   ```
 ## Debugging and Logs
 ### Viewing Instance Logs
 ```bash
 # Get instance logs via API
 curl http://localhost:8080/api/v1/instances/{name}/logs
 # Or check log files directly
 tail -f ~/.local/share/llamactl/logs/{instance-name}.log
 ```
 ### Enable Debug Logging
-For detailed troubleshooting, enable debug logging:
+```bash
-
+export LLAMACTL_LOG_LEVEL=debug
-```yaml
+llamactl
 logging:
  level: "debug"
  output: "/var/log/llamactl/debug.log"
 ```
 ### Key Log Patterns
 Look for these patterns in logs:
 **Startup issues:**
 ```
 ERRO Failed to start server
 ERRO Database connection failed
 ERRO Port binding failed
 ```
 **Instance issues:**
 ```
 ERRO Failed to start instance
 ERRO Model loading failed
 ERRO Process crashed
 ```
 **Performance issues:**
 ```
 WARN High memory usage detected
 WARN Request timeout
 WARN Resource limit exceeded
 ```
 ## Getting Help
-### Collecting Information
+When reporting issues, include:
 When seeking help, provide:
 1. **System information:**
   ```bash
   uname -a
   llamactl --version
   ```
-2. **Configuration:**
+2. **Configuration file** (remove sensitive keys)
   ```bash
   llamactl --config-dump
   ```
-3. **Logs:**
+3. **Relevant log output**
   ```bash
   tail -100 /var/log/llamactl/app.log
   ```
-4. **Error details:**
+4. **Steps to reproduce the issue**
   - Exact error messages
   - Steps to reproduce
   - Environment details
 ### Support Channels
 - **GitHub Issues:** Report bugs and feature requests
 - **Documentation:** Check this documentation first
 - **Community:** Join discussions in GitHub Discussions
 ## Preventive Measures
 ### Health Monitoring
 Set up monitoring to catch issues early:
 ```bash
 # Regular health checks
 */5 * * * * curl -f http://localhost:8080/api/health || alert
 ```
 ### Resource Monitoring
 Monitor system resources:
 ```bash
 # Disk space monitoring
 df -h /var/log/llamactl/
 df -h /path/to/models/
 # Memory monitoring
 free -h
 ```
 ### Backup Configuration
 Regular configuration backups:
 ```bash
 # Backup configuration
 cp ~/.llamactl/config.yaml ~/.llamactl/config.yaml.backup
 # Backup instance configurations
 curl http://localhost:8080/api/instances > instances-backup.json
 ```