Refactor installation and troubleshooting documentation for clarity and completeness

This commit is contained in:
2025-09-03 21:11:26 +02:00
parent 56756192e3
commit 969b4b14e1
4 changed files with 99 additions and 488 deletions

View File

@@ -6,19 +6,17 @@ This guide will walk you through installing Llamactl on your system.
You need `llama-server` from [llama.cpp](https://github.com/ggml-org/llama.cpp) installed: You need `llama-server` from [llama.cpp](https://github.com/ggml-org/llama.cpp) installed:
```bash
# Quick install methods:
# Homebrew (macOS)
brew install llama.cpp
# Or build from source - see llama.cpp docs **Quick install methods:**
```bash
# Homebrew (macOS/Linux)
brew install llama.cpp
# Winget (Windows)
winget install llama.cpp
``` ```
Additional requirements for building from source: Or build from source - see llama.cpp docs
- Go 1.24 or later
- Node.js 22 or later
- Git
- Sufficient disk space for your models
## Installation Methods ## Installation Methods
@@ -40,6 +38,11 @@ sudo mv llamactl /usr/local/bin/
### Option 2: Build from Source ### Option 2: Build from Source
Requirements:
- Go 1.24 or later
- Node.js 22 or later
- Git
If you prefer to build from source: If you prefer to build from source:
```bash ```bash

View File

@@ -20,6 +20,8 @@ Open your web browser and navigate to:
http://localhost:8080 http://localhost:8080
``` ```
Login with the management API key. By default it is generated during server startup. Copy it from the terminal output.
You should see the Llamactl web interface. You should see the Llamactl web interface.
## Step 3: Create Your First Instance ## Step 3: Create Your First Instance

View File

@@ -316,10 +316,10 @@ The server routes requests to the appropriate instance based on the `model` fiel
## Instance Status Values ## Instance Status Values
Instances can have the following status values: Instances can have the following status values:
- `stopped`: Instance is not running - `stopped`: Instance is not running
- `running`: Instance is running and ready to accept requests - `running`: Instance is running and ready to accept requests
- `failed`: Instance failed to start or crashed - `failed`: Instance failed to start or crashed
## Error Responses ## Error Responses

View File

@@ -1,103 +1,27 @@
# Troubleshooting # Troubleshooting
Common issues and solutions for Llamactl deployment and operation. Issues specific to Llamactl deployment and operation.
## Installation Issues ## Configuration Issues
### Binary Not Found ### Invalid Configuration
**Problem:** `llamactl: command not found`
**Solutions:**
1. Verify the binary is in your PATH:
```bash
echo $PATH
which llamactl
```
2. Add to PATH or use full path:
```bash
export PATH=$PATH:/path/to/llamactl
# or
/full/path/to/llamactl
```
3. Check binary permissions:
```bash
chmod +x llamactl
```
### Permission Denied
**Problem:** Permission errors when starting Llamactl
**Solutions:**
1. Check file permissions:
```bash
ls -la llamactl
chmod +x llamactl
```
2. Verify directory permissions:
```bash
# Check models directory
ls -la /path/to/models/
# Check logs directory
sudo mkdir -p /var/log/llamactl
sudo chown $USER:$USER /var/log/llamactl
```
3. Run with appropriate user:
```bash
# Don't run as root unless necessary
sudo -u llamactl ./llamactl
```
## Startup Issues
### Port Already in Use
**Problem:** `bind: address already in use`
**Solutions:**
1. Find process using the port:
```bash
sudo netstat -tulpn | grep :8080
# or
sudo lsof -i :8080
```
2. Kill the conflicting process:
```bash
sudo kill -9 <PID>
```
3. Use a different port:
```bash
llamactl --port 8081
```
### Configuration Errors
**Problem:** Invalid configuration preventing startup **Problem:** Invalid configuration preventing startup
**Solutions:** **Solutions:**
1. Validate configuration file: 1. Use minimal configuration:
```bash
llamactl --config /path/to/config.yaml --validate
```
2. Check YAML syntax:
```bash
yamllint config.yaml
```
3. Use minimal configuration:
```yaml ```yaml
server: server:
host: "localhost" host: "0.0.0.0"
port: 8080 port: 8080
instances:
port_range: [8000, 9000]
```
2. Check data directory permissions:
```bash
# Ensure data directory is writable (default: ~/.local/share/llamactl)
mkdir -p ~/.local/share/llamactl/{instances,logs}
``` ```
## Instance Management Issues ## Instance Management Issues
@@ -106,449 +30,131 @@ Common issues and solutions for Llamactl deployment and operation.
**Problem:** Instance fails to start with model loading errors **Problem:** Instance fails to start with model loading errors
**Diagnostic Steps:** **Common Solutions:**
1. Check model file exists: - **llama-server not found:** Ensure `llama-server` binary is in PATH
```bash - **Wrong model format:** Ensure model is in GGUF format
ls -la /path/to/model.gguf - **Insufficient memory:** Use smaller model or reduce context size
file /path/to/model.gguf - **Path issues:** Use absolute paths to model files
```
2. Verify model format:
```bash
# Check if it's a valid GGUF file
hexdump -C /path/to/model.gguf | head -5
```
3. Test with llama.cpp directly:
```bash
llama-server --model /path/to/model.gguf --port 8081
```
**Common Solutions:**
- **Corrupted model:** Re-download the model file
- **Wrong format:** Ensure model is in GGUF format
- **Insufficient memory:** Reduce context size or use smaller model
- **Path issues:** Use absolute paths, check file permissions
### Memory Issues ### Memory Issues
**Problem:** Out of memory errors or system becomes unresponsive **Problem:** Out of memory errors or system becomes unresponsive
**Diagnostic Steps:**
1. Check system memory:
```bash
free -h
cat /proc/meminfo
```
2. Monitor memory usage:
```bash
top -p $(pgrep llamactl)
```
3. Check instance memory requirements:
```bash
curl http://localhost:8080/api/instances/{name}/stats
```
**Solutions:** **Solutions:**
1. **Reduce context size:** 1. **Reduce context size:**
```json ```json
{ {
"options": { "n_ctx": 1024
"context_size": 1024
}
} }
``` ```
2. **Enable memory mapping:** 2. **Use quantized models:**
```json - Try Q4_K_M instead of higher precision models
{ - Use smaller model variants (7B instead of 13B)
"options": {
"no_mmap": false
}
}
```
3. **Use quantized models:** ### GPU Configuration
- Try Q4_K_M instead of higher precision models
- Use smaller model variants (7B instead of 13B)
### GPU Issues **Problem:** GPU not being used effectively
**Problem:** GPU not detected or not being used
**Diagnostic Steps:**
1. Check GPU availability:
```bash
nvidia-smi
```
2. Verify CUDA installation:
```bash
nvcc --version
```
3. Check llama.cpp GPU support:
```bash
llama-server --help | grep -i gpu
```
**Solutions:** **Solutions:**
1. **Install CUDA drivers:** 1. **Configure GPU layers:**
```bash
sudo apt update
sudo apt install nvidia-driver-470 nvidia-cuda-toolkit
```
2. **Rebuild llama.cpp with GPU support:**
```bash
cmake -DLLAMA_CUBLAS=ON ..
make
```
3. **Configure GPU layers:**
```json ```json
{ {
"options": { "n_gpu_layers": 35
"gpu_layers": 35
}
} }
``` ```
## Performance Issues ### Advanced Instance Issues
### Slow Response Times **Problem:** Complex model loading, performance, or compatibility issues
**Problem:** API responses are slow or timeouts occur Since llamactl uses `llama-server` under the hood, many instance-related issues are actually llama.cpp issues. For advanced troubleshooting:
**Diagnostic Steps:** **Resources:**
1. Check API response times: - **llama.cpp Documentation:** [https://github.com/ggml/llama.cpp](https://github.com/ggml/llama.cpp)
```bash - **llama.cpp Issues:** [https://github.com/ggml/llama.cpp/issues](https://github.com/ggml/llama.cpp/issues)
time curl http://localhost:8080/api/instances - **llama.cpp Discussions:** [https://github.com/ggml/llama.cpp/discussions](https://github.com/ggml/llama.cpp/discussions)
```
2. Monitor system resources: **Testing directly with llama-server:**
```bash ```bash
htop # Test your model and parameters directly with llama-server
iotop llama-server --model /path/to/model.gguf --port 8081 --n-gpu-layers 35
``` ```
3. Check instance logs: This helps determine if the issue is with llamactl or with the underlying llama.cpp/llama-server.
```bash
curl http://localhost:8080/api/instances/{name}/logs
```
**Solutions:** ## API and Network Issues
1. **Optimize thread count:**
```json
{
"options": {
"threads": 6
}
}
```
2. **Adjust batch size:**
```json
{
"options": {
"batch_size": 512
}
}
```
3. **Enable GPU acceleration:**
```json
{
"options": {
"gpu_layers": 35
}
}
```
### High CPU Usage
**Problem:** Llamactl consuming excessive CPU
**Diagnostic Steps:**
1. Identify CPU-intensive processes:
```bash
top -p $(pgrep -f llamactl)
```
2. Check thread allocation:
```bash
curl http://localhost:8080/api/instances/{name}/config
```
**Solutions:**
1. **Reduce thread count:**
```json
{
"options": {
"threads": 4
}
}
```
2. **Limit concurrent instances:**
```yaml
limits:
max_instances: 3
```
## Network Issues
### Connection Refused
**Problem:** Cannot connect to Llamactl web interface
**Diagnostic Steps:**
1. Check if service is running:
```bash
ps aux | grep llamactl
```
2. Verify port binding:
```bash
netstat -tulpn | grep :8080
```
3. Test local connectivity:
```bash
curl http://localhost:8080/api/health
```
**Solutions:**
1. **Check firewall settings:**
```bash
sudo ufw status
sudo ufw allow 8080
```
2. **Bind to correct interface:**
```yaml
server:
host: "0.0.0.0" # Instead of "localhost"
port: 8080
```
### CORS Errors ### CORS Errors
**Problem:** Web UI shows CORS errors in browser console **Problem:** Web UI shows CORS errors in browser console
**Solutions:** **Solutions:**
1. **Enable CORS in configuration:** 1. **Configure allowed origins:**
```yaml ```yaml
server: server:
cors_enabled: true allowed_origins:
cors_origins:
- "http://localhost:3000" - "http://localhost:3000"
- "https://yourdomain.com" - "https://yourdomain.com"
``` ```
2. **Use reverse proxy:** ## Authentication Issues
```nginx
server {
listen 80;
location / {
proxy_pass http://localhost:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
```
## Database Issues **Problem:** API requests failing with authentication errors
### Startup Database Errors
**Problem:** Database connection failures on startup
**Diagnostic Steps:**
1. Check database service:
```bash
systemctl status postgresql
# or
systemctl status mysql
```
2. Test database connectivity:
```bash
psql -h localhost -U llamactl -d llamactl
```
**Solutions:** **Solutions:**
1. **Start database service:** 1. **Disable authentication temporarily:**
```bash
sudo systemctl start postgresql
sudo systemctl enable postgresql
```
2. **Create database and user:**
```sql
CREATE DATABASE llamactl;
CREATE USER llamactl WITH PASSWORD 'password';
GRANT ALL PRIVILEGES ON DATABASE llamactl TO llamactl;
```
## Web UI Issues
### Blank Page or Loading Issues
**Problem:** Web UI doesn't load or shows blank page
**Diagnostic Steps:**
1. Check browser console for errors (F12)
2. Verify API connectivity:
```bash
curl http://localhost:8080/api/system/status
```
3. Check static file serving:
```bash
curl http://localhost:8080/
```
**Solutions:**
1. **Clear browser cache**
2. **Try different browser**
3. **Check for JavaScript errors in console**
4. **Verify API endpoint accessibility**
### Authentication Issues
**Problem:** Unable to login or authentication failures
**Diagnostic Steps:**
1. Check authentication configuration:
```bash
curl http://localhost:8080/api/config | jq .auth
```
2. Verify user credentials:
```bash
# Test login endpoint
curl -X POST http://localhost:8080/api/auth/login \
-H "Content-Type: application/json" \
-d '{"username":"admin","password":"password"}'
```
**Solutions:**
1. **Reset admin password:**
```bash
llamactl --reset-admin-password
```
2. **Disable authentication temporarily:**
```yaml ```yaml
auth: auth:
enabled: false require_management_auth: false
require_inference_auth: false
``` ```
## Log Analysis 2. **Configure API keys:**
```yaml
auth:
management_keys:
- "your-management-key"
inference_keys:
- "your-inference-key"
```
3. **Use correct Authorization header:**
```bash
curl -H "Authorization: Bearer your-api-key" \
http://localhost:8080/api/v1/instances
```
## Debugging and Logs
### Viewing Instance Logs
```bash
# Get instance logs via API
curl http://localhost:8080/api/v1/instances/{name}/logs
# Or check log files directly
tail -f ~/.local/share/llamactl/logs/{instance-name}.log
```
### Enable Debug Logging ### Enable Debug Logging
For detailed troubleshooting, enable debug logging: ```bash
export LLAMACTL_LOG_LEVEL=debug
```yaml llamactl
logging:
level: "debug"
output: "/var/log/llamactl/debug.log"
```
### Key Log Patterns
Look for these patterns in logs:
**Startup issues:**
```
ERRO Failed to start server
ERRO Database connection failed
ERRO Port binding failed
```
**Instance issues:**
```
ERRO Failed to start instance
ERRO Model loading failed
ERRO Process crashed
```
**Performance issues:**
```
WARN High memory usage detected
WARN Request timeout
WARN Resource limit exceeded
``` ```
## Getting Help ## Getting Help
### Collecting Information When reporting issues, include:
When seeking help, provide:
1. **System information:** 1. **System information:**
```bash ```bash
uname -a
llamactl --version llamactl --version
``` ```
2. **Configuration:** 2. **Configuration file** (remove sensitive keys)
```bash
llamactl --config-dump
```
3. **Logs:** 3. **Relevant log output**
```bash
tail -100 /var/log/llamactl/app.log
```
4. **Error details:** 4. **Steps to reproduce the issue**
- Exact error messages
- Steps to reproduce
- Environment details
### Support Channels
- **GitHub Issues:** Report bugs and feature requests
- **Documentation:** Check this documentation first
- **Community:** Join discussions in GitHub Discussions
## Preventive Measures
### Health Monitoring
Set up monitoring to catch issues early:
```bash
# Regular health checks
*/5 * * * * curl -f http://localhost:8080/api/health || alert
```
### Resource Monitoring
Monitor system resources:
```bash
# Disk space monitoring
df -h /var/log/llamactl/
df -h /path/to/models/
# Memory monitoring
free -h
```
### Backup Configuration
Regular configuration backups:
```bash
# Backup configuration
cp ~/.llamactl/config.yaml ~/.llamactl/config.yaml.backup
# Backup instance configurations
curl http://localhost:8080/api/instances > instances-backup.json
```