mirror of
https://github.com/lordmathis/llamactl.git
synced 2025-11-05 16:44:22 +00:00
Refactor installation and troubleshooting documentation for clarity and completeness
This commit is contained in:
@@ -6,19 +6,17 @@ This guide will walk you through installing Llamactl on your system.
|
||||
|
||||
You need `llama-server` from [llama.cpp](https://github.com/ggml-org/llama.cpp) installed:
|
||||
|
||||
```bash
|
||||
# Quick install methods:
|
||||
# Homebrew (macOS)
|
||||
brew install llama.cpp
|
||||
|
||||
# Or build from source - see llama.cpp docs
|
||||
**Quick install methods:**
|
||||
|
||||
```bash
|
||||
# Homebrew (macOS/Linux)
|
||||
brew install llama.cpp
|
||||
# Winget (Windows)
|
||||
winget install llama.cpp
|
||||
```
|
||||
|
||||
Additional requirements for building from source:
|
||||
- Go 1.24 or later
|
||||
- Node.js 22 or later
|
||||
- Git
|
||||
- Sufficient disk space for your models
|
||||
Or build from source - see llama.cpp docs
|
||||
|
||||
## Installation Methods
|
||||
|
||||
@@ -40,6 +38,11 @@ sudo mv llamactl /usr/local/bin/
|
||||
|
||||
### Option 2: Build from Source
|
||||
|
||||
Requirements:
|
||||
- Go 1.24 or later
|
||||
- Node.js 22 or later
|
||||
- Git
|
||||
|
||||
If you prefer to build from source:
|
||||
|
||||
```bash
|
||||
|
||||
@@ -20,6 +20,8 @@ Open your web browser and navigate to:
|
||||
http://localhost:8080
|
||||
```
|
||||
|
||||
Login with the management API key. By default it is generated during server startup. Copy it from the terminal output.
|
||||
|
||||
You should see the Llamactl web interface.
|
||||
|
||||
## Step 3: Create Your First Instance
|
||||
|
||||
@@ -316,10 +316,10 @@ The server routes requests to the appropriate instance based on the `model` fiel
|
||||
|
||||
## Instance Status Values
|
||||
|
||||
Instances can have the following status values:
|
||||
- `stopped`: Instance is not running
|
||||
- `running`: Instance is running and ready to accept requests
|
||||
- `failed`: Instance failed to start or crashed
|
||||
Instances can have the following status values:
|
||||
- `stopped`: Instance is not running
|
||||
- `running`: Instance is running and ready to accept requests
|
||||
- `failed`: Instance failed to start or crashed
|
||||
|
||||
## Error Responses
|
||||
|
||||
|
||||
@@ -1,103 +1,27 @@
|
||||
# Troubleshooting
|
||||
|
||||
Common issues and solutions for Llamactl deployment and operation.
|
||||
Issues specific to Llamactl deployment and operation.
|
||||
|
||||
## Installation Issues
|
||||
## Configuration Issues
|
||||
|
||||
### Binary Not Found
|
||||
|
||||
**Problem:** `llamactl: command not found`
|
||||
|
||||
**Solutions:**
|
||||
1. Verify the binary is in your PATH:
|
||||
```bash
|
||||
echo $PATH
|
||||
which llamactl
|
||||
```
|
||||
|
||||
2. Add to PATH or use full path:
|
||||
```bash
|
||||
export PATH=$PATH:/path/to/llamactl
|
||||
# or
|
||||
/full/path/to/llamactl
|
||||
```
|
||||
|
||||
3. Check binary permissions:
|
||||
```bash
|
||||
chmod +x llamactl
|
||||
```
|
||||
|
||||
### Permission Denied
|
||||
|
||||
**Problem:** Permission errors when starting Llamactl
|
||||
|
||||
**Solutions:**
|
||||
1. Check file permissions:
|
||||
```bash
|
||||
ls -la llamactl
|
||||
chmod +x llamactl
|
||||
```
|
||||
|
||||
2. Verify directory permissions:
|
||||
```bash
|
||||
# Check models directory
|
||||
ls -la /path/to/models/
|
||||
|
||||
# Check logs directory
|
||||
sudo mkdir -p /var/log/llamactl
|
||||
sudo chown $USER:$USER /var/log/llamactl
|
||||
```
|
||||
|
||||
3. Run with appropriate user:
|
||||
```bash
|
||||
# Don't run as root unless necessary
|
||||
sudo -u llamactl ./llamactl
|
||||
```
|
||||
|
||||
## Startup Issues
|
||||
|
||||
### Port Already in Use
|
||||
|
||||
**Problem:** `bind: address already in use`
|
||||
|
||||
**Solutions:**
|
||||
1. Find process using the port:
|
||||
```bash
|
||||
sudo netstat -tulpn | grep :8080
|
||||
# or
|
||||
sudo lsof -i :8080
|
||||
```
|
||||
|
||||
2. Kill the conflicting process:
|
||||
```bash
|
||||
sudo kill -9 <PID>
|
||||
```
|
||||
|
||||
3. Use a different port:
|
||||
```bash
|
||||
llamactl --port 8081
|
||||
```
|
||||
|
||||
### Configuration Errors
|
||||
### Invalid Configuration
|
||||
|
||||
**Problem:** Invalid configuration preventing startup
|
||||
|
||||
**Solutions:**
|
||||
1. Validate configuration file:
|
||||
```bash
|
||||
llamactl --config /path/to/config.yaml --validate
|
||||
```
|
||||
|
||||
2. Check YAML syntax:
|
||||
```bash
|
||||
yamllint config.yaml
|
||||
```
|
||||
|
||||
3. Use minimal configuration:
|
||||
1. Use minimal configuration:
|
||||
```yaml
|
||||
server:
|
||||
host: "localhost"
|
||||
host: "0.0.0.0"
|
||||
port: 8080
|
||||
instances:
|
||||
port_range: [8000, 9000]
|
||||
```
|
||||
|
||||
2. Check data directory permissions:
|
||||
```bash
|
||||
# Ensure data directory is writable (default: ~/.local/share/llamactl)
|
||||
mkdir -p ~/.local/share/llamactl/{instances,logs}
|
||||
```
|
||||
|
||||
## Instance Management Issues
|
||||
@@ -106,449 +30,131 @@ Common issues and solutions for Llamactl deployment and operation.
|
||||
|
||||
**Problem:** Instance fails to start with model loading errors
|
||||
|
||||
**Diagnostic Steps:**
|
||||
1. Check model file exists:
|
||||
```bash
|
||||
ls -la /path/to/model.gguf
|
||||
file /path/to/model.gguf
|
||||
```
|
||||
|
||||
2. Verify model format:
|
||||
```bash
|
||||
# Check if it's a valid GGUF file
|
||||
hexdump -C /path/to/model.gguf | head -5
|
||||
```
|
||||
|
||||
3. Test with llama.cpp directly:
|
||||
```bash
|
||||
llama-server --model /path/to/model.gguf --port 8081
|
||||
```
|
||||
|
||||
**Common Solutions:**
|
||||
- **Corrupted model:** Re-download the model file
|
||||
- **Wrong format:** Ensure model is in GGUF format
|
||||
- **Insufficient memory:** Reduce context size or use smaller model
|
||||
- **Path issues:** Use absolute paths, check file permissions
|
||||
**Common Solutions:**
|
||||
- **llama-server not found:** Ensure `llama-server` binary is in PATH
|
||||
- **Wrong model format:** Ensure model is in GGUF format
|
||||
- **Insufficient memory:** Use smaller model or reduce context size
|
||||
- **Path issues:** Use absolute paths to model files
|
||||
|
||||
### Memory Issues
|
||||
|
||||
**Problem:** Out of memory errors or system becomes unresponsive
|
||||
|
||||
**Diagnostic Steps:**
|
||||
1. Check system memory:
|
||||
```bash
|
||||
free -h
|
||||
cat /proc/meminfo
|
||||
```
|
||||
|
||||
2. Monitor memory usage:
|
||||
```bash
|
||||
top -p $(pgrep llamactl)
|
||||
```
|
||||
|
||||
3. Check instance memory requirements:
|
||||
```bash
|
||||
curl http://localhost:8080/api/instances/{name}/stats
|
||||
```
|
||||
|
||||
**Solutions:**
|
||||
1. **Reduce context size:**
|
||||
```json
|
||||
{
|
||||
"options": {
|
||||
"context_size": 1024
|
||||
}
|
||||
"n_ctx": 1024
|
||||
}
|
||||
```
|
||||
|
||||
2. **Enable memory mapping:**
|
||||
```json
|
||||
{
|
||||
"options": {
|
||||
"no_mmap": false
|
||||
}
|
||||
}
|
||||
```
|
||||
2. **Use quantized models:**
|
||||
- Try Q4_K_M instead of higher precision models
|
||||
- Use smaller model variants (7B instead of 13B)
|
||||
|
||||
3. **Use quantized models:**
|
||||
- Try Q4_K_M instead of higher precision models
|
||||
- Use smaller model variants (7B instead of 13B)
|
||||
### GPU Configuration
|
||||
|
||||
### GPU Issues
|
||||
|
||||
**Problem:** GPU not detected or not being used
|
||||
|
||||
**Diagnostic Steps:**
|
||||
1. Check GPU availability:
|
||||
```bash
|
||||
nvidia-smi
|
||||
```
|
||||
|
||||
2. Verify CUDA installation:
|
||||
```bash
|
||||
nvcc --version
|
||||
```
|
||||
|
||||
3. Check llama.cpp GPU support:
|
||||
```bash
|
||||
llama-server --help | grep -i gpu
|
||||
```
|
||||
**Problem:** GPU not being used effectively
|
||||
|
||||
**Solutions:**
|
||||
1. **Install CUDA drivers:**
|
||||
```bash
|
||||
sudo apt update
|
||||
sudo apt install nvidia-driver-470 nvidia-cuda-toolkit
|
||||
```
|
||||
|
||||
2. **Rebuild llama.cpp with GPU support:**
|
||||
```bash
|
||||
cmake -DLLAMA_CUBLAS=ON ..
|
||||
make
|
||||
```
|
||||
|
||||
3. **Configure GPU layers:**
|
||||
1. **Configure GPU layers:**
|
||||
```json
|
||||
{
|
||||
"options": {
|
||||
"gpu_layers": 35
|
||||
}
|
||||
"n_gpu_layers": 35
|
||||
}
|
||||
```
|
||||
|
||||
## Performance Issues
|
||||
### Advanced Instance Issues
|
||||
|
||||
### Slow Response Times
|
||||
**Problem:** Complex model loading, performance, or compatibility issues
|
||||
|
||||
**Problem:** API responses are slow or timeouts occur
|
||||
Since llamactl uses `llama-server` under the hood, many instance-related issues are actually llama.cpp issues. For advanced troubleshooting:
|
||||
|
||||
**Diagnostic Steps:**
|
||||
1. Check API response times:
|
||||
```bash
|
||||
time curl http://localhost:8080/api/instances
|
||||
```
|
||||
**Resources:**
|
||||
- **llama.cpp Documentation:** [https://github.com/ggml/llama.cpp](https://github.com/ggml/llama.cpp)
|
||||
- **llama.cpp Issues:** [https://github.com/ggml/llama.cpp/issues](https://github.com/ggml/llama.cpp/issues)
|
||||
- **llama.cpp Discussions:** [https://github.com/ggml/llama.cpp/discussions](https://github.com/ggml/llama.cpp/discussions)
|
||||
|
||||
2. Monitor system resources:
|
||||
```bash
|
||||
htop
|
||||
iotop
|
||||
```
|
||||
**Testing directly with llama-server:**
|
||||
```bash
|
||||
# Test your model and parameters directly with llama-server
|
||||
llama-server --model /path/to/model.gguf --port 8081 --n-gpu-layers 35
|
||||
```
|
||||
|
||||
3. Check instance logs:
|
||||
```bash
|
||||
curl http://localhost:8080/api/instances/{name}/logs
|
||||
```
|
||||
This helps determine if the issue is with llamactl or with the underlying llama.cpp/llama-server.
|
||||
|
||||
**Solutions:**
|
||||
1. **Optimize thread count:**
|
||||
```json
|
||||
{
|
||||
"options": {
|
||||
"threads": 6
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
2. **Adjust batch size:**
|
||||
```json
|
||||
{
|
||||
"options": {
|
||||
"batch_size": 512
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
3. **Enable GPU acceleration:**
|
||||
```json
|
||||
{
|
||||
"options": {
|
||||
"gpu_layers": 35
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### High CPU Usage
|
||||
|
||||
**Problem:** Llamactl consuming excessive CPU
|
||||
|
||||
**Diagnostic Steps:**
|
||||
1. Identify CPU-intensive processes:
|
||||
```bash
|
||||
top -p $(pgrep -f llamactl)
|
||||
```
|
||||
|
||||
2. Check thread allocation:
|
||||
```bash
|
||||
curl http://localhost:8080/api/instances/{name}/config
|
||||
```
|
||||
|
||||
**Solutions:**
|
||||
1. **Reduce thread count:**
|
||||
```json
|
||||
{
|
||||
"options": {
|
||||
"threads": 4
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
2. **Limit concurrent instances:**
|
||||
```yaml
|
||||
limits:
|
||||
max_instances: 3
|
||||
```
|
||||
|
||||
## Network Issues
|
||||
|
||||
### Connection Refused
|
||||
|
||||
**Problem:** Cannot connect to Llamactl web interface
|
||||
|
||||
**Diagnostic Steps:**
|
||||
1. Check if service is running:
|
||||
```bash
|
||||
ps aux | grep llamactl
|
||||
```
|
||||
|
||||
2. Verify port binding:
|
||||
```bash
|
||||
netstat -tulpn | grep :8080
|
||||
```
|
||||
|
||||
3. Test local connectivity:
|
||||
```bash
|
||||
curl http://localhost:8080/api/health
|
||||
```
|
||||
|
||||
**Solutions:**
|
||||
1. **Check firewall settings:**
|
||||
```bash
|
||||
sudo ufw status
|
||||
sudo ufw allow 8080
|
||||
```
|
||||
|
||||
2. **Bind to correct interface:**
|
||||
```yaml
|
||||
server:
|
||||
host: "0.0.0.0" # Instead of "localhost"
|
||||
port: 8080
|
||||
```
|
||||
## API and Network Issues
|
||||
|
||||
### CORS Errors
|
||||
|
||||
**Problem:** Web UI shows CORS errors in browser console
|
||||
|
||||
**Solutions:**
|
||||
1. **Enable CORS in configuration:**
|
||||
1. **Configure allowed origins:**
|
||||
```yaml
|
||||
server:
|
||||
cors_enabled: true
|
||||
cors_origins:
|
||||
allowed_origins:
|
||||
- "http://localhost:3000"
|
||||
- "https://yourdomain.com"
|
||||
```
|
||||
|
||||
2. **Use reverse proxy:**
|
||||
```nginx
|
||||
server {
|
||||
listen 80;
|
||||
location / {
|
||||
proxy_pass http://localhost:8080;
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
}
|
||||
}
|
||||
```
|
||||
## Authentication Issues
|
||||
|
||||
## Database Issues
|
||||
|
||||
### Startup Database Errors
|
||||
|
||||
**Problem:** Database connection failures on startup
|
||||
|
||||
**Diagnostic Steps:**
|
||||
1. Check database service:
|
||||
```bash
|
||||
systemctl status postgresql
|
||||
# or
|
||||
systemctl status mysql
|
||||
```
|
||||
|
||||
2. Test database connectivity:
|
||||
```bash
|
||||
psql -h localhost -U llamactl -d llamactl
|
||||
```
|
||||
**Problem:** API requests failing with authentication errors
|
||||
|
||||
**Solutions:**
|
||||
1. **Start database service:**
|
||||
```bash
|
||||
sudo systemctl start postgresql
|
||||
sudo systemctl enable postgresql
|
||||
```
|
||||
|
||||
2. **Create database and user:**
|
||||
```sql
|
||||
CREATE DATABASE llamactl;
|
||||
CREATE USER llamactl WITH PASSWORD 'password';
|
||||
GRANT ALL PRIVILEGES ON DATABASE llamactl TO llamactl;
|
||||
```
|
||||
|
||||
## Web UI Issues
|
||||
|
||||
### Blank Page or Loading Issues
|
||||
|
||||
**Problem:** Web UI doesn't load or shows blank page
|
||||
|
||||
**Diagnostic Steps:**
|
||||
1. Check browser console for errors (F12)
|
||||
2. Verify API connectivity:
|
||||
```bash
|
||||
curl http://localhost:8080/api/system/status
|
||||
```
|
||||
|
||||
3. Check static file serving:
|
||||
```bash
|
||||
curl http://localhost:8080/
|
||||
```
|
||||
|
||||
**Solutions:**
|
||||
1. **Clear browser cache**
|
||||
2. **Try different browser**
|
||||
3. **Check for JavaScript errors in console**
|
||||
4. **Verify API endpoint accessibility**
|
||||
|
||||
### Authentication Issues
|
||||
|
||||
**Problem:** Unable to login or authentication failures
|
||||
|
||||
**Diagnostic Steps:**
|
||||
1. Check authentication configuration:
|
||||
```bash
|
||||
curl http://localhost:8080/api/config | jq .auth
|
||||
```
|
||||
|
||||
2. Verify user credentials:
|
||||
```bash
|
||||
# Test login endpoint
|
||||
curl -X POST http://localhost:8080/api/auth/login \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"username":"admin","password":"password"}'
|
||||
```
|
||||
|
||||
**Solutions:**
|
||||
1. **Reset admin password:**
|
||||
```bash
|
||||
llamactl --reset-admin-password
|
||||
```
|
||||
|
||||
2. **Disable authentication temporarily:**
|
||||
1. **Disable authentication temporarily:**
|
||||
```yaml
|
||||
auth:
|
||||
enabled: false
|
||||
require_management_auth: false
|
||||
require_inference_auth: false
|
||||
```
|
||||
|
||||
## Log Analysis
|
||||
2. **Configure API keys:**
|
||||
```yaml
|
||||
auth:
|
||||
management_keys:
|
||||
- "your-management-key"
|
||||
inference_keys:
|
||||
- "your-inference-key"
|
||||
```
|
||||
|
||||
3. **Use correct Authorization header:**
|
||||
```bash
|
||||
curl -H "Authorization: Bearer your-api-key" \
|
||||
http://localhost:8080/api/v1/instances
|
||||
```
|
||||
|
||||
## Debugging and Logs
|
||||
|
||||
### Viewing Instance Logs
|
||||
|
||||
```bash
|
||||
# Get instance logs via API
|
||||
curl http://localhost:8080/api/v1/instances/{name}/logs
|
||||
|
||||
# Or check log files directly
|
||||
tail -f ~/.local/share/llamactl/logs/{instance-name}.log
|
||||
```
|
||||
|
||||
### Enable Debug Logging
|
||||
|
||||
For detailed troubleshooting, enable debug logging:
|
||||
|
||||
```yaml
|
||||
logging:
|
||||
level: "debug"
|
||||
output: "/var/log/llamactl/debug.log"
|
||||
```
|
||||
|
||||
### Key Log Patterns
|
||||
|
||||
Look for these patterns in logs:
|
||||
|
||||
**Startup issues:**
|
||||
```
|
||||
ERRO Failed to start server
|
||||
ERRO Database connection failed
|
||||
ERRO Port binding failed
|
||||
```
|
||||
|
||||
**Instance issues:**
|
||||
```
|
||||
ERRO Failed to start instance
|
||||
ERRO Model loading failed
|
||||
ERRO Process crashed
|
||||
```
|
||||
|
||||
**Performance issues:**
|
||||
```
|
||||
WARN High memory usage detected
|
||||
WARN Request timeout
|
||||
WARN Resource limit exceeded
|
||||
```bash
|
||||
export LLAMACTL_LOG_LEVEL=debug
|
||||
llamactl
|
||||
```
|
||||
|
||||
## Getting Help
|
||||
|
||||
### Collecting Information
|
||||
|
||||
When seeking help, provide:
|
||||
When reporting issues, include:
|
||||
|
||||
1. **System information:**
|
||||
```bash
|
||||
uname -a
|
||||
llamactl --version
|
||||
```
|
||||
|
||||
2. **Configuration:**
|
||||
```bash
|
||||
llamactl --config-dump
|
||||
```
|
||||
2. **Configuration file** (remove sensitive keys)
|
||||
|
||||
3. **Logs:**
|
||||
```bash
|
||||
tail -100 /var/log/llamactl/app.log
|
||||
```
|
||||
3. **Relevant log output**
|
||||
|
||||
4. **Error details:**
|
||||
- Exact error messages
|
||||
- Steps to reproduce
|
||||
- Environment details
|
||||
|
||||
### Support Channels
|
||||
|
||||
- **GitHub Issues:** Report bugs and feature requests
|
||||
- **Documentation:** Check this documentation first
|
||||
- **Community:** Join discussions in GitHub Discussions
|
||||
|
||||
## Preventive Measures
|
||||
|
||||
### Health Monitoring
|
||||
|
||||
Set up monitoring to catch issues early:
|
||||
|
||||
```bash
|
||||
# Regular health checks
|
||||
*/5 * * * * curl -f http://localhost:8080/api/health || alert
|
||||
```
|
||||
|
||||
### Resource Monitoring
|
||||
|
||||
Monitor system resources:
|
||||
|
||||
```bash
|
||||
# Disk space monitoring
|
||||
df -h /var/log/llamactl/
|
||||
df -h /path/to/models/
|
||||
|
||||
# Memory monitoring
|
||||
free -h
|
||||
```
|
||||
|
||||
### Backup Configuration
|
||||
|
||||
Regular configuration backups:
|
||||
|
||||
```bash
|
||||
# Backup configuration
|
||||
cp ~/.llamactl/config.yaml ~/.llamactl/config.yaml.backup
|
||||
|
||||
# Backup instance configurations
|
||||
curl http://localhost:8080/api/instances > instances-backup.json
|
||||
```
|
||||
4. **Steps to reproduce the issue**
|
||||
|
||||
Reference in New Issue
Block a user