From c0cd03c75d0891ab1006ca0afb2d87fdb8edc3dc Mon Sep 17 00:00:00 2001
From: LordMathis <matus@namesny.com>
Date: Sun, 26 Oct 2025 15:59:17 +0100
Subject: [PATCH] Refactor troubleshooting documentation for instance
 management issues

---
 docs/troubleshooting.md | 84 ++++++++++++++++++++++-------------------
 1 file changed, 46 insertions(+), 38 deletions(-)

diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md
index 1186123..e7d9d80 100644
--- a/docs/troubleshooting.md
+++ b/docs/troubleshooting.md
@@ -26,59 +26,67 @@ Issues specific to Llamactl deployment and operation.
 
 ## Instance Management Issues
 
-### Model Loading Failures
+### Instance Fails to Start
 
-**Problem:** Instance fails to start with model loading errors
-
-**Common Solutions:**  
-- **llama-server not found:** Ensure `llama-server` binary is in PATH  
-- **Wrong model format:** Ensure model is in GGUF format  
-- **Insufficient memory:** Use smaller model or reduce context size  
-- **Path issues:** Use absolute paths to model files  
-
-### Memory Issues
-
-**Problem:** Out of memory errors or system becomes unresponsive
+**Problem:** Instance fails to start or immediately stops
 
 **Solutions:**
-1. **Reduce context size:**
-   ```json
-   {
-     "n_ctx": 1024
-   }
+
+1. **Check instance logs** to see the actual error:
+   ```bash
+   curl http://localhost:8080/api/v1/instances/{name}/logs
+   # Or check log files directly
+   tail -f ~/.local/share/llamactl/logs/{instance-name}.log
    ```
 
-2. **Use quantized models:**  
-   - Try Q4_K_M instead of higher precision models  
-   - Use smaller model variants (7B instead of 13B)  
+2. **Verify backend is installed:**  
+     - **llama.cpp**: Ensure `llama-server` is in PATH
+     - **MLX**: Ensure `mlx-lm` Python package is installed
+     - **vLLM**: Ensure `vllm` Python package is installed
 
-### GPU Configuration
+3. **Check model path and format:**
+     - Use absolute paths to model files
+     - Verify model format matches backend (GGUF for llama.cpp, etc.)
 
-**Problem:** GPU not being used effectively
+4. **Verify backend command configuration:**
+     - Check that the backend `command` is correctly configured in the global config
+     - For virtual environments, specify the full path to the command (e.g., `/path/to/venv/bin/mlx_lm.server`)
+     - See the [Configuration Guide](configuration.md) for backend configuration details
+     - Test the backend directly (see [Backend-Specific Issues](#backend-specific-issues) below)
 
-**Solutions:**
-1. **Configure GPU layers:**
-   ```json
-   {
-     "n_gpu_layers": 35
-   }
-   ```
+### Backend-Specific Issues
 
-### Advanced Instance Issues
+**Problem:** Model loading, memory, GPU, or performance issues
 
-**Problem:** Complex model loading, performance, or compatibility issues
+Most model-specific issues (memory, GPU configuration, performance tuning) are backend-specific and should be resolved by consulting the respective backend documentation:
 
-Since llamactl uses `llama-server` under the hood, many instance-related issues are actually llama.cpp issues. For advanced troubleshooting check llama.cpp resources:
-- **llama.cpp Documentation:** [https://github.com/ggml-org/llama.cpp](https://github.com/ggml-org/llama.cpp)  
+**llama.cpp:**
+- [llama.cpp GitHub](https://github.com/ggml-org/llama.cpp)
+- [llama-server README](https://github.com/ggml-org/llama.cpp/blob/master/tools/server/README.md)
 
+**MLX:**
+- [MLX-LM GitHub](https://github.com/ml-explore/mlx-lm)
+- [MLX-LM Server Guide](https://github.com/ml-explore/mlx-lm/blob/main/mlx_lm/SERVER.md)
+
+**vLLM:**
+- [vLLM Documentation](https://docs.vllm.ai/en/stable/)
+- [OpenAI Compatible Server](https://docs.vllm.ai/en/stable/serving/openai_compatible_server.html)
+- [vllm serve Command](https://docs.vllm.ai/en/stable/cli/serve.html#vllm-serve)
+
+**Testing backends directly:**
+
+Testing your model and configuration directly with the backend helps determine if the issue is with llamactl or the backend itself:
 
-**Testing directly with llama-server:**  
 ```bash
-# Test your model and parameters directly with llama-server
-llama-server --model /path/to/model.gguf --port 8081 --n-gpu-layers 35
-```
+# llama.cpp
+llama-server --model /path/to/model.gguf --port 8081
 
-This helps determine if the issue is with llamactl or with the underlying llama.cpp/llama-server.
+# MLX
+mlx_lm.server --model mlx-community/Mistral-7B-Instruct-v0.3-4bit --port 8081
+
+# vLLM
+python -m vllm.entrypoints.openai.api_server --model microsoft/DialoGPT-medium --port 8081
+```
 
 ## API and Network Issues