Document llama.cpp router mode

This commit is contained in:
2025-12-22 21:20:42 +01:00
parent 67098d7801
commit 3cec850e74
2 changed files with 95 additions and 0 deletions

View File

@@ -12,6 +12,7 @@
**🚀 Easy Model Management**
- **Multiple Models Simultaneously**: Run different models at the same time (7B for speed, 70B for quality)
- **Dynamic Multi-Model Instances**: llama.cpp router mode - serve multiple models from a single instance with on-demand loading
- **Smart Resource Management**: Automatic idle timeout, LRU eviction, and configurable instance limits
- **Web Dashboard**: Modern React UI for managing instances, monitoring health, and viewing logs