Document llama.cpp router mode

2025-12-23 09:34:23 +00:00 · 2025-12-22 21:20:42 +01:00
parent 67098d7801
commit 3cec850e74
2 changed files with 95 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -12,6 +12,7 @@

 **🚀 Easy Model Management**
 - **Multiple Models Simultaneously**: Run different models at the same time (7B for speed, 70B for quality)
+- **Dynamic Multi-Model Instances**: llama.cpp router mode - serve multiple models from a single instance with on-demand loading
 - **Smart Resource Management**: Automatic idle timeout, LRU eviction, and configurable instance limits
 - **Web Dashboard**: Modern React UI for managing instances, monitoring health, and viewing logs