diff --git a/dev/__pycache__/readme_sync.cpython-311.pyc b/dev/__pycache__/readme_sync.cpython-311.pyc new file mode 100644 index 0000000..75a2f88 Binary files /dev/null and b/dev/__pycache__/readme_sync.cpython-311.pyc differ diff --git a/dev/index.html b/dev/index.html index dd8ad2a..08f3ee7 100644 --- a/dev/index.html +++ b/dev/index.html @@ -389,6 +389,48 @@ + +
  • @@ -692,6 +734,48 @@ + +
  • @@ -740,20 +824,33 @@

    Llamactl Documentation

    -

    Welcome to the Llamactl documentation! Management server and proxy for multiple llama.cpp and MLX instances with OpenAI-compatible API routing.

    +

    Welcome to the Llamactl documentation!

    Dashboard Screenshot

    What is Llamactl?

    -

    Llamactl is designed to simplify the deployment and management of llama-server and MLX instances. It provides a modern solution for running multiple large language models with centralized management and multi-backend support.

    +

    Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard.

    Features

    -

    🚀 Multiple Model Serving: Run different models simultaneously (7B for speed, 70B for quality) -🔗 OpenAI API Compatible: Drop-in replacement - route requests by model name -🍎 Multi-Backend Support: Native support for both llama.cpp and MLX (Apple Silicon optimized) -🌐 Web Dashboard: Modern React UI for visual management (unlike CLI-only tools) -🔐 API Key Authentication: Separate keys for management vs inference access -📊 Instance Monitoring: Health checks, auto-restart, log management -⚡ Smart Resource Management: Idle timeout, LRU eviction, and configurable instance limits -💡 On-Demand Instance Start: Automatically launch instances upon receiving OpenAI-compatible API requests -💾 State Persistence: Ensure instances remain intact across server restarts

    +

    🚀 Easy Model Management

    + +

    🔗 Universal Compatibility

    + +

    🌐 User-Friendly Interface

    + +

    ⚡ Smart Operations

    + +

    Dashboard Screenshot