diff --git a/README.md b/README.md index 59681b9..7b917e6 100644 --- a/README.md +++ b/README.md @@ -104,10 +104,13 @@ cd llamactl mkdir -p data/llamacpp data/vllm models # Build and start llamactl with llama.cpp CUDA backend -docker-compose up llamactl-llamacpp -d +docker-compose -f docker/docker-compose.yml up llamactl-llamacpp -d # Build and start llamactl with vLLM CUDA backend -docker-compose up llamactl-vllm -d +docker-compose -f docker/docker-compose.yml up llamactl-vllm -d + +# Build from source using multi-stage build +docker build -f docker/Dockerfile.source -t llamactl:source . ``` **Features:** CUDA support, automatic latest release installation, no backend dependencies. diff --git a/docs/getting-started/installation.md b/docs/getting-started/installation.md index 96b9fce..f64146f 100644 --- a/docs/getting-started/installation.md +++ b/docs/getting-started/installation.md @@ -76,8 +76,9 @@ sudo mv llamactl /usr/local/bin/ llamactl provides Dockerfiles for creating Docker images with backends pre-installed. The resulting images include the latest llamactl release with the respective backend. **Available Dockerfiles (CUDA):** -- **llamactl with llama.cpp CUDA**: `Dockerfile.llamacpp` (based on `ghcr.io/ggml-org/llama.cpp:server-cuda`) -- **llamactl with vLLM CUDA**: `Dockerfile.vllm` (based on `vllm/vllm-openai:latest`) +- **llamactl with llama.cpp CUDA**: `docker/Dockerfile.llamacpp` (based on `ghcr.io/ggml-org/llama.cpp:server-cuda`) +- **llamactl with vLLM CUDA**: `docker/Dockerfile.vllm` (based on `vllm/vllm-openai:latest`) +- **llamactl built from source**: `docker/Dockerfile.source` (multi-stage build with webui) **Note:** These Dockerfiles are configured for CUDA. For other platforms (CPU, ROCm, Vulkan, etc.), adapt the base image. For llama.cpp, see available tags at [llama.cpp Docker docs](https://github.com/ggml-org/llama.cpp/blob/master/docs/docker.md). For vLLM, check [vLLM docs](https://docs.vllm.ai/en/v0.6.5/serving/deploying_with_docker.html). @@ -92,10 +93,10 @@ cd llamactl mkdir -p data/llamacpp data/vllm models # Start llamactl with llama.cpp backend -docker-compose up llamactl-llamacpp -d +docker-compose -f docker/docker-compose.yml up llamactl-llamacpp -d # Or start llamactl with vLLM backend -docker-compose up llamactl-vllm -d +docker-compose -f docker/docker-compose.yml up llamactl-vllm -d ``` Access the dashboard at: @@ -106,7 +107,7 @@ Access the dashboard at: **llamactl with llama.cpp CUDA:** ```bash -docker build -f Dockerfile.llamacpp -t llamactl:llamacpp-cuda . +docker build -f docker/Dockerfile.llamacpp -t llamactl:llamacpp-cuda . docker run -d \ --name llamactl-llamacpp \ --gpus all \ @@ -117,7 +118,7 @@ docker run -d \ **llamactl with vLLM CUDA:** ```bash -docker build -f Dockerfile.vllm -t llamactl:vllm-cuda . +docker build -f docker/Dockerfile.vllm -t llamactl:vllm-cuda . docker run -d \ --name llamactl-vllm \ --gpus all \ @@ -126,6 +127,15 @@ docker run -d \ llamactl:vllm-cuda ``` +**llamactl built from source:** +```bash +docker build -f docker/Dockerfile.source -t llamactl:source . +docker run -d \ + --name llamactl \ + -p 8080:8080 \ + llamactl:source +``` + ### Option 3: Build from Source Requirements: