diff --git a/README.md b/README.md
index d9edfd5..3eed452 100644
--- a/README.md
+++ b/README.md
@@ -123,7 +123,6 @@ instances:
   on_demand_start_timeout: 120   # Default on-demand start timeout in seconds
   timeout_check_interval: 5      # Idle instance timeout check in minutes
 
-
 auth:
   require_inference_auth: true   # Require auth for inference endpoints
   inference_keys: []             # Keys for inference endpoints
@@ -131,107 +130,7 @@ auth:
   management_keys: []            # Keys for management endpoints
 ```
 
-<details><summary><strong>Full Configuration Guide</strong></summary>
-
-llamactl can be configured via configuration files or environment variables. Configuration is loaded in the following order of precedence:  
-
-```
-Defaults < Configuration file < Environment variables
-```
-
-### Configuration Files
-
-#### Configuration File Locations
-
-Configuration files are searched in the following locations (in order of precedence):
-
-**Linux/macOS:**
-- `./llamactl.yaml` or `./config.yaml` (current directory)
-- `$HOME/.config/llamactl/config.yaml`
-- `/etc/llamactl/config.yaml`
-
-**Windows:**
-- `./llamactl.yaml` or `./config.yaml` (current directory)
-- `%APPDATA%\llamactl\config.yaml`
-- `%USERPROFILE%\llamactl\config.yaml`
-- `%PROGRAMDATA%\llamactl\config.yaml`
-
-You can specify the path to config file with `LLAMACTL_CONFIG_PATH` environment variable.
-
-### Configuration Options
-
-#### Server Configuration
-
-```yaml
-server:
-  host: "0.0.0.0"         # Server host to bind to (default: "0.0.0.0")
-  port: 8080              # Server port to bind to (default: 8080)
-  allowed_origins: ["*"]  # CORS allowed origins (default: ["*"])
-  enable_swagger: false   # Enable Swagger UI (default: false)
-```
-
-**Environment Variables:**
-- `LLAMACTL_HOST` - Server host
-- `LLAMACTL_PORT` - Server port
-- `LLAMACTL_ALLOWED_ORIGINS` - Comma-separated CORS origins
-- `LLAMACTL_ENABLE_SWAGGER` - Enable Swagger UI (true/false)
-
-#### Instance Configuration
-
-```yaml
-instances:
-  port_range: [8000, 9000]                          # Port range for instances (default: [8000, 9000])
-  data_dir: "~/.local/share/llamactl"               # Directory for all llamactl data (default varies by OS)
-  configs_dir: "~/.local/share/llamactl/instances"  # Directory for instance configs (default: data_dir/instances)
-  logs_dir: "~/.local/share/llamactl/logs"          # Directory for instance logs (default: data_dir/logs)
-  auto_create_dirs: true                            # Automatically create data/config/logs directories (default: true)
-  max_instances: -1                                 # Maximum instances (-1 = unlimited)
-  max_running_instances: -1                         # Maximum running instances (-1 = unlimited)
-  enable_lru_eviction: true                         # Enable LRU eviction for idle instances
-  llama_executable: "llama-server"                  # Path to llama-server executable
-  default_auto_restart: true                        # Default auto-restart setting
-  default_max_restarts: 3                           # Default maximum restart attempts
-  default_restart_delay: 5                          # Default restart delay in seconds
-  default_on_demand_start: true                     # Default on-demand start setting
-  on_demand_start_timeout: 120                      # Default on-demand start timeout in seconds
-  timeout_check_interval: 5                         # Default instance timeout check interval in minutes
-```
-
-**Environment Variables:**
-- `LLAMACTL_INSTANCE_PORT_RANGE` - Port range (format: "8000-9000" or "8000,9000")
-- `LLAMACTL_DATA_DIRECTORY` - Data directory path
-- `LLAMACTL_INSTANCES_DIR` - Instance configs directory path
-- `LLAMACTL_LOGS_DIR` - Log directory path
-- `LLAMACTL_AUTO_CREATE_DATA_DIR` - Auto-create data/config/logs directories (true/false)
-- `LLAMACTL_MAX_INSTANCES` - Maximum number of instances
-- `LLAMACTL_MAX_RUNNING_INSTANCES` - Maximum number of running instances
-- `LLAMACTL_ENABLE_LRU_EVICTION` - Enable LRU eviction for idle instances
-- `LLAMACTL_LLAMA_EXECUTABLE` - Path to llama-server executable
-- `LLAMACTL_DEFAULT_AUTO_RESTART` - Default auto-restart setting (true/false)
-- `LLAMACTL_DEFAULT_MAX_RESTARTS` - Default maximum restarts
-- `LLAMACTL_DEFAULT_RESTART_DELAY` - Default restart delay in seconds
-- `LLAMACTL_DEFAULT_ON_DEMAND_START` - Default on-demand start setting (true/false)
-- `LLAMACTL_ON_DEMAND_START_TIMEOUT` - Default on-demand start timeout in seconds
-- `LLAMACTL_TIMEOUT_CHECK_INTERVAL` - Default instance timeout check interval in minutes
-
-
-#### Authentication Configuration
-
-```yaml
-auth:
-  require_inference_auth: true           # Require API key for OpenAI endpoints (default: true)
-  inference_keys: []                     # List of valid inference API keys
-  require_management_auth: true          # Require API key for management endpoints (default: true)
-  management_keys: []                    # List of valid management API keys
-```
-
-**Environment Variables:**
-- `LLAMACTL_REQUIRE_INFERENCE_AUTH` - Require auth for OpenAI endpoints (true/false)
-- `LLAMACTL_INFERENCE_KEYS` - Comma-separated inference API keys
-- `LLAMACTL_REQUIRE_MANAGEMENT_AUTH` - Require auth for management endpoints (true/false)
-- `LLAMACTL_MANAGEMENT_KEYS` - Comma-separated management API keys
-
-</details>
+For detailed configuration options including environment variables, file locations, and advanced settings, see the [Configuration Guide](docs/getting-started/configuration.md).
 
 ## License
 
diff --git a/docs/development/building.md b/docs/development/building.md
deleted file mode 100644
index a102915..0000000
--- a/docs/development/building.md
+++ /dev/null
@@ -1,464 +0,0 @@
-# Building from Source
-
-This guide covers building Llamactl from source code for development and production deployment.
-
-## Prerequisites
-
-### Required Tools
-
-- **Go 1.24+**: Download from [golang.org](https://golang.org/dl/)
-- **Node.js 22+**: Download from [nodejs.org](https://nodejs.org/)
-- **Git**: For cloning the repository
-- **Make**: For build automation (optional)
-
-### System Requirements
-
-- **Memory**: 4GB+ RAM for building
-- **Disk**: 2GB+ free space
-- **OS**: Linux, macOS, or Windows
-
-## Quick Build
-
-### Clone and Build
-
-```bash
-# Clone the repository
-git clone https://github.com/lordmathis/llamactl.git
-cd llamactl
-
-# Build the application
-go build -o llamactl cmd/server/main.go
-```
-
-### Run
-
-```bash
-./llamactl
-```
-
-## Development Build
-
-### Setup Development Environment
-
-```bash
-# Clone repository
-git clone https://github.com/lordmathis/llamactl.git
-cd llamactl
-
-# Install Go dependencies
-go mod download
-
-# Install frontend dependencies
-cd webui
-npm ci
-cd ..
-```
-
-### Build Components
-
-```bash
-# Build backend only
-go build -o llamactl cmd/server/main.go
-
-# Build frontend only
-cd webui
-npm run build
-cd ..
-
-# Build everything
-make build
-```
-
-### Development Server
-
-```bash
-# Run backend in development mode
-go run cmd/server/main.go --dev
-
-# Run frontend dev server (separate terminal)
-cd webui
-npm run dev
-```
-
-## Production Build
-
-### Optimized Build
-
-```bash
-# Build with optimizations
-go build -ldflags="-s -w" -o llamactl cmd/server/main.go
-
-# Or use the Makefile
-make build-prod
-```
-
-### Build Flags
-
-Common build flags for production:
-
-```bash
-go build \
-  -ldflags="-s -w -X main.version=1.0.0 -X main.buildTime=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
-  -trimpath \
-  -o llamactl \
-  cmd/server/main.go
-```
-
-**Flag explanations:**
-- `-s`: Strip symbol table
-- `-w`: Strip debug information
-- `-X`: Set variable values at build time
-- `-trimpath`: Remove absolute paths from binary
-
-## Cross-Platform Building
-
-### Build for Multiple Platforms
-
-```bash
-# Linux AMD64
-GOOS=linux GOARCH=amd64 go build -o llamactl-linux-amd64 cmd/server/main.go
-
-# Linux ARM64
-GOOS=linux GOARCH=arm64 go build -o llamactl-linux-arm64 cmd/server/main.go
-
-# macOS AMD64
-GOOS=darwin GOARCH=amd64 go build -o llamactl-darwin-amd64 cmd/server/main.go
-
-# macOS ARM64 (Apple Silicon)
-GOOS=darwin GOARCH=arm64 go build -o llamactl-darwin-arm64 cmd/server/main.go
-
-# Windows AMD64
-GOOS=windows GOARCH=amd64 go build -o llamactl-windows-amd64.exe cmd/server/main.go
-```
-
-### Automated Cross-Building
-
-Use the provided Makefile:
-
-```bash
-# Build all platforms
-make build-all
-
-# Build specific platform
-make build-linux
-make build-darwin
-make build-windows
-```
-
-## Build with Docker
-
-### Development Container
-
-```dockerfile
-# Dockerfile.dev
-FROM golang:1.24-alpine AS builder
-
-WORKDIR /app
-COPY go.mod go.sum ./
-RUN go mod download
-
-COPY . .
-RUN go build -o llamactl cmd/server/main.go
-
-FROM alpine:latest
-RUN apk --no-cache add ca-certificates
-WORKDIR /root/
-COPY --from=builder /app/llamactl .
-
-EXPOSE 8080
-CMD ["./llamactl"]
-```
-
-```bash
-# Build development image
-docker build -f Dockerfile.dev -t llamactl:dev .
-
-# Run container
-docker run -p 8080:8080 llamactl:dev
-```
-
-### Production Container
-
-```dockerfile
-# Dockerfile
-FROM node:22-alpine AS frontend-builder
-
-WORKDIR /app/webui
-COPY webui/package*.json ./
-RUN npm ci
-
-COPY webui/ ./
-RUN npm run build
-
-FROM golang:1.24-alpine AS backend-builder
-
-WORKDIR /app
-COPY go.mod go.sum ./
-RUN go mod download
-
-COPY . .
-COPY --from=frontend-builder /app/webui/dist ./webui/dist
-
-RUN CGO_ENABLED=0 GOOS=linux go build \
-    -ldflags="-s -w" \
-    -o llamactl \
-    cmd/server/main.go
-
-FROM alpine:latest
-
-RUN apk --no-cache add ca-certificates tzdata
-RUN adduser -D -s /bin/sh llamactl
-
-WORKDIR /home/llamactl
-COPY --from=backend-builder /app/llamactl .
-RUN chown llamactl:llamactl llamactl
-
-USER llamactl
-EXPOSE 8080
-
-CMD ["./llamactl"]
-```
-
-## Advanced Build Options
-
-### Static Linking
-
-For deployments without external dependencies:
-
-```bash
-CGO_ENABLED=0 go build \
-  -ldflags="-s -w -extldflags '-static'" \
-  -o llamactl-static \
-  cmd/server/main.go
-```
-
-### Debug Build
-
-Build with debug information:
-
-```bash
-go build -gcflags="all=-N -l" -o llamactl-debug cmd/server/main.go
-```
-
-### Race Detection Build
-
-Build with race detection (development only):
-
-```bash
-go build -race -o llamactl-race cmd/server/main.go
-```
-
-## Build Automation
-
-### Makefile
-
-```makefile
-# Makefile
-VERSION := $(shell git describe --tags --always --dirty)
-BUILD_TIME := $(shell date -u +%Y-%m-%dT%H:%M:%SZ)
-LDFLAGS := -s -w -X main.version=$(VERSION) -X main.buildTime=$(BUILD_TIME)
-
-.PHONY: build clean test install
-
-build:
-	@echo "Building Llamactl..."
-	@cd webui && npm run build
-	@go build -ldflags="$(LDFLAGS)" -o llamactl cmd/server/main.go
-
-build-prod:
-	@echo "Building production binary..."
-	@cd webui && npm run build
-	@CGO_ENABLED=0 go build -ldflags="$(LDFLAGS)" -trimpath -o llamactl cmd/server/main.go
-
-build-all: build-linux build-darwin build-windows
-
-build-linux:
-	@GOOS=linux GOARCH=amd64 go build -ldflags="$(LDFLAGS)" -o dist/llamactl-linux-amd64 cmd/server/main.go
-	@GOOS=linux GOARCH=arm64 go build -ldflags="$(LDFLAGS)" -o dist/llamactl-linux-arm64 cmd/server/main.go
-
-build-darwin:
-	@GOOS=darwin GOARCH=amd64 go build -ldflags="$(LDFLAGS)" -o dist/llamactl-darwin-amd64 cmd/server/main.go
-	@GOOS=darwin GOARCH=arm64 go build -ldflags="$(LDFLAGS)" -o dist/llamactl-darwin-arm64 cmd/server/main.go
-
-build-windows:
-	@GOOS=windows GOARCH=amd64 go build -ldflags="$(LDFLAGS)" -o dist/llamactl-windows-amd64.exe cmd/server/main.go
-
-test:
-	@go test ./...
-
-clean:
-	@rm -f llamactl llamactl-*
-	@rm -rf dist/
-
-install: build
-	@cp llamactl $(GOPATH)/bin/llamactl
-```
-
-### GitHub Actions
-
-```yaml
-# .github/workflows/build.yml
-name: Build
-
-on:
-  push:
-    branches: [ main ]
-  pull_request:
-    branches: [ main ]
-
-jobs:
-  test:
-    runs-on: ubuntu-latest
-    steps:
-    - uses: actions/checkout@v4
-    
-    - name: Set up Go
-      uses: actions/setup-go@v4
-      with:
-        go-version: '1.24'
-    
-    - name: Set up Node.js
-      uses: actions/setup-node@v4
-      with:
-        node-version: '22'
-    
-    - name: Install dependencies
-      run: |
-        go mod download
-        cd webui && npm ci
-    
-    - name: Run tests
-      run: |
-        go test ./...
-        cd webui && npm test
-    
-    - name: Build
-      run: make build
-
-  build:
-    needs: test
-    runs-on: ubuntu-latest
-    if: github.ref == 'refs/heads/main'
-    
-    steps:
-    - uses: actions/checkout@v4
-    
-    - name: Set up Go
-      uses: actions/setup-go@v4
-      with:
-        go-version: '1.24'
-    
-    - name: Set up Node.js
-      uses: actions/setup-node@v4
-      with:
-        node-version: '22'
-    
-    - name: Build all platforms
-      run: make build-all
-    
-    - name: Upload artifacts
-      uses: actions/upload-artifact@v4
-      with:
-        name: binaries
-        path: dist/
-```
-
-## Build Troubleshooting
-
-### Common Issues
-
-**Go version mismatch:**
-```bash
-# Check Go version
-go version
-
-# Update Go
-# Download from https://golang.org/dl/
-```
-
-**Node.js issues:**
-```bash
-# Clear npm cache
-npm cache clean --force
-
-# Remove node_modules and reinstall
-rm -rf webui/node_modules
-cd webui && npm ci
-```
-
-**Build failures:**
-```bash
-# Clean and rebuild
-make clean
-go mod tidy
-make build
-```
-
-### Performance Issues
-
-**Slow builds:**
-```bash
-# Use build cache
-export GOCACHE=$(go env GOCACHE)
-
-# Parallel builds
-export GOMAXPROCS=$(nproc)
-```
-
-**Large binary size:**
-```bash
-# Use UPX compression
-upx --best llamactl
-
-# Analyze binary size
-go tool nm -size llamactl | head -20
-```
-
-## Deployment
-
-### System Service
-
-Create a systemd service:
-
-```ini
-# /etc/systemd/system/llamactl.service
-[Unit]
-Description=Llamactl Server
-After=network.target
-
-[Service]
-Type=simple
-User=llamactl
-Group=llamactl
-ExecStart=/usr/local/bin/llamactl
-Restart=always
-RestartSec=5
-
-[Install]
-WantedBy=multi-user.target
-```
-
-```bash
-# Enable and start service
-sudo systemctl enable llamactl
-sudo systemctl start llamactl
-```
-
-### Configuration
-
-```bash
-# Create configuration directory
-sudo mkdir -p /etc/llamactl
-
-# Copy configuration
-sudo cp config.yaml /etc/llamactl/
-
-# Set permissions
-sudo chown -R llamactl:llamactl /etc/llamactl
-```
-
-## Next Steps
-
-- Configure [Installation](../getting-started/installation.md)
-- Set up [Configuration](../getting-started/configuration.md)
-- Learn about [Contributing](contributing.md)
diff --git a/docs/development/contributing.md b/docs/development/contributing.md
deleted file mode 100644
index 3b27d90..0000000
--- a/docs/development/contributing.md
+++ /dev/null
@@ -1,373 +0,0 @@
-# Contributing
-
-Thank you for your interest in contributing to Llamactl! This guide will help you get started with development and contribution.
-
-## Development Setup
-
-### Prerequisites
-
-- Go 1.24 or later
-- Node.js 22 or later
-- `llama-server` executable (from [llama.cpp](https://github.com/ggml-org/llama.cpp))
-- Git
-
-### Getting Started
-
-1. **Fork and Clone**
-   ```bash
-   # Fork the repository on GitHub, then clone your fork
-   git clone https://github.com/yourusername/llamactl.git
-   cd llamactl
-   
-   # Add upstream remote
-   git remote add upstream https://github.com/lordmathis/llamactl.git
-   ```
-
-2. **Install Dependencies**
-   ```bash
-   # Go dependencies
-   go mod download
-   
-   # Frontend dependencies
-   cd webui && npm ci && cd ..
-   ```
-
-3. **Run Development Environment**
-   ```bash
-   # Start backend server
-   go run ./cmd/server
-   ```
-   
-   In a separate terminal:
-   ```bash
-   # Start frontend dev server
-   cd webui && npm run dev
-   ```
-
-## Development Workflow
-
-### Setting Up Your Environment
-
-1. **Configuration**
-   Create a development configuration file:
-   ```yaml
-   # dev-config.yaml
-   server:
-     host: "localhost"
-     port: 8080
-   logging:
-     level: "debug"
-   ```
-
-2. **Test Data**
-   Set up test models and instances for development.
-
-### Making Changes
-
-1. **Create a Branch**
-   ```bash
-   git checkout -b feature/your-feature-name
-   ```
-
-2. **Development Commands**
-   ```bash
-   # Backend
-   go test ./... -v                    # Run tests
-   go test -race ./... -v              # Run with race detector
-   go fmt ./... && go vet ./...        # Format and vet code
-   go build ./cmd/server               # Build binary
-   
-   # Frontend (from webui/ directory)
-   npm run test                        # Run tests
-   npm run lint                        # Lint code
-   npm run type-check                  # TypeScript check
-   npm run build                       # Build for production
-   ```
-
-3. **Code Quality**
-   ```bash
-   # Run all checks before committing
-   make lint
-   make test
-   make build
-   ```
-
-## Project Structure
-
-### Backend (Go)
-
-```
-cmd/
-├── server/              # Main application entry point
-pkg/
-├── backends/           # Model backend implementations
-├── config/            # Configuration management
-├── instance/          # Instance lifecycle management
-├── manager/           # Instance manager
-├── server/            # HTTP server and routes
-├── testutil/          # Test utilities
-└── validation/        # Input validation
-```
-
-### Frontend (React/TypeScript)
-
-```
-webui/src/
-├── components/        # React components
-├── contexts/         # React contexts
-├── hooks/           # Custom hooks
-├── lib/             # Utility libraries
-├── schemas/         # Zod schemas
-└── types/           # TypeScript types
-```
-
-## Coding Standards
-
-### Go Code
-
-- Follow standard Go formatting (`gofmt`)
-- Use `go vet` and address all warnings
-- Write comprehensive tests for new functionality
-- Include documentation comments for exported functions
-- Use meaningful variable and function names
-
-Example:
-```go
-// CreateInstance creates a new model instance with the given configuration.
-// It validates the configuration and ensures the instance name is unique.
-func (m *Manager) CreateInstance(ctx context.Context, config InstanceConfig) (*Instance, error) {
-    if err := config.Validate(); err != nil {
-        return nil, fmt.Errorf("invalid configuration: %w", err)
-    }
-    
-    // Implementation...
-}
-```
-
-### TypeScript/React Code
-
-- Use TypeScript strict mode
-- Follow React best practices
-- Use functional components with hooks
-- Implement proper error boundaries
-- Write unit tests for components
-
-Example:
-```typescript
-interface InstanceCardProps {
-  instance: Instance;
-  onStart: (name: string) => Promise<void>;
-  onStop: (name: string) => Promise<void>;
-}
-
-export const InstanceCard: React.FC<InstanceCardProps> = ({
-  instance,
-  onStart,
-  onStop,
-}) => {
-  // Implementation...
-};
-```
-
-## Testing
-
-### Backend Tests
-
-```bash
-# Run all tests
-go test ./...
-
-# Run tests with coverage
-go test ./... -coverprofile=coverage.out
-go tool cover -html=coverage.out
-
-# Run specific package tests
-go test ./pkg/manager -v
-
-# Run with race detection
-go test -race ./...
-```
-
-### Frontend Tests
-
-```bash
-cd webui
-
-# Run unit tests
-npm run test
-
-# Run tests with coverage
-npm run test:coverage
-
-# Run E2E tests
-npm run test:e2e
-```
-
-### Integration Tests
-
-```bash
-# Run integration tests (requires llama-server)
-go test ./... -tags=integration
-```
-
-## Pull Request Process
-
-### Before Submitting
-
-1. **Update your branch**
-   ```bash
-   git fetch upstream
-   git rebase upstream/main
-   ```
-
-2. **Run all tests**
-   ```bash
-   make test-all
-   ```
-
-3. **Update documentation** if needed
-
-4. **Write clear commit messages**
-   ```
-   feat: add instance health monitoring
-   
-   - Implement health check endpoint
-   - Add periodic health monitoring
-   - Update API documentation
-   
-   Fixes #123
-   ```
-
-### Submitting a PR
-
-1. **Push your branch**
-   ```bash
-   git push origin feature/your-feature-name
-   ```
-
-2. **Create Pull Request**
-   - Use the PR template
-   - Provide clear description
-   - Link related issues
-   - Add screenshots for UI changes
-
-3. **PR Review Process**
-   - Automated checks must pass
-   - Code review by maintainers
-   - Address feedback promptly
-   - Keep PR scope focused
-
-## Issue Guidelines
-
-### Reporting Bugs
-
-Use the bug report template and include:
-
-- Steps to reproduce
-- Expected vs actual behavior
-- Environment details (OS, Go version, etc.)
-- Relevant logs or error messages
-- Minimal reproduction case
-
-### Feature Requests
-
-Use the feature request template and include:
-
-- Clear description of the problem
-- Proposed solution
-- Alternative solutions considered
-- Implementation complexity estimate
-
-### Security Issues
-
-For security vulnerabilities:
-- Do NOT create public issues
-- Email security@llamactl.dev
-- Provide detailed description
-- Allow time for fix before disclosure
-
-## Development Best Practices
-
-### API Design
-
-- Follow REST principles
-- Use consistent naming conventions
-- Provide comprehensive error messages
-- Include proper HTTP status codes
-- Document all endpoints
-
-### Error Handling
-
-```go
-// Wrap errors with context
-if err := instance.Start(); err != nil {
-    return fmt.Errorf("failed to start instance %s: %w", instance.Name, err)
-}
-
-// Use structured logging
-log.WithFields(log.Fields{
-    "instance": instance.Name,
-    "error": err,
-}).Error("Failed to start instance")
-```
-
-### Configuration
-
-- Use environment variables for deployment
-- Provide sensible defaults
-- Validate configuration on startup
-- Support configuration file reloading
-
-### Performance
-
-- Profile code for bottlenecks
-- Use efficient data structures
-- Implement proper caching
-- Monitor resource usage
-
-## Release Process
-
-### Version Management
-
-- Use semantic versioning (SemVer)
-- Tag releases properly
-- Maintain CHANGELOG.md
-- Create release notes
-
-### Building Releases
-
-```bash
-# Build all platforms
-make build-all
-
-# Create release package
-make package
-```
-
-## Getting Help
-
-### Communication Channels
-
-- **GitHub Issues**: Bug reports and feature requests
-- **GitHub Discussions**: General questions and ideas
-- **Code Review**: PR comments and feedback
-
-### Development Questions
-
-When asking for help:
-
-1. Check existing documentation
-2. Search previous issues
-3. Provide minimal reproduction case
-4. Include relevant environment details
-
-## Recognition
-
-Contributors are recognized in:
-
-- CONTRIBUTORS.md file
-- Release notes
-- Documentation credits
-- Annual contributor highlights
-
-Thank you for contributing to Llamactl!
diff --git a/docs/getting-started/configuration.md b/docs/getting-started/configuration.md
index e9ba2d3..3a859ee 100644
--- a/docs/getting-started/configuration.md
+++ b/docs/getting-started/configuration.md
@@ -1,59 +1,144 @@
 # Configuration
 
-Llamactl can be configured through various methods to suit your needs.
+llamactl can be configured via configuration files or environment variables. Configuration is loaded in the following order of precedence:
 
-## Configuration File
+```
+Defaults < Configuration file < Environment variables
+```
 
-Create a configuration file at `~/.llamactl/config.yaml`:
+llamactl works out of the box with sensible defaults, but you can customize the behavior to suit your needs.
+
+## Default Configuration
+
+Here's the default configuration with all available options:
 
 ```yaml
-# Server configuration
 server:
-  host: "0.0.0.0"
-  port: 8080
-  cors_enabled: true
+  host: "0.0.0.0"                # Server host to bind to
+  port: 8080                     # Server port to bind to
+  allowed_origins: ["*"]         # Allowed CORS origins (default: all)
+  enable_swagger: false          # Enable Swagger UI for API docs
+
+instances:
+  port_range: [8000, 9000]       # Port range for instances
+  data_dir: ~/.local/share/llamactl         # Data directory (platform-specific, see below)
+  configs_dir: ~/.local/share/llamactl/instances  # Instance configs directory
+  logs_dir: ~/.local/share/llamactl/logs    # Logs directory
+  auto_create_dirs: true         # Auto-create data/config/logs dirs if missing
+  max_instances: -1              # Max instances (-1 = unlimited)
+  max_running_instances: -1      # Max running instances (-1 = unlimited)
+  enable_lru_eviction: true      # Enable LRU eviction for idle instances
+  llama_executable: llama-server # Path to llama-server executable
+  default_auto_restart: true     # Auto-restart new instances by default
+  default_max_restarts: 3        # Max restarts for new instances
+  default_restart_delay: 5       # Restart delay (seconds) for new instances
+  default_on_demand_start: true  # Default on-demand start setting
+  on_demand_start_timeout: 120   # Default on-demand start timeout in seconds
+  timeout_check_interval: 5      # Idle instance timeout check in minutes
 
-# Authentication (optional)
 auth:
-  enabled: false
-  # When enabled, configure your authentication method
-  # jwt_secret: "your-secret-key"
-
-# Default instance settings
-defaults:
-  backend: "llamacpp"
-  timeout: 300
-  log_level: "info"
-
-# Paths
-paths:
-  models_dir: "/path/to/your/models"
-  logs_dir: "/var/log/llamactl"
-  data_dir: "/var/lib/llamactl"
-
-# Instance limits
-limits:
-  max_instances: 10
-  max_memory_per_instance: "8GB"
+  require_inference_auth: true   # Require auth for inference endpoints
+  inference_keys: []             # Keys for inference endpoints
+  require_management_auth: true  # Require auth for management endpoints
+  management_keys: []            # Keys for management endpoints
 ```
 
-## Environment Variables
+## Configuration Files
 
-You can also configure Llamactl using environment variables:
+### Configuration File Locations
 
-```bash
-# Server settings
-export LLAMACTL_HOST=0.0.0.0
-export LLAMACTL_PORT=8080
+Configuration files are searched in the following locations (in order of precedence):
 
-# Paths
-export LLAMACTL_MODELS_DIR=/path/to/models
-export LLAMACTL_LOGS_DIR=/var/log/llamactl
+**Linux:**
+- `./llamactl.yaml` or `./config.yaml` (current directory)
+- `$HOME/.config/llamactl/config.yaml`
+- `/etc/llamactl/config.yaml`
 
-# Limits
-export LLAMACTL_MAX_INSTANCES=5
+**macOS:**
+- `./llamactl.yaml` or `./config.yaml` (current directory)
+- `$HOME/Library/Application Support/llamactl/config.yaml`
+- `/Library/Application Support/llamactl/config.yaml`
+
+**Windows:**
+- `./llamactl.yaml` or `./config.yaml` (current directory)
+- `%APPDATA%\llamactl\config.yaml`
+- `%USERPROFILE%\llamactl\config.yaml`
+- `%PROGRAMDATA%\llamactl\config.yaml`
+
+You can specify the path to config file with `LLAMACTL_CONFIG_PATH` environment variable.
+
+## Configuration Options
+
+### Server Configuration
+
+```yaml
+server:
+  host: "0.0.0.0"         # Server host to bind to (default: "0.0.0.0")
+  port: 8080              # Server port to bind to (default: 8080)
+  allowed_origins: ["*"]  # CORS allowed origins (default: ["*"])
+  enable_swagger: false   # Enable Swagger UI (default: false)
 ```
 
+**Environment Variables:**
+- `LLAMACTL_HOST` - Server host
+- `LLAMACTL_PORT` - Server port
+- `LLAMACTL_ALLOWED_ORIGINS` - Comma-separated CORS origins
+- `LLAMACTL_ENABLE_SWAGGER` - Enable Swagger UI (true/false)
+
+### Instance Configuration
+
+```yaml
+instances:
+  port_range: [8000, 9000]                          # Port range for instances (default: [8000, 9000])
+  data_dir: "~/.local/share/llamactl"               # Directory for all llamactl data (default varies by OS)
+  configs_dir: "~/.local/share/llamactl/instances"  # Directory for instance configs (default: data_dir/instances)
+  logs_dir: "~/.local/share/llamactl/logs"          # Directory for instance logs (default: data_dir/logs)
+  auto_create_dirs: true                            # Automatically create data/config/logs directories (default: true)
+  max_instances: -1                                 # Maximum instances (-1 = unlimited)
+  max_running_instances: -1                         # Maximum running instances (-1 = unlimited)
+  enable_lru_eviction: true                         # Enable LRU eviction for idle instances
+  llama_executable: "llama-server"                  # Path to llama-server executable
+  default_auto_restart: true                        # Default auto-restart setting
+  default_max_restarts: 3                           # Default maximum restart attempts
+  default_restart_delay: 5                          # Default restart delay in seconds
+  default_on_demand_start: true                     # Default on-demand start setting
+  on_demand_start_timeout: 120                      # Default on-demand start timeout in seconds
+  timeout_check_interval: 5                         # Default instance timeout check interval in minutes
+```
+
+**Environment Variables:**
+- `LLAMACTL_INSTANCE_PORT_RANGE` - Port range (format: "8000-9000" or "8000,9000")
+- `LLAMACTL_DATA_DIRECTORY` - Data directory path
+- `LLAMACTL_INSTANCES_DIR` - Instance configs directory path
+- `LLAMACTL_LOGS_DIR` - Log directory path
+- `LLAMACTL_AUTO_CREATE_DATA_DIR` - Auto-create data/config/logs directories (true/false)
+- `LLAMACTL_MAX_INSTANCES` - Maximum number of instances
+- `LLAMACTL_MAX_RUNNING_INSTANCES` - Maximum number of running instances
+- `LLAMACTL_ENABLE_LRU_EVICTION` - Enable LRU eviction for idle instances
+- `LLAMACTL_LLAMA_EXECUTABLE` - Path to llama-server executable
+- `LLAMACTL_DEFAULT_AUTO_RESTART` - Default auto-restart setting (true/false)
+- `LLAMACTL_DEFAULT_MAX_RESTARTS` - Default maximum restarts
+- `LLAMACTL_DEFAULT_RESTART_DELAY` - Default restart delay in seconds
+- `LLAMACTL_DEFAULT_ON_DEMAND_START` - Default on-demand start setting (true/false)
+- `LLAMACTL_ON_DEMAND_START_TIMEOUT` - Default on-demand start timeout in seconds
+- `LLAMACTL_TIMEOUT_CHECK_INTERVAL` - Default instance timeout check interval in minutes
+
+### Authentication Configuration
+
+```yaml
+auth:
+  require_inference_auth: true           # Require API key for OpenAI endpoints (default: true)
+  inference_keys: []                     # List of valid inference API keys
+  require_management_auth: true          # Require API key for management endpoints (default: true)
+  management_keys: []                    # List of valid management API keys
+```
+
+**Environment Variables:**
+- `LLAMACTL_REQUIRE_INFERENCE_AUTH` - Require auth for OpenAI endpoints (true/false)
+- `LLAMACTL_INFERENCE_KEYS` - Comma-separated inference API keys
+- `LLAMACTL_REQUIRE_MANAGEMENT_AUTH` - Require auth for management endpoints (true/false)
+- `LLAMACTL_MANAGEMENT_KEYS` - Comma-separated management API keys
+
 ## Command Line Options
 
 View all available command line options:
@@ -62,90 +147,13 @@ View all available command line options:
 llamactl --help
 ```
 
-Common options:
-
-```bash
-# Specify config file
-llamactl --config /path/to/config.yaml
-
-# Set log level
-llamactl --log-level debug
-
-# Run on different port
-llamactl --port 9090
-```
-
-## Instance Configuration
-
-When creating instances, you can specify various options:
-
-### Basic Options
-
-- `name`: Unique identifier for the instance
-- `model_path`: Path to the GGUF model file
-- `port`: Port for the instance to listen on
-
-### Advanced Options
-
-- `threads`: Number of CPU threads to use
-- `context_size`: Context window size
-- `batch_size`: Batch size for processing
-- `gpu_layers`: Number of layers to offload to GPU
-- `memory_lock`: Lock model in memory
-- `no_mmap`: Disable memory mapping
-
-### Example Instance Configuration
-
-```json
-{
-  "name": "production-model",
-  "model_path": "/models/llama-2-13b-chat.gguf",
-  "port": 8081,
-  "options": {
-    "threads": 8,
-    "context_size": 4096,
-    "batch_size": 512,
-    "gpu_layers": 35,
-    "memory_lock": true
-  }
-}
-```
-
-## Security Configuration
-
-### Enable Authentication
-
-To enable authentication, update your config file:
-
-```yaml
-auth:
-  enabled: true
-  jwt_secret: "your-very-secure-secret-key"
-  token_expiry: "24h"
-```
-
-### HTTPS Configuration
-
-For production deployments, configure HTTPS:
-
-```yaml
-server:
-  tls:
-    enabled: true
-    cert_file: "/path/to/cert.pem"
-    key_file: "/path/to/key.pem"
-```
-
-## Logging Configuration
-
-Configure logging levels and outputs:
-
-```yaml
-logging:
-  level: "info"  # debug, info, warn, error
-  format: "json"  # json or text
-  output: "/var/log/llamactl/app.log"
-```
+You can also override configuration using command line flags when starting llamactl.
+
+## Next Steps
+
+- Learn about [Managing Instances](../user-guide/managing-instances.md)
+- Explore [Advanced Configuration](../advanced/monitoring.md)
+- Set up [Monitoring](../advanced/monitoring.md)
 
 ## Next Steps
 
diff --git a/docs/getting-started/installation.md b/docs/getting-started/installation.md
index 9be575e..9ae35ed 100644
--- a/docs/getting-started/installation.md
+++ b/docs/getting-started/installation.md
@@ -4,9 +4,19 @@ This guide will walk you through installing Llamactl on your system.
 
 ## Prerequisites
 
-Before installing Llamactl, ensure you have:
+You need `llama-server` from [llama.cpp](https://github.com/ggml-org/llama.cpp) installed:
 
-- Go 1.19 or later
+```bash
+# Quick install methods:
+# Homebrew (macOS)
+brew install llama.cpp
+
+# Or build from source - see llama.cpp docs
+```
+
+Additional requirements for building from source:
+- Go 1.24 or later
+- Node.js 22 or later
 - Git
 - Sufficient disk space for your models
 
@@ -14,17 +24,18 @@ Before installing Llamactl, ensure you have:
 
 ### Option 1: Download Binary (Recommended)
 
-Download the latest release from our [GitHub releases page](https://github.com/lordmathis/llamactl/releases):
+Download the latest release from the [GitHub releases page](https://github.com/lordmathis/llamactl/releases):
 
 ```bash
-# Download for Linux
-curl -L https://github.com/lordmathis/llamactl/releases/latest/download/llamactl-linux-amd64 -o llamactl
-
-# Make executable
-chmod +x llamactl
-
-# Move to PATH (optional)
+# Linux/macOS - Get latest version and download
+LATEST_VERSION=$(curl -s https://api.github.com/repos/lordmathis/llamactl/releases/latest | grep '"tag_name":' | sed -E 's/.*"([^"]+)".*/\1/')
+curl -L https://github.com/lordmathis/llamactl/releases/download/${LATEST_VERSION}/llamactl-${LATEST_VERSION}-$(uname -s | tr '[:upper:]' '[:lower:]')-$(uname -m).tar.gz | tar -xz
 sudo mv llamactl /usr/local/bin/
+
+# Or download manually from:
+# https://github.com/lordmathis/llamactl/releases/latest
+
+# Windows - Download from releases page
 ```
 
 ### Option 2: Build from Source
@@ -36,11 +47,12 @@ If you prefer to build from source:
 git clone https://github.com/lordmathis/llamactl.git
 cd llamactl
 
-# Build the application
-go build -o llamactl cmd/server/main.go
-```
+# Build the web UI
+cd webui && npm ci && npm run build && cd ..
 
-For detailed build instructions, see the [Building from Source](../development/building.md) guide.
+# Build the application
+go build -o llamactl ./cmd/server
+```
 
 ## Verification
 
diff --git a/docs/getting-started/quick-start.md b/docs/getting-started/quick-start.md
index a882b10..11751c0 100644
--- a/docs/getting-started/quick-start.md
+++ b/docs/getting-started/quick-start.md
@@ -28,7 +28,6 @@ You should see the Llamactl web interface.
 2. Fill in the instance configuration:
    - **Name**: Give your instance a descriptive name
    - **Model Path**: Path to your Llama.cpp model file
-   - **Port**: Port for the instance to run on
    - **Additional Options**: Any extra Llama.cpp parameters
 
 3. Click "Create Instance"
@@ -50,7 +49,6 @@ Here's a basic example configuration for a Llama 2 model:
 {
   "name": "llama2-7b",
   "model_path": "/path/to/llama-2-7b-chat.gguf",
-  "port": 8081,
   "options": {
     "threads": 4,
     "context_size": 2048
@@ -72,13 +70,70 @@ curl -X POST http://localhost:8080/api/instances \
   -d '{
     "name": "my-model",
     "model_path": "/path/to/model.gguf",
-    "port": 8081
   }'
 
 # Start an instance
 curl -X POST http://localhost:8080/api/instances/my-model/start
 ```
 
+## OpenAI Compatible API
+
+Llamactl provides OpenAI-compatible endpoints, making it easy to integrate with existing OpenAI client libraries and tools.
+
+### Chat Completions
+
+Once you have an instance running, you can use it with the OpenAI-compatible chat completions endpoint:
+
+```bash
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "my-model",
+    "messages": [
+      {
+        "role": "user",
+        "content": "Hello! Can you help me write a Python function?"
+      }
+    ],
+    "max_tokens": 150,
+    "temperature": 0.7
+  }'
+```
+
+### Using with Python OpenAI Client
+
+You can also use the official OpenAI Python client:
+
+```python
+from openai import OpenAI
+
+# Point the client to your Llamactl server
+client = OpenAI(
+    base_url="http://localhost:8080/v1",
+    api_key="not-needed"  # Llamactl doesn't require API keys by default
+)
+
+# Create a chat completion
+response = client.chat.completions.create(
+    model="my-model",  # Use the name of your instance
+    messages=[
+        {"role": "user", "content": "Explain quantum computing in simple terms"}
+    ],
+    max_tokens=200,
+    temperature=0.7
+)
+
+print(response.choices[0].message.content)
+```
+
+### List Available Models
+
+Get a list of running instances (models) in OpenAI-compatible format:
+
+```bash
+curl http://localhost:8080/v1/models
+```
+
 ## Next Steps
 
 - Learn more about the [Web UI](../user-guide/web-ui.md)
diff --git a/docs/index.md b/docs/index.md
index b45cae2..19f7508 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -1,12 +1,12 @@
 # Llamactl Documentation
 
-Welcome to the Llamactl documentation! Llamactl is a powerful management tool for Llama.cpp instances that provides both a web interface and REST API for managing large language models.
+Welcome to the Llamactl documentation! Llamactl is a powerful management tool for llama-server instances that provides both a web interface and REST API for managing large language models.
 
 ## What is Llamactl?
 
-Llamactl is designed to simplify the deployment and management of Llama.cpp instances. It provides:
+Llamactl is designed to simplify the deployment and management of llama-server instances. It provides:
 
-- **Instance Management**: Start, stop, and monitor multiple Llama.cpp instances
+- **Instance Management**: Start, stop, and monitor multiple llama-server instances
 - **Web UI**: User-friendly interface for managing your models
 - **REST API**: Programmatic access to all functionality
 - **Health Monitoring**: Real-time status and health checks
@@ -33,8 +33,7 @@ Llamactl is designed to simplify the deployment and management of Llama.cpp inst
 If you need help or have questions:
 
 - Check the [Troubleshooting](advanced/troubleshooting.md) guide
-- Visit our [GitHub repository](https://github.com/lordmathis/llamactl)
-- Read the [Contributing guide](development/contributing.md) to help improve Llamactl
+- Visit the [GitHub repository](https://github.com/lordmathis/llamactl)
 
 ---
 
diff --git a/mkdocs.yml b/mkdocs.yml
index f23c70e..4e7e107 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -1,6 +1,6 @@
-site_name: LlamaCtl Documentation
-site_description: User documentation for LlamaCtl - A management tool for Llama.cpp instances
-site_author: LlamaCtl Team
+site_name: Llamatl Documentation
+site_description: User documentation for Llamatl - A management tool for Llama.cpp instances
+site_author: Llamatl Team
 site_url: https://llamactl.org
 
 repo_name: lordmathis/llamactl
@@ -61,9 +61,6 @@ nav:
     - Backends: advanced/backends.md
     - Monitoring: advanced/monitoring.md
     - Troubleshooting: advanced/troubleshooting.md
-  - Development:
-    - Contributing: development/contributing.md
-    - Building from Source: development/building.md
 
 plugins:
   - search