Homelab AI Infrastructure
Docker Compose orchestration for local-first applications
A containerized AI stack running 8 services with tight resource constraints on a Mac mini. This is the runtime layer for any local-first application, specifically designed to support the Local Personal AI System. Demonstrates cost-efficient orchestration, privacy-first design, and systems thinking for production services on modest hardware. Shows DevOps fundamentals: service orchestration, memory constraints, data persistence, and graceful degradation.
The Challenge
Run a production-grade AI stack locally with privacy guarantees, no cloud lock-in, and minimal hardware cost. Orchestrate multiple services (LLM inference, workflow orchestration, vector search, Git hosting, file management) within constrained memory while maintaining reliability and data persistence.
Architecture
┌──────────────────────────────────────────────────────┐ │ Homelab Stack (Mac mini) │ ├──────────────────────────────────────────────────────┤ │ │ │ AI/LLM Services │ │ Ollama (11434) Open WebUI (3000) │ │ Pipelines (9099) n8n (5678) │ │ ChromaDB (8000) SearXNG (8888) │ │ │ │ ▲ AI inference │ │ ▲ Vector embeddings ▲ Workflow │ │ │ │ Infrastructure & APIs │ │ Gitea (3002) FileBrowser (8081) │ │ │ │ ▲ Git hosting ▲ File mgmt │ │ │ └──────────────────────────────────────────────────────┘
Services & Resource Allocation
Ollama
LLM inference engine
Local language models: Llama 3.2 (1b) for fast inference, Qwen 2.5-Coder (7b) for code generation, Nomic Embed Text for embeddings. No external API calls.
Open WebUI + Pipelines
Chat interface & custom LLM pipelines
Web UI for chat interactions; Pipelines for custom request processing
n8n
Workflow orchestration
Coordinates all service integrations; automation and data routing
ChromaDB
Vector database
Stores embeddings for semantic search and RAG (Retrieval-Augmented Generation)
SearXNG
Private search engine
Meta-search across multiple engines; no tracking, all processing local
Gitea
Self-hosted Git server
Local repository hosting for code and project management
FileBrowser
Web-based file manager
Browse projects and data without CLI
Infrastructure Philosophy
- ▸Local-first by default: All processing happens on your hardware. External APIs (when used) are optional integrations, not core dependencies.
- ▸Composable services: Each container is independent. Upgrade Ollama, replace n8n, add new services—without affecting others.
- ▸Resource-aware design: Every service has explicit memory limits. Proves you can run production workloads on modest hardware without waste.
- ▸Docker Compose as infrastructure code: One YAML file is the single source of truth. No manual config, no lost setup knowledge.
Key Design Decisions
- ▸Memory limits enforced per-service: Each service has strict
mem_limit+memswap_limitto prevent resource starvation - ▸Restart policies tuned: Critical services use
alwaysrestart; non-critical useunless-stopped - ▸Named external volumes: Data persists across container restarts; enables safe updates without data loss
- ▸Docker Compose single source of truth: All configuration lives in one file; no manual steps, no lost setup knowledge
- ▸No cloud dependencies: All processing stays local; privacy by architecture, not policy
Lessons Learned
- ▸Memory limits are hard constraints: Without them, a runaway process can OOM-kill the host. Swaplimit prevents disk thrashing.
- ▸Named volumes are essential: Ephemeral containers with bind mounts = data loss on restart. External volumes survive container lifecycle.
- ▸Restart policies need thought:
alwayscreates restart loops on bad config;unless-stoppedrespects manual stops. - ▸Port allocation matters: Avoid conflicts with host services. Document all port mappings in one place.
Tech Stack
See Also
Local Personal AI System: See how this infrastructure is used to build a coherent, unified system with feedback loops and emergent intelligence. Read about the architecture →
DevOps + Infrastructure: This page showcases Docker orchestration, memory management, service coordination, and deployment patterns on constrained hardware. It's the foundation that enables higher-level systems thinking.
Built and deployed on a Mac mini. All services local; all data stays local unless explicitly shared.