← Back to Home

Homelab AI Infrastructure

Docker Compose orchestration for local-first applications

A containerized AI stack running 8 services with tight resource constraints on a Mac mini. This is the runtime layer for any local-first application, specifically designed to support the Local Personal AI System. Demonstrates cost-efficient orchestration, privacy-first design, and systems thinking for production services on modest hardware. Shows DevOps fundamentals: service orchestration, memory constraints, data persistence, and graceful degradation.

The Challenge

Run a production-grade AI stack locally with privacy guarantees, no cloud lock-in, and minimal hardware cost. Orchestrate multiple services (LLM inference, workflow orchestration, vector search, Git hosting, file management) within constrained memory while maintaining reliability and data persistence.

Architecture

┌──────────────────────────────────────────────────────┐
│         Homelab Stack (Mac mini)                     │
├──────────────────────────────────────────────────────┤
│                                                       │
│         AI/LLM Services                              │
│      Ollama (11434)  Open WebUI (3000)               │
│      Pipelines (9099)  n8n (5678)                    │
│      ChromaDB (8000)  SearXNG (8888)                 │
│                                                       │
│              ▲ AI inference                          │
│      ▲ Vector embeddings  ▲ Workflow                │
│                                                       │
│      Infrastructure & APIs                           │
│      Gitea (3002)  FileBrowser (8081)                │
│                                                       │
│              ▲ Git hosting  ▲ File mgmt              │
│                                                       │
└──────────────────────────────────────────────────────┘

Services & Resource Allocation

Ollama

MEMORY 2GB

LLM inference engine

Local language models: Llama 3.2 (1b) for fast inference, Qwen 2.5-Coder (7b) for code generation, Nomic Embed Text for embeddings. No external API calls.

Open WebUI + Pipelines

MEMORY 1GB + 512MB

Chat interface & custom LLM pipelines

Web UI for chat interactions; Pipelines for custom request processing

n8n

MEMORY 1GB

Workflow orchestration

Coordinates all service integrations; automation and data routing

ChromaDB

MEMORY 512MB

Vector database

Stores embeddings for semantic search and RAG (Retrieval-Augmented Generation)

SearXNG

MEMORY 512MB

Private search engine

Meta-search across multiple engines; no tracking, all processing local

Gitea

MEMORY 512MB

Self-hosted Git server

Local repository hosting for code and project management

FileBrowser

MEMORY 256MB

Web-based file manager

Browse projects and data without CLI

Infrastructure Philosophy

  • Local-first by default: All processing happens on your hardware. External APIs (when used) are optional integrations, not core dependencies.
  • Composable services: Each container is independent. Upgrade Ollama, replace n8n, add new services—without affecting others.
  • Resource-aware design: Every service has explicit memory limits. Proves you can run production workloads on modest hardware without waste.
  • Docker Compose as infrastructure code: One YAML file is the single source of truth. No manual config, no lost setup knowledge.

Key Design Decisions

  • Memory limits enforced per-service: Each service has strict mem_limit + memswap_limit to prevent resource starvation
  • Restart policies tuned: Critical services use always restart; non-critical use unless-stopped
  • Named external volumes: Data persists across container restarts; enables safe updates without data loss
  • Docker Compose single source of truth: All configuration lives in one file; no manual steps, no lost setup knowledge
  • No cloud dependencies: All processing stays local; privacy by architecture, not policy

Lessons Learned

  • Memory limits are hard constraints: Without them, a runaway process can OOM-kill the host. Swaplimit prevents disk thrashing.
  • Named volumes are essential: Ephemeral containers with bind mounts = data loss on restart. External volumes survive container lifecycle.
  • Restart policies need thought: always creates restart loops on bad config; unless-stopped respects manual stops.
  • Port allocation matters: Avoid conflicts with host services. Document all port mappings in one place.

Tech Stack

DockerDocker ComposeOllama (Llama 3.2, Qwen 2.5-Coder)Open WebUIn8nChromaDBSearXNGGiteaFileBrowser

See Also

Local Personal AI System: See how this infrastructure is used to build a coherent, unified system with feedback loops and emergent intelligence. Read about the architecture →

DevOps + Infrastructure: This page showcases Docker orchestration, memory management, service coordination, and deployment patterns on constrained hardware. It's the foundation that enables higher-level systems thinking.

Built and deployed on a Mac mini. All services local; all data stays local unless explicitly shared.