Local Personal AI System
Architecture for privacy-first adaptation & autonomous improvement
A Flask + Discord bot system that orchestrates multiple services (Ollama, n8n, external APIs) around a unified SQLite schema. Demonstrates patterns for cross-domain data correlation, heterogeneous input normalization, agentic self-improvement with safety guardrails, and privacy-by-architecture (all processing local, swappable backends, optional cloud integrations).
The Problem
Personal productivity & context tools are fragmented: activity scattered across Slack, GitHub, Spotify; environmental data siloed in separate APIs; insights buried in unconnected data sources. Most AI systems require cloud lock-in and expose personal data. The goal: a single coherent system where data converges on one schema, patterns emerge from correlation, and the system can improve itself without vendor dependency.
System Architecture
User Input (Discord Bot)
↓
Flask API
├── route: /activity/log (normalize commits, messages, events)
├── route: /briefing (compose state → adaptive output)
├── route: /agent/request (queue self-improvement tasks)
└── route: /correlation (cross-domain pattern analysis)
↓
Processing Layer
├── Ollama (local LLM inference)
├── Data Access (SQLite row_factory)
└── External APIs (optional: Stormglass, PubMed, Spotify)
↓
Unified SQLite Schema
├── Event Tables (activity, observations, reflections, intentions)
├── Computed Tables (trends, forecasts, correlations)
└── Agent State (staging dir, task queue, deployment history)
↓
Composition Layer
├── n8n (orchestrate cron workflows)
├── Discord Bot (user interaction)
└── Dashboard (visualization)The key insight: data convergence enables correlation without hardcoding rules.Instead of "if pressure drops, send alert," the system correlates pressure observations with other event types and detects patterns statistically. New data types integrate into the same schema; existing analysis applies automatically.
Key Design Patterns
Unified Schema for Cross-Domain Correlation
Multiple event types (commits, messages, calendar events, API observations) feed a single database. Computed columns (trends, baselines, correlations) emerge from statistical analysis, not explicit rules. Adding a new data source requires schema change + analysis pipeline auto-applies.
Example: Commits (0–10 scale), messages (0–50), calendar events (0–5) → unified intensity scores (1–10). Baseline calculated from 14-day history; alerts trigger when current > baseline + 20% for 3+ days.
Heterogeneous Input Normalization
Different sources produce different scales. Normalization logic in the API ensures all inputs map to comparable units before storage. Enables apples-to-apples comparison across sources.
Example: Stormglass (pressure in hPa), Spotify (play count), GitHub (commits/day) → all normalized to 0–1 relative to baseline before correlation analysis.
Context-Aware Composition
Same underlying data, different output based on system state. Flask routes decide output format (brief vs. detailed, confidence thresholds, alert frequency) based on current capacity/stress.
Example: High-stress state: briefing shows only top-3 alerts, skips recommendations. Low-stress: full analysis, forecasts, suggestions.
Agentic Loop with Safety Guardrails
Agent writes code → stages in isolated directory → human reviews diff → deploy triggers integration. Prevents production breakage; enables autonomous improvement.
Example: Agent decides system needs correlation analysis for new data type: writes SQL migration + API endpoint + test → stages → human reviews + approves → deploys → integrated into live system.
Privacy-First by Architecture
All processing happens locally. External APIs (Stormglass, PubMed, Spotify) are optional integrations; removing them doesn't break the system. Backends are swappable (Ollama → Claude API, SQLite → PostgreSQL, n8n → custom orchestrator).
Example: Stormglass integration is one optional route. If API goes down, system degrades gracefully; local analysis continues. Different user could swap Ollama for OpenAI without changing schema.
Feedback Loops (What Makes It Smart)
Environmental Context Loop
External APIs → computed table (pressure, air quality, geomagnetic) → briefing logic reads computed values → user sees contextualized alerts
Trend Detection Loop
Event logged → baseline calculated from 14-day history → current vs. baseline → alert if > threshold for 3+ days → user acknowledges alert → trend marked resolved
Outcome Tracking Loop
User sets intention → system provides context → user logs reflection + outcome → system learns what actually happened vs. predicted → briefing accuracy improves
Self-Improvement Loop (Cipher Agent)
System detects missing feature → agent proposes code → stages in isolation → human reviews → deploy on approval → agent integrates into own codebase → system gains capability
Case Studies
Intensity Normalization & Comparative Analysis
Activity sources emit different scales: commits (0–10), messages (0–50), calendar events (0–5). Can't compare directly.
Each source has normalization thresholds (e.g., 3 commits = low, 10+ = high). API endpoint /activity/log accepts source + metric + value, computes normalized 1–10 score. All stored as "intensity" in unified table.
Dashboard shows daily activity across sources in one view. User can see "Tuesday was 7/10 intensity (mostly commits), Wednesday was 3/10 (light Slack)"—apples-to-apples comparison.
Skills demonstrated: Data normalization, threshold tuning, aggregation logic
Trend Detection from Unified Schema
Multiple event types feed the database. How do you detect when any type is trending up/down without hardcoding each?
Compute table stores baseline (14-day avg excluding recent 5) + current value + % change + direction (improving/stable/worsening). Alert logic: if > baseline + 20% for 3+ consecutive days, flag. Works for any numeric column.
Add new data source (e.g., external API)? Schema change + compute pipeline auto-applies. No new hardcoded rules.
Skills demonstrated: Statistical baselines, generic alerting patterns, schema extensibility
Agentic Self-Improvement with Safety
Want autonomous agents to write code, but can't let them ship directly to production.
Agent writes code → stages in data/pending/{task_id}/ → human reviews staged diff → /deploy endpoint copies staged files → git commit → service restart. Staging dir is isolated; nothing hits real files until approval.
Agent can write features (API routes, database migrations, Discord commands) safely. Staging architecture enables rollback; git history tracks all deployments.
Skills demonstrated: Staging architectures, human-in-the-loop validation, safe deployment patterns
Adaptive Briefing Based on System State
Same data, but users need different detail levels depending on capacity/stress.
Briefing route reads system state (pacing score, recent events) and adjusts output: high stress → top-3 alerts only; low stress → full analysis + forecasts + recommendations.
Briefing is useful under all conditions. No need to hard-code "if stressed, show less"; format adapts automatically.
Skills demonstrated: State-driven logic, contextual presentation, adaptive UX
Tech Stack
Why Local-First
Privacy by Design
Journaling, health tracking, mood logs, calendar data—sensitive personal information stays on your hardware. No cloud sync, no terms of service, no data harvesting. You own your data completely.
Customization Over Convention
Generic productivity tools overwhelm and go unused. A system built exactly for how you think, with UI you styled and workflows you designed, actually gets used. Custom tools feel like extensions of yourself.
Better AI Context
Local models let you control context deeply. Instead of an LLM hallucinating from billions of tokens, your model understands your specific patterns, your language, your priorities. The system adapts to you, not the other way around.
Environmental Responsibility
Hyperscale data centers consume enormous energy and water. Running complex, personalized systems locally on modest hardware means less environmental impact. Miniaturization instead of hyperscaling.
Integration Without Fragmentation
Instead of juggling 10 different apps with different UIs and data silos, one coherent system where everything converges on a single schema you control.
Key Insights
- ▸Unified schema enables emergent patterns. You don't hardcode "if A and B, then alert." Instead, you store observations, compute baselines, and let statistical correlations appear.
- ▸Normalization is the foundation. Heterogeneous inputs (commits, messages, API readings) need common units before analysis. Invest in mapping functions early.
- ▸Staging architecture enables safe autonomy. Agents (or junior devs) can propose code safely when it lands in an isolated directory first. Human review + approval before production.
- ▸Privacy-first beats privacy-later. If you design for local-first + swappable backends from day one, you're not retrofitting later. Optional cloud integrations stay optional.
- ▸Feedback loops make systems smart. Not the LLM prompt—the data flow. Intent → context → outcome → learning loop. That's what enables adaptation.
This is a personal project—a testbed for system design patterns, agentic loops, and privacy-first architecture. The architectural principles apply to any domain: activity tracking, health monitoring, research tools, personal knowledge management, autonomous systems that improve themselves safely.
Built and deployed on a Mac mini. All code local; all data stays local unless explicitly shared.