Local Personal AI System

Architecture for privacy-first adaptation & autonomous improvement

A Flask + Discord bot system that orchestrates multiple services (Ollama, n8n, external APIs) around a unified SQLite schema. Demonstrates patterns for cross-domain data correlation, heterogeneous input normalization, agentic self-improvement with safety guardrails, and privacy-by-architecture (all processing local, swappable backends, optional cloud integrations).

The Problem

Personal productivity & context tools are fragmented: activity scattered across Slack, GitHub, Spotify; environmental data siloed in separate APIs; insights buried in unconnected data sources. Most AI systems require cloud lock-in and expose personal data. The goal: a single coherent system where data converges on one schema, patterns emerge from correlation, and the system can improve itself without vendor dependency.

System Architecture

User Input (Discord Bot)
        ↓
    Flask API
    ├── route: /activity/log     (normalize commits, messages, events)
    ├── route: /briefing         (compose state → adaptive output)
    ├── route: /agent/request    (queue self-improvement tasks)
    └── route: /correlation      (cross-domain pattern analysis)
        ↓
    Processing Layer
    ├── Ollama (local LLM inference)
    ├── Data Access (SQLite row_factory)
    └── External APIs (optional: Stormglass, PubMed, Spotify)
        ↓
    Unified SQLite Schema
    ├── Event Tables (activity, observations, reflections, intentions)
    ├── Computed Tables (trends, forecasts, correlations)
    └── Agent State (staging dir, task queue, deployment history)
        ↓
    Composition Layer
    ├── n8n (orchestrate cron workflows)
    ├── Discord Bot (user interaction)
    └── Dashboard (visualization)

The key insight: data convergence enables correlation without hardcoding rules.Instead of "if pressure drops, send alert," the system correlates pressure observations with other event types and detects patterns statistically. New data types integrate into the same schema; existing analysis applies automatically.

Key Design Patterns

Unified Schema for Cross-Domain Correlation

Multiple event types (commits, messages, calendar events, API observations) feed a single database. Computed columns (trends, baselines, correlations) emerge from statistical analysis, not explicit rules. Adding a new data source requires schema change + analysis pipeline auto-applies.

Example: Commits (0–10 scale), messages (0–50), calendar events (0–5) → unified intensity scores (1–10). Baseline calculated from 14-day history; alerts trigger when current > baseline + 20% for 3+ days.

Heterogeneous Input Normalization

Different sources produce different scales. Normalization logic in the API ensures all inputs map to comparable units before storage. Enables apples-to-apples comparison across sources.

Example: Stormglass (pressure in hPa), Spotify (play count), GitHub (commits/day) → all normalized to 0–1 relative to baseline before correlation analysis.

Context-Aware Composition

Same underlying data, different output based on system state. Flask routes decide output format (brief vs. detailed, confidence thresholds, alert frequency) based on current capacity/stress.

Example: High-stress state: briefing shows only top-3 alerts, skips recommendations. Low-stress: full analysis, forecasts, suggestions.

Agentic Loop with Safety Guardrails

Agent writes code → stages in isolated directory → human reviews diff → deploy triggers integration. Prevents production breakage; enables autonomous improvement.

Example: Agent decides system needs correlation analysis for new data type: writes SQL migration + API endpoint + test → stages → human reviews + approves → deploys → integrated into live system.

Privacy-First by Architecture

All processing happens locally. External APIs (Stormglass, PubMed, Spotify) are optional integrations; removing them doesn't break the system. Backends are swappable (Ollama → Claude API, SQLite → PostgreSQL, n8n → custom orchestrator).

Example: Stormglass integration is one optional route. If API goes down, system degrades gracefully; local analysis continues. Different user could swap Ollama for OpenAI without changing schema.

Feedback Loops (What Makes It Smart)

Environmental Context Loop

External APIs → computed table (pressure, air quality, geomagnetic) → briefing logic reads computed values → user sees contextualized alerts

Trend Detection Loop

Event logged → baseline calculated from 14-day history → current vs. baseline → alert if > threshold for 3+ days → user acknowledges alert → trend marked resolved

Outcome Tracking Loop

User sets intention → system provides context → user logs reflection + outcome → system learns what actually happened vs. predicted → briefing accuracy improves

Self-Improvement Loop (Cipher Agent)

System detects missing feature → agent proposes code → stages in isolation → human reviews → deploy on approval → agent integrates into own codebase → system gains capability

Case Studies

Intensity Normalization & Comparative Analysis

PROBLEM

Activity sources emit different scales: commits (0–10), messages (0–50), calendar events (0–5). Can't compare directly.

SOLUTION

Each source has normalization thresholds (e.g., 3 commits = low, 10+ = high). API endpoint /activity/log accepts source + metric + value, computes normalized 1–10 score. All stored as "intensity" in unified table.

RESULT

Dashboard shows daily activity across sources in one view. User can see "Tuesday was 7/10 intensity (mostly commits), Wednesday was 3/10 (light Slack)"—apples-to-apples comparison.

Skills demonstrated: Data normalization, threshold tuning, aggregation logic

Trend Detection from Unified Schema

PROBLEM

Multiple event types feed the database. How do you detect when any type is trending up/down without hardcoding each?

SOLUTION

Compute table stores baseline (14-day avg excluding recent 5) + current value + % change + direction (improving/stable/worsening). Alert logic: if > baseline + 20% for 3+ consecutive days, flag. Works for any numeric column.

RESULT

Add new data source (e.g., external API)? Schema change + compute pipeline auto-applies. No new hardcoded rules.

Skills demonstrated: Statistical baselines, generic alerting patterns, schema extensibility

Agentic Self-Improvement with Safety

PROBLEM

Want autonomous agents to write code, but can't let them ship directly to production.

SOLUTION

Agent writes code → stages in data/pending/{task_id}/ → human reviews staged diff → /deploy endpoint copies staged files → git commit → service restart. Staging dir is isolated; nothing hits real files until approval.

RESULT

Agent can write features (API routes, database migrations, Discord commands) safely. Staging architecture enables rollback; git history tracks all deployments.

Skills demonstrated: Staging architectures, human-in-the-loop validation, safe deployment patterns

Adaptive Briefing Based on System State

PROBLEM

Same data, but users need different detail levels depending on capacity/stress.

SOLUTION

Briefing route reads system state (pacing score, recent events) and adjusts output: high stress → top-3 alerts only; low stress → full analysis + forecasts + recommendations.

RESULT

Briefing is useful under all conditions. No need to hard-code "if stressed, show less"; format adapts automatically.

Skills demonstrated: State-driven logic, contextual presentation, adaptive UX

Tech Stack

FlaskDiscord.pyPython 3.9SQLitesqlite3.RowOllama: Llama 3.2 (1b), Qwen 2.5-Coder (7b), Nomic Embed Textn8n (orchestration)Kerykeion (astrology)Claude API (Cipher planning)Gemini (bug review)PubMed APISpotify APIStormglass API

Why Local-First

Privacy by Design

Journaling, health tracking, mood logs, calendar data—sensitive personal information stays on your hardware. No cloud sync, no terms of service, no data harvesting. You own your data completely.

Customization Over Convention

Generic productivity tools overwhelm and go unused. A system built exactly for how you think, with UI you styled and workflows you designed, actually gets used. Custom tools feel like extensions of yourself.

Better AI Context

Local models let you control context deeply. Instead of an LLM hallucinating from billions of tokens, your model understands your specific patterns, your language, your priorities. The system adapts to you, not the other way around.

Environmental Responsibility

Hyperscale data centers consume enormous energy and water. Running complex, personalized systems locally on modest hardware means less environmental impact. Miniaturization instead of hyperscaling.

Integration Without Fragmentation

Instead of juggling 10 different apps with different UIs and data silos, one coherent system where everything converges on a single schema you control.

Key Insights

▸Unified schema enables emergent patterns. You don't hardcode "if A and B, then alert." Instead, you store observations, compute baselines, and let statistical correlations appear.
▸Normalization is the foundation. Heterogeneous inputs (commits, messages, API readings) need common units before analysis. Invest in mapping functions early.
▸Staging architecture enables safe autonomy. Agents (or junior devs) can propose code safely when it lands in an isolated directory first. Human review + approval before production.
▸Privacy-first beats privacy-later. If you design for local-first + swappable backends from day one, you're not retrofitting later. Optional cloud integrations stay optional.
▸Feedback loops make systems smart. Not the LLM prompt—the data flow. Intent → context → outcome → learning loop. That's what enables adaptation.

This is a personal project—a testbed for system design patterns, agentic loops, and privacy-first architecture. The architectural principles apply to any domain: activity tracking, health monitoring, research tools, personal knowledge management, autonomous systems that improve themselves safely.

Built and deployed on a Mac mini. All code local; all data stays local unless explicitly shared.