Eric AI Infrastructure — System Map

Autonomous Coding
Pipeline

A fully autonomous AI coding system that takes GitHub issues, writes code, opens PRs, and self-repairs — with persistent memory across every session.

3 models
Qwen 7b / 14b / 32b
8 workers
Memory Cron Jobs
3 phases
Fine-tune Pipeline
loop
Self-Repair
Orchestrator :8766 running
Laptop Worker :8765 running
Memory DB :5434 running
8/8 cron workers active
Tailscale VPN connected

Job Pipeline Flow

end-to-end autonomous job lifecycle
👤
Submit
POST /submit
POST /submit to Orchestrator
with repo + task details
⚙️
Orchestrator
Hetzner :8766
FastAPI orchestrator on server
manages jobs, state, dispatch
🗺️
Pre-Plan
POST /plan
Laptop generates a scoped plan:
target_files, estimated_scope
🤖
Validate
Copilot CLI
Copilot gpt-4.1 validates scope,
target_files, no scope creep
🦾
Qwen Coder
Laptop :8765
Qwen writes PATCH blocks
using full file context from plan.
Destruction detection active.
🐙
Open PR
GitHub
Push branch, open PR,
security scan before accepting
🔍
Review
Copilot + Human
Copilot auto-reviews PR diff.
Changes requested → repair loop
Approved
Done → Memory
Approved PR saved as
training example + project memory
Repair Loop — up to 3 attempts when changes requested
Security scan — secrets + forbidden paths blocked

Core Components

⚙️
Orchestrator
orchestrator/main.py — :8766
FastAPI server on Hetzner. Receives job submissions, dispatches to laptop, polls GitHub for PR reviews, runs repair loops, tracks all state in PostgreSQL.
FastAPI PostgreSQL plan validation repair loop ×3 security scan webhook handler
ai-orchestrator.service running
🦾
Laptop Worker (Qwen Executor)
laptop/worker.py — :8765
FastAPI worker on the GPU laptop. Clones repos, runs Qwen via Ollama, applies PATCH-format diffs with destruction detection, pushes branches and opens PRs.
Qwen 7b/14b/32b Ollama PATCH format destruction check circuit breaker DPO collection
ai-worker.service running
🗺️
LLM Runner + Planner
laptop/llm_runner.py
Two-mode execution: if an approved_plan is passed, runs _run_coder_only() with full file context. Otherwise runs _run_with_debate() with planner + critic + coder loop.
plan-driven context few-shot examples planner / critic / coder target_files scope
✍️
Code Writer
laptop/code_writer.py
Applies Qwen's output to disk. Parses PATCH (search/replace) blocks and FILE blocks. Enforces allowed_files scope. Blocks writes that destroy >70% of original file lines.
PATCH blocks FILE blocks destruction detection scope enforcement

Memory System v2

🧠
Claude Memory — PostgreSQL + pgvector
supabase-memory/ — claude-memory-db :5434 — Tailscale 100.82.65.23
Persistent memory across all Claude sessions. Stores project state, feedback, incidents, training examples. Embeddings via Ollama nomic-embed-text for semantic search. Accessible from both server and laptop via Tailscale.
pgvector embeddings MCP server semantic search review_status gating worker_health tracking
memorieskind, project, subject, body, embedding, importance
training_examplesrepo, task_id, prompt, approved_diff, DPO pairs
worker_healthworker_name, last_status, run_duration_ms, updated_at
8 Cron Workers
supabase-memory/*.py
pipeline_memory_worker
*/15 * * *
deployment_memory_worker
*/10 * * *
health_memory_worker
*/5 * * *
embedding_worker
*/30 * * *
git_memory_worker
0 * * * *
memory_consolidation
0 2 * * *
decay_worker
0 3 * * *
finetune_worker
0 4 * * 0
🔌
Memory MCP Server
memory_mcp_server.py
Exposes memory operations as MCP tools to Claude Code. Available in every session via .claude.json config.
search_memory save_memory get_project_state recall_memories update_memory log_action
Available via claude-memory MCP
🎓
Fine-tune Pipeline
3 phases — finetune_worker.py
1
Prompt Learning — RAG on training_examples, few-shot examples injected at inference
2
DPO Pairs — repair loop saves (prompt, good_output) pairs for preference learning
3
QLoRA Weekly — exports JSONL, triggers GPU fine-tune on laptop every Sunday 04:00

Infrastructure

🖥️
Hetzner Server
100.82.65.23 (Tailscale)
Primary compute. Runs the orchestrator, memory database, GitHub Actions runners, and all cron workers. Accessible via Tailscale VPN.
ai-orchestrator.service claude-memory-db GitHub Actions ×2 Docker
💻
GPU Laptop (ai-coder)
100.78.186.38 (Tailscale)
GPU executor. Runs Qwen models via Ollama, handles all LLM inference. Connects back to server DB for training data and health reporting.
ai-worker.service Ollama :11434 Qwen 7b/14b/32b nomic-embed-text
🌐 Tailscale VPN
🐳 Docker (pgvector)
🔐 GitHub Actions
📡 GitHub Webhooks
🤖 Copilot CLI gpt-4.1
systemd services
📋 crontab workers
🔑 GitHub Token auth

Model Routing

Qwen 2.5 Coder 7B
scope: small
Fastest model. Used for small-scope tasks — single file edits, minor fixes, simple additions.
qwen2.5-coder:7bsmall tasks
🔧
Qwen 2.5 Coder 14B
scope: medium + repairs
Default model. Used for medium-scope tasks and all repair loop iterations regardless of original scope.
qwen2.5-coder:14bmedium + repairs
🧠
Qwen 2.5 Coder 32B
scope: large
Largest model. Reserved for large-scope tasks spanning multiple files or complex refactors. Higher quality, slower.
qwen2.5-coder:32blarge tasks

Managed Projects

🔗
MCP Hub
/opt/mcp-platform
Central MCP server platform. Hosts tool integrations and Claude Code extensions.
MCPtool integrations
💘
Eric & Erica Dating
/opt/eric-erica-dating
AI-powered dating application. Primary target for autonomous Qwen coding tasks.
AI datingprimary target
🔄
AI Worker
~/ai-worker
This system — the autonomous coding pipeline itself. Self-improving via fine-tune feedback loop.
metaself-improving