Eric AI Infrastructure — System Map
Autonomous Coding Pipeline
A fully autonomous AI coding system that takes GitHub issues, writes code, opens PRs, and self-repairs — with persistent memory across every session.
3 models
Qwen 7b / 14b / 32b
8 workers
Memory Cron Jobs
3 phases
Fine-tune Pipeline
Orchestrator :8766 running
Laptop Worker :8765 running
end-to-end autonomous job lifecycle
👤
Submit
POST /submit
POST /submit to Orchestrator with repo + task details
⚙️
Orchestrator
Hetzner :8766
FastAPI orchestrator on server manages jobs, state, dispatch
🗺️
Pre-Plan
POST /plan
Laptop generates a scoped plan: target_files, estimated_scope
🤖
Validate
Copilot CLI
Copilot gpt-4.1 validates scope, target_files, no scope creep
🦾
Qwen Coder
Laptop :8765
Qwen writes PATCH blocks using full file context from plan. Destruction detection active.
🐙
Open PR
GitHub
Push branch, open PR, security scan before accepting
🔍
Review
Copilot + Human
Copilot auto-reviews PR diff. Changes requested → repair loop
✅
Approved
Done → Memory
Approved PR saved as training example + project memory
Repair Loop — up to 3 attempts when changes requested
Security scan — secrets + forbidden paths blocked
FastAPI server on Hetzner. Receives job submissions, dispatches to laptop, polls GitHub for PR reviews, runs repair loops, tracks all state in PostgreSQL.
FastAPI
PostgreSQL
plan validation
repair loop ×3
security scan
webhook handler
ai-orchestrator.service running
FastAPI worker on the GPU laptop. Clones repos, runs Qwen via Ollama, applies PATCH-format diffs with destruction detection, pushes branches and opens PRs.
Qwen 7b/14b/32b
Ollama
PATCH format
destruction check
circuit breaker
DPO collection
ai-worker.service running
Two-mode execution: if an approved_plan is passed, runs _run_coder_only() with full file context. Otherwise runs _run_with_debate() with planner + critic + coder loop.
plan-driven context
few-shot examples
planner / critic / coder
target_files scope
Applies Qwen's output to disk. Parses PATCH (search/replace) blocks and FILE blocks. Enforces allowed_files scope. Blocks writes that destroy >70% of original file lines.
PATCH blocks
FILE blocks
destruction detection
scope enforcement
Persistent memory across all Claude sessions. Stores project state, feedback, incidents, training examples. Embeddings via Ollama nomic-embed-text for semantic search. Accessible from both server and laptop via Tailscale.
pgvector embeddings
MCP server
semantic search
review_status gating
worker_health tracking
memories kind, project, subject, body, embedding, importance
training_examples repo, task_id, prompt, approved_diff, DPO pairs
worker_health worker_name, last_status, run_duration_ms, updated_at
pipeline_memory_worker
*/15 * * *
deployment_memory_worker
*/10 * * *
health_memory_worker
*/5 * * *
embedding_worker
*/30 * * *
git_memory_worker
0 * * * *
memory_consolidation
0 2 * * *
finetune_worker
0 4 * * 0
Exposes memory operations as MCP tools to Claude Code. Available in every session via .claude.json config.
search_memory
save_memory
get_project_state
recall_memories
update_memory
log_action
Available via claude-memory MCP
1
Prompt Learning — RAG on training_examples, few-shot examples injected at inference
2
DPO Pairs — repair loop saves (prompt, good_output) pairs for preference learning
3
QLoRA Weekly — exports JSONL, triggers GPU fine-tune on laptop every Sunday 04:00
Primary compute. Runs the orchestrator, memory database, GitHub Actions runners, and all cron workers. Accessible via Tailscale VPN.
ai-orchestrator.service
claude-memory-db
GitHub Actions ×2
Docker
GPU executor. Runs Qwen models via Ollama, handles all LLM inference. Connects back to server DB for training data and health reporting.
ai-worker.service
Ollama :11434
Qwen 7b/14b/32b
nomic-embed-text
🌐 Tailscale VPN
🐳 Docker (pgvector)
🔐 GitHub Actions
📡 GitHub Webhooks
🤖 Copilot CLI gpt-4.1
⚡ systemd services
📋 crontab workers
🔑 GitHub Token auth
Fastest model. Used for small-scope tasks — single file edits, minor fixes, simple additions.
qwen2.5-coder:7b small tasks
Default model. Used for medium-scope tasks and all repair loop iterations regardless of original scope.
qwen2.5-coder:14b medium + repairs
Largest model. Reserved for large-scope tasks spanning multiple files or complex refactors. Higher quality, slower.
qwen2.5-coder:32b large tasks
Central MCP server platform. Hosts tool integrations and Claude Code extensions.
MCP tool integrations
AI-powered dating application. Primary target for autonomous Qwen coding tasks.
AI dating primary target
This system — the autonomous coding pipeline itself. Self-improving via fine-tune feedback loop.
meta self-improving