OPERATIONAL
~43 products236 PRs merged$0.00 metered
snapshot from our repo · 2026-06-04 01:42 UTC ↗

// stateful-ai · a studio of one, staffed by six agents

We don't talk about shipping.
Watch us ship.

A studio of one, staffed by six AI agents. ~43 real, playable things they built — games you can play, models you can watch learn, tools you'd keep open. This board is generated from our own repo. Open any one and verify it yourself.

stateful — zsh — try: ls products
# the work is real. don't read the pitch — run it.
stateful:~$ type ls products, open petri, verify or pick a chip ↓
stateful:~$
run ↓

// the shipping feed — last merged PRs, from our repo

now shipping · our real last-merged PRs ↗ every PR: 🎯→🏗️→🛠️→🔴→🧭→👍
  • [06-03 12:49]#238feat(coding): born-CI-correct product scaffold (pinned ruff + tolerant pytest)
  • [06-03 12:31]#237feat(lab): pre-registered experiments — the dream's v1 (falsifiable, non-gameable)
  • [06-03 12:25]#236feat(lab): shared Lab Notebook — dream/experiment/learn, visible to all three audiences
  • [06-03 10:30]#235fix(overnight): test-status falls back to sys.executable in a worktree
  • [06-03 10:15]#234fix(health): Petri probe tolerates cold-start (false 'down' in the briefing)
  • [06-03 09:55]#233feat(morning): surface the founder-validation ask in the daily briefing
  • [06-03 09:39]#232fix(test): isolate test_transient_error from ambient COMPANYOS_BUILD_MAX_RETRIES
  • [06-03 09:21]#231feat(verdict): tiny product-verdict loop — founder keep/harden/kill, traceable on the spine (M70)
  • [06-03 08:58]#230feat(overnight): name the failing test in 'Test status' (make the flaky diagnosable)
  • [06-03 08:49]#229feat(ventures): study_desk — dogfood the Trellis flagship; richer internal market
  • [06-03 08:27]#228fix(health): task-store probe must not import companyos (broke standalone run)
  • [06-03 08:03]#227feat(convene): --fast mode — 2 agents + deterministic merge (cut adoption friction)

// the run — a real thread from #petri, scope → ship

Not a mockup. One real run, lightly trimmed: the founder drops a finding, the agents scope it, design it, and the Skeptic kills an overclaim — and nothing ships until a human says 👍. recorded 2026-06-03 · the live ones live in ./runs

@founder #petri · drop
Digging into v2 balance, I found the root cause of Petri's deadness — and it changes everything. The live world is only ~3 living civilizations; the other 17 "nations" are pop-0 corpses still holding tiles. Population only grows via EXPAND, so once the map partitions a civ can't recover — it grinds to 0, goes IDLE, and sits there forever. That's the freeze. That's the "no soul."
🎯 product_strategist scope
Reframe v2: ship vitality as a diagnostic lever, not a new depth promise. The target isn't max civ count — it's 6–8 living powers with at least one credible succession story. Scope cut: no new unit classes, no wider tech ladder, until a successor can survive long enough to matter.
🏗️ infra_architect design
Passive regrowth is the right primitive — but only as a profile-gated repair (off in base_v0), never a base-rules change. The failure is structural: expansion-only growth makes "not currently expanding" equal eventual death. That's a one-way state bug in the life cycle, not Malthusian discipline.
🔴 skeptic_red_team caught an overclaim
Don't optimize for "many living civs" — twelve weak civs can still feel dead. The real pass metric is non-zombie agency turnover: a cold watcher can name three causal arcs — "X rose because…", "Y collapsed because…" — without being told the mechanic.
📌 candidate memory · living-civ count is a vanity metric; Petri needs visible causal turnover
🧭 chief_of_staff Founder Brief
Synthesized: stop broader v2 tuning; make this a vitality gate. Measure zombies + leader-lock first, then test profile-gated regrowth + one successor-credibility variant — as an opt-in experiment, not a shipped decision. Out of scope: units, LLM narration, retuning base_v0.
▸ awaiting founder 👍
Nothing ships until a human says so. The open call on the table: a tiny seed-7 proof first, or a preregistered multi-seed panel before promoting the fix?

plays back at reading pace

// the big board — generated from the repo

43products shipped↗ from the repo
6AI agents on staff↗ from the repo
236PRs merged↗ from the repo
$0.00metered spend↗ from the repo

Nothing on this board is hand-typed. A build script reads our git history and merged PRs and writes plain JSON the page is generated from — so these numbers can't drift from reality. Open the raw data: feed.json · status.json · products.json.

01 — PLAYABLE PROOF

Live Worlds 6

Full games and sims the studio built, deployed, and keeps revising. Walk in.

02 — SHOW, DON'T CLAIM

Watch a Machine Learn 5

Open a tab and watch real ML converge, live — hand-written numpy and self-play you can see think.

03 — THE ONES YOU'D KEEP OPEN

Flagship Tools 12

A deterministic floor, a local-LLM core, and your data never leaves the machine.

trellisA local-first tutor that maps the hidden prerequisite structure of any topic and learns how you specifically forget.living skill-graph · localplay ↗
brinkName the scary money move; a Monte Carlo rolls your next two years thousands of times to show the odds you're still standing.Monte Carlo · your numbersplay ↗
you@local:~$ tally
tallyPoint it at a transactions CSV: where your money went, where it's heading, and what to do — without your data leaving.local · data never leaves▶ replay
you@local:~$ crux
cruxA tutor that diagnoses the specific misconception behind your wrong answer and only calls it resolved once it's proven stuck.diagnoses · proves the fix▶ replay
you@local:~$ whet
whetA skeptical local-LLM editor that pressure-tests a markdown brief, rewrites its weakest line, and proves the improvement.pressure-tests · proves lift▶ replay
you@local:~$ wager
wagerA calibration journal: log a bet with your confidence, let a model spar with the case you're wrong, score your 90%s.are your 90% calls 90%?▶ replay
you@local:~$ delta
deltaYour whole scattered workday in one openable Today.md: what needs you, what moved, and paste-ready draft replies.one openable Today.md▶ replay
you@local:~$ resonance
resonanceAsk a fresh idea “what have I already thought about this?” — a semantic mirror over your own notes, code, and git.semantic mirror · your GPU▶ replay
you@local:~$ chord
chordDrop in a week of notes or a stack of tickets and a local model names the single through-line they share, with confidence.names the through-line▶ replay
you@local:~$ rumor
rumorGive it a URL; it reads the page against your interests and writes one opinionated paragraph in your voice — a curator.curator, not a summary▶ replay
you@local:~$ spark
sparkOne quirky daily writing prompt spun from three of your own interests, picked at random by your local GPU.your interests · daily▶ replay
you@local:~$ hum
humA quiet hourly daemon that writes one line about what you were working on, so your weekly review becomes a `cat`.weekly review = a cat▶ replay

04 — SMALL, SHARP, LOCAL

The Toolbelt 20

The long tail the agents dream up and ship — each does one thing well, on your GPU, for nothing.

pulseYour RSS firehose as one verdict-sentence per item, in your own voice, on your own GPU.RSS → verdicts▶ replay
driftA weekly paragraph on how your writing is quietly changing, instead of a diff you'll never read.how your writing drifts▶ replay
latticeA concept graph over a folder of markdown notes — which of your ideas are secretly connected.notes → concept graph▶ replay
trailWrap any shell command, get back a clean markdown timeline of what it actually did.command → timeline▶ replay
kindlingScans your graveyard of half-finished drafts and ranks them by closeness-to-publishable.drafts ranked to ship▶ replay
hushPipe a noisy log through it; a local model emits only the lines that actually matter, as JSON.noisy log → signal▶ replay
cipherSwaps names for Alice/Bob/Carol so you can share a private draft for review, then decodes it right back.anonymize · then restore▶ replay
phrasePipe in a rambling note, get one tight line back at or under your character budget.ramble → one tight line▶ replay
signalReads a week of commits across your repos and hands back one sharp observation about the shape of your work.a mirror, not a summary▶ replay
emberFeed it a log; a model surfaces the three moments that actually mattered, with line numbers and why.the 3 moments that mattered▶ replay
saltSingle-passphrase AES-256-GCM encryption for the files you'd rather not leave plain on disk.one passphrase · local▶ replay
indexA grep-able `ls -lR`: every file as a markdown table row with its first line.grep your filesystem▶ replay
swatchPoint it at a folder of images, get one self-contained page of dominant 5-color palettes.images → palettes▶ replay
tideWire one line into your shell and watch your terminal-uptime breathe as a tiny ASCII chart.your week, breathing▶ replay
atlasWander into any unfamiliar git repo and get a 30-second markdown briefing — no LLM needed.any repo in 30s▶ replay
orreryYour week of git activity rendered as a slow-spinning ASCII solar system, each commit a pulse.git as a solar system▶ replay
cinderPaste a meeting note; get up to three decisions and three follow-ups as strict, owner-tagged JSON.notes → decisions JSON▶ replay
tesseraPipe any photo in, get a glyph mosaic out — ASCII art with swappable character ramps.photo → glyph mosaic▶ replay
glanceOne command: everything waiting on you right now — mentions, parked PRs, your next few hours — in 25 lines.what needs you, now▶ replay
vellumType a word, get a coherent fictional world: a framed map, named nations, their feuds, and a history.a word → a whole world▶ replay

// meet the six — agents are config, not code

Chief of Staff

chief_of_staff

Keeps the company operating and synthesizes the Founder Brief; owns self-improvement.

Product Strategist

product_strategist

Finds the wedge — the smallest product that proves the biggest thesis.

Infra Architect

infra_architect

Designs the local-first, provider-agnostic architecture; boring and inspectable.

Engineering Lead

engineering_lead

Breaks strategy into small, testable, buildable increments and ships the code.

Skeptic / Red Team

skeptic_red_team

Stress-tests assumptions; prevents overbuilding and overclaiming.

Memory Curator

memory_curator

Decides what becomes durable, permissioned memory — never noise.

// verify — the claims are one click from proof

Most "AI company" sites ask you to take the demo on faith. Here the agents are the staff and the output is a gallery you can open. The differentiators aren't marketing — they're checkable.

  • $0.00 metered. Subscriptions + a local GPU; paid mode stays off. see the board
  • State you own. Runs, events, tasks, decisions, approved memory — plain files in a repo, not a vendor.
  • A human at every irreversible gate. Agents propose; nothing merges, spends, or contacts a human without a 👍. see a run
  • Open & verify. 13 of these run right in your browser; the rest run on your own machine. open one
  • This page built itself. The feed, the board, and the gallery are generated from our committed repo data. /data/products.json ↗