inertial


github.com/akaieuan/inertial-moderation-tool · Read the README
Repo, install instructions, the architecture diagram, the policy DSL, and the honest capability matrix all live in the README.
Reference architecture · portfolio work · MIT
Inertial
A reference architecture for auditable AI content review. Not a deployable moderation service — a working demonstration of one architectural thesis, end-to-end through real code.
Status
Reference architecture, not deployable. Schemas, audit chain, eval harness, skill registry, and reviewer dashboard are real and tested. Connectors are stubbed. Action dispatch is unimplemented. No auth. The 31-event gold set is too small for statistical claims; it's there to demonstrate the calibration math, not to certify any skill's accuracy.
The thesis
AI classification outputs and human review actions should both land in a hash-chained audit log, with typed structured signals as the unit of evidence and per-instance YAML as the unit of policy. Inertial is that thesis demonstrated end-to-end through real code.
The system is two products in one monorepo: @inertial/* — a toolkit of orchestration, persistence, policy, and HITL primitives, sibling to eval-kit and HITL-KIT — and @inertial/app, an Electron + React + Tailwind reference dashboard for moderators, built on HITL-KIT.
This is portfolio work, not a maintained OSS project. The point is the architecture choices and where they hold up — not feature completeness.
What's real
- Schema-first Zod contracts across 33 typed shapes (README's inventory centers on 12+ primary schemas in
@inertial/schemas):ContentEvent,StructuredSignal,AgentTrace,ReviewItem,ReviewDecision,Policy,AuditEntry,SkillRegistration,GoldEvent,EvalRun,SkillCalibration,ReviewerTag+ scope,TagAgreement. - A skill / tool registry with a catalog plus a per-instance registration table. Adding a skill is a registration row, not a code change. Reviewers wire Voyage / Anthropic / etc. without touching YAML.
- A YAML policy evaluator with hash-chained
/verify. Per-instance, structured AST (no string eval). Leaves arechannel + op + valueorentity + present; nodes compose withall/any; first match wins. The AST is preserved next to the rule id in the audit log so any decision can be traced back to the exact configuration that produced it. - An eval harness scoring per-(skill, channel) Brier / ECE / agreement.
pnpm evalboots an in-memory pipeline, dispatches the 31-event gold set (config/evals/gold-set-v1.jsonl— 27 text + 3 image + 1 video) against the live skill registry, and prints calibration as a hash-chained artifact, not vibes. Reviewer commits auto-promote viasignalFeedback+reviewerTagsintogold_events( sourcereviewer-derived), so the corpus grows on every decision. - A reviewer-tag layer with per-modality / per-segment scope.
TAG_CATALOGships ~18 starter tags;reviewer_tagsstores them with scope; the "good video, bad audio" mixed-validity case gets a precise label, not a whole-asset verdict. - A reviewer dashboard wired to all of it. Three-deck queue (Quick / Deep / Escalation), inline review session (not a modal), per-channel evidence chips, video keyframe strip with timestamps + per-frame top-channel score, author history, similar events via Voyage embeddings + pgvector, reviewer-tag picker filtered to the event's modalities, side panels (Chat / Notes / Agent activity) docked edge-to-edge.
- Hash-chained audit, in code, with tests.
@inertial/dbis 14 tables on Postgres + pgvector with a pglite dev factory and 68 hermetic integration tests. Every state transition writes one entry per instance withprevHash → hashlinkage. "No remote API touched my instance over the last 30 days" becomes a SQL query, not a vendor promise.
What's stubbed (deliberately)
- Every source connector — Mastodon (ActivityPub), Bluesky (AT Protocol), Lemmy, Discord, Slack, and the generic webhook package (
sdk-webhook). All four connector packages are interface stubs with no real ingestion. Without these, no real platform's events ever reach the runciter; the system can only process events posted directly toPOST /v1/eventsby a script. This is the single biggest gap between this project and a moderation tool. - The action dispatcher that pushes decisions back to source platforms. A moderation system that can't act on its decisions is a logging system.
@inertial/agents-audioand@inertial/agents-identity— each is a single stub class whoseanalyzereturns[]. Image still flows throughimage-classify@anthropic; video is local ffmpeg keyframe extract → that classifier. Package-levelvision-*inertials are empty stubs in the README capability table.- Gateway media download + perceptual hashing — not implemented (README architecture diagram: media download + phash TODO). Honest framing: "multimodal" is text + image + frame-grabbed video; audio has no transcription or classifier path yet.
- Auth and observability layers. Anyone who can reach
localhost:4001can register skills, kick off eval runs, or delete review items. Don't run this in front of any real instance.
The architecture diagram in the README documents the target shape. What runs today is a verification substrate with a reference UI on top.
Why I built this
Commercial moderation APIs claim accuracy without proof and ship verdicts without evidence. Federated mods distrust them because they can't audit them; centralized compliance teams need defensible records they can show regulators. Both want decomposed, evidence-rich decisions; neither has them.
I wanted to know: what would a substrate for that look like — schemas, audit log, skill registry, eval harness, reviewer surface — wired together end-to-end with real code rather than a slide deck. So I built it. Four claims:
- Inertials (sub-agents) emit typed structured signals, not verdicts. Probability + confidence + evidence pointers. The policy layer turns signals into routing; humans turn routing into actions.
- Per-instance YAML policy so federation is a first-class case, not an afterthought. The same code serves a wide-open community and a high-compliance enterprise because the operator brings their own rules.
- Per-skill privacy posture lives in the schema. A skill is either
dataLeavesMachine: trueorfalse. The audit chain records which model saw which event, so privacy claims become hash-chained artifacts. - Reviewer decisions auto-promote into the eval gold set. Every commit grows the calibration corpus by one structured row, so the system improves at measuring itself.
What this is NOT
- Not a deployable moderation service. No connectors. No action dispatcher. No auth. Don't put it in front of any real instance.
- Not a model. Composes existing classifiers (toxic-bert, Claude, Voyage) under typed contracts. Trains nothing.
- Not statistically validated. The 31-event gold set demonstrates the calibration math is correct; per-channel sample sizes (1–15) are too small to make any actual claim about any skill's real-world accuracy.
- Not multimodal in the way that phrase usually means. Audio is fully unimplemented. Video is keyframe extraction plus per-frame image classification — no temporal reasoning, no audio track, no scene-change detection.
- Not a complete moderation toolkit. The dashboard, audit log, eval harness, and skill registry are real. Two of the seven
@inertial/agents-*packages (audio, identity) are pure stubs that return[]; the remaining five (text, vision, video, context, cloud) ship real composition logic. Connector packages are placeholders — none ingest from a real source platform. - Not a maintained OSS project. No CONTRIBUTING.md, no issue templates, no triage commitment. If you want to use any of this, fork it.
Skill tiers — what's actually demonstrated
The architecture supports four execution tiers. Three of the four are exercised today — Tier 2 (local server / Ollama) has no shipped skill yet. Honest mapping:
text-detect-spam-link (regex URL detection), text-context-author@local (DB-backed author-history lookup)text-classify-toxicity@local (toxic-bert):11434) · nothing yet — planned for the in-flight vision-ollama worktext-classify-toxicity@anthropic, image-classify@anthropic, text-embed@voyage, video frame-by-frame (ffmpeg → image-classify@anthropic per keyframe)Privacy posture is per-skill: Tier 0 / 1 never leave the machine; Tier 3 always does. The audit log records which model saw which event, so a federated mod can prove "no remote API touched my instance over the last 30 days" — not as a promise, as a hash-chained artifact.
Local-first is not a magic bullet. For high-stakes content (minor detection, video understanding, audio harassment, coordinated attacks), cloud is currently the only adequate tier — and audio is unimplemented entirely. The point of inertial isn't to replace cloud — it's to make the routing legible and the data flow auditable.
Naming
The vocabulary comes from Philip K. Dick's Ubik (1969).
- inertial — in Ubik, "inertials" are anti-telepaths whose function is to neutralize harmful psychic intrusion on behalf of clients. The toolkit's sub-agents are inertials — each one neutralizes a class of harmful signal (toxicity, spam, NSFW, identity hate, brigading…) for the communities it serves.
- Runciter — Glen Runciter, the operator who runs the prudence organization that dispatches inertials. The orchestrator class in
@inertial/coreisRunciter; the host process isapps/runciter. Code reads asrunciter.dispatch(event) → inertials emit StructuredSignals. - structured signals — what inertials emit. Probability + confidence + evidence pointers. Never verdicts. Policy turns signals into routing; humans turn routing into actions.
One rule: inertials emit signals; the Runciter dispatches them; humans decide.
Sibling kits
- eval-kit — evaluation framework for collaborative-task agents. Inertial uses
@eval-kit/uiprimitives in its eval cockpit; calibration scoring runs through@eval-kit/core(README:@inertial/evalwraps it). - HITL-KIT — human-in-the-loop UI primitives.
@inertial/app's queue and review screens are built onMiniTrace,HitlCard,BatchQueue,AiGenerationScale, andApproveRejectRowfrom the hitlkit.dev shadcn registry. - tag-kit — domain-agnostic structured tagging primitives (catalog + scope-aware matching + PRF scoring + headless React
TagPicker/TagChip). Inertial's reviewer-tag layer was extracted into tag-kit so other HITL annotation workflows (medical, legal, ML training data) can reuse the same substrate.
Inertials emit signals. The Runciter dispatches them. Humans decide. Every routing decision is a policy rule you can read, every signal is evidence you can inspect, and every state transition is an entry in a hash-chained log. The architecture is the argument — and the point of the project is to demonstrate it end-to-end through real code, not to ship a moderation service.