inertial

A queue review session — Approve / Remove / Escalate commits the decision and every applied tag into the hash-chained audit log

github.com/akaieuan/inertial-moderation-tool · Read the README

MITNode ≥20pnpm 10TypeScript

Repo, install instructions, the architecture diagram, the policy DSL, and the honest capability matrix all live in the README.

Reference architecture · portfolio work · MIT

Inertial

A reference architecture for auditable AI content review. Not a deployable moderation service — a working demonstration of one architectural thesis, end-to-end through real code.

Status

Reference architecture, not deployable. Schemas, audit chain, eval harness, skill registry, and reviewer dashboard are real and tested. Connectors are stubbed. Action dispatch is unimplemented. No auth. The 31-event gold set is too small for statistical claims; it's there to demonstrate the calibration math, not to certify any skill's accuracy.

The thesis

AI classification outputs and human review actions should both land in a hash-chained audit log, with typed structured signals as the unit of evidence and per-instance YAML as the unit of policy. Inertial is that thesis demonstrated end-to-end through real code.

The system is two products in one monorepo: @inertial/* — a toolkit of orchestration, persistence, policy, and HITL primitives, sibling to eval-kit and HITL-KIT — and @inertial/app, an Electron + React + Tailwind reference dashboard for moderators, built on HITL-KIT.

This is portfolio work, not a maintained OSS project. The point is the architecture choices and where they hold up — not feature completeness.

What's real

Schema-first Zod contracts across 33 typed shapes (README's inventory centers on 12+ primary schemas in @inertial/schemas): ContentEvent, StructuredSignal, AgentTrace, ReviewItem, ReviewDecision, Policy, AuditEntry, SkillRegistration, GoldEvent, EvalRun, SkillCalibration, ReviewerTag + scope, TagAgreement.
A skill / tool registry with a catalog plus a per-instance registration table. Adding a skill is a registration row, not a code change. Reviewers wire Voyage / Anthropic / etc. without touching YAML.
A YAML policy evaluator with hash-chained /verify. Per-instance, structured AST (no string eval). Leaves are channel + op + value or entity + present; nodes compose with all / any; first match wins. The AST is preserved next to the rule id in the audit log so any decision can be traced back to the exact configuration that produced it.
An eval harness scoring per-(skill, channel) Brier / ECE / agreement. pnpm eval boots an in-memory pipeline, dispatches the 31-event gold set (config/evals/gold-set-v1.jsonl — 27 text + 3 image + 1 video) against the live skill registry, and prints calibration as a hash-chained artifact, not vibes. Reviewer commits auto-promote via signalFeedback + reviewerTags into gold_events ( source reviewer-derived), so the corpus grows on every decision.
A reviewer-tag layer with per-modality / per-segment scope. TAG_CATALOG ships ~18 starter tags; reviewer_tags stores them with scope; the "good video, bad audio" mixed-validity case gets a precise label, not a whole-asset verdict.
A reviewer dashboard wired to all of it. Three-deck queue (Quick / Deep / Escalation), inline review session (not a modal), per-channel evidence chips, video keyframe strip with timestamps + per-frame top-channel score, author history, similar events via Voyage embeddings + pgvector, reviewer-tag picker filtered to the event's modalities, side panels (Chat / Notes / Agent activity) docked edge-to-edge.
Hash-chained audit, in code, with tests. @inertial/db is 14 tables on Postgres + pgvector with a pglite dev factory and 68 hermetic integration tests. Every state transition writes one entry per instance with prevHash → hash linkage. "No remote API touched my instance over the last 30 days" becomes a SQL query, not a vendor promise.

What's stubbed (deliberately)

Every source connector — Mastodon (ActivityPub), Bluesky (AT Protocol), Lemmy, Discord, Slack, and the generic webhook package (sdk-webhook). All four connector packages are interface stubs with no real ingestion. Without these, no real platform's events ever reach the runciter; the system can only process events posted directly to POST /v1/events by a script. This is the single biggest gap between this project and a moderation tool.
The action dispatcher that pushes decisions back to source platforms. A moderation system that can't act on its decisions is a logging system.
@inertial/agents-audio and @inertial/agents-identity — each is a single stub class whose analyze returns []. Image still flows through image-classify@anthropic; video is local ffmpeg keyframe extract → that classifier. Package-level vision-* inertials are empty stubs in the README capability table.
Gateway media download + perceptual hashing — not implemented (README architecture diagram: media download + phash TODO). Honest framing: "multimodal" is text + image + frame-grabbed video; audio has no transcription or classifier path yet.
Auth and observability layers. Anyone who can reach localhost:4001 can register skills, kick off eval runs, or delete review items. Don't run this in front of any real instance.

The architecture diagram in the README documents the target shape. What runs today is a verification substrate with a reference UI on top.

Why I built this

Commercial moderation APIs claim accuracy without proof and ship verdicts without evidence. Federated mods distrust them because they can't audit them; centralized compliance teams need defensible records they can show regulators. Both want decomposed, evidence-rich decisions; neither has them.

I wanted to know: what would a substrate for that look like — schemas, audit log, skill registry, eval harness, reviewer surface — wired together end-to-end with real code rather than a slide deck. So I built it. Four claims:

Inertials (sub-agents) emit typed structured signals, not verdicts. Probability + confidence + evidence pointers. The policy layer turns signals into routing; humans turn routing into actions.
Per-instance YAML policy so federation is a first-class case, not an afterthought. The same code serves a wide-open community and a high-compliance enterprise because the operator brings their own rules.
Per-skill privacy posture lives in the schema. A skill is either dataLeavesMachine: true or false. The audit chain records which model saw which event, so privacy claims become hash-chained artifacts.
Reviewer decisions auto-promote into the eval gold set. Every commit grows the calibration corpus by one structured row, so the system improves at measuring itself.

What this is NOT

Not a deployable moderation service. No connectors. No action dispatcher. No auth. Don't put it in front of any real instance.
Not a model. Composes existing classifiers (toxic-bert, Claude, Voyage) under typed contracts. Trains nothing.
Not statistically validated. The 31-event gold set demonstrates the calibration math is correct; per-channel sample sizes (1–15) are too small to make any actual claim about any skill's real-world accuracy.
Not multimodal in the way that phrase usually means. Audio is fully unimplemented. Video is keyframe extraction plus per-frame image classification — no temporal reasoning, no audio track, no scene-change detection.
Not a complete moderation toolkit. The dashboard, audit log, eval harness, and skill registry are real. Two of the seven @inertial/agents-* packages (audio, identity) are pure stubs that return []; the remaining five (text, vision, video, context, cloud) ship real composition logic. Connector packages are placeholders — none ingest from a real source platform.
Not a maintained OSS project. No CONTRIBUTING.md, no issue templates, no triage commitment. If you want to use any of this, fork it.

Skill tiers — what's actually demonstrated

The architecture supports four execution tiers. Three of the four are exercised today — Tier 2 (local server / Ollama) has no shipped skill yet. Honest mapping:

Tier 0In-process JS · text-detect-spam-link (regex URL detection), text-context-author@local (DB-backed author-history lookup)

Tier 1Local WASM (transformers.js / ONNX) · text-classify-toxicity@local (toxic-bert)

Tier 2Local server (Ollama @ :11434) · nothing yet — planned for the in-flight vision-ollama work

Tier 3Cloud · text-classify-toxicity@anthropic, image-classify@anthropic, text-embed@voyage, video frame-by-frame (ffmpeg → image-classify@anthropic per keyframe)

Privacy posture is per-skill: Tier 0 / 1 never leave the machine; Tier 3 always does. The audit log records which model saw which event, so a federated mod can prove "no remote API touched my instance over the last 30 days" — not as a promise, as a hash-chained artifact.

Local-first is not a magic bullet. For high-stakes content (minor detection, video understanding, audio harassment, coordinated attacks), cloud is currently the only adequate tier — and audio is unimplemented entirely. The point of inertial isn't to replace cloud — it's to make the routing legible and the data flow auditable.

Naming

The vocabulary comes from Philip K. Dick's Ubik (1969).

inertial — in Ubik, "inertials" are anti-telepaths whose function is to neutralize harmful psychic intrusion on behalf of clients. The toolkit's sub-agents are inertials — each one neutralizes a class of harmful signal (toxicity, spam, NSFW, identity hate, brigading…) for the communities it serves.
Runciter — Glen Runciter, the operator who runs the prudence organization that dispatches inertials. The orchestrator class in @inertial/core is Runciter; the host process is apps/runciter. Code reads as runciter.dispatch(event) → inertials emit StructuredSignals.
structured signals — what inertials emit. Probability + confidence + evidence pointers. Never verdicts. Policy turns signals into routing; humans turn routing into actions.

One rule: inertials emit signals; the Runciter dispatches them; humans decide.

Sibling kits

eval-kit — evaluation framework for collaborative-task agents. Inertial uses @eval-kit/ui primitives in its eval cockpit; calibration scoring runs through @eval-kit/core (README: @inertial/eval wraps it).
HITL-KIT — human-in-the-loop UI primitives. @inertial/app's queue and review screens are built on MiniTrace, HitlCard, BatchQueue, AiGenerationScale, and ApproveRejectRow from the hitlkit.dev shadcn registry.
tag-kit — domain-agnostic structured tagging primitives (catalog + scope-aware matching + PRF scoring + headless React TagPicker / TagChip). Inertial's reviewer-tag layer was extracted into tag-kit so other HITL annotation workflows (medical, legal, ML training data) can reuse the same substrate.

The reviewer surface, in frames

The queue — three decks, click to review inline; worked as decks rather than an infinite list

Pipelines — wire up the dispatch flow: a visual canvas of the routing graph beside the active per-instance configs

Skills — what the Runciter is allowed to do: catalog + per-instance registration, with per-skill status

The create sheet — registering a new classifier with its typed signal contract

Compliance — shadow agreement between AI and human decisions, over the hash-chained audit feed

Insights — per-skill calibration (Brier / ECE / agreement) against the gold set, the reviewer-tag corpus, and eval-run history; Run eval fires a live calibration pass

Side panels — chat, notes, and agent activity docked edge-to-edge, so the dashboard reads as one app instead of seven views

Inertials emit signals. The Runciter dispatches them. Humans decide. Every routing decision is a policy rule you can read, every signal is evidence you can inspect, and every state transition is an entry in a hash-chained log. The architecture is the argument — and the point of the project is to demonstrate it end-to-end through real code, not to ship a moderation service.