Run your first loop in 15 minutes
By the end of this tutorial you’ll have stood up a sovereign knowledge base, served it to an agent over MCP, and watched a query come back with the exact edges it traversed — the explainable retrieval that makes Dossier’s memory trustworthy. The whole path runs from this repository, offline.
We’ll build it in three moves: provision a tenant, serve a real OKF knowledge base, and query it back. Then you’ll learn how to ingest your own sources.
Before you start
Section titled “Before you start”You’ll need:
-
Node.js ≥ 22 and git on your
PATH. Check withnode --versionandgit --version. -
pnpm (this is a pnpm workspace).
-
A clone of the Dossier repository, built once:
Terminal window git clone https://github.com/twofoldtech-dakota/dossier.gitcd dossierpnpm installpnpm buildpnpm buildcompiles the CLIs you’ll run below (dossier-runtime,dossier-mcp). Run every command from the repository root.
-
Provision a tenant.
A tenant is one client’s siloed workspace: its own OKF repo, a manifest, and — by default — its own git history. This runs fully offline.
Terminal window node ./packages/runtime/dist/cli.js provision \--root ./clients \--client acme-coYou’ll see the silo created and a JSON record on stdout:
dossier-runtime: provisioned "acme-co" at ./clients/acme-co (vcs:git).Confirm it landed:
Terminal window node ./packages/runtime/dist/cli.js list --root ./clients -
Serve a real knowledge base over MCP.
To see retrieval work right now, serve the DXA reference knowledge base that ships with the repo — 53 real OKF atoms with 174 typed edges, no network, no key:
Terminal window node ./packages/mcp/dist/cli.js \--repo ./verticals/digital-experience-agency \--client dxa \--known-external-ids knowledge-modelThe server announces what it loaded, then waits for MCP requests on stdio:
dossier-mcp: serving tenant "dxa" — 53 atom(s), 174 edge(s), 0 load error(s), 0 graph error(s).dossier-mcp: connected on stdio — awaiting MCP requests.This is a live Model Context Protocol server. It exposes five tools over one tenant’s repo:
search_concepts,get_concept,get_related,list_concepts, andkb_health. -
Query your knowledge back — explainably.
Point any MCP client at the server. The fastest way to see it is to add it to Claude Code and ask a question in plain language:
Terminal window claude mcp add dossier-dxa -- \node ./packages/mcp/dist/cli.js \--repo ./verticals/digital-experience-agency \--client dxa \--known-external-ids knowledge-modelThen, in Claude Code, ask: “Search the dxa knowledge base for the discovery process, then show me what it’s related to.” Claude calls
search_concepts, thenget_related, and the answer comes back with the typed edges traversed — for example:search_concepts("discovery process") →dxa-discovery (process) score 3.04dxa-discovery-report (artifact) score 2.99get_related("dxa-discovery", depth 1) → 9 neighbours, via edges:dxa-discovery —[owner]→ dxa-solution-architectdxa-discovery —[uses]→ dxa-work-managementdxa-discovery —[produces]→ dxa-discovery-reportThat
—[produces]→edge is the point: the answer isn’t a similarity guess, it’s a walk of the real graph, and you can see why each neighbour came back. That’s explainable GraphRAG. -
Curate — keep the human in the loop.
The atoms you just queried are plain files in
verticals/digital-experience-agency/. Open one — sayprocesses/dxa-discovery.md— and you’ll see its frontmatter:confidence,source, and the typed edges (owner,uses,produces). To curate is to edit the file and commit it: change a fact, promote an atom frominferredtoverified, fix an edge. The repo is the system of record, so curation is just a git commit. There’s no separate database to keep in sync.
Now point it at your own knowledge
Section titled “Now point it at your own knowledge”You’ve proven the serve-and-query half of the loop. The other half — turning your raw documents into atoms — is the extract step, and it’s the one place a model runs.
The run verb does the whole loop in one command: ingest a source → extract OKF →
emit atoms → commit. Because extraction calls a model, it needs a transport. Pick the
one you have:
If you have the claude CLI on your
PATH and an active subscription, extract with --subscription — no API key:
node ./packages/runtime/dist/cli.js run \ --root ./clients \ --client acme-co \ --source-dir ./my-docs \ --subscriptionSet ANTHROPIC_API_KEY and run the same command without --subscription:
ANTHROPIC_API_KEY=sk-ant-... \ node ./packages/runtime/dist/cli.js run \ --root ./clients \ --client acme-co \ --source-dir ./my-docsWhere ./my-docs is a directory of clean markdown or HTML. (Prefer to learn a public
website instead? Swap --source-dir ./my-docs for --url https://example.com — the
default web crawler is keyless.)
When it finishes, your tenant’s OKF repo holds freshly extracted atoms, committed as
one diff in ./clients/acme-co/’s git history. Serve it the same way you served
the reference KB in step 2 — just point --repo at your tenant’s OKF repo:
node ./packages/mcp/dist/cli.js \ --repo ./clients/acme-co/okf \ --client acme-coThat’s the full loop: ingest → extract → OKF → serve → query → curate, and every pass is a diff in your own git history.
What you built
Section titled “What you built”A clients/acme-co/okf/ directory of OKF atoms in your own git — cat-able,
git clone-able, yours forever — and a live, explainable retrieval server over it. The
indexes are caches; the files are the truth.
Next steps
Section titled “Next steps”- Install the Claude Code plugin — run this whole loop for a client from inside Claude Code, with the expert agent team and the decision-capture hook.
- The dossier-runtime verb set — every verb, flag, and guarantee, in full.
- What is Dossier? — the loop and the three faces, explained.