Run your first loop in 15 minutes

By the end of this tutorial you’ll have stood up a sovereign knowledge base, served it to an agent over MCP, and watched a query come back with the exact edges it traversed — the explainable retrieval that makes Dossier’s memory trustworthy. The whole path runs from this repository, offline.

We’ll build it in three moves: provision a tenant, serve a real OKF knowledge base, and query it back. Then you’ll learn how to ingest your own sources.

Before you start

You’ll need:

Node.js ≥ 22 and git on your PATH. Check with node --version and git --version.
pnpm (this is a pnpm workspace).
A clone of the Dossier repository, built once:
Terminal window
```
git clone https://github.com/twofoldtech-dakota/dossier.git
cd dossier
pnpm install
pnpm build
```
pnpm build compiles the CLIs you’ll run below (dossier-runtime, dossier-mcp). Run every command from the repository root.

Steps

Provision a tenant.

A tenant is one client’s siloed workspace: its own OKF repo, a manifest, and — by default — its own git history. This runs fully offline.
Terminal window
```
node ./packages/runtime/dist/cli.js provision \
  --root ./clients \
  --client acme-co
```
You’ll see the silo created and a JSON record on stdout:
```
dossier-runtime: provisioned "acme-co" at ./clients/acme-co (vcs:git).
```
Confirm it landed:
Terminal window
```
node ./packages/runtime/dist/cli.js list --root ./clients
```
One client = one repo = one tenant. The isolation boundary is the file boundary — a server only ever reads the one repo it’s pointed at, never a query filter. That’s how a single agency can run many clients’ loops without ever crossing the streams.
Serve a real knowledge base over MCP.

To see retrieval work right now, serve the DXA reference knowledge base that ships with the repo — 53 real OKF atoms with 174 typed edges, no network, no key:
Terminal window
```
node ./packages/mcp/dist/cli.js \
  --repo ./verticals/digital-experience-agency \
  --client dxa \
  --known-external-ids knowledge-model
```
The server announces what it loaded, then waits for MCP requests on stdio:
```
dossier-mcp: serving tenant "dxa" — 53 atom(s), 174 edge(s), 0 load error(s), 0 graph error(s).
dossier-mcp: connected on stdio — awaiting MCP requests.
```
This is a live Model Context Protocol server. It exposes five tools over one tenant’s repo: search_concepts, get_concept, get_related, list_concepts, and kb_health.
Query your knowledge back — explainably.

Point any MCP client at the server. The fastest way to see it is to add it to Claude Code and ask a question in plain language:
Terminal window
```
claude mcp add dossier-dxa -- \
  node ./packages/mcp/dist/cli.js \
    --repo ./verticals/digital-experience-agency \
    --client dxa \
    --known-external-ids knowledge-model
```
Then, in Claude Code, ask: “Search the dxa knowledge base for the discovery process, then show me what it’s related to.” Claude calls search_concepts, then get_related, and the answer comes back with the typed edges traversed — for example:
```
search_concepts("discovery process") →
  dxa-discovery (process)        score 3.04
  dxa-discovery-report (artifact) score 2.99

get_related("dxa-discovery", depth 1) → 9 neighbours, via edges:
  dxa-discovery —[owner]→    dxa-solution-architect
  dxa-discovery —[uses]→     dxa-work-management
  dxa-discovery —[produces]→ dxa-discovery-report
```
That —[produces]→ edge is the point: the answer isn’t a similarity guess, it’s a walk of the real graph, and you can see why each neighbour came back. That’s explainable GraphRAG.
Curate — keep the human in the loop.

The atoms you just queried are plain files in verticals/digital-experience-agency/. Open one — say processes/dxa-discovery.md — and you’ll see its frontmatter: confidence, source, and the typed edges (owner, uses, produces). To curate is to edit the file and commit it: change a fact, promote an atom from inferred to verified, fix an edge. The repo is the system of record, so curation is just a git commit. There’s no separate database to keep in sync.

Now point it at your own knowledge

You’ve proven the serve-and-query half of the loop. The other half — turning your raw documents into atoms — is the extract step, and it’s the one place a model runs.

The run verb does the whole loop in one command: ingest a source → extract OKF → emit atoms → commit. Because extraction calls a model, it needs a transport. Pick the one you have:

Your Claude subscription (no key)
An API key

If you have the claude CLI on your PATH and an active subscription, extract with --subscription — no API key:

node ./packages/runtime/dist/cli.js run \
  --root ./clients \
  --client acme-co \
  --source-dir ./my-docs \
  --subscription

Set ANTHROPIC_API_KEY and run the same command without --subscription:

ANTHROPIC_API_KEY=sk-ant-... \
  node ./packages/runtime/dist/cli.js run \
    --root ./clients \
    --client acme-co \
    --source-dir ./my-docs

Where ./my-docs is a directory of clean markdown or HTML. (Prefer to learn a public website instead? Swap --source-dir ./my-docs for --url https://example.com — the default web crawler is keyless.)

When it finishes, your tenant’s OKF repo holds freshly extracted atoms, committed as one diff in ./clients/acme-co/’s git history. Serve it the same way you served the reference KB in step 2 — just point --repo at your tenant’s OKF repo:

node ./packages/mcp/dist/cli.js \
  --repo ./clients/acme-co/okf \
  --client acme-co

That’s the full loop: ingest → extract → OKF → serve → query → curate, and every pass is a diff in your own git history.

What you built

A clients/acme-co/okf/ directory of OKF atoms in your own git — cat-able, git clone-able, yours forever — and a live, explainable retrieval server over it. The indexes are caches; the files are the truth.

Next steps

Install the Claude Code plugin — run this whole loop for a client from inside Claude Code, with the expert agent team and the decision-capture hook.
The dossier-runtime verb set — every verb, flag, and guarantee, in full.
What is Dossier? — the loop and the three faces, explained.