Post 0 of the Nexus series: the conceptual companion to Installing Nexus. Code is knowledge work, and the agent team needs the same knowledge as the developer to do its job.
Software development is knowledge work. The developer team writing the code curates and contributes to that knowledge as the work proceeds; the LLM agent team helping write the code does the same. Nexus lets the two teams work against the same knowledge base, and lets agents coordinate without passing files or prose between contexts. One knowledge base, every project the developer has open at once.
What is in front of the developer is the work, not the system that supports it. The agent team, the storage tiers, the plan library, the typed-link graph: all of it stays under the surface. Nexus fills gaps in working memory, deconstructs intent into structured queries, and dispatches the agent team behind the conversation. The interface is conversational, cooperative, iterative. No forms, no checkboxes, no rigid pipeline. Nexus adapts to the projects the developer is building, the teams around them, and the systems they integrate with — the work drives the shape of the knowledge base, not the other way around.
Nexus is designed to disappear into the work, so describing it means surfacing things that are, day to day, deliberately under the surface. The pages ahead make them visible long enough to explain.
Modern computers use a memory hierarchy (registers, cache, RAM, disk) to let processes coordinate at the speed and durability each layer is built for. Nexus borrows that shape. Three storage tiers (T1 ephemeral, T2 SQLite, T3 ChromaDB) form a semantic-memory hierarchy that lets agents cooperate the same way: shared reads and writes against the same store, with no two agents needing to know about each other directly. Each tier is tuned for a different lifetime and access pattern (within-session, within-project, across every project the catalog tracks), but the cooperation contract is the same at every level.
In Nexus, decisions live in research-design reviews. Prior art lives in research papers and code repositories. Architecture lives in long-form designs. Work lives in plan templates and bead chains. Corrections from past sessions live on as standing rules. Managing an evolving design is curating knowledge across all of those domains. Agentic coding does not change that discipline; it makes the discipline of knowledge curation indispensable. Nexus gives every agent on the team what the developer would otherwise reconstruct each session: relevant knowledge on demand, filtered for distractions, cross-linked across specifications, architecture, plans, and post-mortems — a working map across every domain the project actually touches.
If you read knowledge management and software engineering as separate disciplines, Nexus looks out of place. But software engineering was always about more than code. Building complicated systems generates documents, decisions, prior art, plans, and reviews, and that material is at least as load-bearing as the source tree. The code is almost the least interesting part of the system.
None of the ideas behind Nexus are new. Software engineering as knowledge work, document management, hypertext and link-graph navigation, decision-record practice: each is an established discipline with decades of literature behind it. What is new is the integration. Strong NLP, capable embedding models, LLM agents that coordinate through structured calls, and vector databases that scale to a single developer’s workspace are recent enough and good enough to compose end-to-end. The last few years made something this integrated practical for one developer rather than a team of librarians and infrastructure engineers. Nexus stands on a lot of shoulders.
The surface

What the developer, the team, and the LLM agents do on Nexus (repeatedly, in any combination, often interleaved) collapses to five activities:
- Gather. A question lands and Nexus returns the right slice of the corpus: source files, decision records, prior art, plan templates, contributing notes, post-mortems. Tens of thousands of chunks across half a dozen indexed projects narrow to the hundred or so that actually apply. Filtering is Nexus’s job, not the developer’s.
- Hypothesize. A working idea is scratched into session memory before it is fully formed. The agent team picks it up, looks for evidence and contradictions, brings it back sharpened or replaced. The conversation acquires structure without form-filling.
- Track facts. From any document, typed edges show what depends on it and what it depends on:
citesto prior art,implementsto source files,supersedesto the older version of a decision the newer one replaces. Provenance is one hop away, not five searches and a guess. - Synthesize new ones. Research findings, design rationales, post-mortems, plan templates: each is written back the moment it crystallizes. The next session reads it on first call. New facts join the same graph everything else lives on.
- Integrate. The catalog densifies, the plan library learns, RDRs supersede prior RDRs, the topic taxonomy reshapes itself. The store evolves with the project. Nothing has to be migrated, re-explained, or rebuilt.
These five are not solo activities. The developer initiates them; the agent team carries them out alongside; Nexus keeps the state visible to both. Everything else in this post (and most of the rest of the series) is what makes the five activities usable in practice.
The cross-domain reality

Taking the working catalog I keep across these projects as an example, a typical day moves through several kinds of indexed content, in no particular order:
- Code: 6,229 source files, AST-aware, language-classified, embedded for code-aware retrieval.
- Decision records (RDRs): 607 RDRs across the active projects. Each RDR cites prior RDRs, links to its implementing code, and is tracked by a bead with a status.
- Research papers: 162 indexed papers (Xanadu, AgenticScholar, BFT-SMR, schema-evolution literature, ChromaDB internals, and others).
- Architecture docs: 10 long-form designs that knit cross-cutting concerns together.
- Prose: 700 documents covering contributing guides, READMEs, RDR templates, and blog drafts (this post included).
- Plans: 50 saved templates in the T2 plan library, alongside the bead graph that tracks ongoing work.
- Standing rules: 19 feedback notes that distill prior corrections into permanent guidance loaded at the start of every session.
Working on the code is not “look at the code.” It is navigating across all of these at once. A query like how does plan-match-first retrieval work pulls from RDR-078 (the decision), src/nexus/operators/dispatch.py (the implementation), the AgenticScholar paper (the prior art), and the test suite (the contract). A code review needs to know which design was being implemented, which alternatives were considered, and which requirements are in force. A plan template should know which paper its strategy came from. A critique should know whether a proposed change quietly supersedes a prior decision.
This is exactly what knowledge-management systems are usually for. The perhaps non-obvious bit (e.g. source-graph obviously knows) is that it is also what code work is, once you stop pretending the source tree is the whole picture.
Finding the signal
Storage and lookup is not the hard part. Organization is. A few indexed projects produce tens of thousands of chunks across code, prose, RDRs, papers, and the residue of failed experiments. Signal-to-noise drops as the corpus grows. Nexus layers a knowledge graph over the semantic store and uses it to filter aggressively, recovering signal as the corpus scales instead of losing it.
- Typed links knit decisions to code to research. A query that lands on RDR-078 can follow
implementsedges to the source files that satisfy it,citesedges to the papers that informed it, andsupersedesedges to the prior RDR it replaces. - Taxonomy assignments cluster topics across collections, so “membership churn in BFT systems” surfaces the relevant papers regardless of which corpus they were originally indexed under.
- Catalog metadata routes queries by author, content-type, subtree, or follow-link, pre-filtering the search space before any embedding work runs.
- The plan library learns which retrieval shapes work for which kinds of question, so paraphrases of what tradeoffs are called out across Arcaneum’s RDRs all land on the same operator chain.
- Plan DAGs (the multi-step operator chains covered in detail in Post 5) run as pipelines that search, cleanse, amplify, and winnow as they go. A single query can fan into retrieval, follow typed edges to related material, extract structured facts, rank by relevance, then summarize. Each step shrinks the working set and sharpens it; signal accumulates across the chain, noise does not.
The point is to surface the right slice on demand, not return everything that vaguely matches. A handle on the knowledge, structure to model it, tools to find what is needed when it is needed: that is what makes the cross-domain corpus useful instead of overwhelming.
The experience
The asymmetry between the developer and the agent is most easily seen on the agent’s side. An LLM agent has a finite context window and no privileged view of the codebase. To find anything, it has to read into context, and naive file-reading wastes most of its budget on material the question did not need. Without indexing, there is no good way to ask what code implements RDR-078: the agent walks the filesystem, opens files, scans for patterns, and burns context on material it then has to discard. The agent has to sip from the corpus, not gulp.
The developer has the same problem in slower motion. IDEs and LSPs offer real tools (go-to-definition, find-references, language-aware refactoring) but with their own friction: cold-start cost, language coverage, cross-repository limits. For prior art outside the open project (papers, RDRs from a sibling repo, a half-remembered design conversation), there is no LSP equivalent. Web search returns the wrong granularity; grep returns syntactic matches without meaning.
Nexus closes the asymmetry by giving developer and agent the same access. Semantic search across code, prose, RDRs, and papers, by intent rather than exact strings. Typed-edge traversal across all of those: find what implements this, what cites it, what supersedes the older version. Symbol-aware navigation and AST-aware chunking for the structural precision LSPs already do well, layered alongside the semantic side rather than competing with it. The agent sips precisely from the slice that matters; the developer skips the mechanical scavenger hunt and gets to the question they were trying to answer.
The agent team

The LLM agents that help with the work are not solo assistants. They are a team, and their roles map to the five activities. Six agents do most of the lifting on this project: deep-research-synthesizer (gather and synthesize from papers and prior decisions), strategic-planner (turn intent into phased execution), plan-auditor (test the plan against the live codebase), developer (implement with TDD discipline), code-review-expert (critique the implementation), and substantive-critic (argue with the design at the gate).
The team is dispatched through skills, not summoned one at a time. Across 2,308 recorded sessions, 820 Agent dispatches were logged; roughly 2.6 of those are critique-or-review for every generation dispatch. The team does not produce first drafts. It argues over them.
The agents do not coordinate with each other in plain English. They leave state in the store (a hypothesis in T1 scratch, a finding in T2, a linked document in T3) and the next agent picks it up from there. The store is how the agents talk to each other; the conversation is how the developer talks to the whole arrangement.
The tiers, briefly
The three storage tiers carry the five activities (gather, hypothesize, track, synthesize, integrate) across different time horizons. Different lifetimes, one cooperation rule: every agent and the developer read from and write to the same store.
- T1 is within-session coordination state: the working hypothesis, the active context, the operator’s intermediate output. Lasts as long as the conversation does.
- T2 is project-scoped persistent state: standing rules, plan templates, research findings against open RDRs, relevance logs.
- T3 is cross-project semantic memory: code, prose, RDRs, papers, knowledge entries, architecture docs. The catalog and its typed links live here; semantic search runs against it.
The shape is intentional. T1 being ephemeral is what makes within-session coordination cheap: agents write freely, knowing the scratch disappears with the session. T2 being project-scoped is what keeps standing rules and plan templates persistent without polluting the cross-project semantic index. T3 being long-lived is what lets a paper indexed for one project answer a question on the next. Each tier is tuned for the work it carries.
Installing Nexus showed the CLI for these tiers (nx scratch, nx memory, nx store). The agent team’s MCP calls reach the same store from the other side. The five activities work because both interfaces land at the same place: nothing is passed between agents in chat messages; everything is left in the store for the next reader to pick up.
A vector store on its own returns documents. With the catalog, the typed-link graph, and the plan library on top, you get the right document at the right point in the work, not just a list of matches.
What the rest of this series does
The series opens up Nexus one piece at a time, each piece tied back to how it gets used in actual work:
- Post 00: Installing Nexus: the install walkthrough and short tour, the practical companion to this post.
- Post 1: Nexus, by example: a worked session against the Delos corpus showing the moving parts cooperating on a real research question.
- Post 2: Typed links and the catalog: the graph that knits the corpora together.
- Post 3: Decisions as indexed data: RDRs as first-class graph nodes, not static markdown.
- Post 4: Plans, not replanning: the plan library as a growing asset, not a maintenance burden.
- Post 5: Operators as building blocks: the DAG primitives that compose into reusable analytical flows.
- Post 6: Nexus, measured: what 2,308 sessions show the system actually doing.
Each piece in the series exists because real work would otherwise drop something on the floor: a decision uncited by code, a paper unlinked from a plan, a critique that runs without knowing which prior critique it is contradicting, a rule the agent team has to relearn every session.
Up next
Post 1: Nexus, by example: a session through the Delos corpus, with the agent team and the knowledge tiers visible at every step.
Previous in the series
Post 00: Installing Nexus: the install walkthrough that put Nexus on your machine.


Leave a Reply