Architecture
The Precision Bundle: orthogonal tags + typed windows for small-model retrieval.
A 7B model can’t hold a 40,000-character context artifact in working memory and still reason well. It needs a precision bundle — the smallest set of typed slices that answer the question. FullStackVibes makes that bundle retrievable by composing two independent tag layers over every artifact, then decomposing each artifact into typed context windows that carry their own tags.
Two independent layers of tags
Every artifact carries tags at two levels. They are orthogonal: a small model can filter on one without losing the other.
- Pattern tags describe the artifact as a whole. Is this a recipe, a boilerplate, a mental model, an anti-pattern document? Who is it written for? How authoritative is it? These never describe content slices — only the artifact’s overall shape.
- Window tags describe individual typed slices inside the artifact. A SCHEMA window can be tagged
drop-in-ready; an ANTI_PATTERN window can be taggedgotcha; a code block can be taggedreusable-skeleton. Window tags travel with the slice, not with the parent.
An agent retrieving "show me drop-in-ready JSON schemas in the fintech space" composes a Space filter (fintech), a window-type filter (SCHEMA), and a window-tag filter (drop-in-ready). Each axis was assigned by inference independently, so the join is sharp.
Strict tag kinds
Pattern tags are not free-form keywords. Each tag belongs to one of five kinds, and the kind constrains what the tag can mean:
- FORM — What kind of artifact is this? Examples:
recipe,boilerplate,mental-model,anti-pattern-doc,reference. - AUDIENCE — Who is this written for? Examples:
agent,human-developer,operator,reviewer. - AUTHORITY — How load-bearing is this? Examples:
opinionated,vendor-spec,community-derived,experimental. - LIFECYCLE — Where in the workflow does this fire? Examples:
setup,build,ops,incident,retro. - RISK — What can go wrong if applied wrong? Examples:
destructive,cost-sensitive,privacy-sensitive,safe-default.
The kinds are intentionally short and stable. Inference can’t invent a sixth kind. That stability is what makes downstream retrieval predictable: a frontier-small model knows it can always ask for FORM:recipe + RISK:safe-default and get a coherent set.
Typed context windows
An artifact is decomposed into a sequence of typed windows by the WINDOW_INDEX inference job. Each window has a single type drawn from a closed enum:
GOAL— What the artifact is trying to enable.BACKGROUND— Context the reader needs before applying the rest.CONSTRAINT— Hard rules the system must obey.INSTRUCTION— A step or directive to follow.SCHEMA— A concrete data shape (JSON, types, headers).TOOL_SPEC— Definition or call signature for a tool the agent uses.EXPECTED_OUTPUT— What success looks like.ANTI_PATTERN— A specific failure mode to avoid.
This is the slicing that matters. A retrieving agent doesn’t want the whole 40,000-character pattern — it wants the three SCHEMA windows and the four ANTI_PATTERN windows.
The Handshake
A frontier-small model talking to FullStackVibes uses a Handshake: a structured retrieval call that names the axes it cares about, in priority order.
{
"space": "fintech",
"window_types": ["SCHEMA", "ANTI_PATTERN"],
"window_tags": ["drop-in-ready", "gotcha"],
"pattern_tags": { "FORM": ["recipe"], "RISK": ["safe-default"] },
"max_chars": 6000
}
The platform returns only the windows that satisfy the join, packed into a budgeted bundle. The model never sees the irrelevant prose, the off-axis examples, or the framing material. A 7B inference call stays under-budget and on-task; a 70B call spends its working memory reasoning about the right slices instead of skimming a wall of markdown.
Why orthogonal matters
Most context platforms collapse classification into a single flat tag list. That breaks for two reasons:
- The axes interfere. A flat list mixes "what kind of artifact" with "what risk profile" with "what audience." A small model retrieving on one axis pulls noise from the others.
- The axes have different cardinality. RISK has five real values; FORM has dozens. Flattening forces both into the same surface and pollutes both.
By keeping pattern tags partitioned by kind and window tags scoped to slices, every retrieval call composes cleanly. A new RISK value doesn’t accidentally appear in a FORM filter; a new FORM doesn’t flood the AUDIENCE space. Composability is the lever — not breadth of vocabulary.
Worked example
An artifact titled Autonomous Agentic Trade Management for Futures Execution is ingested. The inference pipeline produces:
- Spaces:
automation,fintech,security - Pattern tags:
FORM:recipe,FORM:boilerplate,FORM:anti-pattern-doc,FORM:mental-model - Windows: 16 typed slices — 1 GOAL, 2 BACKGROUND, 1 CONSTRAINT, 2 INSTRUCTION, 3 SCHEMA, 1 EXPECTED_OUTPUT, 4 ANTI_PATTERN, 2 TOOL_SPEC
- Window tags on those slices:
drop-in-ready,gotcha,config-template,reusable-skeleton,gold-standard-example
An agent that needs "the JSON contracts for trade execution and what not to do with them" sends a Handshake with space=fintech, window_types=[SCHEMA, ANTI_PATTERN], window_tags=[drop-in-ready, gotcha]. The platform returns 7 windows totalling roughly 1,500 characters — under any small-model budget — and the model never has to read the 5,000-character source pattern.
That is the Precision Bundle.