Cognitive Lenses for Claude Code: Markdown Files That Change How the Model Reasons

Tell Claude “you cannot use any form of the verb ‘to be’” (E-Prime) and the analysis comes back different. Not different like a rephrased sentence. Different like a different person wrote it. Built around actions and relationships instead of labels and categories. It catches things the default output misses. This isn’t prompt engineering. You’re not telling the model what to say. You’re changing how it’s able to think.

Here’s what I mean by “changes how it thinks.” When a model can say “X is Y,” it classifies and moves on. Fast, but shallow. Remove “is” and it can’t classify. It has to describe what X does, how X relates to Y, what happens when they interact. The rule doesn’t tell the model what to conclude. It blocks the cheap path and forces the model to route through more expensive, more revealing territory. Different rules block different paths. Each one exposes structure the default response skips over.

E-Prime is one such rule. I found eight more. A rule that requires tagging every claim as “observed,” “inferred,” or “assumed” forces the model to separate what it knows from what it’s guessing. A rule that inverts every assumption exposes hidden dependencies. Each rule I call a “lens,” and each one pushes the model down a different reasoning path.

Each lens lives in a markdown file. A skill reads the files at runtime and spins up parallel agents, each one reasoning under a different constraint, none of them aware of the others. A synthesizer reads all the outputs and maps where they agree, where they diverge, and what none of them thought to look at.

Eleven lenses, five functional roles, a set of skills that wire them into teams on demand. It runs inside Claude Code every day. Everything here is open source and the whole thing is built from markdown files you can copy in five minutes.

github.com/rodspeed/claude-cognitive-architecture

The problem with one mind

When you ask an AI to review your architecture, critique your writing, or debug your code, you get one perspective. It’s usually competent. It’s also always the same kind of competent. Same reasoning patterns, same default assumptions, same blind spots.

Humans solve this by putting different people in a room. The pessimist catches risks. The first-principles thinker questions assumptions everyone else takes for granted. The person who thinks in analogies connects the problem to something from a completely different domain. Each person’s cognitive style reveals things the others miss.

You can’t put different AIs in a room. But you can put rules on a single AI that force it to reason differently each time.

A rule that bans all forms of “to be” (is, are, was, were) doesn’t just change vocabulary. It reliably produces output built around actions, relationships, and processes instead of static labels. A rule that requires every claim to be tagged with where it came from (“observed,” “inferred,” “assumed”) produces analysis that separates what you actually know from what you’re guessing.

These aren’t prompting tricks. They’re rules that reshape reasoning. And they combine.

Lenses: constraints that change reasoning

A lens is a markdown file with a name, a type declaration, and a body that describes the cognitive constraint. Simplified example:

---
name: counterfactual
type: lens
description: Inverts assumptions to find hidden dependencies
---

For every claim or observation, systematically explore
what would break if the opposite were true.

Structure: observation → inversion → what breaks.

The file lives in .claude/agents/. When a skill runs, it reads this file and adds the constraint to the beginning of the agent’s instructions. The model never sees the word “lens.” It just receives a rule that shapes how it approaches the task.

I have eleven lenses. Each one forces a different mode of reasoning:

Counterfactual

Invert every assumption. What breaks if this premise were false?

Analogical

Map findings to cross-domain parallels before stating them directly.

Minimal

Absolute minimum words. Every word earns its place or gets cut.

Evidential

Tag every claim: [observed], [inferred], [assumed], [uncertain].

First Principles

Derive from axioms only. No best practices, no conventions.

Steel-Man

Build the strongest defense of the status quo before finding where it fails.

E-Prime

Ban all forms of “to be.” Forces operational language.

Process-Only

Everything as flow and transformation. Nothing “is” — things happen.

No-Possession

Strip “to have.” Describe through relationships, not ownership.

Temporal

Trace how things got here and where they’re heading. Genealogy over snapshot.

Regularizer

Ban filler words. Auto-applied to unlensed agents. +7.3pp accuracy in testing.

Why these eleven? Three (E-Prime, No-Possession, Process-Only) come from linguistics. They restrict which words the model can use, which changes what it can express. Temporal and Regularizer each target a specific failure mode (static analysis and filler-padded hedging). The rest restrict how the model approaches a problem: what it looks for, what it questions, what it takes for granted. Together they cover a wide range of thinking styles.

A lens doesn’t tell the model what to conclude. It tells the model how to look. Two agents with different lenses examining the same code will see different things: different risks, different assumptions, different structural properties.

Roles: agents that do specific jobs

Lenses change how an agent thinks. Roles define what an agent does. I have five:

Researcher. Given a question, it gathers evidence on its own: searching the web, reading files, tracking where each finding came from. Its output is structured (key findings, supporting evidence, gaps it noticed, confidence per claim). It doesn’t speculate beyond what it found.

Critic. Reads a document cold. No author context, no conversation history, just the document and instructions to find what’s wrong. Produces a structured review: what’s sound, what’s shaky, what’s missing. The cold read is the point. Leak context and you’ve defeated the purpose.

Synthesizer. Takes output from multiple agents and produces a combined analysis. It looks for four things: where the agents agree, where they disagree, what none of them mentioned, and what emerges from reading them together that no single output contains.

Fact-Checker. Traces every number in a document back to its source data file. Checks arithmetic, verifies citations say what the paper claims, and flags internal inconsistencies. Built after writing a research paper where manual line-by-line number audits proved necessary but tedious.

Editor. Line-level prose improvement without changing argument or structure. Tightens sentences, fixes register drift, ensures logical flow between paragraphs. Distinct from the critic (which finds structural problems) — the editor handles craft.

Roles and lenses combine. A researcher with the evidential lens tags every finding with its source. A researcher with the counterfactual lens probes what would change if key assumptions were wrong. Same job, different thinking style, different output.

How skills wire it together

No single lens or role does much on its own. The leverage comes from combining them. A skill reads lens files at runtime and feeds each constraint into a separate parallel agent. Three skills show the pattern:

Parallax: three lensed agents in parallel, then synthesis

Parallax takes a question and runs it through three agents at once, each with a different lens. The synthesizer reads all three outputs and maps where they agree, where they disagree, and what none of them mentioned. It ships with preset combinations: debug (counterfactual + analogical + minimal), architecture (first-principles + process-only + steel-man), decision (counterfactual + first-principles + steel-man). You can also pass any combination you want.

Research takes a broad question and breaks it into two to five sub-questions depending on how wide the topic is. Each sub-question gets its own researcher agent, running in parallel. The synthesis step maps which answers overlap, which contradict, and where coverage is thin. Works with any lens: /research --lens evidential produces researchers that tag every finding with where it came from.

Scrutinize runs a structured debate. A fresh-context critic reads a document cold and writes a structured critique. An advocate responds, conceding valid points, defending where the critic is wrong. Then a new fresh critic rebuts. Two rounds, then synthesis: agreed changes versus contested points. What makes this work is the cold read. The critic gets zero author context, zero conversation history, zero “here’s what we were thinking.” The entire value is in the fresh eyes.

These three compose into a paper-writing pipeline. Draft takes a thesis, evidence files, and an audience (academic, blog, general) and produces structured prose. It composes with lenses: /draft --lens evidential produces prose where every claim is tagged with its epistemic status. After drafting, Scrutinize reviews. Then Revise takes the scrutiny output and applies changes systematically, producing diffs for approval. The full chain is: research → draft → scrutinize → revise. Each step has a dedicated skill.

Watch sits outside the production pipeline. It scans ArXiv and Semantic Scholar for recent work in your research areas and surfaces what’s relevant. Not for answering questions — for situational awareness. Knowing when someone publishes in your space before you hear about it secondhand.

The directory is the interface

Eleven lenses, five roles, and the skills that compose them all live in two directories:

.claude/
  agents/
    counterfactual.md      # lens
    analogical.md          # lens
    minimal.md             # lens
    evidential.md          # lens
    first-principles.md    # lens
    steel-man.md           # lens
    eprime.md              # lens
    process-only.md        # lens
    no-possession.md       # lens
    temporal.md            # lens
    regularizer.md         # lens (auto-applied)
    critic.md              # role
    researcher.md          # role
    synthesizer.md         # role
    fact-checker.md        # role
    editor.md              # role
  skills/
    parallax/SKILL.md      # 3-lens ensemble
    research/SKILL.md      # parallel deep research
    scrutinize/SKILL.md    # adversarial review
    draft/SKILL.md         # structured prose production
    revise/SKILL.md        # apply editorial feedback
    watch/SKILL.md         # literature monitor
    orient/SKILL.md        # toolkit advisor

Adding a new lens means adding a markdown file. No skill edits, no configuration changes, no deployment. Skills discover agents at runtime by reading the directory. The filesystem is the registry.

Iteration speed follows directly. I built the initial nine lenses in an afternoon. When I realized I needed a lens that forces analogical reasoning, I wrote analogical.md and it was immediately available to every skill that reads from agents/. Zero wiring.

What this actually looks like in practice

Say I’m reviewing an architecture decision: knowledge graph or relational database for a personal wiki.

Running /parallax --profile architecture "Knowledge graph vs relational DB for personal wiki with 200+ interconnected notes" spawns three agents:

The first-principles agent starts from the actual use cases and reasons upward. It might conclude that graph queries only matter for a handful of operations and the complexity isn’t worth it.
The process-only agent describes the system as flows: notes entering, links forming, queries running. It catches bottlenecks invisible in static descriptions, like the fact that writing a note touches the system very differently than reading one.
The steel-man agent builds the strongest possible case for whichever option seems weaker, forcing a fair hearing before dismissal.

The synthesizer then reads all three analyses (it sees which lens produced which) and maps what they agree on, where they diverge, and what none of them thought to examine.

One run. Three structurally different analyses. Thirty seconds. The alternative is sitting with a question for hours and hoping your own cognitive habits don’t create blind spots. They always do.

A note on cost: running Parallax means four agent calls (three lensed agents plus the synthesizer). Research means N+1, where N is the number of facets. Scrutinize means multiple rounds. Each parallel agent uses its own context window. For quick factual questions or straightforward code generation, just ask Claude directly. The ensemble overhead buys you nothing there. This architecture earns its cost on consequential questions where missing a blind spot is more expensive than spending extra tokens.

The feedback loop

The cognitive architecture doesn’t run in isolation. It feeds into a learning system.

The learning loop: work produces knowledge, knowledge feeds the next session

At the end of every meaningful conversation, a harvest skill runs automatically. It scans the exchange and pulls out insights as structured notes, each with a title, tags, and links to related notes in the knowledge graph. An observation step runs silently alongside it, logging what I did: which judgment calls I made, where I pushed back, what worked.

Periodically, a reasoning engine reviews the full graph and finds connections nobody explicitly drew, gaps where knowledge is thin, and clusters where several notes circle the same unresolved question.

The loop is: work → harvest → observe → reason → discover → work. Each conversation feeds the next one. Richer analysis produces richer notes, which produce richer context for the next run.

While building the agents directory, the harvest captured a note about the design decision to separate lenses (how an agent thinks) from roles (what an agent does). Two sessions later, the reasoning engine flagged a connection between that note and an older note about linguistic constraints in my research. That connection (linguistic constraints as a special case of cognitive lenses) became the organizing principle for a research paper I’m now writing. No single session produced that insight. It emerged from the graph.

An advisor that knows the toolkit

With 27 skills, 16 agents, and multiple composition patterns, the question “which tool should I use?” becomes non-trivial. So I built an advisor.

The /orient skill scans the full toolkit by reading the filesystem, discovering every skill, lens, and role. You describe your problem. It classifies the problem (broad vs. deep, analysis vs. production, high-stakes vs. exploratory) and recommends which tools to use, in what order, with which lens combinations, and why. It also names what to skip and flags gaps the current toolkit doesn’t cover.

It doesn’t execute anything. It just recommends. Over time, it teaches you your own toolkit so you develop instincts for which tool fits which problem without needing the advisor.

What you can build today

You don’t need all of this. The pattern works at any scale.

Step 1: Create one lens. Pick counterfactual. It’s the highest-leverage single lens. Create .claude/agents/counterfactual.md with a name, type, and one paragraph describing the constraint. That’s a complete lens.

Step 2: Use it manually. Before you invest in a skill, just tell Claude: “Read the constraint in .claude/agents/counterfactual.md and apply it to this question: [your question].” See if the constrained analysis reveals something the default response missed. If it does, the pattern earns its place.

Step 3: Add a second lens. Pick one that contrasts. Steel-man is a good complement to counterfactual. Now you can run two agents with different constraints on the same question and compare what each one catches.

Step 4: Build a skill. When you find a lens combination you keep reaching for, wrap it in a skill (.claude/skills/yourskill/SKILL.md) that reads the agent files and spawns parallel agents automatically. This is where it goes from useful to frictionless.

Each step is independently useful. You don’t need the full architecture to get value from a single lens. The architecture emerged from noticing which lenses I reached for repeatedly and automating those patterns.

What didn’t work

Not every combination produces useful output. The no-possession lens (strips “to have,” forces relational language) paired with process-only (expresses everything as flow and transformation) produces dense, nearly unreadable prose. Takes longer to decode than just reading a normal analysis. Two heavy linguistic constraints at once don’t add up. They compound into noise. The model spends so much effort satisfying both rules that the actual analysis gets thin.

The whole ensemble is also slower and worse than a direct query for certain problems. Simple factual questions, straightforward debugging with an obvious stack trace, anything where the answer is one thing and you just need to find it. Three parallel agents become overhead, not insight. The system adds value when a question has hidden structure that benefits from multiple angles. No hidden structure means you’re paying three times the tokens for the same answer said three ways.

I also tried a 5-lens ensemble early on. Synthesis quality collapsed. The synthesizer can hold three threads and find meaningful divergences. At five, it starts summarizing instead of synthesizing. You get a list of what each lens said, not an analysis of where they disagree and why. Three is the ceiling that works reliably.

Why this works

The deeper question: can you change how an AI thinks, or only what it says?

If you tell a model “be creative,” not much changes. It’s performing a label, not following a process. But tell a model “you cannot use any form of the verb ‘to be’ in your response” and something different happens. A hard rule, not an aspiration. The output reliably comes back built around verbs, relationships, and processes because the rule removes the easy path of labeling things as fixed categories. The model has to find another way.

I can see that this happens. I can’t tell you exactly why. Maybe the rule changes how the model reasons internally. Maybe it just forces different word choices that happen to surface different information. What I can say is that different rules produce noticeably different analysis. Analysis that catches things the default model misses, and that catches different things depending on the rule. That’s enough to be useful. The diversity in the outputs is real even if the mechanism stays open.

The system that puts it all together (lenses as files, roles as files, skills that combine them at runtime, a directory as the registry) is simple enough that anyone using Claude Code can build their own version in an afternoon. Start with one lens. See what it reveals. Add another. Let it grow from what you actually use.

This is the third post about Claude Code infrastructure. The first covered skills, hooks, and automation patterns. The second covered epistemic memory. The cognitive architecture described here sits on top of both.