I ran an fMRI on LLMs: a concept is a direction, not a region

TL;DR I've been running an "fMRI for LLMs" — capturing the full internal activations of dense open models (Qwen2.5-7B, Gemma-2-9B, Gemma-4-12B) and applying neuroscience methods to map how meaning is organized. The headline result, confirmed causally and across all three models: a concept is not stored in a region of neurons — it is a single direction in activation space. direction, not a region In the brain, categories live in localized regions (faces → fusiform face area). LLMs are the opposite. Distributed, superposed code. A 10-way category linear probe decodes far above chance (Gemma-2 0.97, Qwen 0.80), yet the "most selective" units do not replicate across two random halves of the stimuli (overlap ≈ 0.00–0.05). There is no findable "animal neuron." Causal proof. Ablating the 20 most selective units changes downstream category accuracy by ~0 (same as removing 20 random units). But ablating one distributed direction collapses it — mean ΔAUC up to +0.52 (Qwen). True in all 3 models. So category is localized to one direction but that direction is spread across ~2000 of 3584 neurons, and which neurons is non-reproducible. Localization is in vector space, not anatomy. The residual stream is a shared additive bus. Injecting a concept direction at N consecutive layers equals injecting N× the magnitude at one layer — ratio = 1.00 for every N. The stream literally sums contributions across layers. Only relative magnitude codes. Scaling the whole residual 0.25×–4× → zero output change (RMSNorm divides it out). Scaling only the component along the concept direction → a clean monotonic concept shift. Meaning = the projection along a direction, not the vector's length. Under strict controls (120 stimuli/category, an architecture-matched untrained twin, word-grouped splits so no frame leaks across train/test): A concept is essentially rank-1 — one direction, present at every depth (decodable layer-span: trained 1.0 vs untrained 0.0). Narrow in width, broad in depth. Concepts coexist additively. One shared probe reads each category as well as a dedicated probe (retention 1.00) — they're linearly superposed and read in parallel. Direction is the whole code. A nonlinear MLP probe fails to beat a single linear direction (gap ≤ 0 in all models), even with 1200 stimuli. "Meaning = direction" isn't an approximation; it's the code. Property Brain Dense LLM Verdict Small-worldness / rich-club hubs yes yes (σ up to 12.8) match Network modularity Q 0.30–0.50 0.09–0.23, rising each generation partial Category-selective regions yes (FFA/PPA) no (distributed direction) differ Topographic maps (retinotopy etc.) yes no (~20–40× below cortex) differ Cross-model universality (CKA) — 0.69–0.77, cross-family Platonic convergence Two bonus results worth flagging: Steerability is predicted by encoding dimensionality (r ≈ −0.83): concepts packed into ~1 direction (numbers, colors) steer cleanly; high-dimensional concepts resist. A wiring-cost penalty makes a small transformer more modular (ΔQ > 0 in 4/4 seeds, with a non-monotonic sweet spot) — direct evidence that the brain's modularity is partly a consequence of physical embedding constraints that transformers normally lack. The harness has an adversarial verification gate, and several appealing hypotheses died in it: "abstraction velocity predicts capability" was rejected on a clean 5-point Qwen ladder; the flashy "60× more localized in SAE features" shrank to a modest 2.4× under a gold-standard pretrained Gemma Scope SAE; cross-model feature-level universality is only partial. Reported as nulls, not spun. Method: dense models scanned on Apple Silicon (MPS), neuroscience-style analysis pipeline (linear probes, RSA/CKA, functional connectome graphs, causal patching, SAEs, steering). Every number is traceable to a data file. Feedback welcome.

Key Takeaways

•TL;DR I've been running an "fMRI for LLMs" — capturing the full internal activations of dense open models (Qwen2.5-7B, Gemma-2-9B, Gemma-4-12B) and applying neuroscience methods to map how meaning is organized

•This story was reported by Dev.to, covering developments in the dev space.

•AI advancements continue to reshape industries — read the full article on Dev.to for complete coverage.

I ran an fMRI on LLMs: a concept is a direction, not a region

Key Takeaways

•This story was reported by Dev.to, covering developments in the dev space.

•AI advancements continue to reshape industries — read the full article on Dev.to for complete coverage.

I ran an fMRI on LLMs: a concept is a direction, not a region

Key Takeaways

Related Articles

Kiro Explained: From Vibe Coding to Production Engineering

I built ZeroAPI — free AI tools for developers, no API key, no signup, ever

The True Value of an Idea: The Cost of Success and a Pragmatic

Preventing context bloat and agent loops in database MCP servers

Discussion

I ran an fMRI on LLMs: a concept is a direction, not a region

Key Takeaways

Related Articles

Kiro Explained: From Vibe Coding to Production Engineering

I built ZeroAPI — free AI tools for developers, no API key, no signup, ever

The True Value of an Idea: The Cost of Success and a Pragmatic

Preventing context bloat and agent loops in database MCP servers

Discussion

Related Articles

Dev.to
Kiro Explained: From Vibe Coding to Production Engineering
A 100–400 level guide to Kiro IDE, Kiro CLI, Kiro Agent, Specs, Steering, Hooks, and why this workflow matters for DevOps builders. I’ve been exploring AI-assisted development through hands-on community sessions and experiments for a while now. I started with Claude Code, then experimented with othe

Dev.to
I built ZeroAPI — free AI tools for developers, no API key, no signup, ever
I'm a CS professor from India. Between teaching, research, and writing books on KDP, I built ZeroAPI — https://zeroapi.in - a free AI tools platform for developers and students. The whole thing started because I kept watching my B.Tech students struggle to access AI tools. ChatGPT API costs money. M

Dev.to
The True Value of an Idea: The Cost of Success and a Pragmatic
The True Value of an Idea: The Cost of Success and a Pragmatic Approach The most expensive mistake of my career wasn't a line of code; it was a "yes." With years of experience in system architecture, networking, and enterprise software development, I can say this with absolute clarity: the true va

Dev.to
Preventing context bloat and agent loops in database MCP servers
I've been running Cursor and Claude Code with MCP for a while now, and one thing became obvious pretty quickly: Giving an agent a generic execute_sql tool is usually a terrible idea. The first problem is context explosion. If an agent needs to understand a database, it often starts by pulling huge s