>_CONSTELLATION://BENCHMARK
Benchmark Report · Q2 2026

Smaller, precise context.

Less spend.

More signal.

We benchmarked nine code-intelligence operations with and without Constellation's MCP tools. Same model, same tasks, same codebase. Constellation carries the load, not your context window.

Token Spend Reduction
75% reduction
Token spend reduction on code querying and navigation operations. 10.7M tokens spent on baseline operations collapsed to 2.6M.
Average Per-Tool Savings
72% savings
Mean cost reduction across all operations. The suite that costs $5.43 against a raw codebase costs $1.52 using Constellation. The more complex the analysis, the wider the gap.
Regarding metrics alignment: Token consumption and cost will diverge as an artifact of parameters at the LLM inference provider. Token pricing is typically categorized by input, output, and cached tokens which each have distinct pricing. Therefore, reducing overall token consumption by 50% does not equate to a 50% reduction in overall cost.
// Headline Wins

Where Constellation decisively beats grep-and-guess.

The three analyses below are where engineering hours actually get burned. A pre-indexed knowledge graph turns them from long, expensive exploration runs into deterministic single queries.

Baseline
Constellation
Finding Orphaned Code
16x
fewer tokens to find orphaned code. Baseline has to navigate the entire dependency tree manually. Constellation already has this context.
BASE
5.13M
CONST
318K
Change Impact Analysis
10x
more economical impact analysis. What used to require chasing every reference now resolves from the indexed graph in one call.
BASE
2.32M
CONST
230K
Find Circular Dependencies
7x
more efficient cycle detection without combinatorial pain. A graph problem solved with a graph instead of grep, file reads, and educated guesses.
BASE
1.44M
CONST
197K
// Tool-by-Tool

The full breakdown. Honest data.

All nine operations, sorted by cost savings. The pattern is clean: the harder the analysis, the more decisively the graph wins. Simple symbol lookups are roughly a wash — but those aren't where engineering hours live.

Baseline
Constellation
findOrphanedCode
Impact
BASE
5.13M
CONST
318K
−95% 16.1x fewer tokens
findCircularDependencies
Dependencies
BASE
1.44M
CONST
197K
−92% 7.3x fewer tokens
getDependents
Dependencies
BASE
182K
CONST
65K
−84% 2.8x fewer tokens
impactAnalysis
Impact
BASE
2.32M
CONST
230K
−83% 10.1x fewer tokens
getCallGraph
Tracing
BASE
572K
CONST
592K
−38% 1.0x fewer tokens
getDependencies
Dependencies
BASE
86K
CONST
101K
−18% 0.9x fewer tokens
traceSymbolUsage
Tracing
BASE
528K
CONST
605K
−18% 0.9x fewer tokens
searchSymbols
Discovery
BASE
371K
CONST
345K
−18% 1.1x fewer tokens
getSymbolDetails
Discovery
BASE
58K
CONST
183K
+164% 3.2x more tokens
A note on getSymbolDetails metrics: Baseline wins on this one because the operation is trivial; grab a single file, read a function. Constellation's overhead exceeds its benefit for this operation, especially at scale. We're showing it anyway. Constellation provides non-trivial context; cyclomatic complexity, exported state, etc. Constellation doesn't win based on tokens everywhere; it wins where the work is hard, and that's where token budgets break.
// Why The Gap Exists

Two ways to answer "what depends on this?"

The difference isn't model capability. It's information architecture. Baseline rediscovers your codebase on every query. Constellation already knows it.

Baseline: repetitive rediscovery

The model has tools — grep, read_file, glob — and no map of the territory. Every query reconstructs structural knowledge that gets thrown away the moment the session ends.

  • Wide context loads. Reading whole files to find a few relevant lines.
  • Speculative greps. "Maybe it's called X? No, try Y." Each miss bills tokens.
  • No persistence. Yesterday's discoveries don't carry into today's session.
01grep for symbol name
↓ 47 matches, ambiguous
02read 12 candidate files
↓ partial context
03grep for callers
↓ noisy, miss imports
04read import chains
↓ still incomplete
05guess, summarize, ship

Constellation: code understanding

The codebase is indexed into a knowledge graph. Symbols, references, dependencies, and call edges live in structured form. The model writes one piece of code to query the graph.

  • Code Mode. A single code_intel call composes multiple operations in parallel.
  • Structured returns. Real symbol IDs, not regex hits. No re-parsing.
  • Persistent. The graph survives sessions. Every engineer shares the same map.
01api.searchSymbols(query)
↓ exact symbol, one match
02Promise.all([dependencies, dependents, impact])
↓ parallel graph traversals
03return structured result
// Methodology
Subject
Claude Code (Sonnet 4.6)
with @constellationdev/mcp
Operations
9 code-intelligence tools
searchSymbols → impactAnalysis
Sample size
3 iterations per tool
n = 54 total runs
Measured
Billable tokens, USD cost,
turn count, duration