Open Source MCP Tools MCPCode ReviewKnowledge GraphClaude CodeToken OptimizationGraphRAGTree-sitter

code-review-graph

Local knowledge graph for Claude Code. Builds a persistent map of your codebase so Claude reads only what matters — 6.8× fewer tokens on reviews and up to 49× on daily coding tasks.

tirth8205/code-review-graph 8.5k stars 978 forks MIT

Install

$pip install code-review-graph && code-review-graph install

About Getting Started How It Works How to Use Technical Proposal Features References

About

AI coding tools re-read your entire codebase on every task. code-review-graph fixes that. It builds a structural map of your code with Tree-sitter, tracks changes incrementally, and gives your AI assistant precise context via MCP so it reads only what matters.

The tool parses your repository into an AST stored as a graph of nodes (functions, classes, imports) and edges (calls, inheritance, test coverage), then queries at review time to compute the minimal set of files your AI assistant needs to read.

Benchmarked against 6 real open-source repositories (Express, FastAPI, Flask, Gin, httpx, Next.js), it achieves an average 8.2× token reduction and up to 49× on large monorepos — while maintaining 100% recall on impact analysis (never missing an actually affected file).

Getting Started

Install via pip or pipx, then run the one-time setup command:

pip install code-review-graph
code-review-graph install    # auto-detects Claude Code, Cursor, Windsurf, and more
code-review-graph build      # parse your codebase (~10 seconds for 500 files)

Restart your editor or AI tool after installing. Then open your project and ask:

Build the code review graph for this project

After the initial build, the graph updates automatically on every file edit and git commit. No manual steps required.

Platform-specific install

code-review-graph install --platform claude-code   # Claude Code only
code-review-graph install --platform cursor          # Cursor only
code-review-graph install --platform codex           # Codex only

Optional dependency groups

pip install code-review-graph[embeddings]          # Local vector embeddings
pip install code-review-graph[google-embeddings]   # Google Gemini embeddings
pip install code-review-graph[communities]         # Community detection (igraph)
pip install code-review-graph[all]                 # Everything

How It Works

When a file changes, the graph traces every caller, dependent, and test that could be affected — the "blast radius" of the change. Your AI reads only these files instead of scanning the whole project.

Pipeline

Repository files are parsed into an AST with Tree-sitter
Nodes (functions, classes, imports) and edges (calls, inheritance, test coverage) are stored in a local SQLite database
On every git commit or file save, a hook fires and diffs the changed files
Only files with changed SHA-256 hashes are re-parsed — a 2,900-file project re-indexes in under 2 seconds
At review time, the graph computes the minimal set of files to pass to your AI assistant

Blast-radius analysis

When you ask for a code review, the MCP tool get_impact_radius_tool traces through the dependency graph from the changed files outward — finding every function that calls the changed code, every class that inherits from it, and every test that exercises it. This precise set replaces the naive approach of sending the entire repository.

Monorepo handling

Large monorepos are where token waste is most painful. On a Next.js monorepo with 27,732 files, the graph reduces the review context to approximately 15 files — a 49× reduction — while still covering the full blast radius of the change.

Semantic search (optional)

With the embeddings optional dependency, nodes are indexed with vector embeddings (sentence-transformers locally, or Google Gemini / MiniMax via API). This enables semantic_search_nodes_tool which finds code entities by meaning rather than just keyword matching.

How to Use

Slash commands (Claude Code)

/code-review-graph:build-graph — build or rebuild the code graph

/code-review-graph:review-delta — review changes since last commit

/code-review-graph:review-pr — full PR review with blast-radius analysis

CLI reference

code-review-graph install          # Auto-detect and configure all platforms
code-review-graph build            # Parse entire codebase
code-review-graph update           # Incremental update (changed files only)
code-review-graph status           # Graph statistics
code-review-graph watch            # Auto-update on file changes
code-review-graph visualize        # Generate interactive HTML graph
code-review-graph wiki             # Generate markdown wiki from communities
code-review-graph detect-changes   # Risk-scored change impact analysis
code-review-graph register <path>  # Register repo in multi-repo registry
code-review-graph eval             # Run evaluation benchmarks
code-review-graph serve            # Start MCP server

Excluding paths

Create a .code-review-graphignore file in your repository root to exclude paths from indexing:

.code-review-graphignore

generated/**
*.generated.ts
vendor/**
node_modules/**

In git repos, only tracked files are indexed (git ls-files), so gitignored files are skipped automatically.

Technical Proposal

code-review-graph is a GraphRAG (Graph Retrieval-Augmented Generation) system purpose-built for code. Unlike document RAG which chunks text, it builds a typed property graph where nodes represent code entities and edges represent structural relationships.

Graph model

Node types: function, class, method, import, module, test
Edge types: calls, inherits, imports, tests, defines, exports
Each node carries: name, file path, line range, language, docstring, SHA-256 hash
Storage: SQLite with FTS5 for full-text search + optional vector columns for semantic search

Parsing layer

Tree-sitter grammars are used for all 19 supported languages. This gives access to the concrete syntax tree, not just regex patterns. Each language has explicit node type mappings for functions, classes, imports, call sites, and inheritance — all defined in parser.py.

Incremental update algorithm

On file change, compute SHA-256 of the new content
Compare against stored hash — skip re-parse if unchanged
For changed files: remove all outgoing edges from their nodes, re-parse, insert new nodes and edges
Find all nodes that had edges into the changed nodes (dependents) — their impact metadata is marked stale
A 2,900-file re-index takes under 2 seconds because only the 5–10 changed files are processed

MCP integration

22 MCP tools are registered with the model context. When an AI assistant receives a task, it calls build_or_update_graph_tool to ensure the graph is fresh, then get_review_context_tool to retrieve the minimal token-efficient context for the change set. The context returned includes: blast radius file list, dependency chain summary, test coverage gaps, and architectural overview.

Impact analysis precision vs recall tradeoff

The blast-radius analysis is deliberately conservative — it achieves 100% recall (never misses an affected file) at the cost of some false positives (precision averages 0.38). This is the right tradeoff for code review: missing a broken dependency is worse than reviewing one extra file. The F1 score averages 0.54 across 13 commits in 6 repositories.

Reproducible benchmarks

All benchmark numbers come from the automated eval harness (evals/) run against 6 real open-source repositories. Three-arm control: verbose response, terse response (no skill), graph-assisted response. Run yourself with: code-review-graph eval --all

Features

Incremental updates — re-parses only changed files, subsequent updates under 2 seconds

19 languages + Jupyter notebooks (Python, TypeScript/TSX, JavaScript, Vue, Go, Rust, Java, C#, Ruby, Kotlin, Swift, PHP, Solidity, C/C++, Dart, R, Perl, Lua)

Blast-radius analysis — shows exactly which functions, classes, and files are affected by any change

Auto-update hooks — graph updates on every file edit and git commit without manual intervention

Semantic search — optional vector embeddings via sentence-transformers, Google Gemini, or MiniMax

Interactive visualisation — D3.js force-directed graph with edge-type toggles and search

Local SQLite storage — no external database, no cloud dependency

22 MCP tools — automatically used by Claude Code, Cursor, and other AI assistants

5 MCP workflow prompts — review, architecture, debug, onboard, pre-merge

Community detection — cluster related code via Leiden algorithm

Architecture overview — auto-generated with coupling warnings

Risk-scored reviews — detect_changes maps diffs to affected functions, flows, and test gaps

Wiki generation — auto-generate markdown wiki from community structure

Multi-repo registry — register and search across multiple repositories

Full-text search — FTS5-powered hybrid search combining keyword and vector similarity

Specifications

Average token reduction	8.2× vs naive full-repo reads
Monorepo reduction	Up to 49× (27,700+ files → ~15)
Impact analysis recall	100% — never misses an affected file
Incremental update speed	Under 2 seconds for 2,900-file repos
Languages supported	19 + Jupyter/Databricks notebooks
MCP tools	22 tools + 5 workflow prompts
Storage	Local SQLite (.code-review-graph/)
Python version	3.10+
License	MIT
Platform support	Claude Code, Cursor, Windsurf, Zed, Continue, Codex, OpenCode, Antigravity