Problem
Axon's 12-phase ingestion pipeline produces false positives in dead code detection and incomplete call graphs due to 4 limitations:
- Module-level calls silently dropped — setup() or Table("users", ...) at file top level have no containing symbol and are skipped. The ±10 entry scan window in find_containing_symbol is also too
narrow for dense files (50+ symbols).
- No framework awareness — FastAPI @app.get/@app.post, Alembic upgrade()/downgrade(), Next.js getServerSideProps/generateStaticParams, and Pydantic @computed_field/@field_serializer are not recognized
as entry points or dead code exemptions.
- No type inference for receivers — user.save() never resolves to User.save because _resolve_receiver_method compares class_name == receiver literally ("user" != "User").
- First-class function references ignored — handler = my_func is not tracked, so my_func appears dead.
Benchmarked on a real FastAPI + Alembic + Pydantic project (95 Python files): 30 false dead code symbols, 527 CALLS edges, 18 files flagged.
Proposed Solution
Step 1 — Same-file usage: Replace the fixed ±10 scan window with a dynamic backward scan (capped at 500 lines). Attribute module-level calls to the File node instead of dropping them.
Step 2 — Framework awareness: Recognize FastAPI verbs (@app.get/post/put/delete/patch), Alembic upgrade/downgrade in migration paths, Next.js data fetching functions, and Pydantic serializer decorators
as entry points / dead code exemptions.
Step 3 — Type inference: Add assignment_target to CallInfo and variable_name to TypeRef. Build a per-file type table {var_name → class_name} from annotations, params, and constructor calls. Use it in
_resolve_receiver_method to resolve user.save() → User.save.
Step 4 — First-class function tracking: New FuncRef dataclass for handler = my_func patterns (Python & TypeScript). Create CALLS edges with confidence × 0.7.
After the fix (same benchmark): 17 dead code (-43%), 549 CALLS edges (+4%), 2 new execution flows, 13 new receiver-resolved calls.
Alternatives Considered
- LSP-based type inference (e.g. codegraph-rust, mcpls): Much more accurate but requires running a language server, heavy dependency, and doesn't integrate with Axon's graph pipeline. Our lightweight
approach covers the most common patterns (constructor assignment, annotations, params) with zero external dependencies.
- Full framework plugins (e.g. CodePrism's approach): Dedicated per-framework analyzers. More complete but high maintenance cost. Our pattern-matching approach (decorator names, file path heuristics)
covers 90% of cases with minimal code.
- Broader scan window (e.g. ±100 instead of ±10): Simpler fix for find_containing_symbol but still arbitrary and O(N) for pathological cases. The dynamic backward scan is both correct and efficient.
Already implemented, tested, and benchmarked — ready to submit as PR, or multiple ones.
I'm currently testing it out on different python, JS/TS projects.
Problem
Axon's 12-phase ingestion pipeline produces false positives in dead code detection and incomplete call graphs due to 4 limitations:
narrow for dense files (50+ symbols).
as entry points or dead code exemptions.
Benchmarked on a real FastAPI + Alembic + Pydantic project (95 Python files): 30 false dead code symbols, 527 CALLS edges, 18 files flagged.
Proposed Solution
Step 1 — Same-file usage: Replace the fixed ±10 scan window with a dynamic backward scan (capped at 500 lines). Attribute module-level calls to the File node instead of dropping them.
Step 2 — Framework awareness: Recognize FastAPI verbs (@app.get/post/put/delete/patch), Alembic upgrade/downgrade in migration paths, Next.js data fetching functions, and Pydantic serializer decorators
as entry points / dead code exemptions.
Step 3 — Type inference: Add assignment_target to CallInfo and variable_name to TypeRef. Build a per-file type table {var_name → class_name} from annotations, params, and constructor calls. Use it in
_resolve_receiver_method to resolve user.save() → User.save.
Step 4 — First-class function tracking: New FuncRef dataclass for handler = my_func patterns (Python & TypeScript). Create CALLS edges with confidence × 0.7.
After the fix (same benchmark): 17 dead code (-43%), 549 CALLS edges (+4%), 2 new execution flows, 13 new receiver-resolved calls.
Alternatives Considered
approach covers the most common patterns (constructor assignment, annotations, params) with zero external dependencies.
covers 90% of cases with minimal code.
Already implemented, tested, and benchmarked — ready to submit as PR, or multiple ones.
I'm currently testing it out on different python, JS/TS projects.