Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
- A stuck git command can no longer hang CodeGraph indefinitely. The git checks behind worktree detection and git-hook setup, and the installer's optional `npm install -g` step, now time out and fail gracefully instead of blocking forever — this matters most for the background MCP server, where an unbounded git hang (network filesystems, a wedged fsmonitor) could previously freeze it long enough for the safety watchdog to kill it. Thanks @inth3shadows for the report. (#1139)
- The context hook's new plain-words matching works immediately on projects indexed by an older CodeGraph version. The word lookup it relies on is built at index time, so a project indexed before the upgrade had an empty one, and the hook would silently find nothing until something else happened to refresh the index; the hook now fills it in on first use (a one-time step — normally the background MCP server's startup catch-up gets there first). Thanks @inth3shadows for the report. (#1142)
- Several accuracy fixes to the plain-words matching: a renamed symbol (for example a NestJS route after its module prefix is applied) stays findable under its new name (#1141); a word that only appears in your code as an import statement's package name is no longer presented as a matched symbol (#1144); plural words no longer generate garbled lookup keys ("services" no longer also looks up "servic") (#1145); and a name matching both the singular and plural of one word can no longer squeeze out a genuine two-word match (#1146). Thanks @inth3shadows for the reports.
- Heavily-reflected Unreal Engine C++ classes are no longer dropped from the index. Reflection markup that decorates members — `UPROPERTY(...)`, `UFUNCTION(...)`, `UCLASS(...)`, `GENERATED_BODY()`, `UE_DEPRECATED_*(...)`, `DECLARE_DELEGATE_*(...)` — are no-semicolon macro calls that tree-sitter doesn't recognize, so each drops into error recovery; in a big class the errors pile up until the whole `class_specifier` collapses and the class, its base clause, and its members vanish (`UCharacterMovementComponent`, with ~240 such macros, disappeared entirely, breaking every subclass/type-hierarchy and blast-radius query that went through it). These line-leading annotation macros are now blanked (offset-preserving) before parsing so the class survives. Thanks @luoyxy for the report and root-cause analysis. (#1093 follow-up)
- Unreal Engine members and methods prefixed by an export/visibility macro are no longer lost. The `*_API` macro doesn't only sit on the class header — it prefixes almost every exported member of a large UE class (`ENGINE_API virtual void Tick(...)`, `static ENGINE_API void AddReferencedObjects(...)`); the parser read the macro as an extra type token and each such declaration fell into error recovery, so on headers like `Actor.h` and `World.h` hundreds of return types piled up as orphan errors and could still tip the class into collapse. Member/method-level `*_API` / `*_EXPORT` / `*_ABI` macros (Unreal, Qt/Boost, LLVM) are now blanked before parsing, mirroring the existing class-header recovery. (#1093 follow-up)
- Unreal Engine annotation macros that appear mid-line — an enum value's `UMETA(DisplayName=...)`, a parameter's `UPARAM(ref)`, or a deprecation tag wedged into a `using` alias (`using FOnNetTick UE_DEPRECATED(5.5, "...") = ...;`, which alone collapsed `UWorld` in `World.h`) — are now stripped too. These sit in positions the line-leading recovery structurally can't reach, and a single one could take down the surrounding enum or class. They are matched by an Unreal-only name list (`UMETA`, `UPARAM`, `UE_DEPRECATED*`) so no standard-C++ or other-library code is affected. Together these three fixes recover the main class of every large Unreal Engine header tested (`Actor`, `ActorComponent`, `SkeletalMeshComponent`, `World`, `LightComponent`, `CharacterMovementComponent`). (#1093 follow-up)

## [1.2.0] - 2026-07-02

Expand Down
175 changes: 174 additions & 1 deletion __tests__/extraction.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ import * as os from 'os';
import { CodeGraph } from '../src';
import { extractFromSource, scanDirectory, buildDefaultIgnore, discoverEmbeddedRepoRoots, buildScopeIgnore } from '../src/extraction';
import { detectLanguage, isLanguageSupported, getSupportedLanguages, initGrammars, loadAllGrammars, isSourceFile } from '../src/extraction/grammars';
import { stripCppTemplateArgs, blankCppExportMacros, blankCppInlineMacros, blankMetalAttributes, recoverMangledCppName } from '../src/extraction/languages/c-cpp';
import { stripCppTemplateArgs, blankCppExportMacros, blankCppInlineMacros, blankMetalAttributes, blankCppAnnotationMacroCalls, blankCppApiPrefixMacros, blankCppInlineAnnotationMacros, recoverMangledCppName } from '../src/extraction/languages/c-cpp';
import { normalizePath } from '../src/utils';

beforeAll(async () => {
Expand Down Expand Up @@ -2928,6 +2928,179 @@ kernel void computeBlur(texture2d<float, access::read> inTexture [[texture(0)]],
});
});

describe('C++ in-body reflection-macro annotations do not collapse the class (UE)', () => {
// Unreal reflection markup — `UPROPERTY(...)`, `UFUNCTION(...)`,
// `GENERATED_BODY()`, `UE_DEPRECATED_*(...)`, `DECLARE_DELEGATE_*(...)` — are
// no-semicolon macro CALLS decorating members. tree-sitter doesn't know they
// are macros, so each drops into error recovery; in a heavily-reflected class
// the errors accumulate until the enclosing class_specifier can't close and
// the whole class (its base clause and members) collapses into an ERROR node
// and disappears from the graph. blankCppAnnotationMacroCalls strips them,
// offset-preserving, so the class parses normally.
it('recovers a heavily-reflected class with multiple inheritance + members', () => {
const code = `UCLASS(MinimalAPI)
class UMyMovement : public UPawnMovementComponent, public IRVOAvoidanceInterface, public INetworkPredictionInterface
{
\tGENERATED_BODY()
public:
\tUE_DEPRECATED_FORGAME(5.0, "Deprecated; note the commas, and (parens) inside the string")
\tUPROPERTY(Category="Move", EditAnywhere, BlueprintReadWrite, meta=(ClampMin="0", UIMin="0"))
\tfloat MaxWalkSpeed;

\tUFUNCTION(BlueprintCallable, Category="Move")
\tfloat ComputeSpeed() const { return MaxWalkSpeed * 2.0f; }
};
`;
const result = extractFromSource('movement.cpp', code);
const cls = result.nodes.find((n) => n.kind === 'class' && n.name === 'UMyMovement');
expect(cls).toBeTruthy();
// The class body parses, so its inline method definition is extracted too —
// proof the class_specifier closed instead of collapsing into an ERROR node.
expect(result.nodes.some((n) => n.name === 'ComputeSpeed')).toBe(true);
// The base clause survives (inheritance queries keep working).
expect(
result.unresolvedReferences.find(
(r) => r.referenceKind === 'extends' && r.referenceName === 'UPawnMovementComponent'
)
).toBeTruthy();
});

it('strips line-leading no-semicolon ALL-CAPS calls, offset-preserving', () => {
const inp = `\tUPROPERTY(EditAnywhere, meta=(ClampMin="0"))\n\tfloat X;\n`;
const out = blankCppAnnotationMacroCalls(inp);
expect(out.length).toBe(inp.length); // every byte offset preserved
expect(out).not.toContain('UPROPERTY');
expect(out).toContain('float X;');
// A macro whose args carry commas/parens inside a string still balances.
const inp2 = `UE_DEPRECATED_FORGAME(5.0, "a, b (c)")\nUPROPERTY(Foo)\nfloat Y;\n`;
const out2 = blankCppAnnotationMacroCalls(inp2);
expect(out2.length).toBe(inp2.length);
expect(out2).not.toContain('UE_DEPRECATED_FORGAME');
expect(out2).not.toContain('UPROPERTY');
expect(out2).toContain('float Y;');
});

it('does NOT blank expression / condition / statement / init-list macro uses', () => {
for (const c of [
'void f() {\n\tif (CHECK_FLAG(x)) { g(); }\n}', // condition — not line-leading
'void f() {\n\tLOG_MESSAGE("hi");\n}', // statement call — trailing ;
'C::C()\n\t: MEMBER_A(1)\n\t, MEMBER_B(2)\n{}', // init-list — comma / not line-leading
'C::C() :\n\tMEMBER_A(1),\n\tMEMBER_B(2)\n{}', // init-list wrapped — trailing , / {
'auto y =\n\tMAKE_THING(a) + 1;', // line-leading but an expression fragment
]) {
expect(blankCppAnnotationMacroCalls(c)).toBe(c);
}
});
});

describe('C++ member/method-level export macros do not orphan declarations (UE)', () => {
// The `*_API` visibility macro doesn't only prefix the class header — it
// prefixes almost every exported member/method of a big UE class
// (`ENGINE_API virtual void Tick(…)`, `static ENGINE_API void Foo(…)`).
// blankCppExportMacros only recovers the class-HEADER form; without blanking
// the member form, tree-sitter reads `MACRO <ret> <name>(` as an extra type
// token and each declaration drops into error recovery.
it('recovers a class + base + members when members are *_API-prefixed', () => {
const code = `class ENGINE_API AActor : public UObject
{
\tGENERATED_BODY()
public:
\tENGINE_API virtual void Tick(float DeltaSeconds);
\tstatic ENGINE_API void AddReferencedObjects(int32 Count);
\tENGINE_API float GetLifeSpan() const { return LifeSpan; }
};
`;
const result = extractFromSource('actor.cpp', code);
expect(result.nodes.some((n) => n.kind === 'class' && n.name === 'AActor')).toBe(true);
// The inline definition (its body prefixed by ENGINE_API) is extracted —
// proof the class_specifier closed instead of collapsing into an ERROR.
expect(result.nodes.some((n) => n.name === 'GetLifeSpan')).toBe(true);
// The base clause survives (inheritance queries keep working).
expect(
result.unresolvedReferences.find(
(r) => r.referenceKind === 'extends' && r.referenceName === 'UObject'
)
).toBeTruthy();
});

it('blanks only the suffix macro before a declaration, offset-preserving', () => {
const inp = `ENGINE_API void Tick();\nstatic MYMOD_EXPORT int32 X;\nLLVM_ABI bool Y();\n`;
const out = blankCppApiPrefixMacros(inp);
expect(out.length).toBe(inp.length); // every byte offset preserved
expect(out).not.toContain('ENGINE_API');
expect(out).not.toContain('MYMOD_EXPORT');
expect(out).not.toContain('LLVM_ABI');
expect(out).toContain('void Tick();');
expect(out).toContain('int32 X;');
expect(out).toContain('bool Y();');
expect(out).toMatch(/static\s+int32 X;/); // `static` kept, only the macro blanked
});

it('does NOT blank an *_API token used as a value or in non-declaration position', () => {
for (const c of [
'int x = SOME_API;', // rvalue — trailing ;
'if (mode == FOO_API) { g(); }', // comparison — trailing )
'return DEFAULT_API, other;', // comma operand
'auto v = NS_API::Make();', // qualified name — trailing ::
'x = A_API + B_API;', // operands of + / trailing ;
]) {
expect(blankCppApiPrefixMacros(c)).toBe(c);
}
});

it('leaves a genuine _API-suffixed word alone when it is itself the name', () => {
// A longer word merely CONTAINING _API (not ending in it) must not match.
const inp = 'FOO_APIENTRY handler;';
expect(blankCppApiPrefixMacros(inp)).toBe(inp);
});
});

describe('C++ mid-line UE annotation macros do not collapse the enum/class (UE)', () => {
// UMETA / UPARAM / UE_DEPRECATED can sit MID-LINE (not line-leading), where
// blankCppAnnotationMacroCalls structurally can't reach them: an enum value's
// `UMETA(...)`, or a deprecation tag wedged into a class-scope `using`
// (`using X UE_DEPRECATED(5.5, "…") = …;`) — which alone collapsed UWorld in
// World.h. blankCppInlineAnnotationMacros strips them, offset-preserving.
it('recovers a class whose in-body using-alias carries a mid-line UE_DEPRECATED', () => {
const code = `class ENGINE_API UWorld : public UObject
{
\tGENERATED_BODY()
public:
\tusing FOnNetTickEvent UE_DEPRECATED(5.5, "use TMulticastDelegate<void(float)>") = TMulticastDelegate<void(float)>;
\tENGINE_API float GetTimeSeconds() const { return TimeSeconds; }
};
`;
const result = extractFromSource('world.cpp', code);
expect(result.nodes.some((n) => n.kind === 'class' && n.name === 'UWorld')).toBe(true);
// The member after the poison using-alias is reached — the class closed.
expect(result.nodes.some((n) => n.name === 'GetTimeSeconds')).toBe(true);
expect(
result.unresolvedReferences.find(
(r) => r.referenceKind === 'extends' && r.referenceName === 'UObject'
)
).toBeTruthy();
});

it('blanks mid-line UMETA/UPARAM/UE_DEPRECATED with balanced parens, offset-preserving', () => {
const inp = `enum class EMode : uint8 {\n\tWalk UMETA(DisplayName="Walk (fast), safe"),\n\tRun\n};\n`;
const out = blankCppInlineAnnotationMacros(inp);
expect(out.length).toBe(inp.length);
expect(out).not.toContain('UMETA');
expect(out).toContain('Walk');
expect(out).toContain('Run');
const inp2 = `void F(UPARAM(ref) int& x) {}\n`;
const out2 = blankCppInlineAnnotationMacros(inp2);
expect(out2.length).toBe(inp2.length);
expect(out2).not.toContain('UPARAM');
expect(out2).toContain('int& x');
});

it('does NOT touch source without those UE-only macro names', () => {
const c = 'enum class E { A, B };\nvoid metadata(int meta) { return; }\n';
expect(blankCppInlineAnnotationMacros(c)).toBe(c);
});
});

describe('C++ forward declarations do not mint phantom class nodes (#1093)', () => {
// `class Foo;` parses as a bodiless class_specifier. Repeated across headers,
// each forward decl minted a phantom bodiless `class` node that crowded out —
Expand Down
Loading