Skip to content

feat(cpp): improve C++ parser for inheritance, methods, and calls#734

Open
jannahopp wants to merge 1 commit intoCodeGraphContext:mainfrom
jannahopp:feat/cpp-parser-improvements
Open

feat(cpp): improve C++ parser for inheritance, methods, and calls#734
jannahopp wants to merge 1 commit intoCodeGraphContext:mainfrom
jannahopp:feat/cpp-parser-improvements

Conversation

@jannahopp
Copy link
Copy Markdown
Contributor

Summary

Major improvements to the C++ tree-sitter parser that enable proper graph construction for C++ codebases. On a real 300-file C++ project, these changes increased CALLS edges from 1,287 to 13,042 (10x) and INHERITS edges from 0 to 79.

Changes

1. Inheritance extraction

Replace hardcoded "bases": [] placeholder with _extract_base_classes() that walks base_class_clause children. Handles public/private/protected/virtual inheritance, multiple inheritance, qualified base names (ns::Base), and template base classes (strips template args for graph matching).

2. Qualified method definitions

Add qualified_identifier to the functions tree-sitter query so ClassName::method() definitions in .cpp files are captured with correct class_context. Previously only free functions and simple identifiers matched, missing most C++ method bodies.

3. Scoped and arrow call patterns

Add qualified_identifier and field_expression handling to the calls query. Now captures Class::staticMethod(), ptr->method(), and this->method() with proper inferred_obj_type for cross-file resolution.

4. Fix _get_parent_context for C++

Use class_specifier (the correct C++ tree-sitter node type) instead of class_definition (Python's type). Also handle qualified_identifier and field_identifier when traversing function declarators.

5. Deduplicate pre_scan_cpp paths

Prevent duplicate entries in the imports map that caused the inheritance resolver's len(paths) == 1 check to fail for classes with multiple qualified method definitions.

6. Fix _find_lambda_assignments indentation

Correct a pre-existing indentation bug where the lambda type check ran outside the capture_name == 'name' block, plus update stale class_definition reference.

Note on args field

The args field on call data is now [] instead of a list of parameter dicts. The old code was incorrectly extracting the function definition's parameters rather than the call's arguments.

Tests

Added tests/unit/parsers/test_cpp_parser.py with 14 tests covering inheritance (5), qualified methods (2), call patterns (4), enums (2), and a realistic integration test (1).

Depends on

🤖 Generated with Claude Code

Major improvements to the C++ tree-sitter parser that enable proper
graph construction for C++ codebases:

1. Inheritance extraction: Replace hardcoded bases placeholder with
   _extract_base_classes() that walks base_class_clause children.
2. Qualified method definitions: Add qualified_identifier to the
   functions query so Class::method() definitions are captured.
3. Scoped and arrow call patterns: Capture Class::staticMethod(),
   ptr->method(), and this->method() with inferred_obj_type.
4. Fix _get_parent_context: Use class_specifier instead of
   class_definition (Python node type vs C++ node type).
5. Deduplicate pre_scan_cpp paths to fix inheritance resolution.
6. Fix _find_lambda_assignments indentation bug.

Note: call args field is now [] - the old code incorrectly extracted
the function definitions params, not the call arguments.

Depends on: fix/cpp-find-enums-nameerror

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 17, 2026

@jannahopp is attempting to deploy a commit to the shashankss1205's projects Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant