Skip to content

Add bender kg command and knowledge graph subsystem#306

Open
aottavianoTT wants to merge 1 commit into
pulp-platform:masterfrom
aottavianoTT:aottaviano/kg
Open

Add bender kg command and knowledge graph subsystem#306
aottavianoTT wants to merge 1 commit into
pulp-platform:masterfrom
aottavianoTT:aottaviano/kg

Conversation

@aottavianoTT
Copy link
Copy Markdown

@aottavianoTT aottavianoTT commented May 19, 2026

This PR introduces a knowledge graph (KG) pipeline that parses SystemVerilog designs via bender-slang into a queryable graph, exposed via bender kg CLI and an MCP server for AI assistant integration.

Architecture

The pipeline is split across six new crates:

Crate Role
bender-kg-models Shared IR types (ModuleData, PortInfo, ParamInfo, InstantiationInfo, …) used as the common data contract between crates
bender-kg-extract Drives bender-slang to parse SV source files and emit structured IR
bender-kg-store Persists the graph using Grafeo, augmented with an HNSW vector index for similarity lookups and BM25 for keyword search
bender-kg-similarity Produces dense per-module embeddings via a Model2Vec/ONNX pipeline, powering structural similarity queries
bender-kg-core Orchestrates the above into a typed Engine API; callers issue structured queries and get typed results back
bender-kg-mcp Wraps bender-kg-core in a stdio MCP server so AI assistants can call all queries as MCP tools

Query surface

Once the KG is built (or incrementally updated), it supports:

  • Module search — find modules by name or natural-language description
  • Hierarchy traversal — walk instantiation trees up and down
  • Port / parameter / signal tracing — trace connectivity across module boundaries
  • Structural similarity — find modules that are structurally similar based on dense embeddings
  • Connectivity checks — verify expected connections exist in the design

…igns

Introduces the `bender kg` command and six supporting crates:

- bender-kg-models: IR types (ModuleData, PortInfo, ParamInfo, InstantiationInfo, …)
- bender-kg-extract: SystemVerilog → IR extraction pipeline via bender-slang
- bender-kg-store: Grafeo-backed graph + HNSW vector + BM25 text store
- bender-kg-similarity: dense module embeddings (Model2Vec / ONNX)
- bender-kg-core: orchestration layer with typed Engine query API
- bender-kg-mcp: stdio MCP server exposing all queries as MCP tools

The kg subsystem supports build, incremental update, and a rich query surface
(module search, hierarchy traversal, port/parameter/signal tracing, structural
similarity, connectivity checks) consumable from both the CLI and AI assistants.
@aottavianoTT aottavianoTT changed the title Add bender knowledge-graph command Add bender kg command and knowledge graph subsystem May 19, 2026
@aottavianoTT aottavianoTT marked this pull request as ready for review May 25, 2026 00:30
Copy link
Copy Markdown
Member

@micprog micprog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a very large change/addition, created with a lot of AI support. I added some initial feedback to encourage separatation of concerns, but I'm sceptical if this can be reasonably integrated. As I see the use of having access to bender's internal information, I'm wondering if it would make more sense to push this into a separate project, extending bender itself to allow running of plugins that have access to some of the internal functions, e.g., WASM plugins with a bender-internal WASM runtime. I don't think we can confidently maintain such a large addition inside bender itself. What do you think?

Comment thread Cargo.toml
Comment on lines +108 to +115
slang = [
"dep:bender-slang",
"dep:bender-kg-core",
"dep:bender-kg-extract",
"dep:bender-kg-similarity",
"dep:bender-kg-mcp",
"dep:bender-kg-models",
]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if we could gate the new bender-kg behind an additional feature flag, disabled by default. This really looks like an AI-first feature, and I would argue we can keep it disabled for most use-cases.

Comment thread README.md
```


### `kg` --- Build and query the design knowledge graph
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a rebase, this will need to be moved to the book.

Comment thread Cargo.toml
"crates/bender-kg-mcp",
]

[workspace.dependencies]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to move these dependencies to the corresponding packages, or can we gate them only for when required? Just wondering regarding clutter.

Comment thread src/sess.rs
Comment on lines +456 to +459
// `.vh` headers are de-facto SystemVerilog macro files;
// downstream tools (VCS, slang, verilator) parse them as part
// of the unit, so classify them as Verilog here too.
Some("sv") | Some("v") | Some("vp") | Some("svh") | Some("vh") => {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unrelated change, if required move to a separate PR

Comment thread src/cmd.rs
Comment on lines +19 to +20
#[cfg(feature = "slang")]
pub mod kg;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned by the Cargo.toml, I'd move this to a separate new feature, not the slang feature. This is repeated across several files, I will not explicitly mention everywhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants