Generalize the request type passed down the framework plugins: rename LLM->Inference by RyanRosario · Pull Request #2673 · kubernetes-sigs/gateway-api-inference-extension

RyanRosario · 2026-03-23T22:06:08Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

Enables direct application across various GenAI models, not only OpenAI format, without rewriting the core admission, mutation, or scheduling flows. Pluggable parsers can now intercept raw request bytes and construct a generic InferenceRequest upfront, giving the EPP the flexibility to route, process, and score payloads transparently regardless of the original protocol.

Which issue(s) this PR fixes:
Related to #2447

Does this PR introduce a user-facing change?:

Enables the user to use protocols other than OpenAI via the generic InferenceRequest interface.

This is a series of 3 PRs.

(1) This PR simply renames LLMRequest to InferenceRequest
(2) The second PR moves RequestBody from scheduling to requesthandling (2808)
(3) The third PR separates the parser from the directory (2810)

netlify · 2026-03-23T22:06:17Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`9230996`
🔍 Latest deploy log	https://app.netlify.com/projects/gateway-api-inference-extension/deploys/69d80edceb60bc000821dab8
😎 Deploy Preview	https://deploy-preview-2673--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

k8s-ci-robot · 2026-03-23T22:06:20Z

Hi @RyanRosario. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

RyanRosario · 2026-03-25T23:42:39Z

@zetxqx For now, I've had to commit some other files into this PR to get PROW to pass. I am not sure what the issue is here, but I want to keep the ball rolling. go.mod, go.sum, Makefile and kal.yaml should/will not be part of the final PR.

RyanRosario · 2026-03-26T21:33:45Z

@zetxqx I still have a few comments to address but wanted to address the rest in this PR.

kaushikmitr · 2026-04-08T19:21:38Z

the latency predictor related changes look fine now, as i only see the name change.

RyanRosario · 2026-04-08T19:42:34Z

@ahg-g Please review when you get a chance.

zetxqx · 2026-04-08T19:54:01Z

/lgtm

@ahg-g Copying from the PR description, discussed with @RyanRosario , we want to split the refactoring into the following three PRs.

The PR simply renames LLMRequest to InferenceRequest

The second PR moves RequestBody from scheduling to requesthandling (Generalize request type: Move RequestBody from scheduling to requesthandling #2808)

The third PR separates the parser from the director (Generalize request type: remove parser from director #2810)

zetxqx · 2026-04-09T17:27:46Z

/retest

zetxqx · 2026-04-09T22:00:54Z

/lgtm

ahg-g · 2026-04-10T20:38:37Z

/lgtm
/approve

k8s-ci-robot · 2026-04-10T20:38:46Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, RyanRosario

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [ahg-g]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

nirrozenbaum · 2026-04-12T10:28:15Z

starting from this PR image builds are failing, I assume it's related.

ahg-g · 2026-04-13T09:25:03Z

starting from this PR image builds are failing, I assume it's related.

The failed build logs: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/post-inference-extension-push-images/2042704260231073792

ahg-g · 2026-04-13T09:27:07Z

Created #2832

…on#2673) Co-authored-by: Ryan Rosario <6713180+RyanRosario@users.noreply.github.com>

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 23, 2026

k8s-ci-robot requested review from ahg-g and liu-cong March 23, 2026 22:06

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 23, 2026

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 23, 2026

zetxqx reviewed Mar 23, 2026

View reviewed changes

Comment thread pkg/epp/framework/interface/scheduling/types.go Outdated

RyanRosario force-pushed the issue-2447 branch from 4476f4a to 6ae537d Compare March 24, 2026 00:51

hexfusion reviewed Mar 24, 2026

View reviewed changes

Comment thread design_request_handling_refactor.md Outdated

RyanRosario mentioned this pull request Mar 24, 2026

Extract shared request/response types from scheduling package into a dependency-free package #2677

Open

RyanRosario force-pushed the issue-2447 branch from 087a158 to 8eec71b Compare March 25, 2026 03:28

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 25, 2026

RyanRosario force-pushed the issue-2447 branch 2 times, most recently from 1cce76e to f6b5ec6 Compare March 25, 2026 17:58

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 25, 2026

RyanRosario force-pushed the issue-2447 branch from 8c75863 to b606bac Compare March 25, 2026 23:36

zetxqx reviewed Mar 26, 2026

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 26, 2026

zetxqx reviewed Mar 27, 2026

View reviewed changes

Comment thread pkg/epp/framework/interface/scheduling/types.go Outdated

RyanRosario changed the title ~~Generalize the request type passed down the framework plugins~~ Generalize the request type passed down the framework plugins: move parser out of director Mar 27, 2026

RyanRosario force-pushed the issue-2447 branch from 253977d to cb1c67d Compare March 28, 2026 08:35

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 28, 2026

RyanRosario changed the title ~~Generalize the request type passed down the framework plugins: move parser out of director~~ [WIP] Generalize the request type passed down the framework plugins: move parser out of director Mar 28, 2026

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 28, 2026

RyanRosario commented Mar 28, 2026

View reviewed changes

Comment thread pkg/epp/framework/interface/requestcontrol/plugins.go Outdated

k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Apr 8, 2026

RyanRosario force-pushed the issue-2447 branch from 5b50b60 to 9db728f Compare April 8, 2026 18:34

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 8, 2026

RyanRosario force-pushed the issue-2447 branch from 9db728f to 80bfabf Compare April 8, 2026 18:55

k8s-ci-robot assigned zetxqx Apr 8, 2026

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 8, 2026

RyanRosario force-pushed the issue-2447 branch from 80bfabf to 43a0750 Compare April 9, 2026 17:36

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 9, 2026

RyanRosario force-pushed the issue-2447 branch 3 times, most recently from a1dfc37 to 9bb89eb Compare April 9, 2026 19:33

Rename LLM ->Inference

9230996

RyanRosario force-pushed the issue-2447 branch from 9bb89eb to 9230996 Compare April 9, 2026 20:40

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 9, 2026

k8s-ci-robot assigned ahg-g Apr 10, 2026

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 10, 2026

k8s-ci-robot merged commit 9260cda into kubernetes-sigs:main Apr 10, 2026
11 checks passed

ahg-g mentioned this pull request Apr 13, 2026

Image build is failing #2832

Closed

elevran pushed a commit to llm-d/llm-d-inference-scheduler that referenced this pull request Apr 23, 2026

Rename LLM ->Inference (kubernetes-sigs/gateway-api-inference-extensi…

ac378c0

…on#2673) Co-authored-by: Ryan Rosario <6713180+RyanRosario@users.noreply.github.com>

Conversation

RyanRosario commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify Bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for gateway-api-inference-extension ready!

Uh oh!

k8s-ci-robot commented Mar 23, 2026

Uh oh!

Uh oh!

Uh oh!

RyanRosario commented Mar 25, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RyanRosario commented Mar 26, 2026

Uh oh!

Uh oh!

Uh oh!

kaushikmitr commented Apr 8, 2026

Uh oh!

RyanRosario commented Apr 8, 2026

Uh oh!

zetxqx commented Apr 8, 2026

Uh oh!

zetxqx commented Apr 9, 2026

Uh oh!

zetxqx commented Apr 9, 2026

Uh oh!

ahg-g commented Apr 10, 2026

Uh oh!

k8s-ci-robot commented Apr 10, 2026

Uh oh!

Uh oh!

nirrozenbaum commented Apr 12, 2026

Uh oh!

ahg-g commented Apr 13, 2026

Uh oh!

ahg-g commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

RyanRosario commented Mar 23, 2026 •

edited

Loading

netlify Bot commented Mar 23, 2026 •

edited

Loading