Skip to content

Generalize the request type passed down the framework plugins: rename LLM->Inference#2673

Merged
k8s-ci-robot merged 1 commit intokubernetes-sigs:mainfrom
RyanRosario:issue-2447
Apr 10, 2026
Merged

Generalize the request type passed down the framework plugins: rename LLM->Inference#2673
k8s-ci-robot merged 1 commit intokubernetes-sigs:mainfrom
RyanRosario:issue-2447

Conversation

@RyanRosario
Copy link
Copy Markdown
Contributor

@RyanRosario RyanRosario commented Mar 23, 2026

What type of PR is this?

/kind feature

What this PR does / why we need it:

Enables direct application across various GenAI models, not only OpenAI format, without rewriting the core admission, mutation, or scheduling flows. Pluggable parsers can now intercept raw request bytes and construct a generic InferenceRequest upfront, giving the EPP the flexibility to route, process, and score payloads transparently regardless of the original protocol.

Which issue(s) this PR fixes:
Related to #2447

Does this PR introduce a user-facing change?:

Enables the user to use protocols other than OpenAI via the generic InferenceRequest interface.

This is a series of 3 PRs.

(1) This PR simply renames LLMRequest to InferenceRequest
(2) The second PR moves RequestBody from scheduling to requesthandling (2808)
(3) The third PR separates the parser from the directory (2810)

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 23, 2026
@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 23, 2026

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 9230996
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/69d80edceb60bc000821dab8
😎 Deploy Preview https://deploy-preview-2673--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot requested review from ahg-g and liu-cong March 23, 2026 22:06
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 23, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @RyanRosario. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 23, 2026
Comment thread pkg/epp/framework/interface/scheduling/types.go Outdated
Comment thread design_request_handling_refactor.md Outdated
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 25, 2026
@RyanRosario RyanRosario force-pushed the issue-2447 branch 2 times, most recently from 1cce76e to f6b5ec6 Compare March 25, 2026 17:58
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 25, 2026
@RyanRosario
Copy link
Copy Markdown
Contributor Author

@zetxqx For now, I've had to commit some other files into this PR to get PROW to pass. I am not sure what the issue is here, but I want to keep the ball rolling. go.mod, go.sum, Makefile and kal.yaml should/will not be part of the final PR.

Comment thread .github/workflows/kal.yml Outdated
Comment thread pkg/bbr/server/options_test.go
Comment thread pkg/common/observability/logging/options_test.go
Comment thread pkg/epp/datalayer/data_graph_test.go
Comment thread pkg/epp/framework/interface/scheduling/types_test.go
Comment thread pkg/epp/framework/interface/requesthandling/types.go Outdated
Comment thread pkg/epp/framework/interface/scheduling/types.go Outdated
Comment thread pkg/epp/requestcontrol/admission.go Outdated
Comment thread pkg/epp/requestcontrol/director.go
Comment thread pkg/epp/handlers/server.go
Comment thread pkg/epp/framework/interface/requestcontrol/plugins.go Outdated
Comment thread pkg/epp/framework/plugins/scheduling/scorer/predictedlatency/scorer_test.go Outdated
Comment thread pkg/epp/framework/plugins/scheduling/scorer/runningrequests/running_test.go Outdated
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 26, 2026
@RyanRosario
Copy link
Copy Markdown
Contributor Author

@zetxqx I still have a few comments to address but wanted to address the rest in this PR.

Comment thread pkg/epp/framework/interface/scheduling/types.go Outdated
@RyanRosario RyanRosario changed the title Generalize the request type passed down the framework plugins Generalize the request type passed down the framework plugins: move parser out of director Mar 27, 2026
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 28, 2026
@RyanRosario RyanRosario changed the title Generalize the request type passed down the framework plugins: move parser out of director [WIP] Generalize the request type passed down the framework plugins: move parser out of director Mar 28, 2026
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 28, 2026
Comment thread pkg/epp/framework/interface/requestcontrol/plugins.go Outdated
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Apr 8, 2026
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 8, 2026
@kaushikmitr
Copy link
Copy Markdown
Contributor

the latency predictor related changes look fine now, as i only see the name change.

@RyanRosario
Copy link
Copy Markdown
Contributor Author

@ahg-g Please review when you get a chance.

@zetxqx
Copy link
Copy Markdown
Contributor

zetxqx commented Apr 8, 2026

/lgtm

@ahg-g Copying from the PR description, discussed with @RyanRosario , we want to split the refactoring into the following three PRs.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 8, 2026
@zetxqx
Copy link
Copy Markdown
Contributor

zetxqx commented Apr 9, 2026

/retest

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 9, 2026
@RyanRosario RyanRosario force-pushed the issue-2447 branch 3 times, most recently from a1dfc37 to 9bb89eb Compare April 9, 2026 19:33
@zetxqx
Copy link
Copy Markdown
Contributor

zetxqx commented Apr 9, 2026

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 9, 2026
@ahg-g
Copy link
Copy Markdown
Contributor

ahg-g commented Apr 10, 2026

/lgtm
/approve

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, RyanRosario

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 10, 2026
@k8s-ci-robot k8s-ci-robot merged commit 9260cda into kubernetes-sigs:main Apr 10, 2026
11 checks passed
@nirrozenbaum
Copy link
Copy Markdown
Contributor

starting from this PR image builds are failing, I assume it's related.

@ahg-g
Copy link
Copy Markdown
Contributor

ahg-g commented Apr 13, 2026

starting from this PR image builds are failing, I assume it's related.

The failed build logs: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/post-inference-extension-push-images/2042704260231073792

@ahg-g ahg-g mentioned this pull request Apr 13, 2026
@ahg-g
Copy link
Copy Markdown
Contributor

ahg-g commented Apr 13, 2026

Created #2832

elevran pushed a commit to llm-d/llm-d-inference-scheduler that referenced this pull request Apr 23, 2026
…on#2673)

Co-authored-by: Ryan Rosario <6713180+RyanRosario@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants