Add DataProducer (PrepareData) and Admission control plugins #1796

rahulgurnani · 2025-10-31T16:22:18Z

Add PrepareData and AdmitRequest plugins based on recommendations in evolving datalayer changes

Also, some refactor of director.go

What type of PR is this?

/kind feature

What this PR does / why we need it:

Which issue(s) this PR fixes:

Addresses #1743

Does this PR introduce a user-facing change?:

Yes, enables writing prepare data and admit request plugins for users of IGW.

netlify · 2025-10-31T16:22:24Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`a7354ca`
🔍 Latest deploy log	https://app.netlify.com/projects/gateway-api-inference-extension/deploys/690a8475b3f9960008b1aabb
😎 Deploy Preview	https://deploy-preview-1796--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

k8s-ci-robot · 2025-10-31T16:22:26Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rahulgurnani
Once this PR has been reviewed and has the lgtm label, please assign nirrozenbaum for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2025-10-31T16:22:29Z

Hi @rahulgurnani. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

kfswain · 2025-10-31T20:50:01Z

/ok-to-test

kfswain · 2025-10-31T20:11:13Z

pkg/epp/requestcontrol/director.go

+	loggerDebug := log.FromContext(ctx).V(logutil.DEBUG)
+	for _, plugin := range d.requestControlPlugins.admitRequestPlugins {
+		loggerDebug.Info("Running AdmitRequest plugin", "plugin", plugin.TypedName())
+		if !plugin.Admit(ctx, request, pods) {


We should allow the plugin to return a string explaining the reason for rejection. We can then just treat the empty string as the allow mechanism (less opinionated on this part tho).

Updated, thanks!

kfswain · 2025-10-31T20:19:29Z

pkg/epp/datalayer/attributemap.go

 	"sync"
 )

+// TODO(rahulgurnani): Deprecate this AttributeMap in favor of AttributeMap added in scheduling layer.


I think we may want to invert this. The attribute map is a component of the data system, and so we should refactor the data layer attribute map to fit our needs. The scheduling layer is informed by the data layer, so our structures should reflect that

kfswain · 2025-10-31T20:30:52Z

pkg/epp/requestcontrol/director.go

-// It always returns the requestContext even in the error case, as the request context is used in error handling.
-func (d *Director) HandleRequest(ctx context.Context, reqCtx *handlers.RequestContext) (*handlers.RequestContext, error) {
-	logger := log.FromContext(ctx)
+// getInferenceObjective creates inferenceObjective based on reqCtx.


update comment to reflect the fetch, and it only creates (with a default prio) should no infObjective be found

pkg/epp/requestcontrol/director.go

kfswain · 2025-10-31T20:36:55Z

pkg/epp/requestcontrol/director.go

-	// Parse Request, Resolve Target Models, and Determine Parameters
+// resolveTargetModel is a helper that resolves targetModel
+// and updates the reqCtx and ctx.
+func (d *Director) resolveTargetModel(reqCtx *handlers.RequestContext) error {


Principle of Least Surprise - Recommend returning the reqCtx here, generally we want to avoid modifying param structures without explicitly returning. And if we do need to modify the object in mem, our goDocs should specify that we do.

I think we do need to modify the reqCtx, copying it may be an overkill. Also, the comments of the method state it updates reqCtx. Please let me know if its unclear presently. Thanks!

I think the intention was NOT to copy, but to explicitly return the pointer.

Updated the function. Thanks!

pkg/epp/requestcontrol/plugins.go

kfswain · 2025-10-31T20:45:42Z

pkg/epp/scheduling/types/types.go

+
+// Attributes provides a goroutine-safe implementation of AttributeMap.
+type Attributes struct {
+	data sync.Map // key: attribute name (string), value: attribute value (opaque, Cloneable)


Typically I am all for using prebuild libs to handle this type of complexity.

But since writes to specific attributes will lock the entire data object, we may have high lock contention here. Did we explore having a lock per attribute key?

That would let locks be at the granularity of a specific endpoint & a specific attribute, which should hopefully reduce lock contention & let our system be more performant.

Thanks for the suggestion. This attribute map is a per request copy where we take a snapshot of the attributes so that we can use them in the scheduling layer. Given the per request nature of the map, I think it won't have contention because it will take like t < microsecond to update the map. I think its reasonable to use sync map here.

Can we scale test to ensure we don't have any regression? We can consider baseline metrics what we have here: #1458

pkg/epp/scheduling/types/types.go

kfswain · 2025-10-31T20:52:08Z

pkg/epp/requestcontrol/director_test.go

 			targetModelName:      model,
 		},
+		{
+			name: "successful chat completions request with prepare data plugins",


Can we have a test cast that functionally validates the intent of the plugin?

Currently we have a thin set of tests that basically just validate that we can successfully call a noop function.

kfswain · 2025-10-31T21:24:37Z

pkg/epp/requestcontrol/director.go

+	snapshotOfCandidatePods := d.toSchedulerPodMetrics(candidatePods)
+
+	// Prepare per request data
+	d.runPrepareDataPlugins(ctx, reqCtx.SchedulingRequest, snapshotOfCandidatePods)


The design doc detailed adding timeouts and error limits to these prepare data plugins as to not become a blocker in the event that the plugins are performing poorly, allowing the system to remain resilient. Was this considered in this PR?

I am planning to do that in a follow up PR. Keeping it simpler in the initial change to ensure the SLO aware routing work is unblocked and can use this. Also I think it would be simpler to comprehend/debug if we don't have retires/timeouts. Added TODO for now. Hope that's okay. Thanks!

Keeping it simpler in the initial change to ensure the SLO aware routing work is unblocked and can use this.

SLO aware routing is actually the use case I am concerned about, as it has known performance issues. In an effort to defend the systems current operations we need timeouts & retires to safely implement this functionality

I see. Updated the methods to make prepare data execute parallely for all plugins with timeout and retries. PTAL. Thanks!

nirrozenbaum · 2025-11-02T13:37:00Z

pkg/epp/requestcontrol/director.go

-func (d *Director) HandleRequest(ctx context.Context, reqCtx *handlers.RequestContext) (*handlers.RequestContext, error) {
-	logger := log.FromContext(ctx)
+// getInferenceObjective fetches the inferenceObjective from the datastore otherwise creates a new one based on reqCtx.
+func (d *Director) getInferenceObjective(logger logr.Logger, reqCtx *handlers.RequestContext) *v1alpha2.InferenceObjective {


can we remove logger from the arguments?
This comes back to a discussion we had on a different PR. the convention we use in the project is contextual logging, we take logger from context and use it instead of passing around logger as arg.
e.g.,

log.FromContext(ctx).V(logutil.VERBOSE).Info(...)

nirrozenbaum · 2025-11-03T12:03:20Z

pkg/epp/scheduling/types/types.go

+// AttributeMap is used to store flexible metadata or traits
+// across different aspects of an inference server.
+// Stored values must be Cloneable. This is a per-request snapshot of the attributes.
+type AttributeMap interface {


why do we duplicate this code from datalayer?

nirrozenbaum · 2025-11-03T12:08:02Z

pkg/epp/requestcontrol/plugins.go

+type PrepareData interface {
+	plugins.Plugin
+	PrepareData(ctx context.Context, request *types.LLMRequest, pods []types.Pod)
+}


I'm bit confused by this PR.

we've added recently two interfaces for ProducerPlugin and ConsumerPlugin.
ProducerPlugin is a plugin that produces data.
why do we add another interface for a plugin the produces data?
shouldn't PrepareData function be on the same ProducerPlugin interface and validate that the produced data types are matching the types declaration?

Recommended reading: https://docs.google.com/document/d/1EQwXL2pCuUyM1B917FUgP_8pFS3VF8F_bUfjy8IE7gM/edit?tab=t.vmaefhinvkl5#heading=h.s9mr6kynb3ls

But the TL;DR is

Consume/Produce are expressions of intent only called on startup.

PrepareData is the actual generation of request specific data

right.
would it be correct to say that in order for a plugin to implement PrepareData it should always declare which data it would produce? meaning these two should always go together in the same interface?

if that’s not the case, can you give an example where one is used but the other isn’t?

Definitely, some data producers will still specify the data they produce, but not generate request specific data, and so do not need to implement PrepareData()

So while its true that an implementer of PrepareData() will always implement Produces(). Not all plugins that implement Produces() will implement PrepareData(). The Metrics scrapers are a great example of this

right. metrics collectors only produce, and not prepare per request.

an implementer of PrepareData() will always implement Produces()

I think the above should be forced, for example through the interface:

type PrepareData interface { plugins.ProducerPlugin // < THIS IS THE CHANGE PrepareData(ctx context.Context, request *types.LLMRequest, pods []types.Pod) }

as a side note - I think the naming selection is critical for readability.
PrepareData is too generic and doesn't say anything about a "per request".
it might be worth calling the function - PrepareRequestData (and rename the interface, e.g., DataProducer or something like that).

That all sounds reasonable to me

Updated. Thanks for the suggestions!

nirrozenbaum · 2025-11-03T12:12:04Z

pkg/epp/requestcontrol/plugins.go

+
+// AdmitRequest is called by the director after the PrepareData plugins and before scheduling.
+// AdmitRequest plugin is implemented by plugins for admission control. These plugins need to implement Admit method.
+type AdmitRequest interface {


naming selection should be changed IMO.
more specifically, struct/interface names should be nouns, e.g., AdmissionPlugin, while function names should be verbs based, e.g., AdmitRequest.
additionally, the documentation should explain what happens when one is using multiple admission plugins (e.g., the request should pass all admission checks).

Updated, thanks

nirrozenbaum · 2025-11-03T12:14:44Z

pkg/epp/requestcontrol/director.go

 		return reqCtx, errutil.Error{Code: errutil.ServiceUnavailable, Msg: "failed to find candidate pods for serving the request"}
 	}
-
+	// TODO(rahulgurnani/lukevandrie): Perhaps, refactor/implement Admit plugin for Admission control.


this comment is important. what is the relation between admission controller Admit and the admitRequest plugins?

++

This would be a break in contract of how flow control operates. This specific plugin is for request specific semantics. We have currently do not have Flow Control considering request specific semantics, and there hasnt been a proposal suggesting this change. I think we should remove this todo until we have strong reasoning to actually do this work.

my apologize, but I don't understand the intention of this new Admission plugin.
I thought we want to have admission check pluggable, but it seems now that we have two types of admission checks, with two different interfaces.
this seems wrong.

Removed the comment. Thanks for the catch!

pkg/epp/requestcontrol/director.go

nirrozenbaum · 2025-11-03T12:24:06Z

pkg/epp/requestcontrol/director.go

-	result, err := d.scheduler.Schedule(ctx, reqCtx.SchedulingRequest, d.toSchedulerPodMetrics(candidatePods))
+	// Prepare per request data
+	// TODO(rahulgurnani): Add retries and timeout in the preparedata step.
+	d.runPrepareDataPlugins(ctx, reqCtx.SchedulingRequest, snapshotOfCandidatePods)


why do we create snapshot of candidate pods?
we should work with candidatePods and create a snapshot only when calling the scheduler.
this is true for both prep data and admit request.
helper functions in the director should not rely on the internal scheduler representation of the endpoints.

both prep data and admit request are request specific, and so if we add request specific data to the shared endpoints that could risk data corruption.

Snapshotting before these steps ensures that this data lifecycle is only in the context it is consumed in.

the intention of converting the endpoints representation to scheduler internal structure was only for the purpose of sending it to the scheduler.
PodMetrics has MetricsState behind atomic pointer and reading the metrics is an atomic operation (read all metrics in one operation).
I must be missing something although I've read the proposal doc.

…in parallel

pkg/epp/requestcontrol/director.go

nirrozenbaum · 2025-11-04T08:22:04Z

pkg/epp/requestcontrol/request_control_config.go


 // Config provides a configuration for the requestcontrol plugins.
 type Config struct {
+	admitRequestPlugins      []AdmissionPlugin


Suggested change

admitRequestPlugins []AdmissionPlugin

admissionPlugins []AdmissionPlugin

nirrozenbaum · 2025-11-04T08:23:41Z

pkg/epp/requestcontrol/request_control_config.go

+// WithAdmitRequestPlugins sets the given plugins as the AdmitRequest plugins.
+func (c *Config) WithAdmitRequestPlugins(plugins ...AdmissionPlugin) *Config {
+	c.admitRequestPlugins = plugins
+	return c
+}


Suggested change

// WithAdmitRequestPlugins sets the given plugins as the AdmitRequest plugins.

func (c *Config) WithAdmitRequestPlugins(plugins ...AdmissionPlugin) *Config {

c.admitRequestPlugins = plugins

return c

}

// WithAdmissionPlugins sets the given plugins as the AdmissionPlugin plugins.

func (c *Config) WithAdmissionPlugins(plugins ...AdmissionPlugin) *Config {

c.admissionPlugins = plugins

return c

}

nirrozenbaum · 2025-11-04T08:24:01Z

pkg/epp/requestcontrol/request_control_config.go

 	return c
 }

+// WithPrepareDataPlugins sets the given plugins as the PrepareData plugins.


Suggested change

// WithPrepareDataPlugins sets the given plugins as the PrepareData plugins.

// WithDataProducers sets the given plugins as the DataProducer plugins.

nirrozenbaum · 2025-11-04T08:35:58Z

pkg/epp/requestcontrol/director.go


-	result, err := d.scheduler.Schedule(ctx, reqCtx.SchedulingRequest, d.toSchedulerPodMetrics(candidatePods))
+	// Run admit request plugins
+	if !d.runAdmitRequestPlugins(ctx, reqCtx.SchedulingRequest, snapshotOfCandidatePods) {


can you give an example of how AdmissionPlugin is using candidate pods?
IMO admission check should include only the request (including its metadata like headers) and the system state (like saturation or other system metrics).

@kfswain I really think we should think carefully on the interfaces and align on them.
it doesn't make sense to me to pass the candidate pods to admission check.

I would be happy to get an example for an admission check that is per request data and depends on the candidate pods (other than saturation that is checked separately).
I'm a bit concerned about over complication here, and unless there is a real use case for that we should probably wait with this addition (talking only about admission, data producer/consumer is clearly needed).

separately, I think there is a missing struct, like RequestState, that conceptually should be similar to PluginState we have today. RequestState should be created at the beginning of the request, and passed around for RequestDataProducers to be filled, and later on to ConsumerPlugins (probably instead of CycleState).

I was mostly following recommendation in https://docs.google.com/document/d/1EQwXL2pCuUyM1B917FUgP_8pFS3VF8F_bUfjy8IE7gM/edit?tab=t.vmaefhinvkl5#heading=h.5qzcvbilfn6d

I think we would be adding the first AdmitRequest plugin with latency predictor.

@kfswain please keep me honest here. Thanks!

k8s-ci-robot · 2025-11-04T23:03:24Z

@rahulgurnani: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-gateway-api-inference-extension-verify-main	`a7354ca`	link	true	`/test pull-gateway-api-inference-extension-verify-main`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

rahulgurnani added 2 commits October 30, 2025 17:47

Refactor director to split into smaller functions

9cfe65f

Add AdmitRequest and PrepareData plugins

dfea9cf

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 31, 2025

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 31, 2025

k8s-ci-robot requested review from ahg-g and elevran October 31, 2025 16:22

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 31, 2025

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 31, 2025

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 31, 2025

kfswain reviewed Oct 31, 2025

View reviewed changes

Add unit tests and comments

bf49fb5

rahulgurnani force-pushed the preparedata-changes branch from 63208cf to bf49fb5 Compare October 31, 2025 22:56

Add comments

d6c81d9

nirrozenbaum reviewed Nov 2, 2025

View reviewed changes

rahulgurnani force-pushed the preparedata-changes branch 2 times, most recently from e9f0b7f to 3e16069 Compare November 3, 2025 03:05

Address review comments

e3021c7

rahulgurnani force-pushed the preparedata-changes branch from 3e16069 to e3021c7 Compare November 3, 2025 03:13

nirrozenbaum reviewed Nov 3, 2025

View reviewed changes

pkg/epp/requestcontrol/director.go Show resolved Hide resolved

nirrozenbaum reviewed Nov 3, 2025

View reviewed changes

Make PrepareData step time bound and execute all preparedata plugins …

9d38b44

…in parallel

nirrozenbaum reviewed Nov 4, 2025

View reviewed changes

pkg/epp/requestcontrol/director.go Outdated Show resolved Hide resolved

nirrozenbaum reviewed Nov 4, 2025

View reviewed changes

rahulgurnani changed the title ~~Add PrepareData and AdmitRequest plugins~~ Add DataProducer (PrepareData) and Admission control plugins Nov 4, 2025

Update interface names based on suggestions

a7354ca

rahulgurnani force-pushed the preparedata-changes branch from 6b5e3c4 to a7354ca Compare November 4, 2025 22:55

	admitRequestPlugins []AdmissionPlugin
	admissionPlugins []AdmissionPlugin

	// WithPrepareDataPlugins sets the given plugins as the PrepareData plugins.
	// WithDataProducers sets the given plugins as the DataProducer plugins.

Add DataProducer (PrepareData) and Admission control plugins #1796

Are you sure you want to change the base?

Add DataProducer (PrepareData) and Admission control plugins #1796

Conversation

rahulgurnani commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for gateway-api-inference-extension ready!

Uh oh!

k8s-ci-robot commented Oct 31, 2025

Uh oh!

k8s-ci-robot commented Oct 31, 2025

Uh oh!

kfswain commented Oct 31, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rahulgurnani Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nirrozenbaum Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kfswain Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nirrozenbaum Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

rahulgurnani commented Oct 31, 2025 •

edited

Loading

netlify bot commented Oct 31, 2025 •

edited

Loading

rahulgurnani Nov 4, 2025 •

edited

Loading

nirrozenbaum Nov 3, 2025 •

edited

Loading

kfswain Nov 3, 2025 •

edited

Loading

nirrozenbaum Nov 3, 2025 •

edited

Loading

nirrozenbaum Nov 3, 2025 •

edited

Loading

rahulgurnani Nov 3, 2025 •

edited

Loading

nirrozenbaum Nov 3, 2025 •

edited

Loading

kfswain Nov 3, 2025 •

edited

Loading