Add Initial Gateway API Inference Extension Support

### Gloo Edge Product

Open Source

### Gloo Edge Version

main

### Is your feature request related to a problem? Please describe.

[Gateway API Inference Extension](https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main) (formerly llm-instance-gateway) is a project that originated from [wg-serving](https://github.com/kubernetes/community/tree/master/wg-serving) and is sponsored by [SIG Network](https://github.com/kubernetes/community/blob/master/sig-network/README.md#gateway-api-inference-extension). The project provides APIs, a load balancing algorithm, [ext-proc](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_proc_filter) code, and controllers to support advanced routing of LLM traffic.

### Describe the solution you'd like

Add the following support:

- [x] Create an enhancement proposal that provides the API design and implementation details: https://github.com/kgateway-dev/kgateway/pull/10420
- [x] ~Update the GatewayClassParameters API to surface user-facing configuration for supported inference extensions: https://github.com/kgateway-dev/kgateway/pull/10601~. ATM not needed since the feature is auto-enabled if the inference extension CRDs are present in the cluster.
- [x] ~Update the [configuration](https://github.com/k8sgateway/k8sgateway/blob/main/install/helm/gloo/generate/values.go) API to enable/disable the gateway-api-inference-extension feature~. ATM not needed since the feature is auto-enabled if the inference extension CRDs are present in the cluster. 
- [x] ~Update Helm charts to install k8sgateway with the gateway-api-inference-extension feature based on the provided configuration~. ATM not needed since the feature is auto-enabled if the inference extension CRDs are present in the cluster. 
- [x] Add gateway-api-inference-extension as a supported [extension](https://github.com/k8sgateway/k8sgateway/tree/main/projects/gateway2/extensions2).
- [x] Add controllers that reconcile gateway-api-inference-extension custom resources, e.g. InferencePool. The controller should be optional, i.e. only run if the configuration option is enabled and the gateway-api-inference-extension CRDs exist.
- [x] Update RBAC rules to allow gateway-api-inference-extension controllers to get, list, watch, etc. gateway-api-inference-extension custom resources.
- [x] Update the [deployer](https://github.com/k8sgateway/k8sgateway/tree/main/projects/gateway2/deployer) pkg to manage the required gateway-api-inference-extension resources, e.g. Deployment, to run the [ext-proc](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_proc_filter) server.
- [x] Add InferencePool as a supported HTTPRoute backend reference.
- [x] Update the translator pkg to translate HTTPRoutes referencing an InferencePool resource.
- [x] Update the [proxy_syncer](https://github.com/k8sgateway/k8sgateway/tree/main/projects/gateway2/proxy_syncer) pkg to translate gateway-api-inference-extension CRs into Gloo Proxies and sync the proxy client with the newly translated proxies.
- [ ] Update the [reporter](https://github.com/k8sgateway/k8sgateway/tree/main/projects/gateway2/reports) pkg to support reporting gateway-api-inference-extension CRD status.
- [ ] Add initial e2e tests for this feature.
- [ ] Update CI to run e2e tests.
- [ ] Add initial user docs: https://github.com/kgateway-dev/kgateway.dev/issues/70. Owner: @artberger.
- [ ] Add `failureMode` support as a follow-up to https://github.com/kgateway-dev/kgateway/issues/10411.
- [ ] Update the deployer to support an HTTPRoute switching between Service and InferencePool backendRefs ([xref](https://github.com/kgateway-dev/kgateway/pull/10684#discussion_r1992121372)).
- [ ] Improve EPP RBAC based on https://github.com/kubernetes-sigs/gateway-api-inference-extension/issues/224. Either change ClusterRole and ClusterRoleBinding turn into Role and RoleBinding or the first EPP creates a common CR/CRB if it does not exist, additional InferencePools add their ServiceAccount to the `subjects` in the common ClusterRoleBinding and remove their entry upon InferencePool deletion. This additional complexity may not be worth the benefit of having a common CR/CRB for all EPPs.
- [ ] Track https://github.com/kubernetes-sigs/gateway-api-inference-extension/issues/507 for the status of auto-provisioning InferencePool infra and adjust the deployer accordingly.
- [ ] For multiple backends on one route, investigate using `RouteAction_WeightedClusters` which may use the current `ExtProcPerRoute` approach or put the ExtProc as an upstream Cluster filter. [(xref)](https://github.com/kgateway-dev/kgateway/pull/10684#discussion_r1992124021).
- [ ] Investigate using the standard EDS cluster created for the ext-proc service instead of creating a separate one [(xref)](https://github.com/kgateway-dev/kgateway/pull/10684#discussion_r1992132327). Consider changing the model from endpoint picker per upstream, to per GW.
- [ ] Investigate whether or not to remove finalizer from inferencepool controller.
- [ ] The `usedPools` field of `endpointPickerPass` should be `map[string]map[types.NamespacedName]*ir.InferencePool` to support per-filter-chain ([xref](https://github.com/kgateway-dev/kgateway/pull/10684#discussion_r1994185281)).
- [ ] Implement Override Host LB policy ([Envoy PR](https://github.com/envoyproxy/envoy/pull/38757) and [Infer Ext PR](https://github.com/kubernetes-sigs/gateway-api-inference-extension/pull/761)).
- [ ] Run [benchmarks](https://gateway-api-inference-extension.sigs.k8s.io/performance/benchmark/) and publish results.

### Describe alternatives you've considered

Do not support the [Gateway API Inference Extension](https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main) project.

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Initial Gateway API Inference Extension Support #10411

Gloo Edge Product

Gloo Edge Version

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add Initial Gateway API Inference Extension Support #10411

Description

Gloo Edge Product

Gloo Edge Version

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions