Skip to content

Conversation

@ntkathole
Copy link
Member

What this PR does / why we need it:

This PR adds !isRestService check in the grpcEnabled && restEnabled branch to prevent setting the annotation on the REST service.

When both gRPC and REST APIs are enabled for the registry service with OpenShift TLS, the TLS certificate annotation (service.beta.openshift.io/serving-cert-secret-name) was being set on both the gRPC and REST services. This caused OpenShift's service serving certificate controller to potentially create a certificate with the REST service hostname as the Common Name (CN) instead of the gRPC service hostname.

This resulted in gRPC client connections failing with the error:
InactiveRpcError: Peer name feast-registry.tests.svc.cluster.local is not in peer certificate

The error occurred because:

  • The gRPC client connects using the gRPC service hostname (e.g., feast-registry.tests.svc.cluster.local)
  • But the certificate CN was set to the REST service hostname (e.g., feast-registry-rest.tests.svc.cluster.local)
  • Even though both hostnames were in the SANs, gRPC validates against the CN

Root Cause

The setService() function is called twice when both services are enabled:

  1. Once for the gRPC service (isRestService=false) via createService()
  2. Once for the REST service (isRestService=true) via createRestService()

The original code set the TLS annotation on both services when grpcEnabled && restEnabled was true, without checking which service was being processed. OpenShift's certificate controller creates certificates based on the service where the annotation is set, and if the REST service annotation was processed first or separately, it would create a certificate with the wrong CN.

So, Modified the setService() function to only set the TLS annotation on the gRPC service when both services are enabled by adding a !isRestService check. This ensures:

  1. OpenShift creates the certificate with the gRPC service name as the CN (correct for gRPC connections)
  2. Both hostnames are included as Subject Alternative Names (SANs) via the serving-cert-sans annotation
  3. The REST service can still use the same certificate (mounted in the pod from the gRPC service secret)

@ntkathole ntkathole requested a review from a team as a code owner November 6, 2025 13:43
@ntkathole ntkathole self-assigned this Nov 6, 2025
@ntkathole ntkathole marked this pull request as draft November 6, 2025 13:44
@ntkathole ntkathole marked this pull request as ready for review November 13, 2025 05:32
@ntkathole ntkathole force-pushed the fix_tls_san branch 2 times, most recently from 86ac153 to be7c6e6 Compare November 13, 2025 06:22
Copy link
Contributor

@Srihari1192 Srihari1192 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested feature store instance creation with multiple configurations by enabling grpc: true. Verified that Feast Workbench connections and Feast method invocations are working as expected.

Signed-off-by: ntkathole <nikhilkathole2683@gmail.com>
@ntkathole ntkathole merged commit 75d13db into feast-dev:master Nov 17, 2025
23 of 25 checks passed
franciscojavierarceo pushed a commit that referenced this pull request Dec 16, 2025
# [0.58.0](v0.57.0...v0.58.0) (2025-12-16)

### Bug Fixes

* Add java proto ([#5719](#5719)) ([fc3ea20](fc3ea20))
* Add possibility to force full features names for materialize ops ([#5728](#5728)) ([55c9c36](55c9c36))
* Fixed file registry cache sync ([09505d4](09505d4))
* Handle hyphon in sqlite project name ([#5575](#5575)) ([#5749](#5749)) ([b8346ff](b8346ff))
* Pinned substrait to fix protobuf issue ([d0ef4da](d0ef4da))
* Set TLS certificate annotation only on gRPC service ([#5715](#5715)) ([75d13db](75d13db))
* SQLite online store deletes tables from other projects in shared registry scenarios ([#5766](#5766)) ([fabce76](fabce76))
* Validate not existing entity join keys for preventing panic ([0b93559](0b93559))

### Features

* Add annotations for pod templates ([534e647](534e647))
* Add Pytorch template ([#5780](#5780)) ([6afd353](6afd353))
* Add support for extra options for stream source ([#5618](#5618)) ([18956c2](18956c2))
* Added matched_tag field search api results with fuzzy search capabilities ([#5769](#5769)) ([4a9ffae](4a9ffae))
* Added support for enabling metrics in Feast Operator ([#5317](#5317)) ([#5748](#5748)) ([a8498c2](a8498c2))
* Configure CacheTTLSecondscache,CacheMode for file-based registry in Feast Operator([#5708](#5708)) ([#5744](#5744)) ([f25f83b](f25f83b))
* Implemented Tiling Support for Time-Windowed Aggregations ([#5724](#5724)) ([7a99166](7a99166))
* Offline Store historical features retrieval based on datetime range for spark ([#5720](#5720)) ([27ec8ec](27ec8ec))
* Offline Store historical features retrieval based on datetime range in dask ([#5717](#5717)) ([a16582a](a16582a))
* Production ready feast operator with v1 apiversion ([#5771](#5771)) ([49359c6](49359c6))
* Support for Map value data type ([#5768](#5768)) ([#5772](#5772)) ([b99a8a9](b99a8a9))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants