This document provides an overview of Feast's feature serving architecture, which enables low-latency feature retrieval for online inference and batch scoring. Feature serving is the layer that reads features from online and offline stores and delivers them to ML models in production environments.
For detailed information about specific server implementations, see:
For information about the underlying stores that feature servers read from, see Online Stores and Offline Stores.
Feast's feature serving layer provides multiple mechanisms for retrieving features at inference time. The system supports three main patterns:
The serving layer is implemented in three languages (Python, Go, Java), each optimized for different deployment scenarios and ecosystems.
Multi-Language Feature Server Architecture
Sources: docs/SUMMARY.md38 infra/charts/feast/README.md1-82 README.md225-232
| Server | Language | Primary Use Case | Key Features |
|---|---|---|---|
| Python Feature Server | Python | Feature-rich deployments, experimentation | Native transformations, full SDK access, FastAPI |
| Go Feature Server | Go | High-performance production | Low latency, minimal memory, efficient concurrency |
| Java Feature Server | Java/Spring Boot | JVM ecosystems, enterprise | Spring Boot integration, gRPC/HTTP, Helm charts |
The Python feature server is implemented using FastAPI and supports both HTTP REST and gRPC protocols. It provides the most complete feature set including native support for on-demand transformations.
Key Components:
feast serve CLI command to start the serverFeatureStore.get_online_features() for direct SDK accessDeployment:
Sources: docs/getting-started/quickstart.md136-164 docs/how-to-guides/running-feast-in-production.md185-204 README.md136-164
The Go feature server is optimized for high-throughput, low-latency scenarios. It delegates transformation logic to the Python transformation service while handling feature retrieval directly.
Architecture:
Go Feature Server Request Flow
Configuration:
The Go server requires transformation_service_endpoint in feature_store.yaml when on-demand features are used.
Sources: infra/charts/feast/charts/feature-server/values.yaml13-16 docs/reference/registries/metadata.md28-29 CHANGELOG.md25
The Java feature server is built on Spring Boot and provides a production-ready gRPC and HTTP interface. It's designed for deployment in JVM-based enterprise environments.
Deployment Configuration:
| Configuration | Description | Default |
|---|---|---|
application.yaml | Default Spring Boot config | Included |
application-override.yaml | Custom overrides (ConfigMap) | User-provided |
application-secret.yaml | Secrets (Secret) | User-provided |
javaOpts | JVM options | None |
Helm Chart Structure:
infra/charts/feast/charts/feature-server/charts/transformation-service/Sources: infra/charts/feast/README.md1-82 infra/charts/feast/charts/feature-server/README.md1-68 java/pom.xml30-35
| Server | HTTP REST | gRPC | Protocol Buffers |
|---|---|---|---|
| Python | ✓ | ✓ | ✓ |
| Go | ✓ | ✓ | ✓ |
| Java | ✓ | ✓ | ✓ |
Online Feature Retrieval Sequence
Sources: docs/getting-started/quickstart.md136-164 docs/how-to-guides/running-feast-in-production.md181-204
The Transformation Service is a separate Python gRPC server that handles on-demand feature transformations. This separation allows non-Python feature servers (Go, Java) to leverage Python-based transformation logic.
Transformation Service Architecture
Deployment: The transformation service is deployed alongside feature servers using Helm:
Service Configuration:
Sources: infra/charts/feast/requirements.yaml7-11 infra/charts/feast/charts/transformation-service/values.yaml1-37 docs/reference/registries/metadata.md27-29
Use Case: Python microservices, notebooks, local development
Advantages:
Limitations:
Sources: docs/getting-started/quickstart.md136-164 README.md136-164
Use Case: Polyglot services, centralized serving, load balancing
Advantages:
Limitations:
Sources: docs/how-to-guides/running-feast-in-production.md185-204
Use Case: Production deployments, high availability, auto-scaling
Deployment Options:
Sources: infra/feast-operator/README.md1-166 docs/how-to-guides/feast-on-kubernetes.md1-72 infra/feast-operator/config/samples/kustomization.yaml1-7
The Offline Feature Server provides Arrow Flight-based access to historical features from offline stores. It supports the same query patterns as the Python SDK but over a network protocol.
Offline Feature Server Architecture
Supported Operations:
get_historical_features() - Point-in-time correct training datapull_all_from_table_or_query() - Full table retrievalpull_latest_from_table_or_query() - Latest values retrievaloffline_write_batch() - Write features to offline storeCLI Usage:
Kubernetes Deployment:
Sources: docs/reference/feature-servers/offline-feature-server.md1-60 README.md231
Single-Service Deployment
Configuration Example:
Sources: docs/how-to-guides/running-feast-in-production.md209-221
Multi-Service Deployment with Transformations
Helm Deployment:
Sources: infra/charts/feast/README.md13-58 infra/charts/feast/charts/feature-server/values.yaml1-141
The Feast Operator manages the full lifecycle of Feast deployments using Kubernetes Custom Resources.
FeatureStore CR Example:
Components Created:
Sources: infra/feast-operator/README.md1-166 docs/how-to-guides/feast-on-kubernetes.md14-66
The Registry Server exposes the Feast registry as a gRPC/REST service, enabling remote access to feature metadata. This is useful when the registry is stored in a location not directly accessible to all services.
Registry Server Architecture
Configuration:
Sources: README.md232 docs/roadmap.md67
Feature servers enforce Role-Based Access Control (RBAC) when configured. Permissions are validated at request time.
| Endpoint | Resource Type | Permission | Description |
|---|---|---|---|
get_online_features | FeatureView | Read Online | Retrieve online features |
push | FeatureView | Write Online | Push features to online store |
get_historical_features | FeatureView | Read Offline | Retrieve historical features |
materialize | FeatureView | Write Online | Materialize features |
Configuration Example:
Sources: docs/reference/feature-servers/offline-feature-server.md44-60 CHANGELOG.md11-12
Feature servers can be scaled horizontally by increasing replica count. Key considerations:
cache_ttl_seconds)Scaling Commands:
Sources: infra/charts/feast/charts/feature-server/values.yaml1-2 docs/how-to-guides/running-feast-in-production.md1-240
Registry Caching:
Connection Pooling:
JVM Tuning (Java Server):
Sources: infra/charts/feast/charts/feature-server/values.yaml34-35 docs/how-to-guides/running-feast-in-production.md18
Feature servers expose metrics for monitoring:
Go Server OTEL Integration: The Go feature server includes OpenTelemetry instrumentation for distributed tracing.
Sources: CHANGELOG.md103 docs/reference/registries/metadata.md13
Liveness Probe:
Readiness Probe:
Sources: infra/charts/feast/charts/feature-server/values.yaml43-69
Feature servers support environment variable interpolation in feature_store.yaml:
This enables the same configuration to work across multiple environments (dev, staging, production).
Sources: docs/how-to-guides/running-feast-in-production.md209-221
Refresh this wiki
This wiki was recently refreshed. Please wait 4 days to refresh again.