This document describes Feast's feature transformation system, which allows users to define derived features by applying custom functions to existing features and request-time data. Feature transformations in Feast are primarily implemented through OnDemandFeatureView (ODFV), which enables computation of new features at request time or during materialization.
For information about batch transformations during materialization, see Materialization and Compute Engines. For data sources and how they provide raw features, see Data Sources.
The OnDemandFeatureView class serves as the primary mechanism for defining feature transformations in Feast. It wraps a user-defined function (UDF) and specifies its input sources and output schema.
Key Classes and Attributes:
| Attribute | Type | Purpose |
|---|---|---|
feature_transformation | Transformation | The transformation logic to execute |
mode | str | Execution mode: "pandas", "python", or "substrait" |
sources | List[OnDemandSourceType] | Input sources (FeatureViews, RequestSources) |
schema | List[Field] | Output feature schema |
write_to_online_store | bool | Whether to materialize to online store |
singleton | bool | Row-by-row execution mode (Python only) |
Sources:
The default mode that operates on pandas DataFrames. Best for transformations that leverage pandas vectorized operations.
Implementation:
PandasTransformationpd.DataFrame with columns for each source featurepd.DataFrame with columns for each output featureExample Pattern:
Sources:
Native Python mode that operates on dictionaries of lists. Supports both batch and singleton execution.
Implementation:
PythonTransformationdict[str, list[Any]] for batch, dict[str, Any] for singletonsingleton=TrueSingleton Constraint: The singleton mode is only supported with Python mode, as it requires native Python types for row-level processing.
Sources:
Cross-language transformation mode using Substrait for portable execution plans.
Implementation:
SubstraitTransformationSources:
FeatureViews provide batch or stream features as inputs to transformations. When a FeatureView is specified as a source, its projection is automatically extracted.
Processing Logic:
Sources:
RequestSources define features that are provided at request time rather than from stored data. These are commonly used for contextual features like user inputs or request metadata.
Configuration:
Sources:
During get_historical_features, transformations are applied after point-in-time joins complete.
Key Method:
OnDemandFeatureView.get_requested_odfvs()RetrievalJob.to_df() after offline store retrievalSources:
For online serving, transformations execute synchronously after fetching base features from the online store.
Sources:
OnDemandFeatureViews can be materialized to the online store by setting write_to_online_store=True. This pre-computes transformations during batch materialization rather than at serving time.
Requirements:
write_to_online_store=True must be setDUMMY_ENTITY)online=TrueValidation Logic:
Sources:
When materializing ODFVs with write_to_online_store=True:
Sources:
OnDemandFeatureViews support automatic schema inference from the transformation UDF and source features.
Inference Method:
The infer_features() method executes the transformation with an empty input to determine output schema.
Sources:
Feast provides a decorator for convenient ODFV definition with automatic UDF capture.
Usage Pattern:
Implementation Details:
PandasTransformation or PythonTransformationSources:
Apply Flow:
write_to_online_store=True)Sources:
The PassthroughProvider handles ODFV execution by delegating to appropriate engines:
| Operation | Delegation Target | Notes |
|---|---|---|
| Historical retrieval | OfflineStore | Transformations in RetrievalJob |
| Online retrieval | OnlineStore | Transformations in get_online_features |
| Materialization | ComputeEngine | Batch transformation execution |
Sources:
| Error Type | Condition | Error Message Source |
|---|---|---|
| Mode mismatch | singleton=True with non-Python mode | ODFVErrorMessages.singleton_mode_requires_python() |
| Missing entities | write_to_online_store=True without entities | ODFVErrorMessages.online_store_requires_entities() |
| No transformation | Neither udf nor feature_transformation provided | ODFVErrorMessages.no_transformation_provided() |
| Duplicate sources | Source names overlap | ODFVErrorMessages.duplicate_source_names() |
Sources:
OnDemandFeatureViews serialize to protobuf for registry storage:
Key Proto Fields:
user_defined_function: Contains serialized UDF code and modeon_demand_sources: List of source FeatureViews and RequestSourcesfeatures: Output feature schemaSources:
Tests validate transformation behavior across different modes and configurations:
Test Categories:
Example Test Pattern:
Sources:
Refresh this wiki
This wiki was recently refreshed. Please wait 4 days to refresh again.