This document provides a high-level introduction to Feast, explaining its role as a feature store for machine learning, its core value propositions, and its architectural components. For detailed architectural patterns, see System Architecture. For installation and CLI usage, see Getting Started and CLI. For deep dives into specific subsystems, refer to sections 2 (Core Concepts) through 8 (Advanced Topics).
Feast (Feature Store) is an open source feature store for machine learning that provides a unified interface for managing and serving features across training and inference workloads. It acts as a data access layer that abstracts feature storage from feature retrieval, enabling ML teams to productionize analytic data for model training and online inference.
The primary entry point is the FeatureStore class sdk/python/feast/feature_store.py105-2080 which orchestrates interactions between offline stores (historical data), online stores (low-latency serving), and a registry (metadata catalog).
Sources: README.md26-38 sdk/python/feast/feature_store.py105-115 docs/getting-started/quickstart.md1-16
Feast addresses three fundamental challenges in ML feature management:
| Challenge | Solution | Implementation |
|---|---|---|
| Training-Serving Skew | Point-in-time correct feature joins prevent future data leakage during training | get_historical_features() method in offline stores sdk/python/feast/feature_store.py1682-1845 |
| Feature Availability | Pre-computed features served at low latency (<10ms) from online stores | get_online_features() method sdk/python/feast/feature_store.py1847-2061 |
| Infrastructure Coupling | Pluggable offline/online stores enable portability across data systems | Provider abstraction sdk/python/feast/infra/provider.py49-531 |
The system ensures consistency through:
Sources: README.md32-36 docs/getting-started/quickstart.md43-59 sdk/python/feast/feature_store.py105-115
The following diagram shows Feast's three-tier architecture and how the FeatureStore class orchestrates components:
Key Code Entities:
Sources: sdk/python/feast/feature_store.py105-220 sdk/python/feast/infra/provider.py41-46 sdk/python/feast/repo_config.py253-318
The FeatureStore class is the primary interface for all Feast operations. It lazily initializes its dependencies:
Location: sdk/python/feast/feature_store.py105-220
Key Methods:
apply() - Registers feature definitions sdk/python/feast/feature_store.py944-1123materialize() - Loads features into online store sdk/python/feast/feature_store.py1124-1252get_historical_features() - Retrieves training data sdk/python/feast/feature_store.py1682-1845get_online_features() - Retrieves inference features sdk/python/feast/feature_store.py1847-2061The registry stores metadata about feature definitions. Multiple implementations exist:
| Type | Class | Storage Backend | Use Case |
|---|---|---|---|
| File | Registry | Local/S3/GCS | Development |
| SQL | SqlRegistry | PostgreSQL/MySQL | Production |
| Snowflake | SnowflakeRegistry | Snowflake | Snowflake-native |
| Remote | RemoteRegistry | gRPC server | Distributed teams |
Configuration: sdk/python/feast/repo_config.py136-184
Sources: sdk/python/feast/infra/registry/registry.py sdk/python/feast/feature_store.py206-243
Offline stores provide historical feature values for training and batch scoring. The base interface:
Interface: sdk/python/feast/infra/offline_stores/offline_store.py
Implementations: 9+ stores including BigQuery, Snowflake, Redshift, Spark, DuckDB sdk/python/feast/repo_config.py91-107
Online stores provide low-latency feature retrieval for inference:
Interface: sdk/python/feast/infra/online_stores/online_store.py35-252
Implementations: 9+ stores including Redis, DynamoDB, SQLite, Cassandra, Milvus (vector DB) sdk/python/feast/repo_config.py68-89
Example - SQLite: sdk/python/feast/infra/online_stores/sqlite.py117-431
Sources: sdk/python/feast/infra/online_stores/online_store.py35-71 sdk/python/feast/infra/online_stores/sqlite.py103-116
The Provider interface orchestrates offline stores, online stores, and materialization:
The PassthroughProvider is the default implementation that delegates to configured stores sdk/python/feast/infra/passthrough_provider.py58-128
Sources: sdk/python/feast/infra/provider.py49-105 sdk/python/feast/infra/passthrough_provider.py58-90
The following diagram shows how feature definitions flow through the system:
Key Functions:
parse_repo() - Extracts objects from Python files sdk/python/feast/repo_operations.py114-221apply() - Validates and registers objects sdk/python/feast/feature_store.py944-1123plan() - Dry-run for changes sdk/python/feast/feature_store.py795-880Sources: sdk/python/feast/repo_operations.py114-221 sdk/python/feast/feature_store.py944-1068 protos/feast/core/Registry.proto
Materialization copies features from offline to online stores:
Key Methods:
materialize() - Time-range materialization sdk/python/feast/feature_store.py1124-1252materialize_incremental() - Incremental materialization sdk/python/feast/feature_store.py1253-1320materialize_single_feature_view() - Provider implementation sdk/python/feast/infra/passthrough_provider.py492-686Compute Engines:
Sources: sdk/python/feast/feature_store.py1124-1252 sdk/python/feast/infra/passthrough_provider.py492-686 sdk/python/feast/repo_config.py46-53
Feast is configured via feature_store.yaml:
The RepoConfig class sdk/python/feast/repo_config.py253-318 parses this configuration and provides properties for:
registry - Registry configuration sdk/python/feast/repo_config.py365-383offline_store - Offline store configuration sdk/python/feast/repo_config.py385-398online_store - Online store configuration sdk/python/feast/repo_config.py431-443batch_engine - Materialization engine configuration sdk/python/feast/repo_config.py445-459Sources: sdk/python/feast/repo_config.py253-460 docs/getting-started/quickstart.md106-117
Feast supports multiple serving patterns:
Declarative deployment using FeatureStore CRD infra/feast-operator/README.md1-13
Sources: README.md136-164 docs/getting-started/quickstart.md136-157 infra/feast-operator/README.md1-13
Feast's architecture is designed for extensibility through well-defined interfaces:
| Component | Interface | Custom Implementation Guide |
|---|---|---|
| Offline Store | OfflineStore | See Adding a new offline store |
| Online Store | OnlineStore | See Adding a new online store |
| Registry | BaseRegistry | See Registry documentation |
| Provider | Provider | See Creating a custom provider |
| Batch Engine | ComputeEngine | See Custom materialization engine |
Class Loading: Custom implementations are loaded via the importer module sdk/python/feast/importer.py using fully qualified class names in configuration.
Sources: sdk/python/feast/repo_config.py39-121 sdk/python/feast/infra/provider.py533-542 docs/getting-started/third-party-integrations.md1-30
Feast supports multiple deployment patterns:
The architecture diagram at README.md38-43 illustrates the minimal deployment with SQLite and local files.
Sources: README.md38-68 docs/how-to-guides/running-feast-in-production.md1-20 docs/how-to-guides/feast-on-kubernetes.md1-8
Feast uses a custom type system that maps between Python types, Pandas types, database types, and Protocol Buffer types:
PrimitiveFeastType, Array, and complex typesThis type system ensures consistency across offline stores (databases), online stores (key-value stores), and the feature server API (protobuf).
For details, see Type System.
Sources: sdk/python/feast/types.py sdk/python/feast/type_map.py protos/feast/types/Value.proto
Refresh this wiki
This wiki was recently refreshed. Please wait 4 days to refresh again.