Skip to content

Commit 2e36106

Browse files
committed
fix: Fixed blog location
Signed-off-by: ntkathole <nikhilkathole2683@gmail.com>
1 parent a0b552a commit 2e36106

2 files changed

Lines changed: 41 additions & 17 deletions

File tree

docs/blog/README.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,3 @@ Welcome to the Feast blog! Here you'll find articles about feature store develop
1919
{% content-ref url="rbac-role-based-access-controls.md" %}
2020
[rbac-role-based-access-controls.md](rbac-role-based-access-controls.md)
2121
{% endcontent-ref %}
22-
23-
{% content-ref url="feast-mlflow-kubeflow.md" %}
24-
[feast-mlflow-kubeflow.md](feast-mlflow-kubeflow.md)
25-
{% endcontent-ref %}

docs/blog/feast-mlflow-kubeflow.md renamed to infra/website/docs/blog/feast-mlflow-kubeflow.md

Lines changed: 41 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ These tools are not competitors. Each one occupies a distinct role:
2323

2424
Together they form a complete, open-source foundation for operationalizing ML.
2525

26-
This topic has been explored by the community before — the post ["Feast with AI: Feed Your MLflow Models with Feature Store"](https://blog.qooba.net/2021/05/22/feast-with-ai-feed-your-mlflow-models-with-feature-store/) by [@qooba](https://github.com/qooba) is an excellent early look at combining Feast and MLflow. This post builds on that work and extends the story to include Kubeflow Pipelines and the Kubeflow Training Operator.
26+
This topic has been explored by the community before — the post ["Feast with AI: Feed Your MLflow Models with Feature Store"](https://blog.qooba.net/2021/05/22/feast-with-ai-feed-your-mlflow-models-with-feature-store/) by [@qooba](https://github.com/qooba) is an excellent early look at combining Feast and MLflow. For a hands-on, end-to-end example of Feast and Kubeflow working together, see ["From Raw Data to Model Serving: A Blueprint for the AI/ML Lifecycle with Kubeflow and Feast"](/blog/kubeflow-fraud-detection-e2e) by Helber Belmiro. This post builds on that prior work and brings all three tools — Feast, MLflow, and Kubeflow — into a single narrative.
2727

2828
---
2929

@@ -132,7 +132,7 @@ from feast import on_demand_feature_view, Field
132132
from feast.types import Float64
133133

134134
@on_demand_feature_view(
135-
sources=["driver_hourly_stats"],
135+
sources=[driver_stats],
136136
schema=[Field(name="conv_acc_ratio", dtype=Float64)],
137137
)
138138
def driver_ratios(inputs):
@@ -141,6 +141,8 @@ def driver_ratios(inputs):
141141
return df[["conv_acc_ratio"]]
142142
```
143143

144+
Here `driver_stats` is the `FeatureView` object defined earlier. The `sources` parameter accepts `FeatureView`, `RequestSource`, or `FeatureViewProjection` objects.
145+
144146
Using `on_demand_feature_view` ensures that the same transformation logic is applied whether features are retrieved from the offline store for training or from the online store at inference time, preventing transformation skew.
145147

146148
### Feature lineage
@@ -158,20 +160,47 @@ print(feature_view.source) # upstream data source
158160
print(feature_view.schema) # feature schema
159161
```
160162

163+
For cross-system lineage that extends beyond Feast into upstream data pipelines and downstream model training, Feast also supports native [OpenLineage integration](/blog/feast-openlineage-integration). Enabling it in your `feature_store.yaml` automatically emits lineage events on `feast apply` and `feast materialize`, letting you visualize the full data flow in tools like [Marquez](https://marquezproject.ai/).
164+
161165
### Data quality monitoring
162166

163-
Feast integrates with data quality frameworks to detect feature drift, stale data, and schema violations before they silently degrade model performance. You can attach expectations to a feature view so that data is validated during materialization:
167+
Feast integrates with data quality frameworks like [Great Expectations](https://greatexpectations.io/) to detect feature drift, stale data, and schema violations before they silently degrade model performance. The workflow centers on Feast's `SavedDataset` and `ValidationReference` APIs: you save a profiled dataset during training, define a profiler using Great Expectations, and then validate new feature data against that reference in subsequent runs.
164168

165169
```python
166-
from feast import FeatureView
167-
from feast.data_quality import DataQualityStats
170+
from feast import FeatureStore
171+
from feast.dqm.profilers.ge_profiler import ge_profiler
172+
from great_expectations.core import ExpectationSuite
173+
from great_expectations.dataset import PandasDataset
174+
175+
store = FeatureStore(repo_path=".")
176+
177+
@ge_profiler
178+
def my_profiler(dataset: PandasDataset) -> ExpectationSuite:
179+
dataset.expect_column_values_to_be_between("conv_rate", min_value=0, max_value=1)
180+
dataset.expect_column_values_to_be_between("acc_rate", min_value=0, max_value=1)
181+
return dataset.get_expectation_suite()
168182

169-
# After materialization, inspect stats tracked by Feast
170-
stats = store.get_saved_dataset("driver_stats_validation").to_df()
171-
print(stats.describe())
183+
reference_job = store.get_historical_features(
184+
entity_df=entity_df,
185+
features=["driver_hourly_stats:conv_rate", "driver_hourly_stats:acc_rate"],
186+
)
187+
188+
dataset = store.create_saved_dataset(
189+
from_=reference_job,
190+
name="driver_stats_validation",
191+
storage=storage,
192+
)
193+
194+
reference = dataset.as_reference(name="driver_stats_ref", profiler=my_profiler)
195+
196+
new_job = store.get_historical_features(
197+
entity_df=new_entity_df,
198+
features=["driver_hourly_stats:conv_rate", "driver_hourly_stats:acc_rate"],
199+
)
200+
new_job.to_df(validation_reference=reference)
172201
```
173202

174-
Monitoring feature distributions over time — and comparing them to the distributions seen during training — allows you to detect training–serving skew early, before it causes silent model degradation in production.
203+
If validation fails, Feast raises a `ValidationFailed` exception with details on which expectations were violated. Monitoring feature distributions over time — and comparing them to the distributions seen during training — allows you to detect training–serving skew early, before it causes silent model degradation in production.
175204

176205
### Feast Feature Registry vs. MLflow Model Registry
177206

@@ -276,7 +305,6 @@ Kubeflow Pipelines lets you compose the entire workflow — feature retrieval, t
276305

277306
```python
278307
from kfp import dsl
279-
from kfp.components import func_to_container_op
280308

281309
@dsl.component(base_image="python:3.10-slim", packages_to_install=["feast", "mlflow", "scikit-learn", "pandas", "pyarrow"])
282310
def retrieve_features(entity_df_path: str, feature_store_repo: str, output_path: dsl.Output[dsl.Dataset]):
@@ -353,12 +381,12 @@ mlflow.register_model(f"runs:/{best_run_id}/model", "driver_conversion_model")
353381

354382
### Step 3: Promote to production
355383

356-
Transitioning the model to "Production" in MLflow signals that it is ready for deployment. At this point, you also know the exact subset of Feast features required by that model — these are the features to materialize and serve.
384+
Promoting the model in MLflow signals that it is ready for deployment. At this point, you also know the exact subset of Feast features required by that model — these are the features to materialize and serve.
357385

358386
```python
359387
client = mlflow.tracking.MlflowClient()
360-
client.transition_model_version_stage(
361-
name="driver_conversion_model", version="3", stage="Production"
388+
client.set_registered_model_alias(
389+
name="driver_conversion_model", alias="production", version="3"
362390
)
363391
```
364392

0 commit comments

Comments
 (0)