You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: infra/website/docs/blog/feast-dbt-integration.md
+84-50Lines changed: 84 additions & 50 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,39 +11,50 @@ authors: ["Francisco Javier Arceo", "Yassin Nouh"]
11
11
12
12
# Streamlining ML Feature Engineering with Feast and dbt
13
13
14
-
If you're building machine learning models in production, you've likely faced the challenge of managing features consistently across training and serving environments. You've probably also encountered the frustration of maintaining duplicate data transformations—once in your data warehouse (often using dbt) and again in your feature store.
14
+
If you're a dbt user, you know the power of well-crafted data models. You've invested time building clean, tested, and documented transformations that your team relies on. Your dbt models represent the single source of truth for analytics, reporting, and increasingly—machine learning features.
15
15
16
-
We're excited to share how Feast's dbt integration solves this problem by allowing you to automatically import your dbt models as Feast FeatureViews, eliminating redundant work and accelerating your ML development workflow.
16
+
But here's the challenge: when your ML team wants to use these models for production predictions, they often need to rebuild the same transformations in their feature store. Your beautiful dbt models, with all their logic and documentation, end up getting reimplemented elsewhere. This feels like wasted effort, and it is.
17
17
18
-
## The Challenge: Duplicate Feature Definitions
18
+
What if you could take your existing dbt models and put them directly into production for ML without rewriting anything? That's exactly what Feast's dbt integration enables.
19
19
20
-
Many organizations use [dbt (data build tool)](https://www.getdbt.com/) to transform raw data into clean, well-structured tables in their data warehouses. Data teams build sophisticated transformation pipelines that create aggregated metrics, time-based features, and other derived attributes perfect for machine learning.
20
+
## Your dbt Models Are Already ML-Ready
21
21
22
-
But here's the problem: when it comes time to use these transformations for ML, data scientists often need to manually recreate the same logic in their feature store. This leads to:
22
+
You've already done the hard work with dbt:
23
23
24
-
-**Duplicate work**: Writing the same transformations twice
25
-
-**Inconsistency**: Features drift between warehouse and feature store implementations
26
-
-**Maintenance burden**: Changes must be synchronized across two systems
-**Transformed raw data** into clean, aggregated tables
25
+
-**Documented your models** with column descriptions and metadata
26
+
-**Tested your logic** to ensure data quality
27
+
-**Organized your transformations** into a maintainable codebase
28
28
29
-
## The Solution: Feast's dbt Integration
29
+
These models are perfect for machine learning features. The aggregations you've built for your daily reports? Those are features. The customer attributes you've enriched? Features. The time-based calculations you've perfected? You guessed it—features.
30
30
31
-
Feast's dbt integration bridges this gap by automatically importing dbt model metadata into Feast, generating ready-to-use Entity, DataSource, and FeatureView definitions. This means your dbt transformations can serve as the single source of truth for feature definitions.
31
+
The problem isn't your models—they're great. The problem is getting them into a system that can serve them for real-time ML predictions with low latency and point-in-time correctness.
32
32
33
-
###How It Works
33
+
## How Feast Brings Your dbt Models to Production ML
34
34
35
-
The integration operates on dbt's compiled artifacts (`manifest.json`), extracting model metadata including:
35
+
Feast's dbt integration is designed with one principle in mind: **your dbt models should be the single source of truth**. Instead of asking you to rewrite your transformations, Feast reads your dbt project and automatically generates everything needed to serve those models for ML predictions.
36
36
37
-
- Column names and data types
38
-
- Model descriptions and documentation
39
-
- Table locations and schemas
40
-
- Tags and custom properties
37
+
Here's how it works:
41
38
42
-
Feast then generates Python code that defines corresponding Feast objects, maintaining full compatibility with your existing feature store infrastructure.
39
+
1.**Tag your dbt models** that you want to use as features (just add `tags: ['feast']` to your config)
40
+
2.**Run `feast dbt import`** to automatically generate Feast definitions from your dbt metadata
41
+
3.**Deploy to production** using Feast's feature serving infrastructure
43
42
44
-
## Getting Started: A Practical Example
43
+
Feast reads your `manifest.json` (the compiled output from `dbt compile`) and extracts:
45
44
46
-
Let's walk through a complete example of using Feast with dbt to build driver features for a ride-sharing application.
45
+
- Column names, types, and descriptions from your schema files
46
+
- Table locations from your dbt models
47
+
- All the metadata you've already documented
48
+
49
+
Then it generates Python code defining Feast entities, data sources, and feature views—all matching your dbt models exactly. Your documentation becomes feature documentation. Your data types become feature types. Your models become production-ready features.
50
+
51
+
The best part? **You don't change your dbt workflow at all.** Keep building models the way you always have. The integration simply creates a bridge from your dbt project to production ML serving.
52
+
53
+
## See It In Action: From dbt Model to Production Features
54
+
55
+
Let's walk through a real example. Imagine you're a data engineer at a ride-sharing company, and you've already built dbt models to track driver performance. Your analytics team loves these models, and now your ML team wants to use them to predict which drivers are likely to accept rides.
56
+
57
+
Perfect use case. Let's take your existing dbt models to production ML in just a few steps.
47
58
48
59
### Step 1: Install Feast with dbt Support
49
60
@@ -53,15 +64,15 @@ First, ensure you have Feast installed with dbt support:
53
64
pip install 'feast[dbt]'
54
65
```
55
66
56
-
### Step 2: Create Your dbt Model
67
+
### Step 2: Tag Your Existing dbt Model
57
68
58
-
In your dbt project, create a model that computes driver features. Tag it with `feast`to mark it for import:
69
+
You already have a dbt model that computes driver metrics. All you need to do is add one tag to mark it for Feast:
That's it. One tag. Your model doesn't change—it keeps working exactly as before for your analytics workloads.
104
+
105
+
### Step 3: Use Your Existing Documentation
93
106
94
-
Define column types and descriptions in your schema file. This metadata will be preserved in your Feast definitions:
107
+
You're probably already documenting your dbt models (and if you're not, you should be!). Feast uses this exact same documentation—no duplication needed:
95
108
96
109
{% code title="models/features/schema.yml" %}
97
110
```yaml
@@ -124,20 +137,22 @@ models:
124
137
```
125
138
{% endcode %}
126
139
127
-
### Step 4: Compile Your dbt Project
140
+
Your column descriptions and data types become the feature documentation in Feast automatically. Write it once, use it everywhere.
128
141
129
-
Generate the manifest file that Feast will read:
142
+
### Step 4: Compile Your dbt Project (As Usual)
143
+
144
+
This is your normal dbt workflow—nothing special here:
130
145
131
146
```bash
132
147
cd your_dbt_project
133
148
dbt compile
134
149
```
135
150
136
-
This creates `target/manifest.json` with all your model metadata.
151
+
This creates `target/manifest.json` with all your model metadata—the same artifact you're already generating.
137
152
138
-
### Step 5: Preview Available Models
153
+
### Step 5: See What Feast Found
139
154
140
-
Use the Feast CLI to discover which models are tagged for import:
155
+
Use the Feast CLI to discover your tagged models:
141
156
142
157
```bash
143
158
feast dbt list target/manifest.json --tag-filter feast
@@ -154,9 +169,9 @@ Found 1 model(s) with tag 'feast':
154
169
Tags: feast
155
170
```
156
171
157
-
### Step 6: Import to Feast
172
+
### Step 6: Import Your dbt Model to Feast
158
173
159
-
Generate Feast feature definitions from your dbt model:
174
+
Now for the magic—automatically generate production-ready feature definitions from your dbt model:
You just went from dbt model to production ML features without rewriting a single line of transformation logic. Your dbt model—with all its carefully crafted SQL, documentation, and testing—is now:
253
+
254
+
-**Serving features in milliseconds** for real-time predictions
255
+
-**Maintaining point-in-time correctness** to prevent data leakage during training
256
+
-**Syncing with your data warehouse** automatically as your dbt models update
257
+
-**Self-documenting** using the descriptions you already wrote
258
+
259
+
And here's the best part: when you update your dbt model (maybe you add a new column or refine your logic), just re-run `feast dbt import` and `feast apply`. Your production features stay in sync with your dbt source of truth.
260
+
261
+
## Advanced Use Cases for dbt Users
236
262
237
263
### Multiple Entity Support
238
264
@@ -369,14 +395,15 @@ Commit the generated Python files to your repository. This provides:
369
395
- Code review visibility for dbt-to-Feast mappings
370
396
- Rollback capability if needed
371
397
372
-
## Real-World Impact
398
+
## Why dbt Users Love This
373
399
374
-
Organizations using Feast's dbt integration report significant benefits:
400
+
Data teams using Feast with dbt are seeing real impact:
375
401
376
-
-**50-70% reduction in feature engineering time**: No more duplicating transformations
377
-
-**Improved consistency**: Single source of truth for feature logic
378
-
-**Faster experimentation**: Analysts can create ML-ready features without ML engineering expertise
379
-
-**Better collaboration**: Data engineers and ML engineers work from the same definitions
402
+
-**"We stopped rewriting features twice"**: Data engineers build once in dbt, ML teams use directly
403
+
-**50-70% faster ML deployment**: From dbt model to production features in minutes, not weeks
404
+
-**Single source of truth**: When dbt models update, ML features stay in sync
405
+
-**Analytics expertise becomes ML expertise**: Your dbt knowledge directly translates to ML feature engineering
406
+
-**Better collaboration**: No more "Can you rewrite this SQL in Python?" conversations
380
407
381
408
## Current Limitations and Future Roadmap
382
409
@@ -400,25 +427,32 @@ If you encounter issues or have questions:
-**Issues**: Report bugs or request features on [GitHub](https://github.com/feast-dev/feast/issues)
402
429
403
-
## Conclusion
430
+
## Conclusion: Your dbt Models Deserve Production ML
431
+
432
+
You've invested time and care into your dbt models. They're clean, documented, tested, and trusted by your organization. They shouldn't have to be rewritten to power machine learning—they should work as-is.
404
433
405
-
Feast's dbt integration represents a significant step toward reducing friction in ML feature engineering. By leveraging your existing dbt transformations as feature definitions, you can:
434
+
Feast's dbt integration makes that possible. Your dbt models become production ML features with:
406
435
407
-
- Eliminate duplicate work
408
-
- Maintain consistency across environments
409
-
- Accelerate ML development cycles
410
-
- Enable better collaboration between data and ML teams
436
+
- ✅ No rewriting or duplication
437
+
- ✅ No changes to your dbt workflow
438
+
- ✅ All your documentation preserved
439
+
- ✅ Real-time serving for predictions
440
+
- ✅ Point-in-time correctness for training
411
441
412
-
The integration is designed to fit naturally into existing workflows, requiring minimal changes to your dbt projects while unlocking powerful feature store capabilities.
442
+
If you're a dbt user who's been asked to "make those models work for ML," this is your answer.
413
443
414
-
Ready to get started? Install Feast with dbt support today and transform your feature engineering workflow:
444
+
Ready to see your dbt models in production? Install Feast and try it out:
We're excited to see what you build with Feast and dbt. Share your experiences with us on [Slack](http://slack.feast.dev/) or [Twitter](https://twitter.com/feast_dev)!
453
+
Your models are already great. Now make them do more.
454
+
455
+
Join us on [Slack](http://slack.feast.dev/) to share your dbt + Feast success stories, or check out the [full documentation](https://docs.feast.dev/how-to-guides/dbt-integration) to dive deeper.
0 commit comments