Skip to content

Commit a2eb5a5

Browse files
committed
Add terraform files to setup AWS file source and registry in module 2
Signed-off-by: Danny Chiao <danny@tecton.ai>
1 parent d1aad86 commit a2eb5a5

28 files changed

Lines changed: 302 additions & 169 deletions

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,7 @@
11
**/registry.db
22
**/online.db
33
.DS_Store
4-
**/__pycache__
4+
**/__pycache__
5+
terraform.tfstate
6+
terraform.tfstate.backup
7+
.terraform*

README.md

Lines changed: 14 additions & 76 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,8 @@
1-
# Feast Spark + Kafka + Redis workshop
1+
# Learning Feast
22

33
## Overview
44

5-
This workshop aims to teach basic Feast concepts and walk you through focuses on how to achieve a common architecture:
6-
7-
TODO: add architecture diagram
8-
9-
- **Data sources**: Kafka + File source
10-
- **Online store**: Redis
11-
- **Use case**: Predicting churn for drivers
12-
- Batch scoring via offline store
13-
- Real time scoring via online store
14-
15-
We will generate a model that will predict whether a driver will churn.
5+
This workshop aims to teach basic Feast concepts and walk you through focuses on how to achieve common architectures
166

177
## Pre-requisites
188
This workshop assumes you have the following installed:
@@ -21,67 +11,15 @@ This workshop assumes you have the following installed:
2111
- pip
2212
- Docker & Docker Compose (e.g. `brew install docker docker-compose`)
2313

24-
## Setup
25-
26-
### Docker + Kafka + Redis
27-
First, we install Feast with Redis support:
28-
```
29-
pip install "feast[redis]"
30-
```
31-
32-
We then use Docker Compose to spin up a local Kafka cluster and automatically publish events to it.
33-
- This leverages a script (in `kafka_demo/`) that creates a topic, reads from `feature_repo/data/driver_stats.parquet`, generates newer timestamps, and emits them to the topic.
34-
35-
```
36-
docker-compose up
37-
```
38-
39-
### Setting up Feast
40-
41-
Install Feast using pip
42-
43-
```
44-
pip install 'feast[redis]'
45-
```
46-
47-
We have already set up a feature repository in [feature_repo/](feature_repo/).
48-
49-
Deploy the feature store by running `apply` from within the `feature_repo/` folder
50-
```
51-
cd feature_repo/
52-
feast apply
53-
```
54-
55-
Output:
56-
```
57-
Created entity driver
58-
Created feature view driver_hourly_stats
59-
Created feature view driver_daily_features
60-
Created on demand feature view transformed_conv_rate
61-
Created feature service convrate_plus100
62-
63-
Deploying infrastructure for driver_hourly_stats
64-
Deploying infrastructure for driver_daily_features
65-
```
66-
67-
Next we load features into the online store using the `materialize-incremental` command. This command will load the
68-
latest feature values from a data source into the online store.
69-
70-
```
71-
CURRENT_TIME=$(date -u +"%Y-%m-%dT%H:%M:%S")
72-
feast materialize-incremental $CURRENT_TIME
73-
```
74-
75-
Output:
76-
```
77-
Materializing 2 feature views to 2022-04-28 12:38:01-04:00 into the redis online store.
78-
79-
driver_hourly_stats from 1748-07-13 16:38:03-04:56:02 to 2022-04-28 12:38:01-04:00:
80-
100%|████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 759.42it/s]
81-
driver_daily_features from 1748-07-13 16:38:03-04:56:02 to 2022-04-28 12:38:01-04:00:
82-
100%|███████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 1081.28it/s]
83-
```
84-
85-
## Continue with the workshop
86-
87-
Now run the Jupyter notebook ([feature_repo/workshop.ipynb](feature_repo/workshop.ipynb))
14+
## This workshop is composed of several modules
15+
16+
| Description | Module |
17+
| --- | --- |
18+
| Feast Concepts and basic flows | [Quickstart](https://docs.feast.dev/getting-started/quickstart) |
19+
| Powering low latency online feature retrieval with Kafka, Spark, and Redis | [Module 1](module_1/README.md) |
20+
| Using remote registry and file sources, platform vs client user flows, on demand transformations | [Module 2](module_2/README.md) |
21+
| Fetching features for batch scoring | TBD |
22+
| Feast Web UI | TBD |
23+
| Versioning features / models in Feast | TBD |
24+
| Data quality monitoring in Feast | TBD |
25+
| Deploying a feature server to AWS Lambda | TBD |

client_dir/test_python.py

Lines changed: 0 additions & 16 deletions
This file was deleted.

feature_repo/.DS_Store

-6 KB
Binary file not shown.

feature_repo/data/.DS_Store

-6 KB
Binary file not shown.

feature_repo/feature_store.yaml

Lines changed: 0 additions & 10 deletions
This file was deleted.

module_1/README.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Module 1: Serving fresh online features with Feast, Kafka, Redis
2+
3+
In module 1, we focus on building some test features and go through common flows in Feast
4+
5+
- **Data sources**: Kafka + File source
6+
- **Online store**: Redis
7+
- **Use case**: Predicting churn for drivers
8+
- Batch scoring via offline store
9+
- Real time scoring via online store
10+
11+
## Setup
12+
13+
### Docker + Kafka + Redis
14+
First, we install Feast with Redis support:
15+
```
16+
pip install "feast[redis]"
17+
```
18+
19+
We then use Docker Compose to spin up a local Kafka cluster and automatically publish events to it.
20+
- This leverages a script (in `kafka_demo/`) that creates a topic, reads from `feature_repo/data/driver_stats.parquet`, generates newer timestamps, and emits them to the topic.
21+
22+
```
23+
docker-compose up
24+
```
25+
26+
### Setting up Feast
27+
28+
Install Feast using pip
29+
30+
```
31+
pip install 'feast[redis]'
32+
```
33+
34+
We have already set up a feature repository in [feature_repo/](feature_repo/).
35+
36+
Deploy the feature store by running `apply` from within the `feature_repo/` folder
37+
```
38+
cd feature_repo/
39+
feast apply
40+
```
41+
42+
Output:
43+
```
44+
Created entity driver
45+
Created feature view driver_hourly_stats
46+
Created feature view driver_daily_features
47+
Created on demand feature view transformed_conv_rate
48+
Created feature service convrate_plus100
49+
50+
Deploying infrastructure for driver_hourly_stats
51+
Deploying infrastructure for driver_daily_features
52+
```
53+
54+
Next we load features into the online store using the `materialize-incremental` command. This command will load the
55+
latest feature values from a data source into the online store.
56+
57+
```
58+
CURRENT_TIME=$(date -u +"%Y-%m-%dT%H:%M:%S")
59+
feast materialize-incremental $CURRENT_TIME
60+
```
61+
62+
Output:
63+
```
64+
Materializing 2 feature views to 2022-04-28 12:38:01-04:00 into the redis online store.
65+
66+
driver_hourly_stats from 1748-07-13 16:38:03-04:56:02 to 2022-04-28 12:38:01-04:00:
67+
100%|████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 759.42it/s]
68+
driver_daily_features from 1748-07-13 16:38:03-04:56:02 to 2022-04-28 12:38:01-04:00:
69+
100%|███████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 1081.28it/s]
70+
```
71+
72+
## Continue with the workshop
73+
74+
Now run the Jupyter notebook ([feature_repo/workshop.ipynb](feature_repo/workshop.ipynb))
File renamed without changes.

0 commit comments

Comments
 (0)