Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/getting-started/concepts/entity.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ An entity is a collection of semantically related features. Users define entitie
driver = Entity(name='driver', value_type=ValueType.STRING, join_key='driver_id')
```

Entities are typically defined as part of feature views. Entities are used to identify the primary key on which feature values should be stored and retrieved. These keys are used during the lookup of feature values from the online store and the join process in point-in-time joins. It is possible to define composite entities \(more than one entity object\) in a feature view. It is also possible for feature views to have zero entities. See [feature view](feature-view.md) for more details.
Entities are typically defined as part of feature views. Entity name is used to reference the entity from a feature view definition and join key is used to identify the physical primary key on which feature values should be stored and retrieved. These keys are used during the lookup of feature values from the online store and the join process in point-in-time joins. It is possible to define composite entities \(more than one entity object\) in a feature view. It is also possible for feature views to have zero entities. See [feature view](feature-view.md) for more details.

Entities should be reused across feature views.

Expand Down
5 changes: 4 additions & 1 deletion docs/getting-started/concepts/feature-retrieval.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,10 @@ online_features = fs.get_online_features(
'driver_locations:lon',
'drivers_activity:trips_today'
],
entity_rows=[{'driver': 'driver_1001'}]
entity_rows=[
# {join_key: entity_value}
{'driver': 'driver_1001'}
]
)
```

Expand Down
18 changes: 14 additions & 4 deletions docs/getting-started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,14 +95,16 @@ driver_hourly_stats = FileSource(

# Define an entity for the driver. You can think of entity as a primary key used to
# fetch features.
driver = Entity(name="driver_id", value_type=ValueType.INT64, description="driver id",)
# Entity has a name used for later reference (in a feature view, eg)
# and join_key to identify physical field name used in storages
driver = Entity(name="driver", value_type=ValueType.INT64, join_key="driver_id", description="driver id",)

# Our parquet files contain sample data that includes a driver_id column, timestamps and
# three feature column. Here we define a Feature View that will allow us to serve this
# data to our model online.
driver_hourly_stats_view = FeatureView(
name="driver_hourly_stats",
entities=["driver_id"],
entities=["driver"], # reference entity by name
ttl=Duration(seconds=86400 * 1),
features=[
Feature(name="conv_rate", dtype=ValueType.FLOAT),
Expand Down Expand Up @@ -162,14 +164,16 @@ driver_hourly_stats = FileSource(

# Define an entity for the driver. You can think of entity as a primary key used to
# fetch features.
driver = Entity(name="driver_id", value_type=ValueType.INT64, description="driver id",)
# Entity has a name used for later reference (in a feature view, eg)
# and join_key to identify physical field name used in storages
driver = Entity(name="driver", value_type=ValueType.INT64, join_key="driver_id", description="driver id",)

# Our parquet files contain sample data that includes a driver_id column, timestamps and
# three feature column. Here we define a Feature View that will allow us to serve this
# data to our model online.
driver_hourly_stats_view = FeatureView(
name="driver_hourly_stats",
entities=["driver_id"],
entities=["driver"], # reference entity by name
ttl=Duration(seconds=86400 * 1),
features=[
Feature(name="conv_rate", dtype=ValueType.FLOAT),
Expand Down Expand Up @@ -213,8 +217,13 @@ from feast import FeatureStore
# The entity dataframe is the dataframe we want to enrich with feature values
entity_df = pd.DataFrame.from_dict(
{
# entity's join key -> entity values
"driver_id": [1001, 1002, 1003],

# label name -> label values
"label_driver_reported_satisfaction": [1, 5, 3],

# "event_timestamp" (reserved key) -> timestamps
"event_timestamp": [
datetime.now() - timedelta(minutes=11),
datetime.now() - timedelta(minutes=36),
Expand Down Expand Up @@ -320,6 +329,7 @@ feature_vector = store.get_online_features(
"driver_hourly_stats:avg_daily_trips",
],
entity_rows=[
# {join_key: entity_value}
{"driver_id": 1004},
{"driver_id": 1005},
],
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ fs = FeatureStore(repo_path="path/to/feature/repo")
online_features = fs.get_online_features(
features=features,
entity_rows=[
# {join_key: entity_value, ...}
{"driver_id": 1001},
{"driver_id": 1002}]
).to_dict()
Expand Down
7 changes: 6 additions & 1 deletion docs/tutorials/driver-stats-on-snowflake.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,12 @@ fs.materialize_incremental(end_date=datetime.now())
{% code title="test.py" %}
```python
online_features = fs.get_online_features(
features=features, entity_rows=[{"driver_id": 1001}, {"driver_id": 1002}],
features=features,
entity_rows=[
# {join_key: entity_value}
{"driver_id": 1001},
{"driver_id": 1002}
],
).to_dict()
```
{% endcode %}