Skip to content

Commit 821b1f2

Browse files
Add Set types to Feast type system
- Add 8 Set value types (BYTES_SET, STRING_SET, INT32_SET, INT64_SET, DOUBLE_SET, FLOAT_SET, BOOL_SET, UNIX_TIMESTAMP_SET) - Implement Set class with base type validation (no nested Sets/Maps) - Add type conversion logic with duplicate removal - Generate protobuf bindings and update type stubs - Add comprehensive tests and documentation for Set types Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
1 parent 62fe664 commit 821b1f2

File tree

14 files changed

+729
-112
lines changed

14 files changed

+729
-112
lines changed

docs/reference/type-system.md

Lines changed: 43 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
## Motivation
44

55
Feast uses an internal type system to provide guarantees on training and serving data.
6-
Feast supports primitive types, array types, and map types for feature values.
6+
Feast supports primitive types, array types, set types, and map types for feature values.
77
Null types are not supported, although the `UNIX_TIMESTAMP` type is nullable.
88
The type system is controlled by [`Value.proto`](https://github.com/feast-dev/feast/blob/master/protos/feast/types/Value.proto) in protobuf and by [`types.py`](https://github.com/feast-dev/feast/blob/master/sdk/python/feast/types.py) in Python.
99
Type conversion logic can be found in [`type_map.py`](https://github.com/feast-dev/feast/blob/master/sdk/python/feast/type_map.py).
@@ -40,6 +40,23 @@ All primitive types have corresponding array (list) types:
4040
| `Array(Bool)` | `List[bool]` | List of booleans |
4141
| `Array(UnixTimestamp)` | `List[datetime]` | List of timestamps |
4242

43+
### Set Types
44+
45+
All primitive types (except Map) have corresponding set types for storing unique values:
46+
47+
| Feast Type | Python Type | Description |
48+
|------------|-------------|-------------|
49+
| `Set(Int32)` | `Set[int]` | Set of unique 32-bit integers |
50+
| `Set(Int64)` | `Set[int]` | Set of unique 64-bit integers |
51+
| `Set(Float32)` | `Set[float]` | Set of unique 32-bit floats |
52+
| `Set(Float64)` | `Set[float]` | Set of unique 64-bit floats |
53+
| `Set(String)` | `Set[str]` | Set of unique strings |
54+
| `Set(Bytes)` | `Set[bytes]` | Set of unique binary data |
55+
| `Set(Bool)` | `Set[bool]` | Set of unique booleans |
56+
| `Set(UnixTimestamp)` | `Set[datetime]` | Set of unique timestamps |
57+
58+
**Note:** Set types automatically remove duplicate values. When converting from lists or other iterables to sets, duplicates are eliminated.
59+
4360
### Map Types
4461

4562
Map types allow storing dictionary-like data structures:
@@ -60,7 +77,7 @@ from datetime import timedelta
6077
from feast import Entity, FeatureView, Field, FileSource
6178
from feast.types import (
6279
Int32, Int64, Float32, Float64, String, Bytes, Bool, UnixTimestamp,
63-
Array, Map
80+
Array, Set, Map
6481
)
6582

6683
# Define a data source
@@ -101,6 +118,12 @@ user_features = FeatureView(
101118
Field(name="notification_settings", dtype=Array(Bool)),
102119
Field(name="login_timestamps", dtype=Array(UnixTimestamp)),
103120

121+
# Set types (unique values only)
122+
Field(name="visited_pages", dtype=Set(String)),
123+
Field(name="unique_categories", dtype=Set(Int32)),
124+
Field(name="tag_ids", dtype=Set(Int64)),
125+
Field(name="preferred_languages", dtype=Set(String)),
126+
104127
# Map types
105128
Field(name="user_preferences", dtype=Map),
106129
Field(name="metadata", dtype=Map),
@@ -110,6 +133,24 @@ user_features = FeatureView(
110133
)
111134
```
112135

136+
### Set Type Usage Examples
137+
138+
Sets store unique values and automatically remove duplicates:
139+
140+
```python
141+
# Simple set
142+
visited_pages = {"home", "products", "checkout", "products"} # "products" appears twice
143+
# Feast will store this as: {"home", "products", "checkout"}
144+
145+
# Integer set
146+
unique_categories = {1, 2, 3, 2, 1} # duplicates will be removed
147+
# Feast will store this as: {1, 2, 3}
148+
149+
# Converting a list with duplicates to a set
150+
tag_list = [100, 200, 300, 100, 200]
151+
tag_ids = set(tag_list) # {100, 200, 300}
152+
```
153+
113154
### Map Type Usage Examples
114155

115156
Maps can store complex nested data structures:

protos/feast/types/Value.proto

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,14 @@ message ValueType {
4545
NULL = 19;
4646
MAP = 20;
4747
MAP_LIST = 21;
48+
BYTES_SET = 22;
49+
STRING_SET = 23;
50+
INT32_SET = 24;
51+
INT64_SET = 25;
52+
DOUBLE_SET = 26;
53+
FLOAT_SET = 27;
54+
BOOL_SET = 28;
55+
UNIX_TIMESTAMP_SET = 29;
4856
}
4957
}
5058

@@ -72,6 +80,14 @@ message Value {
7280
Null null_val = 19;
7381
Map map_val = 20;
7482
MapList map_list_val = 21;
83+
BytesSet bytes_set_val = 22;
84+
StringSet string_set_val = 23;
85+
Int32Set int32_set_val = 24;
86+
Int64Set int64_set_val = 25;
87+
DoubleSet double_set_val = 26;
88+
FloatSet float_set_val = 27;
89+
BoolSet bool_set_val = 28;
90+
Int64Set unix_timestamp_set_val = 29;
7591
}
7692
}
7793

@@ -107,6 +123,34 @@ message BoolList {
107123
repeated bool val = 1;
108124
}
109125

126+
message BytesSet {
127+
repeated bytes val = 1;
128+
}
129+
130+
message StringSet {
131+
repeated string val = 1;
132+
}
133+
134+
message Int32Set {
135+
repeated int32 val = 1;
136+
}
137+
138+
message Int64Set {
139+
repeated int64 val = 1;
140+
}
141+
142+
message DoubleSet {
143+
repeated double val = 1;
144+
}
145+
146+
message FloatSet {
147+
repeated float val = 1;
148+
}
149+
150+
message BoolSet {
151+
repeated bool val = 1;
152+
}
153+
110154
message Map {
111155
map<string, Value> val = 1;
112156
}

sdk/python/feast/protos/feast/core/DatastoreTable_pb2.pyi

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,19 @@
11
"""
22
@generated by mypy-protobuf. Do not edit manually!
33
isort:skip_file
4-
5-
* Copyright 2021 The Feast Authors
6-
*
7-
* Licensed under the Apache License, Version 2.0 (the "License");
8-
* you may not use this file except in compliance with the License.
9-
* You may obtain a copy of the License at
10-
*
11-
* https://www.apache.org/licenses/LICENSE-2.0
12-
*
13-
* Unless required by applicable law or agreed to in writing, software
14-
* distributed under the License is distributed on an "AS IS" BASIS,
15-
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16-
* See the License for the specific language governing permissions and
4+
5+
* Copyright 2021 The Feast Authors
6+
*
7+
* Licensed under the Apache License, Version 2.0 (the "License");
8+
* you may not use this file except in compliance with the License.
9+
* You may obtain a copy of the License at
10+
*
11+
* https://www.apache.org/licenses/LICENSE-2.0
12+
*
13+
* Unless required by applicable law or agreed to in writing, software
14+
* distributed under the License is distributed on an "AS IS" BASIS,
15+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16+
* See the License for the specific language governing permissions and
1717
* limitations under the License.
1818
"""
1919
import builtins

sdk/python/feast/protos/feast/core/Entity_pb2.pyi

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,19 @@
11
"""
22
@generated by mypy-protobuf. Do not edit manually!
33
isort:skip_file
4-
5-
* Copyright 2020 The Feast Authors
6-
*
7-
* Licensed under the Apache License, Version 2.0 (the "License");
8-
* you may not use this file except in compliance with the License.
9-
* You may obtain a copy of the License at
10-
*
11-
* https://www.apache.org/licenses/LICENSE-2.0
12-
*
13-
* Unless required by applicable law or agreed to in writing, software
14-
* distributed under the License is distributed on an "AS IS" BASIS,
15-
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16-
* See the License for the specific language governing permissions and
4+
5+
* Copyright 2020 The Feast Authors
6+
*
7+
* Licensed under the Apache License, Version 2.0 (the "License");
8+
* you may not use this file except in compliance with the License.
9+
* You may obtain a copy of the License at
10+
*
11+
* https://www.apache.org/licenses/LICENSE-2.0
12+
*
13+
* Unless required by applicable law or agreed to in writing, software
14+
* distributed under the License is distributed on an "AS IS" BASIS,
15+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16+
* See the License for the specific language governing permissions and
1717
* limitations under the License.
1818
"""
1919
import builtins

sdk/python/feast/protos/feast/core/FeatureViewProjection_pb2.pyi

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ else:
1919
DESCRIPTOR: google.protobuf.descriptor.FileDescriptor
2020

2121
class FeatureViewProjection(google.protobuf.message.Message):
22-
"""A projection to be applied on top of a FeatureView.
22+
"""A projection to be applied on top of a FeatureView.
2323
Contains the modifications to a FeatureView such as the features subset to use.
2424
"""
2525

sdk/python/feast/protos/feast/core/Project_pb2.pyi

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,19 @@
11
"""
22
@generated by mypy-protobuf. Do not edit manually!
33
isort:skip_file
4-
5-
* Copyright 2020 The Feast Authors
6-
*
7-
* Licensed under the Apache License, Version 2.0 (the "License");
8-
* you may not use this file except in compliance with the License.
9-
* You may obtain a copy of the License at
10-
*
11-
* https://www.apache.org/licenses/LICENSE-2.0
12-
*
13-
* Unless required by applicable law or agreed to in writing, software
14-
* distributed under the License is distributed on an "AS IS" BASIS,
15-
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16-
* See the License for the specific language governing permissions and
4+
5+
* Copyright 2020 The Feast Authors
6+
*
7+
* Licensed under the Apache License, Version 2.0 (the "License");
8+
* you may not use this file except in compliance with the License.
9+
* You may obtain a copy of the License at
10+
*
11+
* https://www.apache.org/licenses/LICENSE-2.0
12+
*
13+
* Unless required by applicable law or agreed to in writing, software
14+
* distributed under the License is distributed on an "AS IS" BASIS,
15+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16+
* See the License for the specific language governing permissions and
1717
* limitations under the License.
1818
"""
1919
import builtins

sdk/python/feast/protos/feast/core/Registry_pb2.pyi

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,19 @@
11
"""
22
@generated by mypy-protobuf. Do not edit manually!
33
isort:skip_file
4-
5-
* Copyright 2020 The Feast Authors
6-
*
7-
* Licensed under the Apache License, Version 2.0 (the "License");
8-
* you may not use this file except in compliance with the License.
9-
* You may obtain a copy of the License at
10-
*
11-
* https://www.apache.org/licenses/LICENSE-2.0
12-
*
13-
* Unless required by applicable law or agreed to in writing, software
14-
* distributed under the License is distributed on an "AS IS" BASIS,
15-
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16-
* See the License for the specific language governing permissions and
4+
5+
* Copyright 2020 The Feast Authors
6+
*
7+
* Licensed under the Apache License, Version 2.0 (the "License");
8+
* you may not use this file except in compliance with the License.
9+
* You may obtain a copy of the License at
10+
*
11+
* https://www.apache.org/licenses/LICENSE-2.0
12+
*
13+
* Unless required by applicable law or agreed to in writing, software
14+
* distributed under the License is distributed on an "AS IS" BASIS,
15+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16+
* See the License for the specific language governing permissions and
1717
* limitations under the License.
1818
"""
1919
import builtins

0 commit comments

Comments
 (0)