Add feature and feature set labels, for metadata#536
Add feature and feature set labels, for metadata#536feast-ci-bot merged 25 commits intofeast-dev:masterfrom
Conversation
|
Hi @imjuanleonard. Thanks for your PR. I'm waiting for a gojek member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
cced904 to
359fd3d
Compare
|
/ok-to-test |
|
Demo List Features from feast import Client,Entity, ValueType, FeatureSet, Feature
from pprint import pprint
client = Client(core_url="localhost:6565", serving_url="localhost6566")
client.set_project("test")
client_features = client.list_features()
pprint(vars(client_features['city']))Output |
|
Demo List Feature Set Requirements def __repr__(self):
return "<Test name:%s dtype:%s labels:%s>" % (self._name, self.dtype, self.labels)Create the python function to list feature set and print the features from feast import Client,Entity, ValueType, FeatureSet, Feature
client = Client(core_url="localhost:6565", serving_url="localhost6566")
client.set_project('juan')
test = client.list_feature_sets(project="juan",name="driver",version="1")
for featureset in test:
print(featureset.features)Result: |
|
Example Set Labels from google.protobuf.duration_pb2 import Duration
from feast import Client,Entity, ValueType, FeatureSet, Feature
client = Client(core_url="localhost:6565", serving_url="localhost6566")
client.set_project('juan')
feature1=Feature(name='total_transaction4',dtype=ValueType.INT64, labels=dict())
feature1.set_label("test","tost")
feature1.set_label("tast","tust")
feature1.set_label("bost","bust")
customer_fs = FeatureSet("customer_transactions",
entities=[Entity(name='customer_id4', dtype=ValueType.INT64, labels=dict())],
features=[feature1],
max_age=Duration(seconds=432000))
client.apply(customer_fs)Result DB |
|
/retest |
sdk/python/feast/field.py
Outdated
There was a problem hiding this comment.
It is of type dictionary it is handled on the initiation below, this is to make the labels Optional so the function signature doesn't change
There was a problem hiding this comment.
You can use Optional[Dict] there I think
There was a problem hiding this comment.
If what you mean is typing.Optional
I have tried to use it before, it is not behaving as optional parameter instead according to the documentation it is behaving as optional value
There was a problem hiding this comment.
It's not entirely clear what you mean. Can you please refer to this explanation of Optional type hints: https://stackoverflow.com/questions/51710037/how-should-i-use-the-optional-type-hint
There was a problem hiding this comment.
Okay, I guess after python 3.5 they are using the Optional aforementioned for type hinting. Have added it
|
@imjuanleonard we absolutely need tests in order to sign this off. Can you please add them in the same PR? |
fa1f75d to
33be193
Compare
|
The test has been added |
sdk/python/feast/field.py
Outdated
There was a problem hiding this comment.
Is there a reason why some code is on the feature and some of it is on the field? Can we label entities? If not, why not?
There was a problem hiding this comment.
I am following the current discussion that the use case people need are for the feature currently. So I don't want to assume that people need for the entities as well, Do we need metadata for the entity?
There was a problem hiding this comment.
It's not so much about entity vs feature as it is feature vs field. Entities inherit from Fields. If the Field can have a label then an Entity can have a label. So we should either limit this entirely to features, or we should add labels to both Entities and Features, which may happen through the Field class.
There was a problem hiding this comment.
Looking at the .proto files on this branch, I can see that labels field is only applied to Features (not Entities or Fields). Not sure how to proceed with this one as I'm not familiar with requirements/use cases of the Python client. Any suggestion?
There was a problem hiding this comment.
Actually Fields are the carrier of (feature and entity) values—actual data, not the metadata of registry specifications. I think these changes to field.py should probably be backed out.
From discussion Friday, it sounds like we're in agreement to leave entity labeling out of scope for now. Useful thing to have someday (perhaps when #405 gets focus), but it wasn't initially planned for Feast 0.5. So I think we should proceed with adding map<string, string> labels to FeatureSetSpec as we discussed and providing the API for that.
We might have to defer Python SDK at least in part to the Gojek team, we don't currently use it. It's the primary means of verifying end-to-end functional integration in the project, but maybe as a compromise we can demonstrate the implementation using grpc_cli examples as a test plan, and Python SDK could be finished in a second PR.
@woop Does this sound reasonable?
There was a problem hiding this comment.
Btw @suwik have a look through tests/e2e directory in the root of the project for examples of Python SDK, useful for learning the system. They're kicked off for CI by scripts in infra/scripts (used to be in .prow/scripts at the point of our internal fork).
There was a problem hiding this comment.
Thanks for suggestion @ches. I'm thinking it would be good to add some logic testing the labels to the e2e tests (and later we could take that to our repo too). I tried to follow the installation instructions in the readme, but after running docker-compose up none of the mentioned ports are open. Any suggestion how to get my environment running for this repo so that I'm able to run the e2e tests?
EDIT: ok, seems like I had COMPOSE_FILE set to our internal fork file and that's why it wasn't starting all the services that I was expecting to.
There was a problem hiding this comment.
I have removed the python SDK changes related to labels (implemented inside field.py). I did that in a separate commit so it should be easy to rollback in case we actually need them.
|
I was a bit misled by the PR title and trying to follow the history from #499—this PR is not only Python SDK, it's complete implementation of feature labels. Conflicts need to be fixed now (mostly generated files that have been removed from Git)—if in the process you could rebase and squash some of the false-start commits like |
84486a5 to
505d8f7
Compare
|
|
||
| // Labels that this field belongs to | ||
| @Column(name = "labels", columnDefinition = "text") | ||
| private String labels; |
There was a problem hiding this comment.
Hmm so I guess we're going to want to move this to the Feature entity class in #655… this was mentioned before but I guess I wasn't understanding the implications until I saw #655 reduced from #612.
Creates a bit of a conundrum for us since I thought the labels feature could be a straightforward backport (internal, assuming Feast isn't going to backport features). That's not going to be the case if this has to wait on #655 and substantial SQL schema migrations are going to be required for it.
|
/test test-end-to-end-batch |
|
Looks good to me. Will you let me know when its ready to be merged? |
|
Thanks for being so patient here @ches. I guess the conversation around Field, Entity, and Feature has temporarily been answered, although I personally don't think we have reached our destination yet. I do feel like the door is left more open with our current approach. |
| Map<String, String> featureSetLabels = | ||
| new HashMap<>() { | ||
| { | ||
| put("description", "My precious feature set"); |
tests/e2e/grpc-based-registration.py
Outdated
| LABEL_KEY = "my" | ||
| LABEL_VALUE = "label" | ||
|
|
||
|
|
There was a problem hiding this comment.
The tests look good, but possibly add a TODO to migrate this over to the python sdk once that has been updated?
|
/test test-end-to-end |
| return get_feature_set_response.feature_set | ||
|
|
||
| @pytest.mark.timeout(45) | ||
| @pytest.mark.run(order=9) |
|
/test test-end-to-end |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: imjuanleonard, woop The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/lgtm |
Backports feast-dev#536 to v0.3
* Add feature and feature set labels Backports #536 to v0.3 * Fix feature labels test for round trip It appears that this test was only exercising the test setup code, not covering that labels work when applied. This change should be forward-ported to master. Co-authored-by: Suwinski, Krzysztof (Agoda) <Krzysztof.Suwinski@agoda.com>

What this PR does / why we need it:
-> Extension for #463
Update (@suwik):
Which issue(s) this PR fixes:
Fixes #463
Does this PR introduce a user-facing change?: