-
Notifications
You must be signed in to change notification settings - Fork 343
Add pipeline snippets for Firestore #919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| # [START basic_read] | ||
| pipeline = client.pipeline().collection("users") | ||
| for result in pipeline.execute(): | ||
| print(result.id + " => " + result.data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f-strings are usually preferred to concatenation: print(f"{result.id} => {result.data}")
| # [START pipeline_concepts] | ||
| pipeline = client.pipeline() \ | ||
| .collection("cities") \ | ||
| .where(Field.of("population").greater_than(100000)) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Python lets you break up large numbers, so you could do 100_000 for readability (and consider this for other large numbers in this file)
| # [START basic_read] | ||
| pipeline = client.pipeline().collection("users") | ||
| for result in pipeline.execute(): | ||
| print(result.id + " => " + result.data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like result.data is a method and not a property, so I think you need to call it: result.data()
| # [START field_or_constant] | ||
| pipeline = client.pipeline() \ | ||
| .collection("cities") \ | ||
| .where(Field.of("name").equal(Constant("Toronto"))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can also do Constant.of("Toronto"), to match Field.of. I'm not sure which is preferred
|
|
||
| default_app = firebase_admin.initialize_app() | ||
| client = firestore.client(default_app, "your-new-enterprise-database") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also have an async version of everything.
Often, we re-write the samples for both sync and async, and show both options in the language selection pickers. I'm not sure how you want to handle that here.
Maybe a couple samples for pipeline_initialization_async and basic_read_async is enough to show it exists, since the API is almost identical. Or maybe samples for async aren't needed for preview
| result = client.pipeline() \ | ||
| .collection("books") \ | ||
| .select( | ||
| And( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you should include the import in the sample? from google.cloud.firestore_v1.pipeline_expressions import And. I guess this feedback would apply to Field and Constant too though...
Our usual policy was that samples should all include relevant imports. Most of these pipeline tests don't really need imports. But when we start adding Ands and Ors and Nots, I could see users getting confused, since they look so close to built-in methods
|
|
||
| def cosine_distance_function(): | ||
| # [START cosine_distance] | ||
| sample_vector = [0.0, 1.0, 2.0, 3.0, 4.0, 5.0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have a Vector type, maybe we should recommend users use that instead of a raw list? Vector([0.0, 1.0, 2.0, 3.0, 4.0, 5.0])
We support both though, so if you think this is easier/cleaner, this works too
| .execute() | ||
| # [END vector_length] | ||
| for res in result: | ||
| print(res) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if we could have a test file that exercises these functions. Even just running them without verifying outputs would be helpful to make sure no exceptions are thrown, but reading back results would be ideal
It looks like the other samples in this repo aren't tested though, so maybe that would be too much to set up
| .execute() | ||
| # [END vector_length] | ||
| for res in result: | ||
| print(res) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually use the black code formatter for our Python code. If you haven't used it, it might be good to run it over this file to see if it has any changes
Nothing really stands out to me as badly formatted, and I don't think using that formatting should be a requirement for samples. But it might be worth checking
| Field.of("rating").average().as_("avg_rating"), | ||
| groups=[Field.of("genre")] | ||
| ) \ | ||
| .execute() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to get rid of all the \, IIRC think you could do something like this:
results = (
client.pipeline()
.collection("books")
.aggregate(
Field.of("rating").average().as_("avg_rating"),
groups=[Field.of("genre")]
).execute()
)
(But test it first)
No description provided.