docs: add code sample for using `BigQueryWriteClient` with a compiled proto2 module #269

tswast · 2021-08-09T21:40:03Z

Will also require client changes, but I figure I'll start with a code sample with all the types I can think of so I better understand how the current (low-level) client works.

Just creating a draft PR so I don't lose track of this.

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
Ensure the tests and linter pass
Code coverage does not decrease (if any source code was changed)
Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

snippet-bot · 2021-08-16T20:52:09Z

Here is the summary of changes.

You are about to add 2 region tags.

samples/snippets/append_rows_proto2.py:15, tag bigquerystorage_append_rows_raw_proto2
samples/snippets/sample_data.proto:15, tag bigquerystorage_append_rows_raw_proto2_definition

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

Refresh this comment

…95450856-write-client

shollyman

This sample mostly reinforces my perspective that no one should use the raw API directly, so many sharp edges. It's valuable info though. Should we give it a more severe snippet tag name? Alternately, maybe more details in the python comments that this demonstrates using the raw API directly?

shollyman · 2021-08-17T00:03:11Z

samples/snippets/append_rows_proto2.py

+    row.timestamp_col = int(delta.total_seconds()) * 1000000 + int(delta.microseconds)
+    proto_rows.serialized_rows.append(row.SerializeToString())
+
+    # Since this is the second request, you only need to include the row data


you should only need the row data, streamid/schema/traceid all get set on first request in the stream.

You're right. Omitting the stream name on subsequent requests worked fine.

shollyman · 2021-08-17T00:05:44Z

samples/snippets/append_rows_proto2.py

+    proto_data.rows = proto_rows
+    request.proto_rows = proto_data
+
+    # Offset is optional, but can help detect when the server recieved a


If we want a sample where offset is optional, I'd consider either a committed or default stream. Pending implies you care about completeness, where offset is what you're sending and receiving to confirm.

Hmmm... The write API didn't seem to care when I omitted offset, but perhaps that's a server bug?

If you're omitting offset then you're just trusting that all rows land. offset ensures that rows are present where you expect them, and you have no gaps. You can do it, but commit semantics + maybe successful appends seems like a weird match.

shollyman · 2021-08-17T00:12:57Z

samples/snippets/sample_data.proto

+
+syntax = "proto2";
+
+message SampleData {


Consider commenting this. It also raises the question of how we can surface the proto content in the sample itself?

Added some comments.

Seeing as we'd need show folks how to run protoc and such I've kind of given up hope that this could be useful as a standalone sample. To do it right I think we'd need a proper tutorial, but perhaps that effort would be better spent on a manual client layer?

In the meantime this'll at least function as a system test. Plus it should be pretty easy to adapt without any changes to the sample code for my next step of seeing what happens when I try to write to a non-US table.

shollyman · 2021-08-17T00:15:25Z

samples/snippets/sample_data.proto

+    optional int64 sub_int_col = 1;
+  }
+
+  optional bool bool_col = 1;


Should we make one of the fields required? It mucks up your sample code a bit, but it gets the coverage across nullable/required/repeated.

A required "row number" field or something could be useful for testing. I forget if the wire format actually changes for required / optional. I thought it's just a client-side validation check.

there's two variations: if the proto is required, and schema is optional then you can get client side errors. If the schema is required and proto is optional and unset, then you'll get an error from backend.

shollyman · 2021-08-17T00:26:44Z

samples/snippets/append_rows_proto2.py

+
+    # Some stream types support an unbounded number of requests. Pass a
+    # generator or other iterable to the append_rows method to continuously
+    # write rows to the stream as requests are generated.


yeah, the caveat here is if you're not interleaving reads/writes you can end up in a case where no more requests are accepted until you receive responses. I believe it's currently 1000, but that's an impl detail currently and not a formal part of the docs.

Good point. I wonder if I should add a warning that it's highly recommended to read the responses and/or make the writes in a background thread? -- Plus I'll have to tell people not to use multiprocessing since gRPC hasn't really been playing nice with that.

tswast · 2021-08-17T20:49:29Z

Should we give it a more severe snippet tag name?

I'll add "raw" to the tag. This is not fine sushi, though.

tswast · 2021-08-17T21:24:58Z

@shollyman I think this is ready for another review pass

tswast · 2021-08-17T21:43:09Z

samples/snippets/append_rows_proto2.py

+    # write rows to the stream as requests are generated. Make sure to read
+    # from the response iterator as well so that the stream continues to flow.
+    requests = generate_sample_data(write_stream.name)
+    responses = write_client.append_rows(iter(requests))


I figured out how to make this work in non-default regions. Turns out just one more line to this... I'll update this PR rather than make a separate one for non-US.

…client

plamut · 2021-08-26T11:47:37Z

This PR was accidentally closed when main was deleted, and now I cannot re-open it even if I temporarily re-create main. Please open it again against master, thanks!

WIP: write client sample

c34e0bf

product-auto-label bot added api: bigquerystorage Issues related to the googleapis/python-bigquery-storage API. samples Issues that are directly related to samples. labels Aug 9, 2021

google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Aug 9, 2021

tswast added 4 commits August 11, 2021 16:12

add sample with nullable types

c9d6338

add schema for all supported types

dbe2a21

add complex types to code sample

e5b7b3f

refactor sample so that it can be tested

2d927b1

make test assertions more thorough

19b746e

tswast changed the title ~~WIP: write client sample~~ docs: add code sample for using BigQueryWriteClient with a compiled proto2 module Aug 16, 2021

tswast marked this pull request as ready for review August 16, 2021 21:43

tswast requested a review from a team August 16, 2021 21:43

tswast requested review from a team as code owners August 16, 2021 21:43

tswast requested review from leahecole and shollyman August 16, 2021 21:43

tswast added 4 commits August 16, 2021 16:43

Merge branch 'master' into b195450856-write-client

f6af466

fix lint error

483b0e3

Merge remote-tracking branch 'origin/b195450856-write-client' into b1…

a74e7ea

…95450856-write-client

remove done TODO

eba8539

shollyman reviewed Aug 17, 2021

View reviewed changes

tswast added 2 commits August 17, 2021 16:18

address reviewer comments

40e2800

fix tag mismatch

49c3c58

tswast requested a review from shollyman August 17, 2021 21:25

tswast commented Aug 17, 2021

View reviewed changes

tswast added 2 commits August 17, 2021 16:54

test on multiple regions

6d30337

Merge remote-tracking branch 'upstream/master' into b195450856-write-…

cc969eb

…client

tswast added 3 commits August 19, 2021 09:25

correct comments about why offset exists

bba6df4

upgrade g-c-b

dcc648d

Merge branch 'master' into b195450856-write-client

aa17538

plamut changed the base branch from master to main August 25, 2021 10:09

plamut deleted the branch googleapis:main August 26, 2021 11:29

plamut closed this Aug 26, 2021

tswast mentioned this pull request Aug 26, 2021

docs: add code sample for using BigQueryWriteClient with a compiled proto2 module #285

Closed

4 tasks


		syntax = "proto2";

		message SampleData {

docs: add code sample for using BigQueryWriteClient with a compiled proto2 module #269

docs: add code sample for using BigQueryWriteClient with a compiled proto2 module #269

Uh oh!

Conversation

tswast commented Aug 9, 2021

Uh oh!

snippet-bot bot commented Aug 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shollyman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tswast commented Aug 17, 2021

Uh oh!

tswast commented Aug 17, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

plamut commented Aug 26, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

docs: add code sample for using `BigQueryWriteClient` with a compiled proto2 module #269

docs: add code sample for using `BigQueryWriteClient` with a compiled proto2 module #269

snippet-bot bot commented Aug 16, 2021 •

edited

Loading