Skip to content

Conversation

@dhermes
Copy link
Contributor

@dhermes dhermes commented May 3, 2017

Some notes:

  • For now, the build is failing unit tests, because I haven't done those (what I'm sure will be very time-consuming) updates
  • The ACTUAL IMPORTANT thing to note is that the system tests in the build are passing
  • I haven't yet updated create_resumable_upload_session(), but I can (it should be fairly straightforward, but I'd rather do it in a separate PR)
  • I would really like some suggestions about what to do about the num_retries argument. IMO supporting it is a "bad idea™"
  • I have ever-so-slightly changed the behavior of upload_from_file() when size is passed in. This is mostly because the existing implementation is confused about what the size / chunk_size combination should mean. The implementation now does the "sane" thing: use a resumable upload IF AND ONLY IF there is a chunk size specified on the blob
  • I have also dropped the "extra" size check from the implementation and moved an actual fstat into upload_from_filename(). This size check never really made sense in the generic "give me an IO[bytes] and I'll stream it" method. What's more, a resumable upload works perfectly fine if the size isn't known, so there is no good reason to add in that extra check.
  • Previously, if the chunk_size was unset (i.e. a simple upload), then blob.upload_from_file would completely fail if size was not passed in. This is not necessary, since we could just do file_obj.read() when there is no size specified
  • I would really like to deprecate the rewind keyword argument (related Dropping internal usage of rewind in Blob.upload_from_string(). #3365)

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label May 3, 2017
@dhermes dhermes added api: storage Issues related to the Cloud Storage API. and removed cla: yes This human has signed the Contributor License Agreement. labels May 3, 2017
.. _API reference: https://cloud.google.com/storage/\
docs/json_api/v1/objects
"""
# NOTE: This assumes `self.name` is unicode.

This comment was marked as spam.

This comment was marked as spam.

* An object metadata dictionary
* The ``content_type`` as a string (according to precedence)
"""
transport = self._make_transport(client)

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

def _do_multipart_upload(self, client, stream, content_type, size):
"""Perform a multipart upload.
Assumes ``chunk_size`` is :data:`None` on the current blob.

This comment was marked as spam.

This comment was marked as spam.

upload.initiate(
transport, stream, object_metadata, content_type,
total_bytes=size, stream_final=False)
while not upload.finished:

This comment was marked as spam.

content_type = self._get_content_type(content_type, filename=filename)

with open(filename, 'rb') as file_obj:
total_bytes = os.fstat(file_obj.fileno()).st_size

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@theacodes
Copy link
Contributor

Overall this looks fine, just some small concerns.

For now, the build is failing unit tests, because I haven't done those (what I'm sure will be very time-consuming) updates

Do you plan to address those in this PR?

I haven't yet updated create_resumable_upload_session(), but I can (it should be fairly straightforward, but I'd rather do it in a separate PR)

Will you file a bug to track that, or do you have confidence you won't forget?

have ever-so-slightly changed the behavior of upload_from_file() when size is passed in. This is mostly because the existing implementation is confused about what the size / chunk_size combination should mean. The implementation now does the "sane" thing: use a resumable upload IF AND ONLY IF there is a chunk size specified on the blob

I'm okay with this.

I would really like to deprecate the rewind keyword argument

Do it. File a bug if needed to track.

@dhermes
Copy link
Contributor Author

dhermes commented May 3, 2017

Do you plan to address those (i.e. unit tests) in this PR?

Absolutely.

Will you file a bug to track that (that == fixing the impl. of create_resumable_upload_session) , or do you have confidence you won't forget?

I realized that I would go below 100% line coverage in the _create_upload function if I left that alone. Since _create_upload needs to go anyways, I will just make the create_resumable_upload_session in this PR (rather than fighting coverage).

Do it. File a bug if needed to track.

You mean like do it in this PR?

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label May 3, 2017
@theacodes
Copy link
Contributor

You mean like do it in this PR?

Your call.

dhermes added 3 commits May 3, 2017 19:24
In addition, switched over Blob.create_resumable_upload_session() to
use google-resumable-media instead of using the vendored in
`google.cloud.streaming` package.
extra_headers=extra_headers)
curr_chunk_size = self.chunk_size
try:
# Temporarily patch the chunk size. A user should still be able

This comment was marked as spam.

This is to avoid monkey-patching the instance when "pure" behavior
will suffice.

Also removed the transport from Blob._get_upload_arguments().
@dhermes
Copy link
Contributor Author

dhermes commented May 4, 2017

Merging this now after discussions with @lukesneeringer and @jonparrott.

This needs a follow-up PR ASAP that supports num_retries, so I will be working on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: storage Issues related to the Cloud Storage API. cla: yes This human has signed the Contributor License Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants