-
Notifications
You must be signed in to change notification settings - Fork 99
Closed
Labels
api: spannerIssues related to the googleapis/python-spanner API.Issues related to the googleapis/python-spanner API.priority: p2Moderately-important priority. Fix may not be included in next release.Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Description
When a "bytes" field is split across chunks when querying it, the result iterator code merges the strings, then will try to encode the value from str to bytes twice, causing an error on the second attempt. The solution we found is to not parse it after merging the chunks, since it happens again on all merged values anyway.
Here's the problematic line:
| return _parse_value(merged, field.type_) |
Environment details
- OS type and version: MacOSX 10.15.6
- Python version: Python 3.8.6
- pip version: pip 20.2.1
google-cloud-spannerversion: 3.0.0
Steps to reproduce
- Create table with the following schema:
CREATE TABLE Test (id STRING(36) NOT NULL, megafield BYTES(MAX)) PRIMARY KEY (id)
- Run the code sample below to trigger the expection
Code example
"""
CREATE TABLE Test (id STRING(36) NOT NULL, megafield BYTES(MAX)) PRIMARY KEY (id)
"""
import base64
from google.cloud import spanner
from google.auth.credentials import AnonymousCredentials
###################################
# HOTFIX
###################################
from google.cloud.spanner_v1.streamed import StreamedResultSet, _merge_by_type
def _merge_chunk(self, value):
"""Merge pending chunk with next value.
:type value: :class:`~google.protobuf.struct_pb2.Value`
:param value: continuation of chunked value from previous
partial result set.
:rtype: :class:`~google.protobuf.struct_pb2.Value`
:returns: the merged value
"""
current_column = len(self._current_row)
field = self.fields[current_column]
merged = _merge_by_type(self._pending_chunk, value, field.type_)
self._pending_chunk = None
# Bug fix:
return merged #_parse_value(merged, field.type_)
# Uncomment this to fix the bug:
# StreamedResultSet._merge_chunk = _merge_chunk
###################################
# END OF HOTFIX
###################################
instance_id = 'test'
database_id = 'test-db'
spanner_client = spanner.Client(
project='test',
client_options={"api_endpoint": 'localhost:9010'},
credentials=AnonymousCredentials()
)
instance = spanner_client.instance(instance_id)
database = instance.database(database_id)
# This must be large enough that the SDK will split the megafield payload across two query chunks
# and try to recombine them, causing the error:
data = base64.standard_b64encode(("a" * 1000000).encode("utf8"))
with database.batch() as batch:
batch.insert(
table="Test",
columns=("id", "megafield"),
values=[
(1, data),
],
)
with database.snapshot() as snapshot:
results = snapshot.execute_sql(
"SELECT * FROM Test"
)
for row in results:
print("Id: ", row[0])
print("Megafield: ", row[1][:100])Stack trace
Traceback (most recent call last):
File "/Users/user1/Code/test.py", line 55, in <module>
for row in results:
File "/Users/user1/.pyenv/versions/project-3.8.6/lib/python3.8/site-packages/google/cloud/spanner_v1/streamed.py", line 139, in __iter__
self._consume_next()
File "/Users/user1/.pyenv/versions/project-3.8.6/lib/python3.8/site-packages/google/cloud/spanner_v1/streamed.py", line 132, in _consume_next
self._merge_values(values)
File "/Users/user1/.pyenv/versions/project-3.8.6/lib/python3.8/site-packages/google/cloud/spanner_v1/streamed.py", line 103, in _merge_values
self._current_row.append(_parse_value(value, field.type_))
File "/Users/user1/.pyenv/versions/project-3.8.6/lib/python3.8/site-packages/google/cloud/spanner_v1/_helpers.py", line 170, in _parse_value
result = value.encode("utf8")
AttributeError: 'bytes' object has no attribute 'encode'
Metadata
Metadata
Assignees
Labels
api: spannerIssues related to the googleapis/python-spanner API.Issues related to the googleapis/python-spanner API.priority: p2Moderately-important priority. Fix may not be included in next release.Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.Error or flaw in code with unintended results or allowing sub-optimal usage patterns.