-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Allow empty indexed arrays. #3767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow empty indexed arrays. #3767
Conversation
|
@lukesneeringer So, system tests? |
|
Going to take a pass on that for this one. It is a semi-edge case that I do not really want to own a system test for, and two separate customers have already confirmed that it works for them. |
|
Can you provide a snippet that you've personally used to confirm it works with the backend? I'll check out this branch, run it myself, then we can merge? |
|
Not sure if I have the one I used a few months ago when I wrote this, but I should be able to write one. I will get back to you. |
|
Thanks |
|
|
On it. |
|
@lukesneeringer I'm not sure this fixes the issue? Notice after storing, the property isn't actually added: >>> from google.cloud import datastore
>>> client = datastore.Client()
>>> key = client.key('EntityKind', 1234)
>>> client.delete(key)
>>> client.get(key) is None
True
>>> entity = datastore.Entity(key, exclude_from_indexes=('foo',))
>>> entity['foo'] = []
>>> entity
<Entity('EntityKind', 1234) {'foo': []}>
>>> entity.exclude_from_indexes
{'foo'}
>>> client.put(entity)
>>>
>>> retrieved = client.get(key)
>>> retrieved
<Entity('EntityKind', 1234) {}>
>>> retrieved.exclude_from_indexes
set() |
|
Odd that two customers said it worked for them. It almost looks like the current behavior just ignores the empty list entirely rather than actually store an empty list in Datastore. |
|
@lukesneeringer I'm fairly certain they are using a different client to actually store the data / use our low-level gRPC client to send the empty list. I'll try to do that right now. |
|
@lukesneeringer It's unclear to me what >>> mutation
upsert {
key {
partition_id {
project_id: "PROJECT"
}
path {
kind: "EntityKind"
id: 1234
}
}
properties {
key: "foo"
value {
array_value {
values {
string_value: "a"
exclude_from_indexes: true
}
}
}
}
} |
|
@dhermes That might be why the fix is correct then? The problem is that it was raising an exception on an empty list (because programmatically the users were unsure whether the list was empty or not). |
|
@alercunha @MeteKem Can you elucidate this for us? |
|
@lukesneeringer Gotcha. Have you run your snippet against |
|
@lukesneeringer @dhermes In my case the issue manifested when retrieving empty array values from DS which was set by another client (Java in my case). The actual value stored was an empty array. |
|
@lukesneeringer @dhermes @alercunha Same here. In my case the other client was a node client. |
| array_pb = array_val_pb.array_value.values | ||
|
|
||
| unindexed_value_pb1 = array_pb.add() | ||
| unindexed_value_pb1.integer_value = 10 |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
If I create an entity using this library and set a certain property as an empty array there is no value stored (it's like the property was never added). However in other client libraries that's not always the case. So for example if I create data doing this: From the console it will look like this: The first entity has no property children defined. However if I edit the entity manually and set a property with an empty array like this: Then back to the list of entities now the first entity does have a children property with an empty value in it. Now if I query the data I get the error mentioned earlier. This bug seems to be caused by other libraries saving data in a way that this library wouldn't do. So I'm not sure within this library context it will be possible to reproduce the problem in a unit test. However it should still be considered a bug since the Datastore accepts it when coming from other libraries as a valid value. |
|
@lukesneeringer you're right, the current code ignores a key-pair value if the value is an empty array: https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/datastore/google/cloud/datastore/helpers.py#L214 I tried removing this behavior but then I get a 400 error from the google API call. |
If it's possible to do at all, we can still do it with raw protocol buffers. |
|
Any updates? I'm still struggling with my development cycle because of this. |
|
@MeteKem Can you help us with the reproduction of the issue? For an offending entity, can you retrieve it with the low-level API? (I'll produce a snippet shortly.) |
>>> from google.cloud.proto.datastore.v1 import datastore_pb2
>>> from google.cloud import datastore
>>>
>>> client = datastore.Client()
>>> key = client.key('Foo', 'Bar')
>>>
>>> key_pb = key.to_protobuf()
>>> read_options = datastore_pb2.ReadOptions()
>>> gapic_client = client._datastore_api
>>> lookup_response = gapic_client.lookup(
... client.project, read_options, [key_pb])
>>>
>>> lookup_response
missing {
entity {
key {
partition_id {
project_id: "${PROJECT}"
}
path {
kind: "Foo"
name: "Bar"
}
}
}
version: 1
}
>>> |
|
@dhermes To reproduce the issue:
I did confirm that the low-level API can retrieve the offending entity. It was clear to me that the entity can be retrieved with the low-level API anyways. It's just the conversion of |
Can you provide a JS snippet?
Sorry I didn't communicate clearly. I was interested in you doing that so we could use the returned protobuf to reproduce this error, not as validation that the API could store / send the value. In that vein, can you share the returned protobuf from the low-level call? |
|
@dhermes It looks something like this. I had to strip it down as the original entity includes proprietary data that I cannot publicly share. |
|
Great, so the offenders seem to be: |
|
Just hit this as well, hoping for fix soon. In my case it's also due to them being created by a node.js client, and me trying to query in python. |
|
@jgrabenstein Definitely hoping to get a fix soon. As you can see, this one has been quite a struggle. :-) |
|
@lukesneeringer I'm working a repro right now. |
|
Setting up an environment: and running we have In [1]: from google.cloud import datastore
In [2]: from google.cloud.proto.datastore.v1 import datastore_pb2
In [3]: from google.cloud.proto.datastore.v1 import entity_pb2
In [4]: client = datastore.Client()
In [5]: key = client.key('Foo', 'Bar')
In [6]: entity = datastore.Entity(key=key)
In [7]: batch = client.batch()
In [8]: batch.begin()
In [9]: batch.put(entity)
In [10]: mutations = batch.mutations
In [11]: len(mutations)
Out[11]: 1
In [12]: mutation = mutations[0]
In [13]: mutation
Out[13]:
upsert {
key {
partition_id {
project_id: "prahj-ekt"
}
path {
kind: "Foo"
name: "Bar"
}
}
}
In [14]: value_pb = mutation.upsert.properties.get_or_create('name-name')
In [15]: value_pb.array_value.CopyFrom(entity_pb2.ArrayValue(values=[]))
In [16]: mutation
Out[16]:
upsert {
key {
partition_id {
project_id: "prahj-ekt"
}
path {
kind: "Foo"
name: "Bar"
}
}
properties {
key: "name-name"
value {
array_value {
}
}
}
}
In [17]: gapic_api = client._datastore_api
In [18]: gapic_api
Out[18]: <google.cloud.datastore._gax.GAPICDatastoreAPI at 0x7f56310b9588>
In [19]: mode = datastore_pb2.CommitRequest.NON_TRANSACTIONAL
In [20]: commit_response_pb = gapic_api.commit(
...: batch.project, mode, mutations, transaction=None)
...:
In [21]: commit_response_pb
Out[21]:
mutation_results {
version: 1503505990610000
}
index_updates: 1
In [22]: key_pbs = [key.to_protobuf()]
In [23]: read_options = datastore_pb2.ReadOptions()
In [24]: lookup_response = gapic_api.lookup(
...: client.project, read_options, key_pbs)
...:
In [25]: lookup_response
Out[25]:
found {
entity {
key {
partition_id {
project_id: "prahj-ekt"
}
path {
kind: "Foo"
name: "Bar"
}
}
properties {
key: "name-name"
value {
array_value {
}
}
}
}
version: 1503505990610000
}
In [26]: %xmode plain
Exception reporting mode: Plain
In [27]: client.get(key)
Traceback (most recent call last):
File "<ipython-input-27-0924c86e3e51>", line 1, in <module>
client.get(key)
File ".../venv/lib/python3.6/site-packages/google/cloud/datastore/client.py", line 309, in get
deferred=deferred, transaction=transaction)
File ".../venv/lib/python3.6/site-packages/google/cloud/datastore/client.py", line 370, in get_multi
for entity_pb in entity_pbs]
File ".../venv/lib/python3.6/site-packages/google/cloud/datastore/client.py", line 370, in <listcomp>
for entity_pb in entity_pbs]
File ".../venv/lib/python3.6/site-packages/google/cloud/datastore/helpers.py", line 140, in entity_from_protobuf
raise ValueError('For an array_value, subvalues must either '
ValueError: For an array_value, subvalues must either all be indexed or all excluded from indexes.
In [28]: |
|
@lukesneeringer You can add the above as a system test for the explicit purpose of guarding against regression? Running it on this branch: In [27]: client.get(key)
Out[27]: <Entity('Foo', 'Bar') {'name-name': []}>
In [28]: client.delete(key)
In [29]: |
|
bump. |
|
@dhermes Any updates? |
|
Would be nice to see some movement on this |
|
@dhermes @lukesneeringer Any updates? PS: This change fixed the issue for me as well. Thanks @lukesneeringer :-) |
| array_pb = array_val_pb.array_value.values | ||
|
|
||
| unindexed_value_pb1 = array_pb.add() | ||
| unindexed_value_pb1.integer_value = 10 |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
Hi, Thanks |
|
I ran into this. In my case, Ruby and Java clients set array empty. |
|
Superseded by #4915. |



Fixes #3152