Skip to content

Conversation

@azymnis
Copy link
Contributor

@azymnis azymnis commented Jul 12, 2017

Problem: I need to compare a new schema with an existing table schema. Easiest way would be to use set comparison operators. However this is not possible with SchemaField because there is no implementation of __hash__.

Solution: Add a simple implementation of __hash__ using tuple keys. This is not the most efficient implementation of a hashcode but should probably be ok for most purposes.

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If your company signed a CLA, they designated a Point of Contact who decides which employees are authorized to participate. You may need to contact the Point of Contact for your company and ask to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the project maintainer to go/cla#troubleshoot.
  • In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again.

@googlebot googlebot added the cla: no This human has *not* signed the Contributor License Agreement. label Jul 12, 2017
@azymnis
Copy link
Contributor Author

azymnis commented Jul 12, 2017

I signed!

@googlebot
Copy link

CLAs look good, thanks!

@googlebot googlebot added cla: yes This human has signed the Contributor License Agreement. and removed cla: no This human has *not* signed the Contributor License Agreement. labels Jul 12, 2017
@dhermes
Copy link
Contributor

dhermes commented Jul 12, 2017

I was a bit confused by what is needed, but this example clarifies for me:

>>> class A(object):
...     def __init__(self, val):
...         self.val = val
...     def __repr__(self):
...         return 'A({!r})'.format(self.val)
...     def __eq__(self, other):
...         if not isinstance(other, A):
...             return NotImplemented
...         return other.val == self.val
...     def __ne__(self, other):
...         if not isinstance(other, A):
...             return NotImplemented
...         return other.val != self.val
...
>>> a1 = A(10)
>>> a2 = A(11)
>>> a3 = A(11)
>>> a1 == a2
False
>>> a1 == a3
False
>>> a2 == a3
True
>>> set([a1, a2, a3])
set([A(10), A(11), A(11)])

self.fields = fields

def __eq__(self, other):
def __key(self):

This comment was marked as spam.

This comment was marked as spam.

self.field_type.lower(),
self.mode,
self.description,
self.fields)

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@dhermes dhermes added the api: bigquery Issues related to the BigQuery API. label Jul 12, 2017
@azymnis
Copy link
Contributor Author

azymnis commented Jul 13, 2017

@dhermes @tseaver what do you think of this?

self.mode = mode
self.description = description
self.fields = fields
self.fields = None if(fields is None) else tuple(fields)

This comment was marked as spam.

This comment was marked as spam.

@azymnis
Copy link
Contributor Author

azymnis commented Jul 17, 2017

@dhermes @tseaver i modified the default arg for fields to be (). The question is what to do with how this resource is represented. Should I use () there too? This is what I did, so if you would rather have me use [] I can change that.

FYI: this broke a few tests and I'm wondering if it will break any of the actual api calls.

info['description'] = field.description
if field.fields is not None:
info['fields'] = _build_schema_resource(field.fields)
info['fields'] = tuple(_build_schema_resource(field.fields))

This comment was marked as spam.

"""
if 'fields' not in info:
return None
return ()

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

def __hash__(self):
return hash(self._key())

def __repr__(self):

This comment was marked as spam.

This comment was marked as spam.

@azymnis
Copy link
Contributor Author

azymnis commented Jul 17, 2017

Hmm looks like I broke some tests in circleci, let me go fix that

@dhermes dhermes force-pushed the azymnis/schema_hashcode branch from 5dc1f46 to 66a3d66 Compare July 17, 2017 17:11
@googlebot
Copy link

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored by someone other than the pull request submitter. We need to confirm that they're okay with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of the commit author(s) and merge this pull request when appropriate.

@googlebot googlebot added cla: no This human has *not* signed the Contributor License Agreement. and removed cla: yes This human has signed the Contributor License Agreement. labels Jul 17, 2017
@dhermes
Copy link
Contributor

dhermes commented Jul 17, 2017

@azymnis We (@jonparrott, @lukesneeringer and I) just had a discussion about this and determined SchemaField should be immutable in order to be hashable.

Rather than put the burden on you, I have sent a commit into this PR to make the class immutable.

Copy link
Contributor

@lukesneeringer lukesneeringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One style nitpick.

'schema': {'fields': [
{'name': 'full_name', 'type': 'STRING', 'mode': 'REQUIRED'},
{'name': 'age', 'type': 'INTEGER', 'mode': 'REQUIRED'},
{'name':

This comment was marked as spam.

self._fields,
)

def __eq__(self, other):

This comment was marked as spam.

This comment was marked as spam.

@dhermes dhermes force-pushed the azymnis/schema_hashcode branch from 66a3d66 to 8882df3 Compare July 17, 2017 17:33
@azymnis
Copy link
Contributor Author

azymnis commented Jul 17, 2017

@dhermes makes sense, thanks for the update

@dhermes dhermes merged commit 59fd1e4 into googleapis:master Jul 17, 2017
@azymnis azymnis deleted the azymnis/schema_hashcode branch July 17, 2017 18:53
@azymnis
Copy link
Contributor Author

azymnis commented Jul 17, 2017

Thank you all!

landrito pushed a commit to landrito/google-cloud-python that referenced this pull request Aug 21, 2017
* Add a __hash__ implementation to SchemaField
* Modify default list of subfields to be the empty tuple
* Making SchemaField immutable.
* Adding SchemaField.__ne__.
landrito pushed a commit to landrito/google-cloud-python that referenced this pull request Aug 22, 2017
* Add a __hash__ implementation to SchemaField
* Modify default list of subfields to be the empty tuple
* Making SchemaField immutable.
* Adding SchemaField.__ne__.
landrito pushed a commit to landrito/google-cloud-python that referenced this pull request Aug 22, 2017
* Add a __hash__ implementation to SchemaField
* Modify default list of subfields to be the empty tuple
* Making SchemaField immutable.
* Adding SchemaField.__ne__.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the BigQuery API. cla: no This human has *not* signed the Contributor License Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants