-
Notifications
You must be signed in to change notification settings - Fork 46
Add get_pop_obs api #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
antaresc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing this! I added a number of comments about documentation and placement of code.
Can you also update the .rst files in /docs/ and make sure the docstrings render locally in the sphinx build?
datacommons/places.py
Outdated
| return result | ||
|
|
||
|
|
||
| def get_pop_obs(dcid): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move this to the Populations module? The language of get_pop_obs documentation matches more get_populations and get_observations, than it does get_places_in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Later on will add get_place_obs, get_place_kml, similar to this one, can be thought of "properties" or "nodes" about a place. So in the places module, a user can find all apis to get information about a place. Also get_pop_obs can return observation directly linked to a place (I know the function name is not accurate then), where no population is present at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay that's fine. This raises a couple questions for me actually.
- I think the distinction between get_place_obs and get_pop_obs is a bit confusing now. I would have assumed that observations linked directly to place would be returned by get_place_obs.
- Also how is the structure of the return value if the observation is linked directly to the place dcid (in particular, how do you differentiate Observations linked by "location" vs. "childhoodLocation"?
- Is it possible to separate get_pop_obs into two functions with one returning population observation pairs, and the other returning observations linked directly to the Place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also wondering if it all needs to be returned in one call. As is, the call seems very browser specific. Also, are these calls place specific? If I understand correctly, we could have statistical population on non-place nodes.
Would we run into limit issues with this call? Or is everything going to be in the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay that's fine. This raises a couple questions for me actually.
- I think the distinction between get_place_obs and get_pop_obs is a bit confusing now. I would have assumed that observations linked directly to place would be returned by get_place_obs.
They do have things in common, kinda like get_triples and get_property_values. get_place_obs get observations for a particular population spec, while get_pop_obs get everything. We need to inform users from the comments.
- Also how is the structure of the return value if the observation is linked directly to the place dcid (in particular, how do you differentiate Observations linked by "location" vs. "childhoodLocation"?
There is an "observations" field to contain all those. Currently there is no way to tell that, this is a bug.
- Is it possible to separate get_pop_obs into two functions with one returning population observation pairs, and the other returning observations linked directly to the Place?
Might be easier to put them in one, this is like get_triples to get everything. get_place_obs is like get_prop_values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only from cache, so limit is not an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_place_obs gives observation of places, with the population specification fixed.
get_pop_obs actually or (ultimately) get observations for a place, i would see the population here is a by-product. This is like get city in USA, the county is just a by-product.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this could return non-place results, should this move to the populations module? Unless the mixer currently only returns place results. Seeing the layout of functions across modules now, I think this belongs with populations.. sorry if I misunderstood your comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, moved to populations module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
datacommons/places.py
Outdated
| 'marginOfError': 119, | ||
| 'measuredProp': 'count', | ||
| 'measuredValue': 225, | ||
| 'measurementMethod': 'CenusACS5yrSurvey', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we choose BLS as the example population? We should try to avoid any observation that has measurementMethod "Cenus"ACS5YrSurvey until that's fixed in the graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Good point!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah it's back to CenusACS5yrSurvey?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
um, I made it Census now..
datacommons/examples/places.py
Outdated
| pd_frame = dc.flatten_frame(pd_frame) | ||
| print(pd_frame) | ||
|
|
||
| # Get all population and observation data of Mountain View. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move to datacommons/examples/populations.py. See comment above about moving get_pop_obs to populations submodule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See reply on the other comment.
shifucun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for comments on the docstring!
datacommons/places.py
Outdated
| return result | ||
|
|
||
|
|
||
| def get_pop_obs(dcid): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Later on will add get_place_obs, get_place_kml, similar to this one, can be thought of "properties" or "nodes" about a place. So in the places module, a user can find all apis to get information about a place. Also get_pop_obs can return observation directly linked to a place (I know the function name is not accurate then), where no population is present at all.
datacommons/examples/places.py
Outdated
| pd_frame = dc.flatten_frame(pd_frame) | ||
| print(pd_frame) | ||
|
|
||
| # Get all population and observation data of Mountain View. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See reply on the other comment.
datacommons/places.py
Outdated
| 'marginOfError': 119, | ||
| 'measuredProp': 'count', | ||
| 'measuredValue': 225, | ||
| 'measurementMethod': 'CenusACS5yrSurvey', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Good point!
antaresc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a bunch of nits, but otherwise looks good!
PTAL at my questions from a comment above before submitting.
datacommons/places.py
Outdated
| return result | ||
|
|
||
|
|
||
| def get_pop_obs(dcid): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also wondering if it all needs to be returned in one call. As is, the call seems very browser specific. Also, are these calls place specific? If I understand correctly, we could have statistical population on non-place nodes.
Would we run into limit issues with this call? Or is everything going to be in the cache.
datacommons/places.py
Outdated
| 'marginOfError': 119, | ||
| 'measuredProp': 'count', | ||
| 'measuredValue': 225, | ||
| 'measurementMethod': 'CenusACS5yrSurvey', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah it's back to CenusACS5yrSurvey?
* Fix typo in places.py. (#71) (#84) * Added iPynb notebooks and fixed issue #67 (#86) * Added notebooks * Added --upgrade flag * Update requirements and setup.py * Added maintenance information to the readme * Update setup.py and pypi long description (#87) * Update setup.py and requirements * Add long description for pypi package * Use README.md as the long description * Use single quote * Fixed docs and updated changelog (#89) * Amended documentation for creating a key. * Updated CHANGELOG * Spacing * Added 0.4.3 changes * Added 0.x to changelog * Add get_pop_obs api (#93) * Add get_pop_obs api * Updated comments with more detailed description. * Update comment and rst file * Updated comments * Move to populations module * Add missing change * Correct typo * More typo correction * Add comments on observations field * Add the step to join google group (#95) * Add more details for joining google group (#96) * Implemented get_place_obs and reimplemented query (#94) * Implemented get_place_obs * Reimplemented query as a function instead of a class. * query now returns a list. Amended docstrings. * Fixed docstring typo * Add observation_date as argument to get_place_obs (#97) * Add observation_date as argument to get_place_obs * Update doc string * Update test * fix link to getting started on read the docs home page (#98) * Fixed bug regarding indexing (#100) * Updated the changelog (#101)
No description provided.