Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
ea9559c
add header for old and new
lucyking140 Sep 20, 2025
a468864
adding a passed-in param to account for calls made from MCP, etc.
lucyking140 Sep 20, 2025
d985ef4
adding a passed-in param to account for calls made from MCP, etc.
lucyking140 Sep 20, 2025
de9cd61
adding flag to other fetch calls
lucyking140 Sep 20, 2025
af0d321
adding basic validation:
lucyking140 Sep 20, 2025
1cdd6c3
updating test suite to account for new metadata tag
lucyking140 Sep 22, 2025
99a1323
removing old client changes because it uses V1
lucyking140 Sep 22, 2025
0155738
cleaning up docs
lucyking140 Sep 22, 2025
0ded695
removing -new because we only have one endpoint here
lucyking140 Sep 22, 2025
2654012
removing -new because we only have one endpoint here
lucyking140 Sep 22, 2025
a8a22e5
removing -new because we only have one endpoint here
lucyking140 Sep 22, 2025
2852900
updating run_test to temporarily change click versions
lucyking140 Sep 22, 2025
12d08e0
adding support for mcp-version
lucyking140 Sep 22, 2025
1015c3c
adding -y to uninstall click command to make it work without input
lucyking140 Sep 22, 2025
5c05728
correcting tests
lucyking140 Sep 22, 2025
c072f33
converting to a param passed in on init
lucyking140 Sep 22, 2025
4657b00
pushing to see if i have build privileges
lucyking140 Sep 22, 2025
fad438e
cleaned up tests to match new format
lucyking140 Sep 22, 2025
c19bcec
converting to an attribute of the DCC
lucyking140 Sep 23, 2025
dc45042
correcting typos
lucyking140 Sep 23, 2025
81b6a23
cleaning up docs
lucyking140 Sep 23, 2025
76b573d
adding tests to make sure the surface header is propogated correctly
lucyking140 Sep 23, 2025
b3b21b2
making api key an explicit param in build-headers
lucyking140 Sep 23, 2025
08261ea
fixing spacing
lucyking140 Sep 23, 2025
1874732
linting
lucyking140 Oct 2, 2025
6d12ff5
undoing notebook changes:
lucyking140 Oct 2, 2025
235ec86
Merge branch 'master' of https://github.com/datacommonsorg/api-python…
lucyking140 Oct 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 47 additions & 37 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,17 @@
# Changelog

## 2.1.2

**Date** - 09/23/2025

**Release Tag** - [py2.1.2](https://github.com/datacommonsorg/api-python/releases/tag/py2.1.2)

**Release Status** - Current head of branch [`master`](https://github.com/datacommonsorg/api-python/tree/master)

This update adds an optional `surface_header_value` parameter to the Data Commons Client that is used internally by Data Commons
to track usage across its platforms. Other Data Commons services make calls to the Python client and pass in this parameter, but it
is not intended for public use and does not affect the behavior of the client.

## 2.1.1

**Date** - 06/10/2025
Expand All @@ -19,17 +31,20 @@ This is an under-the-hood update to use `pydantic` for data models.
**Release Status** - Current head of branch [`master`](https://github.com/datacommonsorg/api-python/tree/master)

Bugs fixed:

- Remove auto-flattening for unpack_arcs
- Fix unpack_arcs when multiple arcs are in the node response

Other improvements:

- Clarify parent_entity requirements for observations_dataframe
- Updated some tutorials and notebooks to use the v2 client
- Fix install command and refine documentation
- Handle empty/malformed REST API node responses
- Make renamed methods backwards compatible

New features:

- Add helpers to extract data from NodeResponse arcs
- Add convenient ways to fetch parents and children of given entities

Expand Down Expand Up @@ -65,7 +80,7 @@ Bugs fixed in new release

New features added to the Python API

- Added batching to `get_stat_all` to handle querying for many StatisticalVariables across many Places.
- Added batching to `get_stat_all` to handle querying for many StatisticalVariables across many Places.

## 1.4.1

Expand All @@ -77,9 +92,9 @@ New features added to the Python API

New features added to the Python API

- `get_stat_value`: returns a single value for the specified Place and StatisticalVariable.
- `get_stat_series`: returns a single time series dict for the specified Place and StatisticalVariable.
- `get_stat_all`: returns a nested dictionary of all possible time series for each Place and StatisticalVariable pair.
- `get_stat_value`: returns a single value for the specified Place and StatisticalVariable.
- `get_stat_series`: returns a single time series dict for the specified Place and StatisticalVariable.
- `get_stat_all`: returns a nested dictionary of all possible time series for each Place and StatisticalVariable pair.

## 1.3.0

Expand All @@ -91,12 +106,12 @@ New features added to the Python API

New features added to the Python API

- Option to use the API without providing API key.
- New options to `get_stats`: `measurement_method`, `unit`, and `obs_period` for finer-grain control over returned statistics.
- Option to use the API without providing API key.
- New options to `get_stats`: `measurement_method`, `unit`, and `obs_period` for finer-grain control over returned statistics.

Bugs fixed in new release

- Elegantly handle sparse responses from `query`.
- Elegantly handle sparse responses from `query`.

## 1.2.0

Expand All @@ -108,12 +123,11 @@ Bugs fixed in new release

New features added to the Python API

- Add get_stats API to get observations given a StatisticalVariable and place dcids.
- Add get_stats API to get observations given a StatisticalVariable and place dcids.

Bugs fixed in new release

- Check Null and empty data in REST API response field to avoid KeyError.

- Check Null and empty data in REST API response field to avoid KeyError.

## 1.1.0

Expand All @@ -125,11 +139,11 @@ Bugs fixed in new release

New features added to the Python API

- Handle and ignore NaN in API argument.
- Handle and ignore NaN in API argument.

Bugs fixed in new release

- Various small fix.
- Various small fix.

## 1.0.9

Expand All @@ -141,7 +155,7 @@ Bugs fixed in new release

New features added to the Python API

- Use six package for urllib.
- Use six package for urllib.

## 1.0.7

Expand All @@ -153,7 +167,7 @@ New features added to the Python API

New features added to the Python API

- Support python 2.7.
- Support python 2.7.

## 1.0.6

Expand All @@ -165,9 +179,7 @@ New features added to the Python API

New features added to the Python API

- Add a new API for getting related places.


- Add a new API for getting related places.

## 1.0.5

Expand All @@ -179,9 +191,8 @@ New features added to the Python API

New features added to the Python API

- Remove the dependency on Pandas and Numpy in package dependency.
- Replace requests with urllib.

- Remove the dependency on Pandas and Numpy in package dependency.
- Replace requests with urllib.

## 1.0.2

Expand All @@ -193,8 +204,7 @@ New features added to the Python API

New features added to the Python API

- Remove the dependency on Pandas.

- Remove the dependency on Pandas.

## 1.0.1

Expand All @@ -206,14 +216,14 @@ New features added to the Python API

New features added to the Python API

- Added two new functions `get_pop_obs` and `get_place_obs`
- SPARQL query is now supported as a function `query` instead of a class.
- Added documentation on how to provision an API key and provide it to the API
- Added two new functions `get_pop_obs` and `get_place_obs`
- SPARQL query is now supported as a function `query` instead of a class.
- Added documentation on how to provision an API key and provide it to the API

Bugs fixed in new release

- Fixed various typos and formatting issues in the documentation.
- If the index of the `pandas.Series` passed into functions such as `get_populations` and `get_observations` was not contiguous, then the assignment step would not properly align the values returned by calling the function. This is because the `pandas.Series` returned by the function would have a different index than the given series. This is fixed by assigning the index of the returned series to that of the given series.
- Fixed various typos and formatting issues in the documentation.
- If the index of the `pandas.Series` passed into functions such as `get_populations` and `get_observations` was not contiguous, then the assignment step would not properly align the values returned by calling the function. This is because the `pandas.Series` returned by the function would have a different index than the given series. This is fixed by assigning the index of the returned series to that of the given series.

## 1.0.0

Expand All @@ -223,15 +233,15 @@ Bugs fixed in new release

New release of the Python API.

- New functions in the API built on top of the [Data Commons REST API](https://github.com/datacommonsorg/mixer).
- `get_property_labels`
- `get_property_values`
- `get_triples`
- `get_populations`
- `get_observations`
- `get_places_in`
- New tests and examples checked into `datacommons/test` and `datacommons/examples`
- Full documentation released on [readthedocs](https://datacommons.readthedocs.io/en/latest/)
- New functions in the API built on top of the [Data Commons REST API](https://github.com/datacommonsorg/mixer).
- `get_property_labels`
- `get_property_values`
- `get_triples`
- `get_populations`
- `get_observations`
- `get_places_in`
- New tests and examples checked into `datacommons/test` and `datacommons/examples`
- Full documentation released on [readthedocs](https://datacommons.readthedocs.io/en/latest/)

## 0.4.3

Expand All @@ -243,7 +253,7 @@ New release of the Python API.

Patch release that fixes bugs in `datacommons.Client`.

- Functions `get_cities` and `get_states` now provides `typeOf` constraints in their datalog queries.
- Functions `get_cities` and `get_states` now provides `typeOf` constraints in their datalog queries.

## 0.x

Expand Down
2 changes: 1 addition & 1 deletion datacommons_client/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = "2.1.1"
__version__ = "2.1.2rc1"
"""
Data Commons Client Package

Expand Down
18 changes: 10 additions & 8 deletions datacommons_client/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,12 @@ class DataCommonsClient:

"""

def __init__(
self,
api_key: Optional[str] = None,
*,
dc_instance: Optional[str] = "datacommons.org",
url: Optional[str] = None,
):
def __init__(self,
api_key: Optional[str] = None,
*,
dc_instance: Optional[str] = "datacommons.org",
url: Optional[str] = None,
surface_header_value: Optional[str] = None):
"""
Initializes the DataCommonsClient.

Expand All @@ -54,7 +53,10 @@ def __init__(
dc_instance = None

# Create an instance of the API class which will be injected to the endpoints
self.api = API(api_key=api_key, dc_instance=dc_instance, url=url)
self.api = API(api_key=api_key,
dc_instance=dc_instance,
url=url,
surface_header_value=surface_header_value)

# Create instances of the endpoints
self.node = NodeEndpoint(api=self.api)
Expand Down
61 changes: 50 additions & 11 deletions datacommons_client/endpoints/base.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
import re
from typing import Any, Dict, Optional

from datacommons_client.utils.request_handling import build_headers
from datacommons_client.utils.error_handling import InvalidSurfaceHeaderValueError
from datacommons_client.utils.error_handling import VALID_SURFACE_HEADER_VALUES
from datacommons_client.utils.request_handling import check_instance_is_valid
from datacommons_client.utils.request_handling import post_request
from datacommons_client.utils.request_handling import resolve_instance_url
Expand All @@ -19,6 +21,7 @@ def __init__(
api_key: Optional[str] = None,
dc_instance: Optional[str] = None,
url: Optional[str] = None,
surface_header_value: Optional[str] = None,
):
"""
Initializes the API instance.
Expand All @@ -30,6 +33,8 @@ def __init__(
url: A fully qualified URL for the base API. This may be useful if more granular control
of the API is required (for local development, for example). If provided, dc_instance`
should not be provided.
surface_header_value: indicates which DC surface (MCP server, etc.) makes a call to the python library.
If the call originated internally, this is null and we pass in "clientlib-python" as the surface header

Raises:
ValueError: If both `dc_instance` and `url` are provided.
Expand All @@ -40,15 +45,25 @@ def __init__(
if not dc_instance and not url:
dc_instance = "datacommons.org"

self.headers = build_headers(api_key)

if url is not None:
# Use the given URL directly (strip trailing slash)
self.base_url = check_instance_is_valid(url.rstrip("/"))
else:
# Resolve from dc_instance
self.base_url = resolve_instance_url(dc_instance)

# if this call originates from another DC product (MCP server, DataGemma, etc.), we indicate that to Mixer
# otherwise, the 'x-surface' header is 'clientlib-python'
if surface_header_value:
# use patterns to support tags like mcp-{VERSION}
if not any(
re.fullmatch(pattern, surface_header_value)
for pattern in VALID_SURFACE_HEADER_VALUES):
raise InvalidSurfaceHeaderValueError

self.headers = self.build_headers(surface_header_value=surface_header_value,
api_key=api_key)

def __repr__(self) -> str:
"""Returns a readable representation of the API object.

Expand All @@ -60,14 +75,12 @@ def __repr__(self) -> str:
has_auth = " (Authenticated)" if "X-API-Key" in self.headers else ""
return f"<API at {self.base_url}{has_auth}>"

def post(
self,
payload: dict[str, Any],
endpoint: Optional[str] = None,
*,
all_pages: bool = True,
next_token: Optional[str] = None,
) -> Dict[str, Any]:
def post(self,
payload: dict[str, Any],
endpoint: Optional[str] = None,
*,
all_pages: bool = True,
next_token: Optional[str] = None) -> Dict[str, Any]:
"""Makes a POST request using the configured API environment.

If `endpoint` is provided, it will be appended to the base_url. Otherwise,
Expand All @@ -91,12 +104,38 @@ def post(
raise ValueError("Payload must be a dictionary.")

url = (self.base_url if endpoint is None else f"{self.base_url}/{endpoint}")

return post_request(url=url,
payload=payload,
headers=self.headers,
all_pages=all_pages,
next_token=next_token)

def build_headers(self,
surface_header_value: Optional[str],
api_key: Optional[str] = None) -> dict[str, str]:
"""Build request headers for API requests.

Includes JSON content type. If an API key is provided, add it as `X-API-Key`.

Args:
self: the API, which includes API key and surface header if available

Returns:
A dictionary of headers for the request.
"""
headers = {
"Content-Type": "application/json",
"x-surface": "clientlib-python"
}
if api_key:
headers["X-API-Key"] = api_key

if surface_header_value:
headers["x-surface"] = surface_header_value

return headers


class Endpoint:
"""Represents a specific endpoint within the Data Commons API.
Expand Down
Loading