Skip to content

BigQuery: Add timeout parameter to to_dataframe() #7612

@alixhami

Description

@alixhami

Opening this feature request for discussion. There is currently no way to provide a timeout for cases when to_dataframe() continues to run and eventually stalls due to a query returning results that are too large for a pandas DataFrame.

A timeout given to QueryJob.to_dataframe should probably pass this timeout to the result() function, and use the remaining time to construct the DataFrame.

There is some ambiguity in how to handle RowIterator.to_dataframe because it does not call result(), so there are two separate timeouts that can be given:
client.query(sql).result(timeout=10).to_dataframe(timeout=10)

It would likely be confusing to users that the timeout given to to_dataframe() will apply to both the query job and the DataFrame construction when given to a QueryJob, but will only apply to the DataFrame construction when given to a RowIterator.

Metadata

Metadata

Labels

api: bigqueryIssues related to the BigQuery API.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions